40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Thu Feb 26 13:17:31 2015, ericq, Update, Computer Scripts / Programs, FB IO load 
    Reply  Thu Feb 26 13:55:59 2015, jamie, Update, Computer Scripts / Programs, FB IO load 
       Reply  Mon Mar 23 19:30:36 2015, rana, Update, Computer Scripts / Programs, rsync frames to LDAS cluster 
          Reply  Wed May 13 09:17:28 2015, rana, Update, Computer Scripts / Programs, rsync frames to LDAS cluster 
             Reply  Mon May 18 14:22:05 2015, ericq, Update, Computer Scripts / Programs, rsync frames to LDAS cluster 
Message ID: 11076     Entry time: Thu Feb 26 13:17:31 2015     Reply to this: 11077
Author: ericq 
Type: Update 
Category: Computer Scripts / Programs 
Subject: FB IO load 

Over the past few days, I've occasionally been peeking at the framebuilder IO load to see If I could correlate anything with it, but it's usually been low when I looked. I.e. with daqd and all models running, the %wa time was in the few percents at most.

Just now, I was seeing some EPICS sluggishness, and sure enough, the %wa was in the 50-60 range. I used iostat -xmh 5 on the framebuilder to see that /dev/sda, the /frames drive, was at 100% utilization, which means it was reading and writing as fast as it possibliy could. 

I ssh'd over to nodus, and with iotop found that an rsync job was running (rsync -am --exclude .*.gwf full 131.215.114.19::40m/full), and its IO rates corresponded very closely to the data read rates on the framebuilder from /frames. 

I killed the rsync process on nodus, and the %wa time on the framebuilder dropped to near zero. The ASS striptools, where I had noticed the sluggishness, immediately started updating faster.

While rsync is supposed to play nice with a system's IO demands, maybe it only knows about nodus's IO usage, not fb which is the underlying NFS server where the frames live. I think it would be good to throttle the bandwidth of these jobs to a specific bandwidth. 50MB/s seemed like too much, so maybe 10MB/s is ok?

ELOG V3.1.3-