Summary:
- Frames were not being written. This was because of disk full conditions. We deleted old files and restarted everything.
- FB0 network setup wasn't good. Joe fixed this with ifconfig.
The script which cleans up the frame files (so that the disk doesn't become overful) was set to only delete files when the /frames/full directory was getting up to 99.7% of the full capacity. This is ridiculously close to the edge. We set it instead to be 95%. Here's the diff in the /target/fb/wiper.pl script:
fb0:fb>diff wiper.pl wiper.pl~
26c26
< $full_frames_percent_keep = 95;
---
> $full_frames_percent_keep = 99.7;
32c32
< $minute_frames_percent_keep = 0.2;
---
> $minute_frames_percent_keep = 0.005;
The DAQD process was spitting out core files and had also filled up the / partition on FB0. After deleting this the regular system processes were able to run. To check the disk space you can use 'df -h':
fb0:fb>df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
224G 24G 189G 12% /
/dev/sda1 99M 28M 66M 30% /boot
tmpfs 1006M 0 1006M 0% /dev/shm
/dev/sdd1 1.4T 142G 1.3T 11% /frames/trend
/dev/sdc1 917G 707G 165G 82% /frames/full
fb1:/cvs 917G 104G 814G 12% /cvs
We found that although NTPD was running on FB0, it had been configured in some really screwy way. We used /sbin/ifconfig to remove the configuration for the other network devices (eth1, eth2, eth3) and set it so that FB0 only talks to the ATF martian network and the router. The router is now configured to NOT filter out the requests from FB0. Now the NTPD works and seems to be correctly fixing the computer's system time. There's still the issue that the ATF FE will change this time as long as the FE is running, but I guess the system clock will once in awhile get fixed when we restart the FE and NTP takes over.
Along the way, I also restored the Xinerama dual-head display on ws1. Alastair somehow believed that it had never been dual-head before, but in fact I elogged the procedure in September. Please don't do any auto-updates on any of these machines unless you know what you're doing. and are willing to fix things after breaking them.
I attached an image showing that I can, indeed, get real time data from one of the _DAQ channels of the gyro. |