40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  ATF eLog  Not logged in ELOG logo
Entry  Fri Feb 4 19:38:49 2011, rana, joeB, Computing, DAQ, fb0 problems fixed: daqd, nds, and atffe all now running Untitled.png
    Reply  Mon Feb 7 12:59:11 2011, joeB, Computing, DAQ, fb0 problems fixed: daqd, nds, and atffe all now running 
Message ID: 1287     Entry time: Fri Feb 4 19:38:49 2011     Reply to this: 1291
Author: rana, joeB 
Type: Computing 
Category: DAQ 
Subject: fb0 problems fixed: daqd, nds, and atffe all now running 

Summary:

  1. Frames were not being written. This was because of disk full conditions. We deleted old files and restarted everything.
  2. FB0 network setup wasn't good. Joe fixed this with ifconfig.

 


The script which cleans up the frame files (so that the disk doesn't become overful) was set to only delete files when the /frames/full directory was getting up to 99.7% of the full capacity. This is ridiculously close to the edge. We set it instead to be 95%. Here's the diff in the /target/fb/wiper.pl script:

fb0:fb>diff wiper.pl wiper.pl~
26c26
< $full_frames_percent_keep = 95;
---
> $full_frames_percent_keep = 99.7;
32c32
< $minute_frames_percent_keep = 0.2;
---
> $minute_frames_percent_keep = 0.005;

The DAQD process was spitting out core files and had also filled up the / partition on FB0. After deleting this the regular system processes were able to run. To check the disk space you can use 'df -h':

fb0:fb>df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      224G   24G  189G  12% /
/dev/sda1              99M   28M   66M  30% /boot
tmpfs                1006M     0 1006M   0% /dev/shm
/dev/sdd1             1.4T  142G  1.3T  11% /frames/trend
/dev/sdc1             917G  707G  165G  82% /frames/full
fb1:/cvs              917G  104G  814G  12% /cvs

We found that although NTPD was running on FB0, it had been configured in some really screwy way. We used /sbin/ifconfig to remove the configuration for the other network devices (eth1, eth2, eth3) and set it so that FB0 only talks to the ATF martian network and the router. The router is now configured to NOT filter out the requests from FB0. Now the NTPD works and seems to be correctly fixing the computer's system time. There's still the issue that the ATF FE will change this time as long as the FE is running, but I guess the system clock will once in awhile get fixed when we restart the FE and NTP takes over.

 

Along the way, I also restored the Xinerama dual-head display on ws1. Alastair somehow believed that it had never been dual-head before, but in fact I elogged the procedure in September. Please don't do any auto-updates on any of these machines unless you know what you're doing. and are willing to fix things after breaking them.

I attached an image showing that I can, indeed, get real time data from one of the _DAQ channels of the gyro.

Attachment 1: Untitled.png  914 kB  | Hide | Hide all
Untitled.png
ELOG V3.1.3-