The daqd process is segfaulting and restarting itself every 30 seconds or so. It's pretty frustrating.
Just for kicks, I tried an mxstream restart, clearing the testpoints, and restarting the daqd process, but none of things changed anything.
Manasa found an elog from a year ago (elog 7105 and preceding), but I'm not sure that it's a similar / related problem. Jamie, please help us
The problem is not exactly the same as what's described in 7105, but the symptoms are so similar I assumed they must have a similar source.
And sure enough, /frames is completely full:
controls@fb /opt/rtcds/caltech/c1/target/fb 0$ df -h /frames/
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 13T 13T 0 100% /frames
controls@fb /opt/rtcds/caltech/c1/target/fb 0$
So the problem in both cases was that it couldn't write out the frames. Unfortunately daqd is apparently too stupid to give us a reasonable error message about what's going on.
So why is /frames full? Apparently the wiper script is either not running, or is failing to do it's job. My guess is that this is a side effect of the linux1 raid failure we had over xmas.