Quote: | C1susvme2 and C1iovme crashed which sent the optics swinging and tripped the watchdogs.
Koji and I were able to restore c1susvme2 without any trouble.
We have been unable to revive c1iovme. We have tried telneting in and running startup.cmd,
the process runs for a while then hangs with "DAQ init failed -- exiting".
Resetting the board doesn't help. I didn't try keying the whole crate.
All optics are back to normal with damping restored. |
I tried keying the crate, then keying the DAQ controller & AWG, then powering down & restarting the framebuilder.
On coming up, the framebuild doesn't start a daqd process, and I can't get one to start by hand (it just prints "652", and then stops).
No error messages and daqd doesn't appear in the prstat.
I then tried keying the DAQ controller again (after the fb0 reboot), which blew the watchdogs on all the suspensions. So then I went around and keyed all the crates.
Now, the suspension controllers are back online. Still no c1iovme, and now the framebuilder/DAQ/AWG are also hosed. We can try keying all the crates again, in the order that Yoichi did last week.
After some more poking around, I found the daqd log file. It's now complaining about
Jun 28 03:00:39 fb daqd[546]: [ID 355684 user.info] Fatal error: channel `C1: PSL-FSS_MIXERM_F' is duplicated 126
This is the second error message like this. It first complained about C1: PSL-FSS_FAST_F, so I commented that out of C1IOOF.ini and rebooted the framebuilder (note this is an actual reboot of the full solaris machine). Eventually I discovered that C1IOOF.ini and C1IOO.ini are essentially identical. They presumably will keep getting these duplicate channel errors until one of them is completely removed.
C1IOO.ini has a modification time of seven PM on Friday night. Who did this and didn't elog it? I've now modified C1IOOF.ini, and I don't remember when it was last modified. |