40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Fri Jun 27 19:59:44 2008, John, Update, Computers, c1iovme 
    Reply  Sat Jun 28 03:10:25 2008, rob, Update, Computers, c1iovme 
Message ID: 587     Entry time: Sat Jun 28 03:10:25 2008     In reply to: 586
Author: rob 
Type: Update 
Category: Computers 
Subject: c1iovme 

Quote:
C1susvme2 and C1iovme crashed which sent the optics swinging and tripped the watchdogs.

Koji and I were able to restore c1susvme2 without any trouble.

We have been unable to revive c1iovme. We have tried telneting in and running startup.cmd,
the process runs for a while then hangs with "DAQ init failed -- exiting".

Resetting the board doesn't help. I didn't try keying the whole crate.

All optics are back to normal with damping restored.


I tried keying the crate, then keying the DAQ controller & AWG, then powering down & restarting the framebuilder.
On coming up, the framebuild doesn't start a daqd process, and I can't get one to start by hand (it just prints "652", and then stops).
No error messages and daqd doesn't appear in the prstat.

I then tried keying the DAQ controller again (after the fb0 reboot), which blew the watchdogs on all the suspensions. So then I went around and keyed all the crates.

Now, the suspension controllers are back online. Still no c1iovme, and now the framebuilder/DAQ/AWG are also hosed. We can try keying all the crates again, in the order that Yoichi did last week.

After some more poking around, I found the daqd log file. It's now complaining about

Jun 28 03:00:39 fb daqd[546]: [ID 355684 user.info] Fatal error: channel `C1: PSL-FSS_MIXERM_F' is duplicated 126

This is the second error message like this. It first complained about C1: PSL-FSS_FAST_F, so I commented that out of C1IOOF.ini and rebooted the framebuilder (note this is an actual reboot of the full solaris machine). Eventually I discovered that C1IOOF.ini and C1IOO.ini are essentially identical. They presumably will keep getting these duplicate channel errors until one of them is completely removed.

C1IOO.ini has a modification time of seven PM on Friday night. Who did this and didn't elog it? I've now modified C1IOOF.ini, and I don't remember when it was last modified.
ELOG V3.1.3-