40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Message ID: 6944     Entry time: Mon Jul 9 11:27:27 2012
Author: Jenne 
Type: Update 
Category: Computers 
Subject: c1oaf has been down for several days - BURT restore wasn't done correctly on startup 

The c1oaf model hasn't been running for a few days (since the leap second problems we were having last week).  I had looked into it, but finally figured it out (with Jamie's help) today. 

The BURT restore has to be given to the model during startup, but for whatever reason it wasn't BURT restoring until *after* the model had already failed to start.  The symptoms were:  no 'heartbeat' for the oaf model, no connection to the fb, NO SYNC on the GDS screen, 0x4000.  the BURT restore button was green, which threw me off the scent, but that's just because it did, in fact, get set, just way too late.

I ended up looking in the dmesg of the lsc computer, and the last set of stuff was several lines of "[3354303.626446] c1oaf: Epics burt restore is 0".  Nothing else was written after that.  Jamie pointed out that this meant the BURT restore wasn't getting sent before the model unloaded itself and decided not to run.

The solution:  restart the model, and manually click the BURT restore button as soon as you're able (after everything comes back from being white).  We used to have to do this, but then there was a "fix", which apparently isn't super robust and failed for the oaf (even though it used to work just fine).  Bugzilla report submitted.

ELOG V3.1.3-