40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Thu Jul 5 10:49:38 2012, Jamie, Update, CDS, front-end/fb communication lost, likely again due to timing offsets 
    Reply  Thu Jul 5 11:12:53 2012, Jenne, Update, CDS, front-end/fb communication lost, likely again due to timing offsets 
    Reply  Thu Jul 5 12:27:05 2012, Jamie, Update, CDS, front-end/fb communication lost, likely again due to timing offsets 
Message ID: 6920     Entry time: Thu Jul 5 12:27:05 2012     In reply to: 6917
Author: Jamie 
Type: Update 
Category: CDS 
Subject: front-end/fb communication lost, likely again due to timing offsets 

Quote:

All the front-ends are showing 0x4000 status and have lost communication with the frame builder.  It looks like the timing skew is back again.  The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds!  I'm looking into it now.

This was indeed another leap second timing issue.  I'm guessing nodus resync'd from whatever server is posting the wrong time, and it brought everything out of sync again.  It really looks like the caltech server is off.  When I manually sync form there the time is off by a second, and then when I manually sync from the global pool it is correct.

I went ahead and updated nodus's config (/etc/inet/ntp.conf) to point to the global pool (pool.ntp.org).  I then restarted the ntp daemon:

  nodus$ sudo /etc/init.d/xntpd stop
  nodus$ sudo /etc/init.d/xntpd start

That brought nodus's time in sync.

At that point all I had to do was resync the time on fb:

  fb$ sudo /etc/init.d/ntp-client restart

When I did that daqd died, but it immediately restarted and everything was in sync.

ELOG V3.1.3-