40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Tue Oct 17 23:07:52 2017, gautam, Update, CDS, FEs unresponsive 
    Reply  Wed Oct 18 01:41:32 2017, jamie, Update, CDS, FEs unresponsive 
       Reply  Wed Oct 18 02:09:32 2017, gautam, Update, CDS, FEs unresponsive 
          Reply  Wed Oct 18 09:21:22 2017, jamie, Update, CDS, FEs unresponsive 
             Reply  Wed Oct 18 23:11:53 2017, gautam, Update, CDS, FEs unresponsive 
Message ID: 13388     Entry time: Wed Oct 18 09:21:22 2017     In reply to: 13387     Reply to this: 13394
Author: jamie 
Type: Update 
Category: CDS 
Subject: FEs unresponsive 
Quote:

I was looking at the ASDC channel on dataviewer, and toggling various settings like whitening gain. At some point, the signal just froze. So I quit dataviewer and tried restarting it, at which point it complained about not being able to connect to FB. This is when I brought up the CDS_OVERVIEW medm screen, and noticed the frozen 1pps indicator lights. There was certainly something going on with the end FEs, because I was able to ping the machine, but not ssh into it. Once the 1pps lights came back, I was able to ssh into c1iscex and c1iscey, no problems.

Could it be that some of the mx processes stalled, but the systemctl routine automatically restarted them after some time?

An mx_stream glitch would have interrupted data flowing from the front end to the DAQ, but it wouldn't have affected the heartbeat.  The heartbeat stop could mean either that the front end process froze, or the EPICS communication stopped.  The fact that everything came back fine after a couple of minutes indicates to me that the front end processes all kept running fine.  If they hadn't I'm sure the machines would have locked up.  The fact that you couldn't connect to the FE machine is also suspicious.

My best guess is that there was a network glitch on the martian network.  I don't know how to account for the fact that pings still worked, though.

ELOG V3.1.3-