Quote: |
I was looking at the ASDC channel on dataviewer, and toggling various settings like whitening gain. At some point, the signal just froze. So I quit dataviewer and tried restarting it, at which point it complained about not being able to connect to FB. This is when I brought up the CDS_OVERVIEW medm screen, and noticed the frozen 1pps indicator lights. There was certainly something going on with the end FEs, because I was able to ping the machine, but not ssh into it. Once the 1pps lights came back, I was able to ssh into c1iscex and c1iscey, no problems.
Could it be that some of the mx processes stalled, but the systemctl routine automatically restarted them after some time?
|
An mx_stream glitch would have interrupted data flowing from the front end to the DAQ, but it wouldn't have affected the heartbeat. The heartbeat stop could mean either that the front end process froze, or the EPICS communication stopped. The fact that everything came back fine after a couple of minutes indicates to me that the front end processes all kept running fine. If they hadn't I'm sure the machines would have locked up. The fact that you couldn't connect to the FE machine is also suspicious.
My best guess is that there was a network glitch on the martian network. I don't know how to account for the fact that pings still worked, though. |