Summary:
The gps time mismatch between /proc/gps and gpstime seems to be resolved. However, the 0x4000 DC errors still persist. It is not clear to me why.
Details:
On the phone with J Hanks on Friday, he reminded me that c1sus seems to be the only machine with an IRIG-B timing card installed. I can't find the elog but I remembered that Jamie, ericq and I had done this work in 2016 (?), and I also remembered Jamie saying it wasn't working exactly as expected. Since the DAQ was working fine before this card was installed, and since there are no problems with the recording of channels from the other four FE machines without this card installed, I decided to simply pull out the card from the expansion chassis. The card has been stored in the CDS/FE cabinet along the Y arm for now. There was also a cable that interfaces to the card which brings over the 1pps from the GPS unit, which has also been stored in the CDS/FE cabinet.
This seems to have resolved the mismatch between the gpstime reported by cat /proc/gps and the gpstime commands - Attachment #1 (the <1 second mismatch is presumably due to the deadtime between commands). However, the 0x4000 DC errors still persist. I'll try the full power cycle of FEs and FB which has fixed this kind of error in the past, but apart from that, I'm out of ideas.
Update 1215:
Following the instructions in this elog did not fix the problem. The problem seems to be with the daqd_fw service, which reports the following:
controls@fb1:~ 0$ sudo systemctl status daqd_fw.service
● daqd_fw.service - Advanced LIGO RTS daqd frame writer
Loaded: loaded (/etc/systemd/system/daqd_fw.service; enabled)
Active: failed (Result: start-limit) since Wed 2019-01-09 12:17:12 PST; 2min 0s ago
Process: 2120 ExecStart=/usr/bin/daqd_fw -c /opt/rtcds/caltech/c1/target/daqd/daqdrc.fw (code=killed, signal=ABRT)
Main PID: 2120 (code=killed, signal=ABRT)
Jan 09 12:17:12 fb1 systemd[1]: Unit daqd_fw.service entered failed state.
Jan 09 12:17:12 fb1 systemd[1]: daqd_fw.service holdoff time over, scheduling restart.
Jan 09 12:17:12 fb1 systemd[1]: Stopping Advanced LIGO RTS daqd frame writer...
Jan 09 12:17:12 fb1 systemd[1]: Starting Advanced LIGO RTS daqd frame writer...
Jan 09 12:17:12 fb1 systemd[1]: daqd_fw.service start request repeated too quickly, refusing to start.
Jan 09 12:17:12 fb1 systemd[1]: Failed to start Advanced LIGO RTS daqd frame writer.
Jan 09 12:17:12 fb1 systemd[1]: Unit daqd_fw.service entered failed state.
Update 1530:
The frame-writer error was tracked down to a C0EDCU issue. Jon told me that the Hornet CC1 pressure gauge channel was renamed to . C1:Vac-CC1_pressure, and I made the change in the C0EDCU file. However, it returns a value of 9990000000.0, which the frame writer is not happy about... Keeping the old channel name makes the frame-writer run again (although the actual data is bunk).
Update 1755:
J Hanks suggested adding a 1 second offset to the daqdrc config files. This has now fixed the 0x4000 errors, and we are back to the "nominal" RTCDS status screen now - Attachment #2. |