40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Wed Jul 19 08:37:21 2017, Jamie, Update, CDS, Update on front-end/DAQ rebuild  
    Reply  Wed Jul 19 14:26:50 2017, Jamie, Update, CDS, Update on front-end/DAQ rebuild  
    Reply  Fri Jul 21 18:03:17 2017, Jamie, Update, CDS, Update on front-end/DAQ rebuild  
       Reply  Sun Jul 23 22:16:55 2017, Jamie, gautam, Update, CDS, front-end now running with new OS, RCG 2017-07-23-210810_1394x488_scrot.png2017-07-23-211812_387x488_scrot.png
          Reply  Mon Jul 24 10:45:23 2017, gautam, Update, CDS, c1iscex models died c1iscexFailure.png
             Reply  Mon Jul 24 10:59:08 2017, Jamie, Update, CDS, c1iscex models died 
          Reply  Mon Jul 24 19:28:55 2017, Jamie, Update, CDS, front end MX stream network working, glitches in c1ioo fixed 48.png
             Reply  Mon Jul 24 19:57:54 2017, gautam, Update, CDS, IMC locked, Autolocker re-enabled 
             Reply  Wed Jul 26 19:13:07 2017, Jamie, Update, CDS, daqd showing same instability as before 
                Reply  Fri Jul 28 20:22:41 2017, Jamie, Update, CDS, possible stable daqd configuration with separate DC and FW 
                   Reply  Mon Jul 31 15:13:24 2017, gautam, Update, CDS, FB ---> FB1 
                   Reply  Mon Jul 31 18:44:40 2017, Jamie, Update, CDS, CDS system essentially fully recovered 02.png
                      Reply  Thu Aug 3 19:46:27 2017, Jamie, Update, CDS, new daqd restart procedure 
                      Reply  Fri Aug 4 09:07:28 2017, rana, Update, CDS, CDS system essentially NOT fully recovered 
                         Reply  Thu Aug 10 14:25:52 2017, gautam, Update, CDS, Slow EPICS channels -> Frames re-enabled 
                            Reply  Fri Aug 11 00:10:03 2017, gautam, Update, CDS, Slow EPICS channels -> Frames re-enabled 
                               Reply  Fri Aug 11 11:14:24 2017, gautam, Update, CDS, Slow EPICS channels -> Frames re-enabled 
                               Reply  Fri Aug 11 18:53:35 2017, gautam, Update, CDS, Slow EPICS channels -> Frames re-enabled 
                      Reply  Fri Aug 11 19:34:49 2017, Jamie, Update, CDS, CDS final bits status update 
Message ID: 13198     Entry time: Fri Aug 11 19:34:49 2017     In reply to: 13153
Author: Jamie 
Type: Update 
Category: CDS 
Subject: CDS final bits status update 

So it appears we now have full frames and second, minute, and minute_raw trends.

We are still not able to raise test points with daqd_rcv (e.g. the NDS1 server), which is why dataviewer and nds2-client can't get test points on their own.

We were not able to add the EDCU (EPICS client) channels without daqd_fw crashing.

We have a new kernel image that's supposed to solve the module unload instability issue.  In order to try it we'll need to restart the entire system, though, so I'll do that on Monday morning.

I've got the CDS guys investigating the test point and EDCU issues, but we won't get any action on that until next week.


Remaining unresolved issues:

  • IFO needs to be fully locked to make sure ALL components of all models are working.
  • The remaining red status lights are from the "FB NET" diagnostics, which are reflecting a missing status bit from the front end processes due to the fact that they were compiled with an earlier RCG version (3.0.3) than the mx_streams were (3.3+/trunk).  There will be a new release of the RTS soon, at which point we'll compile everything from the same version, which should get us all green again.
  • The entire system has been fully modernized, to the target CDS reference OS (Debian jessie) and more recent RCG versions.  The management of the various RTS components, both on the front ends and on fb, have as much as possible been updated to use the modern management tools (e.g. systemd, udev, etc.).  These changes need to be documented.  In particular...
  • The fb daqd process has been split into three separate components, a configuration that mirrors what is done at the sites and appears to be more stable: The "target" directory for all of these components is now:
    • daqd_dc: data concentrator (receives data from front ends)
    • daqd_fw: receives frames from dc and writes out full frames and second/minute trends
    • daqd_rcv: NDS1 server (raises test points and receives archive data from frames from 'nds' process)
    The "target" directory for all of these new components is:
    • /opt/rtcds/caltech/c1/target/daqd
    All of these processes are now managed under systemd supervision on fb, meaning the daqd restart procedure has changed.  This needs to be simplified and clarified.
  • Second trend frames are being written, but for some reason they're not accessible over NDS.
  • Have not had a chance to verify minute trend and raw minute trend writing yet.  Needs to be confirmed.
  • Get wiper script working on new fb.
  • Front end RTS kernel will occaissionally crash when the RTS modules are unloaded.  Keith Thorne apparently has a kernel version with a different set of patches from Gerrit Kuhn that does not have this problem.  Keith's kernel needs to be packaged and installed in the front end diskless root.
  • The models accessing the dolphin shared memory will ALL crash when one of the front end hosts on the dolphin network goes away.  This results in a boot fest of all the dolphin-enabled hosts.  Need to figure out what's going on there.
  • The RCG settings snapshotting has changed significantly in later RCG versions.  We need to make sure that all burt backup type stuff is still working correctly.
  • Restoration of /frames from old fb SCSI RAID?
  • Backup of entirety of fb1, including fb1 root (/) and front end diskless root (/diskless)
  • Full documentation of rebuild procedure from Jamie's notes.
ELOG V3.1.3-