40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Message ID: 11394     Entry time: Tue Jul 7 23:26:19 2015
Author: Koji 
Type: Update 
Category: CDS 
Subject: Attempt to list CDS issues 

As Jamie succeded to realize somewhat workable condition of the 40m CDS, I tried to list the obvious CDS issues so that we can attack them one by one.

  1. c1cal is constantly time-outing now (t>60usec). c1sus is close to it (t=56~57us)
  2. We should check the trends of the CPU meters ("C1:FEC-**_CPU_METER"). In fact this should be listed in the summary pages in a new CDS tab.
  3. Probably this is related to 1): c1lsc is constantly showing IPC error (bit0 = shmem). C1LSC_IPC_STATUS.adl is telling that this is coming from the IPC error between c1lsc and c1cal. ("C1:CAL-LSC_SENSMAT_OSC_****"). This information is found by opning C1LSC_GDS_TP.adl screen and click RT NET STAT button next to the IPC error status.
  4. We wonder how the RFM access is accelerated or decelerated by this upgrade.
  5. We need tests to see if the time delays of the models/IPCs are still reasonable.
  6. LSC Overviw screen has a small digest of the CDS status. Now there are many white boxes that correspond to the channels "C1:FEC-**_DIAG1".
  7. All realtime systems have default (0 or 1) epics channel values (i.e. gains, FM switches, matrices, etc). Need burtrestores.
  8. I tried to burtrestore the models but burtgooey indicated there are some errors.
  9. Detailed check of the snapshot files comparing snapshot files in /opt/rtcds/caltech/c1/burt/autoburt/snapshots/2015/Jul/7/19:07 and /opt/rtcds/caltech/c1/burt/autoburt/snapshots/2015/Jun/1/19:07 :
    • c1alsepics shows bunch of volatile channels to be snapshot. It seems that all of the static epics channels are missing in the snapshot file. Is this related to the current omission of the slow data acquisition? => No actually this must be the modification of the ALS model to accommodate the ALS in the LSC model for the new ALS setup.
    • c1lscepics was checked indeed slow channels were properly snapshot. So what was the problem in burting???
    • I made a simple csh script to restore the snapshots one by one while collecting the error messages.
      This script is located as /users/koji/150707/burtrevert.sh
    • #!/bin/csh
      echo 'This script restores all of the snapshot files found in' $argv[1] '.'
      echo 'Are you sure? y/n'

      set ans = $<

      set ANS = `echo $ans | tr "[:upper:]" "[:lower:]" `
      if ($ANS == y) then
          foreach fname ($argv[1]/*epics.snap)
          echo ''
          echo '#################################'
          echo $fname
          echo '#################################'

              burtwb -f $fname
          end
      else
          echo "exiting..."
      endif
       
    •  Now I ran the command
      ./burtrevert.sh /opt/rtcds/caltech/c1/burt/autoburt/snapshots/2015/Jun/1/19:07 &>burt.log
      This lists up the missing channels. The zipped log is attached to this entry.
    • Burting old snapshot always crashes the RT process "c1sus" (not the c1sus host). If I use the newly generated snapshot today,
      the process does not crash. The process halts at the cycle time of 74us (>60us). I left the process crashed so that we can take a new snapshot with the matrix numbers filled. Once we have the correct snapshot, we don't need to worry about this crash. Let's see.
    • c1sus still crashes with the new burt file. Theremust be a trigeer that makes the model frozen. We need to split the burtfile into pieces
      to figure out which line causes the halt.
Attachment 1: burt.log.zip  7 kB  Uploaded Wed Jul 8 01:57:45 2015
ELOG V3.1.3-