40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Message ID: 15370     Entry time: Wed Jun 3 11:20:19 2020
Author: gautam 
Type: Update 
Category: DetChar 
Subject: Summary pages 

Summary:

The 40m summary pages have been revived. I've not had to make any manual interventions in the last 5 days, so things seem somewhat stable, but I'm sure there will need to be multiple tweaks made. The primary use of the pages right now are for vacuum, seismic and PSL diagnostics.

Resources:

Caveats:

  • Intermittent failures of cron jobs
    • The status page relies on the condor_q command executing successfully on the cluster end. I have seen this fail a few times, so the status page may say the pages are dead whereas they're actually still running.
    • Similarly, the rsync of the pages to nodus where they're hosted can sometimes fail.
    • Usually, these problems are fixed on the next iteration of the respective cron jobs, so check back in ~half hour.
  • I haven't really looked into it in detail, but I think our existing C1:IFO-STATE word format is not compatible with what gwsumm wants - I think it expects single bits that are either 0 or 1 to indicate a particular state (e.g. MC locked, POX and POY locked etc). So if we want to take advantage of that infrastructure, we may need to add a few soft EPICS channels that take care of some logic checking (several such bits could also be and-ed together) and then assume either 0 or 1 value. Then we can have the nice duty cycle plots for the IMC (for example).
  • I commented out the obsolete channels (e.g. PEM MIC channels). We can add them back later if we so desire.
  • For some reason, the jobs that download the data are pretty memory-heavy: I have to request for machines on the cluster with >100 GB (yes 💯GB) ! of memory for the jobs to execute to completion. The frame-data certainly isn't so large, so I wonder what's going on here - is GWPy/GWsumm so heavy? The site summary pages run on a dedicated cluster, so probably the code isn't built for efficiency...
  • Weather tab in PEM is still in there but none of those channels mean anything right now.
  • The MEDM screenshot stuff is commented out for now too. This should be redone in a better way with some modern screen grab utilities, I'm sure there are plenty of python based ones.
  • There seems to be a problem with the condor .dag lockfile / rescue file not being cleared correctly sometimes - I am looking into this.
ELOG V3.1.3-