40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 131 of 341  Not logged in ELOG logo
ID Date Authorup Type Category Subject
  8055   Mon Feb 11 13:07:17 2013 Max HortonUpdateSummary PagesFixed A Calendar Bug

Understanding the Code:  Documented more functions in summary_pages.py.  Since it is very difficult and slow to understand what is going on, it might be best to just start trying to factor out the code into multiple files, and understand how the code works from there.

Crontab:  Started learning how the program is called by cron / what cron is, so that I can fix the problem that forces data to only be displayed up until 6PM.

Calendars:  One of the problems with the page is that the calendars on the left column didn't have any of the months of 2013 in them.

I identified the incorrect block of code as:

Original Code:
  # loop over months
  while t < e:
     if t.month < startday.month or t >= endday:
      ptable[t.year].append(calendar_link(t, firstweekday, tab=tab, run=run))

    # increment by month
    # Move forward day by day, until a new month is reached.
    m = t.month
    while t.month == m:
      t = t + d

    # Ensure that y still represents the current year.
    if t.year > y:
      y = t.year
      ptable[y] = []

The problem is that the months between the startday and endday aren't being treated properly.

Modified Code:
  # loop over months
  while t < e:
    if (t.month < startday.month and t.year <= startday.year) or t >= endday:
      ptable[t.year].append(calendar_link(t, firstweekday, tab=tab, run=run))

    # increment by month
    # Move forward day by day, until a new month is reached.
    m = t.month
    while t.month == m:
      t = t + d

    # Ensure that y still represents the current year.
    if t.year > y:
      y = t.year
      ptable[y] = []

After this change, the calendars display the year of 2013, as desired.

  8098   Mon Feb 18 11:54:15 2013 Max HortonUpdateSummary PagesTiming Issues and Calendars

Crontab: The bug of data only plotting until 5PM is being investigated.  The crontab's final call to the summary page generator was at 5PM.  This means that the data plots were not being generated after 5PM, so clearly they never contained data from after 5PM.  In fact, the original crontab reads:

0 11,5,11,17 * * * /users/public_html/40m-summary/bin/c1_summary_page.sh 2>&1

I'm not exactly sure what inspired these entries.  The 11,5,11,17 entries are supposed to be the hours at which the program is run.  Why is it run twice at 11?  I assume it was just a typo or something.

The final call time was changed to 11:59PM in an attempt to plot the entire day's data, but this method didn't appear to work because the program would still be running past midnight, which was apparently inhibiting its functionality (most likely, the day change was affecting how the data is fetched).  The best solution is probably to just wait until the next day, then call the summary page generator on the previous day's data.  This will be implemented soon.

Calendars: Although the calendar tabs on the left side of the page were fixed, the calendars displayed at: https://nodus.ligo.caltech.edu:30889/40m-summary/calendar/ appear to still have squished together text.  The calendar is being fetched from https://nodus.ligo.caltech.edu:30889/40m-summary/calendar/calendar.html and displayed in the page.  This error is peculiar because the URL from which the calendar is being fetched does NOT have squished together text, but the resulting calendar at 40m-summary/calendar/ will not display spaces between the text.  This issue is still being investigated.

  8148   Sat Feb 23 16:16:11 2013 Max HortonUpdateSummary PagesMultiprocessing

Calendars:  The calendar issue discussed previously (http://nodus.ligo.caltech.edu:8080/40m/8098) where the numbers are squished together is very difficult for me to find.  I am not going to worry about it for the time being.

Multiprocessing:   Reviewed the implementation of Multiprocessing in python (using Multiprocessing package).  Wrote a simple test function and ran it on megatron, to verify that multiprocessing could successfully take advantage of megatron’s multiple cores – it could.  Now, I will work on implementing multiprocessing in the program.  I began testing at a section in the program where a for loop calls process_data() (which has a long runtime) multiple times.  The megatron terminals I had open began to run very slowly.  Why?  I believe that the process_data() function loads data into global variables to accomplish its task.  The global variables in the original implementation were cleared before the subsequent calls to process_data().  But in the multiprocessing version, the data is not cleared, meaning the memory fills quickly, which drastically reduces performance.  In the short term, I could try generating fewer processes at a time, wait for them to finish, then clearing the data, then generating more processes, etc.  This will probably generate a nominal performance boost.  In the long-term, restructuring of the way the program handles data may help (but not for sure).  In the coming week I will experiment with these techniques and try to decrease the run time of the program.

  8194   Wed Feb 27 22:46:53 2013 Max HortonUpdateSummary PagesMultiprocessing Implementation

Overview: In order to make the code more maintainable, I need to factor it into different well-documented classes.  To do this carefully and rigorously, I need to run tests every time I make changes to the code.  The runtime of the code is currently quite high, so I will work on improving the runtime of the program before factoring it into classes.  This will be more efficient (minimize testing time) and allow me to factor more quickly.  So, my current goal is to improve runtime as much as possible.

Multiprocessing Implementation:

I invented a simple way to implement multiprocessing in the summary_pages.py file.  Here is an example: in the code, there is a process_data() function, which is run 75 times and takes rather long to run.  I created multiple processes to run these calls concurrently, as follows:

Original Code: (around line 7840)

for sec in datasections:
      for run in run_opts:
        run_opt = 'run_%s_time' % run
        if hasattr(tabs[sec], run_opt) and getattr(tabs[sec], run_opt):

          process_data(cp, ifo, start, end, tabs[sec],\
                       cache=datacache, segcache=segcache, run=run,\
                       veto_def_table=veto_table[run], plots=do['plots'],\
                       subplots=do['subplots'], html_only=do['html_only'])

  # free data memory
          keys = globvar.data.keys()
          for ch in keys:
            del globvar.data[ch]


The weakness in this code is that process_data() is called many times, and doesn't take advantage of megatron's multiple threads.  I changed the code to:

Modified Code: (around line 7840)
  import multiprocessing

  if do['dataplot']:
  ... etc... (same as before)       
    if hasattr(tabs[sec], run_opt) and getattr(tabs[sec], run_opt):

          # Create the process
          p = multiprocessing.Process(target=process_data, args=(cp, ifo, start, end, tabs[sec], datacache, segcache, run, veto_table[run], do['plots'], do['subplots'], do['html_only']))
          # Add the process to the list of processes
          plist += [p]


Then, I run the process in groups of size "numconcur", as follows:
    numconcur = 8
    curlist = []
    for i in range(len(plist)):
      curlist += [plist[i]]
      if (i % numconcur == (numconcur - 1)):
        for item in curlist:
        for item in curlist:
        keys = globvar.data.keys()
        for ch in keys:
          del globvar.data[ch]
        curlist = []


The value of numconcur (which defines how many threads megatron will use concurrently to run the program) greatly effects the speed of the program!  With numconcur = 8, the program runs in ~45% of the time of the original code!  This is the optimal value -- megatron has 8 threads.  Several other values were tested - numconcur = 4 and numconcur = 6 had almost the same performance as numconcur = 8, but numconcur = 1 (which is essentially the same as the unmodified code) has a much worse performance.

Improvement Cap:

Why does numcores = 4 have almost the same performance as numcores = 8?  I monitored the available memory of megatron, and it is quickly consumed during these runs.  I believe that once 4 or more cores are being used, the fact that the data can't all fit in megatron's memory (which was entirely filled during these trials) counteracts the usefulness of additional threads.

Summary of Improvements:

Original Runtime of all process_data() statements: (approximate): 8400 sec

Runtime with 8 processes (approximate): 3842 sec

This is about a 55% improvement for speed, in this particular sector (not in the overall performance of the entire program).  It saves about 4600 seconds (~1.3 hours) per run of the program.  Note that these values are approximate (since other processes are running on megatron during my tests.  They might be inflated or deflated by some margin of error).


Next Time:

This same optimization method will be applied to all repetitive processes with reasonably large runtimes.

  8201   Thu Feb 28 14:19:20 2013 Max HortonUpdateSummary PagesMultiprocessing Implementation

Okay, more memory would definitely be good.  I don't think I have been using MEDM (which Jamie tells me is the controls interface) so making a cronjob would probably be a good idea.

  8218   Mon Mar 4 10:41:18 2013 Max HortonUpdateSummary PagesMultiprocessing Implementation


Upon investigation, the other methods besides process_data() take almost no time at all to run, by comparison.  The process_data() method takes roughly 2521 seconds to run using Multiprocessing with eight threads.  After its execution, the rest of the program only takes 120 seconds to run.  So, since I still need to restructure the code, I won't bother adding multiprocessing to these other methods yet, since it won't significantly improve speed (and any improvements might not be compatible with how I restructure the code).  For now, the code is just about as efficient as it can be (in terms of multiprocessing).  Further improvements may or may not be gained when the code is restructured.

  8227   Mon Mar 4 21:05:49 2013 Max HortonUpdateSummary PagesMultiprocessing and Crontab

Multiprocessing:  In its current form, the code uses multiprocessing to the maximal extent possible.  It takes roughly 2600 seconds to run (times may vary depending on what else megatron is running, etc.).  Multiprocessing is only used on the process_data() function calls, because this by far takes the longest.  The other function calls after the process_data() calls take a combined ~120 seconds.  See http://nodus.ligo.caltech.edu:8080/40m/8218 for details on the use of Multiprocessing to call process_data().

Crontab:  I also updated the crontab in an attempt to fix the problem where data is only displayed until 5PM.  Recall that previously (http://nodus.ligo.caltech.edu:8080/40m/8098) I found that the crontab wasn't even calling the summary_pages.py script after 5PM.  I changed it then to be called at 11:59PM, which also didn't work because of the day change after midnight.

I decided it would be easiest to just call the function on the previous day's data at 12:01AM the next day.  So, I changed the crontab.

Previous Crontab:

59 5,11,17,23 * * * /users/public_html/40m-summary/bin/c1_summary_page.sh 2>&1

New Crontab:

0 6,12,18 * * * /users/public_html/40m-summary/bin/c1_summary_page.sh 2>&1
1 0 * * * /users/public_html/40m-summary/bin/c1_summary_page.sh $(date "+%Y/%m/%d" --date="1 days ago") 2>&1

For some reason, as of 9:00PM today (March 4, 2013) I still don't see any data up, even though the change to the crontab was made on February 28.  Even more bizarre is the fact that data is present for March 1-3.  Perhaps some error was introduced into the code somehow, or I don't understand how crontab does its job.  I will look into this now.


Once I fix the above problem, I will begin refactoring the code into different well-documented classes.

  8266   Mon Mar 11 10:20:36 2013 Max HortonSummaryComputersAttempted Smart UPS 2200 Battery Replacement

Attempted Battery Replacement on Backup Power Supply in the Control Room:

I tried to replace the batteries in the Smart UPS 2200 with new batteries purchased by Steve.  However, the power port wasn't compatible with the batteries.  The battery cable's plug was too tall to fit properly into the Smart UPS port.  New batteries must be acquired.  Steve has pictures of the original battery (gray) and the new battery (blue) plugs, which look quite different (even though the company said the battery would fit).

The Correct battery connector is GRAY : APC RBC55

Attachment 1: upsB.jpg
Attachment 2: upsBa.jpg
  8273   Mon Mar 11 22:28:30 2013 Max HortonUpdateSummary PagesFixing Plot Limits

Quick Note on Multiprocessing:  The multiprocessing was plugged into the codebase on March 4. Since then, the various pages that appear when you click on certain tabs (such as the page found here: https://nodus.ligo.caltech.edu:30889/40m-summary/archive_daily/20130304/ifo/dc_mon/ from clicking the 'IFO' tab) don't display graphs.  But, the graphs are being generated (if you click here or here, you will find the two graphs that are supposed to be displayed).  So, for some reason, the multiprocessing is preventing these graphs from appearing, even though they are being generated.  I rolled back the multiprocessing changes temporarily, so that the newly generated pages look correct until I find the cause of this.

Fixing Plot Limits:  The plots generated by the summary_pages.py script have a few problems, one of which is: the graphs don't choose their boundaries in a very useful way.  For example, in these pressure plots, the dropout 0 values 'ruin' the graph in the sense that they cause the plot to be scaled from 0 to 760, instead of a more useful range like 740 to 760 (which would allow us to see details better).

The call to the plotting functions begins in process_data() of summary_pages.py, around line 972, with a call to plot_data().  This function takes in a data list (which represents the x-y data values, as well as a few other fields such as axes labels).  The easiest way to fix the plots would be to "cleanse" the data list before calling plot_data().  In doing so, we would remove dropout values and obtain a more meaningful plot.

To observe the data list that is passed to plot_data(), I added the following code:

      # outfile is a string that represents the name of the .png file that will be generated by the code.
      print_verbose("Saving data into a file.")
      outfile_mch = open(outfile + '.dat', 'w')

      # at this point in process_data(), data is an array that should contain the desired data values.
      if (data == []):
          print_verbose("Empty data!")
      print >> outfile_mch, data

When I ran this in the code midday, it gave a human-readable array of values that appeared to match the plots of pressure (i.e. values between 740 and 760, with a few dropout 0 values).  However, when I let the code run overnight, instead of observing a nice list in 'outfile.dat', I observed:

[('Pressure', array([  1.04667840e+09,   1.04667846e+09,   1.04667852e+09, ...,
         1.04674284e+09,   1.04674290e+09,   1.04674296e+09]), masked_array(data = [ 744.11076965  744.14254761  744.14889221 ...,  742.01931356  742.05930208
             mask = False,
       fill_value = 1e+20)

I.e. there was an ellipsis (...) instead of actual data, for some reason.  Python does this when printing lists in a few specific situations.  The most common of which is that the list is recursively defined.  For example:

a = [5]
print a

[5, [...]]

It doesn't seem possible that the definitions for the data array become recursive (especially since the test worked midday).  Perhaps the list becomes too long, and python doesn't want to print it all because of some setting.

Instead, I will use cPickle to save the data.  The disadvantage is that the output is not human readable.  But cPickle is very simple to use.  I added the lines:

      import cPickle
      cPickle.dump(data, open(outfile + 'pickle.dat', 'w'))

This should save the 'data' array into a file, from which it can be later retrieved by cPickle.load().

There are other modules I can use that will produce human-readable output, but I'll stick with cPickle for now since it's well supported.  Once I verify this works, I will be able to do two things:
1) Cut out the dropout data values to make better plots.
2) When the process_data() function is run in its current form, it reprocesses all the data every time.  Instead, I will be able to draw the existing data out of the cPickle file I create.  So, I can load the existing data, and only add new values.  This will help the program run faster.

  8286   Wed Mar 13 15:30:37 2013 Max HortonUpdateSummary PagesFixing Plot Limits

Jamie has informed me of numpy's numpy.savetxt() method, which is exactly what I want for this situation (human-readable text storage of an array).  So, I will now be using:

      # outfile is the name of the .png graph.  data is the array with our desired data.
      numpy.savetxt(outfile + '.dat', data)

to save the data.  I can later retrieve it with numpy.loadtxt()

  8414   Thu Apr 4 13:39:12 2013 Max HortonUpdateSummary PagesGraph Limits

Graph Limits: The limits on graphs have been problematic.  They often reflect too large of a range of values, usually because of dropouts in data collection.  Thus, they do not provide useful information because the important information is washed out by the large limits on the graph.  For example, the graph below shows data over an unnecessarily large range, because of the dropout in the 300-1000Hz pressure values.

Time series data from frames

The limits on the graphs can be modified using the config file found in /40m-summary/share/c1_summary_page.ini.  At the entry for the appropriate graph, change the amplitude-lim=y1,y2 line by setting y1 to the desired lower limit and y2 to the desired upper limit.  For example, I changed the amplitude limits on the above graph to amplitude-lim=.001,1, and achieved the following graph.

Time series data from frames

The limits could be tightened further to improve clarity - this is easily done by modifying the config file.  I modified the config file for all the 2D plots to improve the bounds.  However, on some plots, I wasn't sure what bounds were appropriate or what range of values we were interested in, so I will have to ask someone to find out.

Next:  I now want to fix all the funny little problems with the site, such as scroll bars appearing where they should not appear, and graphs only plotting until 6PM.  In order to do this most effectively, I need to restructure the code and factor it into several files.  Otherwise, the code will not only be much harder to edit, but will become more and more confusing as I add on to it, compounding the problems that we currently have (i.e. that this code isn't very well documented and nobody knows how it works).  We need lots of specific documentation on what exactly is happening before too many changes are made.  Take the config files, for example.  Someone put a lot of work into them, but we need a README specifying which options are supported for which types of graphs, etc.  So we are slowed down because I have to figure out what is going on before I make small changes.

To fix this, I will divide the code into three main sectors.  The division of labor will be:
- Sector 1: Figure out what the user wants (i.e. read config files, create a ConfigParser, etc...)
- Sector 2: Process the data and generate the plots based on what the user wants
- Sector 3: Generate the HTML

  8476   Tue Apr 23 15:02:19 2013 Max HortonUpdateSummary PagesImporting New Code

Duncan Macleod (original author of summary pages) has an updated version that I would like to import and work on.  The code and installation instructions are found below.

I am not sure where we want to host this.  I could put it in a new folder in /users/public_html/  on megatron, for example.  Duncan appears to have just included the summary page code in the pylal repository.  Should I reimport the whole repository?  I'm not sure if this will mess up other things on megatron that use pylal.  I am working on talking to Rana and Jamie to see what is best.

  8496   Fri Apr 26 15:50:48 2013 Max HortonUpdateSummary PagesImporting New Code


I am following the instructions here:


But there as an error when I run the ./00boot command near the beginning.  I have asked Duncan Macleod about this and am waiting to hear back.

For now, I am putting things into /home/controls on allegra.  My understanding is that this is not shared, so I don't have a chance of messing up anyone else's work.  I have been moving slow and being extra cautious about what I do because I don't want to accidentally nuke anything.

  8504   Mon Apr 29 15:35:31 2013 Max HortonUpdateSummary PagesImporting New Code


 I installed the new version of LAL on allegra.  I don't think it has interfered with the existing version, but if anyone has problems, let me know.  The old version on allegra was 6.9.1, but the new code uses  To use it, add . /opt/lscsoft/lal/etc/lal-user-ench.sh to the end of the .bashrc file (this is the simplest way, since it will automatically pull the new version).

I am having a little trouble getting some other unmet dependencies for the summary pages such as the new lalframe, etc.  But I am working on it.

Once I get it working on allegra and know that I can get it without messing up current versions of lal, I will do this again on megatron so I can test and edit the new version of the summary pages.

  8523   Thu May 2 14:14:10 2013 Max HortonUpdateSummary PagesImporting New Code


 LALFrame was successfully installed.  Allegra had unmet dependencies with some of the library tools.  I tried to install LALMetaIO, but there were unmet dependencies with other LSC software.  After updating the LSC software, the problem has persisted.  I will try some more, and ask Duncan if I'm not successful.

Installing these packages is rather time consuming, it would be nice if there was a way to do it all at once.

  8536   Tue May 7 15:09:38 2013 Max HortonUpdateSummary PagesImporting New Code


 I am now working on megatron, installing in /home/controls/lal.  I am having some unmet dependency issues that I have asked Duncan about.

  8572   Tue May 14 16:14:47 2013 Max HortonUpdateSummary PagesImporting New Code

I have figured out all the issues, and successfully installed the new versions of the LAL software.  I am now going to get the summary pages set up using the new code.

  8604   Tue May 21 14:50:52 2013 Max HortonUpdate Importing New Code

There was an issue with running the new summary pages, because laldetchar was not included (the website I used for instructions doesn't mention that it is needed for the summary pages).  I figured out how to include it with help from Duncan.  There appear to be other needed dependencies, though.  I have emailed Duncan to ask how these are imported into the code base.  I am making a list of all the packages / dependencies that I needed that weren't included on the website, so this will be easier if/when it has to be done again.

  8678   Wed Jun 5 14:39:41 2013 Max HortonUpdate Importing New Code

Most dependencies are met.  The next issue is that matplotlib.basemap is not installed, because it is not available for our version of python.  We need to update python on megatron to fix this.

  8686   Thu Jun 6 15:46:10 2013 Max HortonSummaryGeneral Smart UPS 2200 Batteries Replaced

Replaced the batteries successfully in the control room.  We just had to switch the clips from the old batteries to the new one, which we didn't know was possible until now.

  11375   Thu Jun 25 12:03:42 2015 Max IsiUpdateGeneralSummary page status

The summary pages have been down due to incompatibilities with a software update and problems with the LDAS cluster. I'm working at the moment to fix the former and the LDAS admins are looking into the latter. Overall, we can expect the pages will be fully functional again by Monday.

  11376   Thu Jun 25 14:18:46 2015 Max IsiUpdateGeneralSummary page status

The pages are live again. Please allow some time for the system to catch up and process missed days. If there are any further issues, please let me know.
URL reminder: https://nodus.ligo.caltech.edu:30889/detcharsummary/


The summary pages have been down due to incompatibilities with a software update and problems with the LDAS cluster. I'm working at the moment to fix the former and the LDAS admins are looking into the latter. Overall, we can expect the pages will be fully functional again by Monday.


  11382   Mon Jun 29 17:40:56 2015 Max IsiUpdateGeneralSummary pages "Code status" page fixed

It was brought to my attention that the "Code status" page (https://nodus.ligo.caltech.edu:30889/detcharsummary/status.html) had been stuck showing "Unknown status" for a while.
This was due to a sync error with LDAS and has now been fixed. Let me know if the issue returns.

  11401   Fri Jul 10 17:57:38 2015 Max IsiUpdateGeneralSummary pages down

The summary pages are currently unstable due to priority issues on the cluster*. The plots had been empty ever since the CDS updated started anyway. This issue will (presubmably) disappear once the jobs are moved to the new 40m shared LDAS account by the end of next week.

*namely, the jobs are put on hold (rather, status "idle") because we have low priority in the processing queue, making the usual 30min latency impossible.

  11431   Mon Jul 20 16:45:15 2015 Max IsiConfigurationGeneralSummary page c1sus.ini error corrected

Bad syntax errors in the c1sus.ini config file were causing the summary pages to crash: a plot type had not been indicated for plots 5 and 6, so I've made these "timeseries."
In the future, please remember to always specify a plot type, e.g.:




       C1:SUS-ITMY_SUSPIT_INMON.mean timeseries

By the way, the pages will continue to be unavailable while I transfer them to the new shared account.

  11433   Tue Jul 21 21:25:18 2015 Max IsiUpdateGeneral40m LDAS account

A shared LIGO Data Grid (LDG) account was created for use by the 40m lab. The purpose of this account is to provide access to the LSC computer cluster resources for 40m-specific projects that may benefit from increased computational power and are not linked to any user in particular (e.g. the summary pages).

For further information, please see https://wiki-40m.ligo.caltech.edu/40mLDASaccount

  11434   Tue Jul 21 21:33:22 2015 Max IsiUpdateGeneralSummary pages moved to 40m LDAS account

The summary pages are now generated from the new 40m LDAS account. The nodus URL (https://nodus.ligo.caltech.edu:30889/detcharsummary/) is the same and there are no changes to the way the configuration files work. However, the location on LDAS has changed to https://ldas-jobs.ligo.caltech.edu/~40m/summary/ and the config files are no longer version-controlled on the LDAS side (this was redundant, as they are under VCS in nodus).

I have posted a more detailed description of the summary page workflow, as well as instructions to run your own jobs and other technical minutiae, on the wiki: https://wiki-40m.ligo.caltech.edu/DailySummaryHelp

  11444   Fri Jul 24 18:12:52 2015 Max IsiUpdateGeneralData missing

For the past couple of days, the summary pages have shown minute trend data disappear at 12:00 UTC (05:00 AM local time). This seems to be the case for all channels that we plot, see e.g. https://nodus.ligo.caltech.edu:30889/detcharsummary/day/20150724/ioo/. Using Dataviewer, Koji has checked that indeed the frames seem to have disappeared from disk. The data come back at 24 UTC (5pm local). Any ideas why this might be?

  11788   Thu Nov 19 14:50:34 2015 Max IsiUpdateGeneralNew 2D histogram plot for summary pages

A new type of plot is now available for use in the summary pages, based on EricQ's 2D histogram plots (elog 11210). I have added an example of this to the SandBox tab (https://nodus.ligo.caltech.edu:30889/detcharsummary/day/20151119/sandbox/). The usage is straighforwad: the name to be used in config files is histogram2d; the first channel corresponds to the x-axis and the second one to the y-axis; the options accepted are the same as numpy.histogram2d and pyploy.pcolormesh (besides plot limits, titles, etc.). The default colormap is inferno_r and the shading is flat.

Attachment 1: C1-ALL_AB3834_HISTOGRAM2D-1131926417-86400.png
  11884   Tue Dec 15 18:08:22 2015 Max IsiUpdateGeneralSummary archive cleaning cron job

I have added a new cron job in pcdev1 at CIT using the 40m shared account. This will run the /home/40m/DetectorChar/bin/cleanarchive script one minute past midnight on the first of every month. The script removes GWsumm archive files older than 1 month old.

  12135   Wed May 25 14:21:29 2016 Max IsiUpdateGeneralSummary page configuration

I have modified the c1summary.ini and c1lsc.ini configuration files slightly to avoid overloading the system and remove the errors that were preventing plots from being updated after certain time in the day.

The changes made are the following:
1- all high-resolution spectra from the Summary and LSC tabs are now computed for each state (X-arm locked, Y-arm locked, IFO locked, all);
2- I've removed MICH, PRCL & SRCL from the summary spectrum (those can still be found in the LSC tab);
3- I've split LSC into two subtabs.

The reason for these changes is that having high resolution (raw channels, 16kHz) spectra for multiple (>3) channels on a single tab requires a *lot* of memory to process. As a result, those jobs were failing in a way that blocked the queue, so even other "healthy" tabs could not be updated.

My changes, reflected from May 25 on, should hopefully fix this. As always, feel free to re organize the ini files to make the pages more useful to you, but keep in mind that we cannot support multiple high resolution spectra on a single tab, as explained above.

  12259   Wed Jul 6 21:16:17 2016 Max IsiUpdateComputer Scripts / ProgramsNew Tabs and Working Summary Pages

This should be fixed now—apologies for the spam.


I don't know much about how the cron job runs, I'll forward this to Max.


I started to receive emails from cron every 15min. Is the email related to this? And is it normal? I never received these cron emails before when the sum-page was running.



  12394   Wed Aug 10 17:30:26 2016 Max IsiUpdateGeneralSummary pages status
Summary pages are currently empty due to a problem with the code responsible for locating frame files in the cluster. This should be fixed soon and the
pages should go back to normal automatically at that point. See Dan Kozak's email below for details.

Date: Wed, 10 Aug 2016 13:28:50 -0700
From: Dan Kozak <dkozak@ligo.caltech.edu>

> Dan, maybe it's a gw_data_find problem?

Almost certainly that's the problem. The diskcache program that finds
new data died on Saturday and no one noticed. I couldn't restart it,
but fortunately it's author just returned from several weeks vacation
today. Smile He's working on it and I'll let you know when it's back up.

Dan Kozak
  12399   Thu Aug 11 11:09:52 2016 Max IsiUpdateGeneralSummary pages status
This problem has been fixed.

> Summary pages are currently empty due to a problem with the code responsible for locating frame files in the cluster. This should be fixed soon and the
> pages should go back to normal automatically at that point. See Dan Kozak's email below for details.
> Date: Wed, 10 Aug 2016 13:28:50 -0700
> From: Dan Kozak <dkozak@ligo.caltech.edu>
> > Dan, maybe it's a gw_data_find problem?
> Almost certainly that's the problem. The diskcache program that finds
> new data died on Saturday and no one noticed. I couldn't restart it,
> but fortunately it's author just returned from several weeks vacation
> today. Smile He's working on it and I'll let you know when it's back up.
> --
> Dan Kozak
> dkozak@ligo.caltech.edu
  12432   Tue Aug 23 09:50:17 2016 Max IsiUpdateGeneralSummary pages down due to cluster maintenance

Summary pages down today due to schedulted LDAS cluster maintenance. The pages will be back automatically once the servers are back (by tomorrow).

  12440   Thu Aug 25 08:19:25 2016 Max IsiUpdateGeneralSummary pages down due to cluster maintenance

The system is back from maintenance and the pages for last couple of days will be filled retroactively by the end of the week.


Summary pages down today due to schedulted LDAS cluster maintenance. The pages will be back automatically once the servers are back (by tomorrow).


  12544   Mon Oct 10 17:42:47 2016 Max IsiUpdateDMFsummar pages dead again

I've re-submitted the Condor job; pages should be back within the hour.


Been non-functional for 3 weeks. Anyone else notice this? images missing since ~Sep 21.


  12548   Tue Oct 11 08:09:46 2016 Max IsiUpdateDMFsummar pages dead again

Summary pages will be unavailable today due to LDAS server maintenance. This is unrelated to the issue that Rana reported.


I've re-submitted the Condor job; pages should be back within the hour.


Been non-functional for 3 weeks. Anyone else notice this? images missing since ~Sep 21.



  12703   Wed Jan 11 19:20:23 2017 Max IsiUpdateSummary PagesDecember outage

The summary pages were not successfully generated for a long period of time at the end of 2016 due to syntax errors in the PEM and Weather configuration files.

These errors caused the INI parser to crash and brought down the whole gwsumm system. It seems that changes in the configuration of the Condor daemon at the CIT clusters may have made our infrastructure less robust against these kinds of problems (which would explain why there wasn't a better error message/alert), but this requires further investigation.

In any case, the solution was as simple as correcting the typos in the config side (on the nodus side) and restarting the cron jobs (on the cluster side, by doing `condor_rm 40m && condor_submit DetectorChar/condor/gw_daily_summary.sub`). Producing pages for the missing days will take some time (how to do so for a particular day is explained in the wiki https://wiki-40m.ligo.caltech.edu/DailySummaryHelp).

RXA: later, Max sent us this secret note:

However, I realize it might not be clear from the page which are the key steps. These are just running:

1) ./DetectorChar/bin/gw_daily_summary --day YYYYMMDD --file-tag some_custom_tag To create pages for day YYYYMMDD (the file-tag option is not strictly necessary but will prevent conflict with other instances of the code running simultaneously).

2) sync those days back to nodus by doing, eg: ./DetectorChar/bin/pushnodus 20160701 20160702

This must all be done from the cluster using the 40m shared account.
  12749   Tue Jan 24 07:36:56 2017 Max IsiUpdateSummary PagesCluster maintenance
System-wide CIT LDAS cluster maintenance may cause disruptions to summary pages today. 
  12752   Wed Jan 25 09:00:39 2017 Max IsiUpdateSummary PagesCluster maintenance
LDAS has not recovered from maintenance causing the pages to remain unavailable until further notice.

> System-wide CIT LDAS cluster maintenance may cause disruptions to summary pages today. 
  12787   Thu Feb 2 11:25:45 2017 Max IsiUpdateSummary PagesCluster maintenance
FYI this issue has still not been solved, but the pages are working because I got the software running on an
alternative headnode (pcdev2). This may cause unexpected behavior (or not).

> LDAS has not recovered from maintenance causing the pages to remain unavailable until further notice.
> > System-wide CIT LDAS cluster maintenance may cause disruptions to summary pages today. 
  12831   Wed Feb 15 22:16:05 2017 Max IsiUpdateSummary PagesNew condor_q format

There has been a change in the default format for the output of the condor_q command at CIT clusters. This could be problematic for the summary page status monitor, so I have disabled the default behavior in favor of the old one. Specifically, I ran the following commands from the 40m shared account: mkdir -p ~/.condor echo "CONDOR_Q_DASH_BATCH_IS_DEFAULT=False" >> ~/.condor/user_config This should have no effect on the pages themselves.

  302   Fri Feb 8 17:09:52 2008 Max JonesUpdateComputersChanges to NoiseBudget
Today I altered the following files

In DARM_CTRL case I changed the second channel name to DARM_ERR. Messy but it may be effective.

I commented out lines of code specifically pertaining to non-existent DARM_DAQ channel. Marked in
code around approximately line 60.

Please address all comments or concerns to me (williamj(at symbl)caltech.edu). Thank you.
  318   Thu Feb 14 17:21:53 2008 Max JonesUpdateComputersNoise budget code changes
In cvs/cds/caltech/NB/matlab/utilities/LSCmodel.m at line 146
I have hardwired in changes to struct lsc. Please see code.
  364   Fri Mar 7 17:10:01 2008 Max JonesUpdateComputersNoise Budget work
Noise budget has been moved to the svn system. A checked out copy is in the directory caltech. From now on, I will try to use the work cycle as outlined in the svn manual. Changes made today include the following:

Details of the modifications made may be found on the svn system. Please let me know if anyone has a suggestion or concern. Thank you - Max.
  535   Mon Jun 16 18:26:01 2008 Max JonesUpdateComputer Scripts / ProgramsNoise Budget Changes
In the directory cvs/cds/caltech/NB the following changes were made:

I created temporary files in ReferenceData for the C1 by copying and renaming the corresponding H1 files.
- C1_SRD_Disp.txt
- C1IFOparams.m
- C1_NoiseParams.m

In getmultichan.m I added a C1 case.

In NoiseBudget.m I added a C1 case with modified sources array to include only DARM and Seismic

I appreciate and suggestions. Max.

  542   Wed Jun 18 18:32:09 2008 Max JonesUpdateComputer Scripts / ProgramsNB Update
I am reconfiguring the the noisebudget code currently in use at the sights. To that end, I have done the following things (in addition to the elog I posted earlier)

In get_dtt_dataset.m - I added C1 specific cases for DARM_CTRL, SEIS, and ITMTRX changing the specific channels to make those in use at caltech

In LocalizeSite.m - I changed the NDS_PATH to match caltechs. I left NDS_HOST untouched.

Since I am trying to get SEIS and DARM to work initially I added C1 specific cases to both of these.

Better documentation may be found in /users/mjones/DailyProgressReport/06_18_08. Suggestions are appreciated. Max.
  565   Wed Jun 25 11:36:14 2008 Max JonesUpdateComputer Scripts / ProgramsFirst Week Update
For the first week I have been modifying the noise budget script in caltech/NB to run with 40 m parameters and data. As per Rana's instructions, I have tried to run the script with only seismic and Darm sources. This involves identifying and changing channel names and altering paramter files (such as NB/ReferenceData/C1IFOparams.m). To supply the parameter files, I have copied the H1 files with (as yet only) slight modification. The channel name changes have been made to mirror the sites for the most part. Two figures are attached which show the current noise budget. The Day plot was taken 6/23/08 at ~10:30 am. The night plot was taken 6/22/08 at ~11:00 pm . Note that the SRD curve is for the sites and not for the 40 m (I hope to change that soon). Also in one of the plots the DARM noise signal is visible. Obviously this needs work. A list of current concerns is

1) I am using a seismic transfer function made by previous SURF student Ryan Kinney to operate with channels of the form C1:PEM_ACC-ETMY_Y (should I be using C1:DMF-IX_ACCY?) and the channels I am currently using are the acceleraometers for the mode cleaner with names of the form C1:PEM_ACC-MC1_X. Rana said that he thinks these may be the same but I need to be sure.

2) We don't have a DARM_CTRL channel but the code requires it, currently I am using DARM_ERR as a substitute which is probably partly responsible for the obvious error in DARM noise.

Any suggestions are appreciated. Max.
Attachment 1: C1_NoiseBudgetPlot_Day.eps
Attachment 2: C1_NoiseBudgetPlot_Night.eps
  572   Thu Jun 26 10:56:15 2008 Max JonesUpdatePEMRemoved Magnetometer
I removed the Bartington Magnetometer on the x arm to one of the outside benches. I'll be trying to determine if and how it works today. It makes a horrible high pitched sound which is due to the fact that the battery is probably 16 yrs old. It still works with ac power though and I want to see if it is still operating correctly before I ask to buy a new battery. Sorry for the bother.
ELOG V3.1.3-