40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 80 of 350  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  11240   Thu Apr 23 21:05:23 2015 ranaUpdateComputer Scripts / ProgramsCDSutils upgrade undone

Q: please update this Wiki page with the go-back procedure:

https://wiki-40m.ligo.caltech.edu/CDSutils_Upgrade_Procedure

  11252   Sun Apr 26 00:56:21 2015 ranaSummaryComputer Scripts / Programsproblems with new restart procedures for elogd and apache

Since the nodus upgrade, Eric/Diego changed the old csh restart procedures to be more UNIX standard. The instructions are in the wiki.

After doing some software updates on nodus today, apache and elogd didn't come back OK. Maybe because of some race condition, elog tried to start but didn't get apache. Apache couldn't start because it found that someone was already binding the ELOGD port. So I killed ELOGD several times (because it kept trying to respawn). Once it stopped trying to come back I could restart Apache using the Wiki instructions. But the instructions didn't work for ELOGD, so I had to restart that using the usual .csh script way that we used to use.

  11263   Wed Apr 29 18:12:42 2015 ranaUpdateComputer Scripts / Programsnodus update

Installed libmotif3 and libmotif4 on nodus so that we can run dataviewer on there.

Also, the lscsoft stuff wasn't installed for apt-get, so I did so following the instructions on the DASWG website:

https://www.lsc-group.phys.uwm.edu/daswg/download/repositories.html#debian

Then I installed libmetaio1, libfftw3-3. Now, rather than complain about missing librarries, diaggui just silently dies.

Then I noticed that the awggui error message tells us to use 'ssh -Y' instead of 'ssh -X'. Using that I could run DTT on nodus from my office.

  11267   Fri May 1 20:33:31 2015 ranaSummaryComputer Scripts / Programsproblems with new restart procedures for elogd and apache

Same thing again todaysad. So I renamed the /etc/init/elog.conf so that it doesn't keep respawning bootlessly. Until then restart elog using the start script in /cvs/cds/caltech/elog/ as usual.

I'll let EQ debug when he gets back - probably we need to pause the elog respawn so that it waits until nodus is up for a few minutes before starting.

Quote:

Since the nodus upgrade, Eric/Diego changed the old csh restart procedures to be more UNIX standard. The instructions are in the wiki.

After doing some software updates on nodus today, apache and elogd didn't come back OK. Maybe because of some race condition, elog tried to start but didn't get apache. Apache couldn't start because it found that someone was already binding the ELOGD port. So I killed ELOGD several times (because it kept trying to respawn). Once it stopped trying to come back I could restart Apache using the Wiki instructions. But the instructions didn't work for ELOGD, so I had to restart that using the usual .csh script way that we used to use.

 

  11273   Tue May 5 10:40:05 2015 ericqHowToComputer Scripts / ProgramsHow to get a web page running on Nodus

How to get your own web page running on Nodus

  1. On any martian machine, put your stuff in /users/public_html/$MYPAGE/
  2. On Nodus, run: ln -s /users/public_html/$MYPAGE /export/home/
  3. Your site is now available at https://nodus.ligo.caltech.edu:30889/$MYPAGE/
  4. If you want to allow straight up directory listing to the entire internet, on Nodus run: sudoedit /etc/sites-available/nodus, and add the following lines towards the bottom:
<Directory /export/home/$MYPAGE>
    Options +Indexes
</Directory>
  11277   Sun May 10 13:54:41 2015 ranaHowToComputer Scripts / Programssummary page URL change

Also, EQ gave us a better (and not pwd protected) URL for the summary pages. Please replace your previous links with this new one:

https://nodus.ligo.caltech.edu:30889/detcharsummary/

  11278   Mon May 11 01:28:33 2015 ranaHowToComputer Scripts / Programssummary page URL change

Like Steve pointed out, the summary pages show that the y-arm transmission drifts a lot when locked. The OL summary page shows that this is all due to ITMY yaw.

Could be either that they coil driver / DAC is bad or that the suspension is poorly built. We need to dig into ITMY OL trends over long term to see if this is new or now.

Also, weather station needs a reboot. And does anyone know what the MC_F calibration is?

  11288   Wed May 13 09:17:28 2015 ranaUpdateComputer Scripts / Programsrsync frames to LDAS cluster

Still seems to be running without causing FB issues. One thought is that we could look through the FB status channel trends and see if there is some excess of FB problems at 10 min after the hour to see if its causing problems.

I also looked into our minute trend situation. Looks like the files are comrpessed and have checksum enabled. The size changes sometimes, but its roughly 35 MB per hour. So 840 MB per day.

According to the wiper.pl script, its trying to keep the minute-trend directory to below some fixed fraction of the total /frames disk. The comment in the scripts says 0.005%,

but I'm dubious since that's only 13TB*5e-5 = 600 MB, and that would only keep us for a day. Maybe the comment should read 0.5% instead...

Quote:

The rsync job to sync our frames over to the cluster has been on a 20 MB/s BW limit for awhile now.

Dan Kozak has now set up a cronjob to do this at 10 min after the hour, every hour. Let's see how this goes.

You can find the script and its logfile name by doing 'crontab -l' on nodus.

 

  11299   Mon May 18 14:22:05 2015 ericqUpdateComputer Scripts / Programsrsync frames to LDAS cluster
Quote:

Still seems to be running without causing FB issues.

I'm not so sure. I just was experiencing some severe network latency / EPICS channel freezes that was alleviated by killing the rsync job on nodus. It started a few minutes after ten past the hour, when the rysnc job started. 

Unrelated to this, for some odd reason, there is some weirdness going on with ssh'ing to martian machines from the control room computers. I.e. on pianosa, ssh nodus fails with a failure to resolve hostaname message, but ssh nodus.martian succeeds. 

  11307   Tue May 19 11:15:09 2015 ericqUpdateComputer Scripts / ProgramsChiara Backup Hiccup

Starting on the 14th (five days ago) the local chiara rsync backup of /cvs/cds to an external HDD has been failing:

caltech/c1/scripts/backup/rsync_chiara.backup.log:

2015-05-13 07:00:01,614 INFO       Updating backup image of /cvs/cds
2015-05-13 07:49:46,266 INFO       Backup rsync job ran successfully, transferred 6504 files.
2015-05-14 07:00:01,826 INFO       Updating backup image of /cvs/cds
2015-05-14 07:50:18,709 ERROR      Backup rysnc job failed with exit code 24!
2015-05-15 07:00:01,385 INFO       Updating backup image of /cvs/cds
2015-05-15 08:09:18,527 ERROR      Backup rysnc job failed with exit code 24!
...
 

Code 24 apparently means "Partial transfer due to vanished source files."

Manually running the backup command on chiara worked fine, returning a code of 0 (success), so we are backed up. For completeness, the command is controls@chiara: sudo rsync -av --delete --stats /home/cds/ /media/40mBackup

Are the summary page jobs moving files around at this time of day? If so, one of the two should be rescheduled to not conflict. 

  11308   Tue May 19 11:24:44 2015 ericqUpdateComputer Scripts / ProgramsNotification Scheme

Given some of the things we've facing lately, it occurs to me that we could be better served by having some sort of unified human-alerting scheme in place, for things like:

  • Local/offsite backup failures
  • Vaccumm system problems
  • HDD status for things like /frames/ and /cvs/cds/, whether the disks are full, or their SMART status indicates imminent mechanical failure

Currently, many of these things are just checked sporadically when it occurs to someone to do so, or when debugging random issues. Smoother IFO operation and peace of mind could be gained if we're confident that the relevant people are notified in a timely manner. 

Thoughts? Suggestions on other things to monitor, like maybe frontend/model crashes?

  11321   Fri May 22 18:09:58 2015 ericqUpdateComputer Scripts / ProgramsifoCoupling

I've started working on a general routine to measure noise couplings in our interferometers. Often this is done with swept sine measurements, but this misses the nonlinear part of the coupling, especially if the linear part is alreay reduced through some compensation or feedforward scheme. Rana suggested using a series of narrow band-limited noise injections. 

The structure I'm working on is a python script that uses the AWG interface written by Chris W. to create the excitations. Afterwards, I calculate a series of PSD estimates from the data (i.e. a spectrogram), and apply a two-sample, unequal variance, t-test to test for statisically significant increases in the noise spectra to try and evaluate the nonlinear contriubutions to the noise. I've started a git repository at github.com/e-q/ifoCoupling with the code. 

So far, I've tested one such injection of noise coupling from the ETMX oplev error point to the single arm length error signal. It's completely missing the user interface and structure to do a general series of measurements, but this is just organizational; I'm trying to get the math/science down first. 

Here's a result from today:

Median, instead of the usual mean, PSDs are used throughout, to reject outliers/glitches.

The linear part of the coupling can be estimated using the coherence / spectrum height in the excitation band, but I'm not sure what the best what to present/paramerize the nonlinear parts of each individaul excitation band's result is.

Also, I anticipate being able to write an excitation auto-leveling routine, gradually increasing the exctiation level until the excited spectrum is some amount  noisier than the baseline spectrum, up to some maximum amount configurable by the user. 

The excitation shaping could probably be improved, too. It's currently and elliptic + butterworth bandpass for a sharp edge and rolloff. 

I'm open to any thoughts and/or suggestions anyone may have!

Attachment 1: ETMX_PIT_L_coupling.png
ETMX_PIT_L_coupling.png
  11325   Tue May 26 19:57:11 2015 ranaUpdateComputer Scripts / ProgramsifoCoupling

Looks like a very handy code, especially with the real statistical tests.

I would make sure to use much smaller excitation amplitudes. Since the coupling is nonlinear, we expect that its only a good noise budget estimator when the excitation amplitude is less than a factor of 3 above the quiesscent excitation.

  11327   Wed May 27 15:20:54 2015 ericqUpdateComputer Scripts / ProgramsChiara Backup Hiccup

The local chiara backups are still failing due to vanished source files. I've emailed Max about the summary page jobs, since I think they're running remotely. 

  11336   Fri May 29 11:28:42 2015 ericqUpdateComputer Scripts / ProgramsChiara Backup Hiccup

I've changed the chiara local backup script to read a folder exclusion file, and excluded /users/public_html/detcharsummary, and things are working again. 

This was neccesary because the summary pages are being updated every half hour, which is faster than the time it takes for the backup script to run, so the file index that it builds at the start becomes invalid later on in the process. 


Thinking about chiara's disk, it strikes me that when we went from the linux1 RAID to a single HDD on chiara, we may have tightened a bottleneck on our NFS latency, i.e. we are limited to that single hard drive's IO rates. This of course isn't the culprit for the more recent dramatic slowdowns, but in addition to fixing whatever has happened more recently, we may want to consider some kind of setup with higher IO capability for the NFS filesystem. 

  11337   Fri May 29 12:49:53 2015 KojiUpdateComputer Scripts / ProgramsChiara Backup Hiccup

In fact, the file access is supposed to be WAY faster now than in the RAID case.

As noted in ELOG 9511, it was SCSI-2(or 3?) that had ~6MB/s thruput. Previously the backup took ~2hours.
This was improved to 30min by SATA HDD on llinux1.

I am looking at /opt/rtcds/caltech/c1/scripts/backup/rsync.backup.cumlog

In fact, this "30-min backup" was true until the end of March. After that the backup is taking 1h~1.5h.

This could be related to the recent NFS issue?

  11338   Fri May 29 15:12:39 2015 KojiUpdateComputer Scripts / ProgramsChiara Backup Hiccup

Actual data

Attachment 1: backup_hours.pdf
backup_hours.pdf
  11366   Fri Jun 19 16:54:20 2015 JenneUpdateComputer Scripts / ProgramsWiener scripts in scripts directory

I have put the Wiener filter scripts into  /opt/rtcds/caltech/c1/scripts/Wiener/  .  They are under version control. 

The idea is that you should copy ParameterFile_Example.m into your own directory, and modify parameters at the top of the file, and then when you run that script, it will output fitted filters ready to go into Foton.  (Obviously you must check before actually implementing them that you're happy with the efficacy and fits of the filters). 

Things to be edited in the ParameterFile include:

  • Channel names for the witness sensors (which should each have a corresponding .txt file with the raw data)
  • Channel name for the target
  • Folder where this raw data is saved
  • Folder to save results
  • 1 or 0 to determine if need to load and downsample the raw data, or if can use pre-downsampled data
    • This should probably be changed to just look to see if the pre-downsampled data already exists, and if not, do the downsampling
  • 1 or 0 to determine if should use actuator pre-weighting
  • Data folder for measured actuator TFs (only if using actuator pre-weighting)
    • Actuator TFs can be many different exported text files from DTT, and they will be stitched together to make one set of measurements, where all points have coherence above some quantity (that you set in the ParameterFile)
  • Coherence threshold for actuator data (only use data points with coherence above this amount)
  • Fit order for actuator transfer function's vectfit
  • 1 or 0 to decide if should use preweighting filter
  • zeros and poles for preweighting filters
  • 1 or 0 to decide if should use lowpass after Wiener filters (will be provided corresponding SOS coefficients for this filter, if you say yes)
  • Lowpass filter parameters: cuttoff freq, order and ripple for the Cheby filter
  • New sample rate for the data
  • Number of Wiener filter taps
  • Decide if use brute force matrix inversion or Levinson method
  • Calibrations for witnesses and target
  • Fit order for each of the Wiener filters

I think that's everything that is required.

  •  
  11481   Thu Aug 6 01:38:19 2015 ericqUpdateComputer Scripts / ProgramsChiara gets new Ethernet card

Since Chiara's onboard ethernet card has a reputation to be flaky in Linux, Koji suggested we could just buy a new ethernet card and throw it in there, since they're cheap. 

I've installed a Intel EXPI9301CT ethernet card in Chiara, which detected it without problems. I changed over the network settings in /etc/networking/interfaces to use eth1 instead of eth0, restarted nfs and bind9, and everything looked fine. 

Sadly, EPICS/network slowdowns are still happening. :(

  11498   Wed Aug 12 14:35:46 2015 ericqUpdateComputer Scripts / ProgramsPDFs in ELOG

I've tweaked the ELOG code to allow uploading of PDFs by drag-and-drop into the main editor window. Once again we can bask in the glory of 

(You may have to clear your browser's cache to load the new javascript)

Attachment 1: smooth.pdf
smooth.pdf
  11572   Fri Sep 4 04:12:05 2015 ericqUpdateComputer Scripts / ProgramsMATLAB down on all workstations

There seems to be something funny going on with MATLAB's license authentication on the control room workstations. Earlier today, I was able to start MATLAB on pianosa, but now attempting to run /cvs/cds/caltech/apps/linux64/matlab/bin/matlab -desktop results in the message: 

License checkout failed.
License Manager Error -15
MATLAB is unable to connect to the license server. 
Check that the license manager has been started, and that the MATLAB client machine can communicate
with the license server.

Troubleshoot this issue by visiting: 
http://www.mathworks.com/support/lme/R2013a/15

Diagnostic Information:
Feature: MATLAB 
License path: /home/controls/.matlab/R2013a_licenses:/cvs/cds/caltech/apps/linux64/matlab/licenses/license.dat:/cv
s/cds/caltech/apps/linux64/matlab/licenses/network.lic 
Licensing error: -15,570. System Error: 115

  11580   Mon Sep 7 16:30:56 2015 ranaHowToComputer Scripts / Programsincrease of window border size on Rossa

Frustrated by the single pixel width of the windows and how hard that makes it to drag things around, I explored StackExchange:

which showed how there is a .xml file which can be edited to increase this. I've changed the border size to 4 pixels on Rossa - its nice.devil

  11615   Thu Sep 17 19:58:06 2015 gautamSummaryComputer Scripts / ProgramsFrequency counting algorithm

I made some changes to the c1tst model running on c1iscey in order to test my algorithm for frequency counting. I followed the steps listed in elog 8909 to make, install and start the model. 

I need to debug a few things and run some more diagnostics so I am leaving the model in its edited version (Eric had committed it to the svn before I made any changes). 

  11618   Fri Sep 18 09:06:26 2015 ranaFrogsComputer Scripts / Programsremote data access: volume 1, Inferno

Trying to download some data using matlab today, I found that my ole mDV stuff doesn't work because its MEX files were built for AMD64...

Tried to rebuild the NDS1 MEX according to 7 year old instructions didn't work; our GCC is 'too' new.

From the Remote Data Access wiki (https://wiki.ligo.org/RemoteAccess/MatlabTools) I got the new 'get_data.m' and 'GWdata.m'. These didn't run, so I updated the nds2-client and matlab-nds2-client on Donatella.

Still doesn't run to get 40m data. It recognizes that we're C1, but throws some java exception error. Maybe it doesn't work on the NDS1 protocol of our framebuilder?

So then I noticed that our NDS2 server on megatron is no longer running...thought it was supposed to run via init.d. Found that the nds2 binary doesn't run because it can't find libframecpp.so.5; maybe this was blown away in some recent upgrade? We do have versions 3, 4, 6, 7, & 8 of this library installed.

So now, after an hour or two, I'm upgrading the nds2 server on megatron (plus a hundred dependencies) as well as getting a newer version of matlab to see if there's some kind of java version issue there.

Of course python still works to get data, but doesn't have any of the wiener filter calculating code that matlab has...

  11623   Fri Sep 18 19:19:49 2015 ranaFrogsComputer Scripts / Programsremote data access: volume 1, Inferno

NDS2 restarted after hours long upgrade process; testing has begun. Let's try to get some long stretches of MC locked with MCL FF ON this weekend so's I can test out the angular FF idea.

  11628   Mon Sep 21 18:31:06 2015 gautamSummaryComputer Scripts / ProgramsFrequency counting algorithm

I have been working on setting up a frequency counting module that can give us a readout of the beat frequency, divided by a factor of 2^14 using the Wenzel frequency dividers as described here. This is a summary of what I have thus far.

The algorithm, and simulink model

The basic idea is to pass the digitized signal through a Schmitt trigger (existing RCG module), which provides some noise immunity, and should in theory output a clean square wave with the same frequency as the input. The output of the Schmitt trigger module is either 0 (for input < lower threshold value) and 1 (for input greater than the high threshold value). By differencing this between successive samples, we can detect a "zero-crossing", and by measuring the time interval between successive zero crossings, we can take the reciprocal to get the frequency. The last bit of this operation (i.e. measuring the interval) is done using a piece of custom C code. Initially, I was trying to use the part "GPS" from CDS_PARTS to get the current GPS time and hence measure intervals between successive zero-crossings, but this didn't work out because the output of GPS is in seconds, and that doesn't give me the required precision to count frequency. I tried implementing some more precision timing using the clock_gettime() function, which is capable of giving nanosecond precision, but this didn't work for me. So I am now using a more crude way of measuring the interval, by using a counter variable that is incremented each time a zero-crossing is NOT detected, and then converting this to time using the FE_RATE macro (=16384). In any case, the ADC sampling rate limits the resolution of frequency counting using zero-crossing detection (more on this later). Attachment 1 shows the SIMULINK block diagram for this entire procedure.

Testing the model

I implemented all of this on c1tst, and followed the steps listed here to get the model up and running. I then used one of the DB37 breakout boards to send a signal to the ADC using the DS345 function generator. Attachment 2 shows some diagnostic plots - input signal was a 2.5Vpp (chosen to match the output from the Wenzel dividers) square wave at 2kHz:

  • Bottom left: digitized version of the input signal - I used this to set the upper and lower thresholds on the Schmitt trigger at +1000 counts and -1000 counts respectively.
  • Top left: Schmitt trigger output (red trace) and the difference between successive samples of the Schmitt trigger output (blue trace - this variable is used to detect a zero crossing)
  • Top right: Counter variable used to measure intervals between successive zero crossings, and hence, the frequency. The frequency output is held until the next zero crossing is detected, at which time counter is reset
  • Bottom right: frequency output in Hz.

The right column pointed me to the limitations of frequency counting using this method - even though the input frequency was constant (2kHz), the counter variable, and hence the frequency readout, was neither accurate nor precise. But this was to be expected given the limitations imposed by ADC sampling? We only get information of the state of the input signal once within each sampling interval, and hence, we cannot know if a zero crossing has occurred until the next sampling interval. Moreover, we can only count frequency in discrete steps. In attachments 3 and 4, I've plotted these discrete frequencies which can be measured - the error bars indicate the error in the frequency readout if the counter variable is 1 more or less than the "true" value - this can (and does) happen if the high and low times of the Schmitt trigger are not equal over time (see top left plot in Attachment 2, its not very obvious, but all the "low" times are not equal, and so, the interval between detected zero crossings is not equal). This becomes a problem for small values of the counter variable, i.e. at high input frequencies. I was having a look at the elogs Aidan wrote some years ago for a different digital frequency counting approach, and I guess the conclusion there was similar - for high input frequencies, the error is large. 

I further did two frequency sweeps using the DS345, to see if I could recover this in the frequency readout. Attachments 5 and 6 show the results of these sweeps. For low frequencies, i.e. 100-500 Hz, the jitter in the readout is small (though this will be multiplied by a factor of 2^14), but by the time the input frequency gets up to 2kHz, the jitter in the readout is pretty bad (and gets worse for even higher frequencies.

Bottom line

Some refinements can be made to the algorithm, perhaps by introducing some averaging (i.e. not reading out frequency for every pair of zero crossings, but every 5) which may improve the jitter in the readout, but I would think that the current approach is not very useful above 2kHz (corresponding to ~30MHz of pre-divider frequency), because of the limitations shown in attachments 3 and 4. 

Attachment 1: Simulink_model.pdf
Simulink_model.pdf
Attachment 2: diagnostic_plots.pdf
diagnostic_plots.pdf
Attachment 3: Error_high_frequency.pdf
Error_high_frequency.pdf
Attachment 4: Error_low_frequency.pdf
Error_low_frequency.pdf
Attachment 5: Frequency_sweep_100_500_Hz.pdf
Frequency_sweep_100_500_Hz.pdf
Attachment 6: Frequency_sweep_100_2000_Hz.pdf
Frequency_sweep_100_2000_Hz.pdf
  11629   Mon Sep 21 23:18:55 2015 ericqSummaryComputer Scripts / ProgramsFrequency counting algorithm

I definitely think lowpassing the output is the way to go. Since this frequency readback will be used for slow control of the beatnote frequency via auxillary laser temperature, even lowpassing at tens of Hz is fine. The jitter doesn't mean its useless, though.

If we lowpass at 16Hz, we're effectively averaging over 1024 samples, bringing, for example, a +-2kHz jitter of a 6kHz signal as you post down to 2kHz/sqrt(1024) ~ 60Hz, which is 1% of the carrier. This seems ok to me. 

  11631   Tue Sep 22 02:11:17 2015 ranaSummaryComputer Scripts / ProgramsFrequency counting algorithm

I was going to suggest using a software PLL, but perhaps averaging gives the same result. The same ADC signal can be fed to multiple blocks with different averaging times and we can just use whichever ones seems the most useful.

  11640   Thu Sep 24 17:01:37 2015 ericqUpdateComputer Scripts / ProgramsFreeing up some space on /cvs/cds

I noticed that Chiara's backup HD (which has a capacity of 1.8TB, vs the main drives 2TB) was near to getting full, meaning that we would soon be without a local backup. 

I freed up ~200GB of space by compressing the autoburt snapshots from 2012, 2013, 2014. Nothing is deleted, I've just compressed text files into archives, so we can still dig out the data whenever we want.

  11688   Wed Oct 14 15:59:06 2015 ranaUpdateComputer Scripts / Programsnodus web apache simlinks too soft

None of the links here seem to work. I forgot what the story is with our special apache redirect frown

https://wiki-40m.ligo.caltech.edu/Core_Optics

  11694   Thu Oct 15 14:39:58 2015 ericqUpdateComputer Scripts / Programsnodus web apache simlinks too soft
Quote:

None of the links here seem to work. I forgot what the story is with our special apache redirect frown

https://wiki-40m.ligo.caltech.edu/Core_Optics

The story is: we currently don't expose the whole /users/public_html folder. Instead, we are symlinking the folders from public_html to /export/home/ on nodus, which is where apache looks for things

So, I fixed the links on the Core Optics page by running:

controls@nodus|~ > ln -sfn /users/public_html/40m_phasemap /export/home/

  11784   Wed Nov 18 20:49:05 2015 ranaUpdateComputer Scripts / Programsnodus boot getting full

controls@nodus|~ > df -h
Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/nodus2--vg-root  355G   69G  269G  21% /
udev                         5.9G  4.0K  5.9G   1% /dev
tmpfs                        1.2G  308K  1.2G   1% /run
none                         5.0M     0  5.0M   0% /run/lock
none                         5.9G     0  5.9G   0% /run/shm
/dev/sda1                    236M  210M   14M  94% /boot
chiara:/home/cds             2.0T  1.5T  459G  77% /cvs/cds
fb:/frames                    13T   11T  1.6T  88% /frames

  11786   Wed Nov 18 23:18:07 2015 ericqUpdateComputer Scripts / Programsnodus /boot cleared up

The /boot partition was filling up with old kernels. Nodus has automatic security updates turned on, so new kernels roll in and the old ones don't get removed. 

I ran apt-get autoremove, which removed several old kernels. (apt is configured by default to keep two previous kernels around when autoremoving, so this isn't so risky)

Now: /dev/sda1                    236M   94M  130M  42% /boot

In principle, one should be able change a setting in /etc/apt/apt.conf.d/50unattended-upgrades that would do this cleanup automatically, but this mechanism has a bug whose fix hasn't propagated out yet (link). So, I've added a line to nodus' root crontab to autoremove once a week, Sunday morning. 

  11799   Mon Nov 23 14:45:39 2015 ericqUpdateComputer Scripts / ProgramsNew software

COMSOL 5.1 has been installed at: /cvs/cds/caltech/apps/linux64/comsol51/bin/comsol

MATLAB 2015b has been installed at: /cvs/cds/caltech/apps/linux64/matlab15b/bin/matlab 

This has not replaced the default matlab on the workstations, which remains at 2013a. If some testing reveals that the upgrade is ok, we can rename the folders to switch. 

  11824   Mon Nov 30 12:19:38 2015 yutaroUpdateComputer Scripts / Programsimage capture

On VIDEO.adl, Image Capture and Video Capture did not seem to work and gave me some errors, so I fixed following two things:

1. just put one side of a USB cable to Pianosa the other side of which was connected to Sensoray; I don't know why but this was unconnected.

2. slightly fixed /users/sensoray/sdk_2253_1.2.2_linux/imsub/display-image.py as fpllows   

L52:       pix[j, i] = R, G, B     ->     pix[j, i] = int(R), int(G), int(B)  

It seems to work, at least for some cameras including ETMYF and ITMYF. 

  11837   Wed Dec 2 15:08:41 2015 ericqUpdateComputer Scripts / ProgramsDonatella sudo problem resolved

Somehow, the controls user account on donatella lost its membership to the sudoers group, which meant doing anything that needs root authentication was impossible. 

I fixed this by booting up from a Linux install USB drive, mounting the HD, and running useradd controls sudo

  11855   Mon Dec 7 10:40:09 2015 yutaroUpdateComputer Scripts / ProgramsAdded 1 line to UNFREEZE_DITHER.py

I added 1 line to one of the ASS scripts, UNFREEZE_DITHER.py like this:

L29>   ez.cawrite('C1:ASS-'+dof+'_GAIN', 0)   

The reason why I added this is: without this line, C1:ASS-'+dof+'_GAIN become larger that 1.0, which is nomial value, if you UNFREEZE DITHER when the dither is already running or C1:ASS-'+dof+'_GAIN is not 0.0.  

  11861   Tue Dec 8 11:24:45 2015 yutaroSummaryComputer Scripts / ProgramsScripts for loss map measurement

Here I explain usage of my scripts for loss map measurement. There are 7 script files in a same directory /opt/rtcds/caltech/c1/scripts/lossmap_scripts. With these scripts, round trip loss of an arm cavity with the beam spot on one mirror shifted to 5x5 (option: 3x3) points is measured. You can choose on which cavity you measure, the beam spot on which mirror you shift, and maximum shift of the beam spot in vertical and horizontal direction.

 

To start measurement from the beginning

Run the following command in an arbitrary directory and you will get several text files including the result of loss map measurement:

> python /opt/rtcds/caltech/c1/scripts/lossmap_scripts/lossmap.py [maximum shift in mm (PIT)] [maximum shift in mm (YAW)] [arm name (XorY)] [mirror name (E or I)]

Optionally, you can add "AUTO" at the end of the above command. Without "AUTO", you will be asked if the dithering has already settled down or not after each shift of the beam spot and you can let the scripts wait until the dithering settles down sufficiently. If you add "AUTO", it will be judged if the dithering has settled down or not according to some criteria, and the measurement will continue without your response to the terminal.

The files to be created in the current directory by the scripts are:

- lossmapETMX1-1.txt                                # [POX power (locked)] / [POX power (misaligned)]

- lossmapETMX1-2.txt                                # standard deviation of [POX power (locked)] / [POX power (misaligned)]

- lossmapETMX1-3.txt                                # TRX

- lossmapETMX1-1_converted.txt               # round trip loss (ppm) calculated from lossmapETMX1-1.txt

- lossmapETMX1-1_converted_sigma.txt     # standard deviation of round trip loss calculated from 1-1.txt and 1-2.txt

- lossmapETMX_result.txt                           # round trip loss and its error in a clear form.

The name of the files would be "lossmapITMY1-1.txt" etc. depending on which mirror you have chosen.

 

To restart measurement from a certain point

Run the following command in a directory containing "lossmap(mirror name)1-1.txt", "lossmap(mirror name)1-2.txt" and "lossmap(mirrorname)1-3.txt" which are created by previous not-completed measurement:

> python /opt/rtcds/caltech/c1/scripts/lossmap_scripts/lossmap.py [maximum shift in mm (PIT)] [maximum shift in mm (YAW)] [arm name (XorY)] [mirror name (E or I)] [restart point (PIT)] [restart point (YAW)]​ 

You can also add "AUTO".

How to designate the restart point:

Matrix elements of output of this measurement procedure are characterized by a pair of two numbers as the following shows.

   (-1,-1) ->  (-1,-0.5) ->  (-1,0) ->   (-1,0.5)  ->   (-1,1)
                                                                              v
   (-0.5,1) <- (-0.5,0.5) <- (-0.5,0) <- (-0.5,-0.5) <- (0.5,-1)
      v
   (0,-1) ->   (0,-0.5)  ->  (0,0)  ->   (0,0.5)  ->    (0,1)   
                                                                              v
   (0.5,1)  <- (0.5,0.5)  <- (0.5,0)  <- (0.5,-0.5)  <- (0.5,-1)
      v
   (1,-1) ->   (1,-0.5) ->   (1,0)  ->   (1,0.5)   ->   (1,1)

Please write the numbers that correspond to the matrix element you want to restart at. Arrows show the order of sequence of measurement. About the correspondence between the matrix elements and real position on the ETMY and ETMX, see elog 11818 and 11857, respectively. 

This script will overwrite the files (~1-1.txt etc.) so it is safer to make backup of the files before you run this script.

 

Some notes on the scripts and measurement

- Calibration has been done only for ETMs, i.e. for ITMs unit of [maximum shift] is not mm, but the values written in [maximum shift] equal to the maximum offsets added just after demodulation of ASS loop (ex. C1:ASS-YARM_ITM_PIT_L_DEMOD_I_OFFSET).

- It should be checked before doing measurement if the following parameters are correct or not.

POXzero (L47 in lossmapx.py and L52 in lossmapx_resume.py: the value of C1:LSC-POXDC_OUTPUT when no light injects into POXPD.)

POYzero (L45 in lossmapy.py and L50 in lossmapy_resume.py: the value of C1:LSC-POYDC_OUTPUT when no light injects into POYPD.)

mmr (L11 in lossmap_convert.py: (mode matching carrier power)/(total power))

Tf (L12 in lossmap_convert.py; transmittivity of ITM) 

Tetm (L13 in lossmap_convert.py: transmittivity of ETM in ppm)

- Changing n (L50 in lossmap.py) from 5 to 3, the grid points will be 3x3 changed from the default value of 5x5. If 3x3, the matrix elements are characterized by

   (-1,-1) -> (-1,0) -> (-1,1)
                                    v
   (0,1)  <-  (0,0) <-  (0,-1)   
      v
   (1,-1) ->  (1,0) ->  (1,1)  

similarly to the case of 5x5.

- You can copy the directory lossmap_scripts anywhere in controls and use it. These scripts will work as long as all the 7 scripts exist in a same directory. 

  11862   Tue Dec 8 15:18:29 2015 ericqUpdateComputer Scripts / ProgramsNodus security

I've done a couple things to try and make nodus a little more secure. Some have worried that nodus may be susceptible to being drafted into a botnet, slowing down our operations. 

1. I configured the ssh server settings to disallow logins as root. Ubuntu doesn't enable the root account by default anyways, but it doesn't hurt.

2. I installed fail2ban. Function: If some IP address fails to authenticate an ssh connection 3 times, it is banned from trying to connect for 10 minutes. This is mostly for thwarting mass brute force attacks. Looking at /var/log/auth.log doesn't indicate any of this kind of thing going on in the past week, at least.

3. I set up and enabled ufw (uncomplicated firewall) to only allow incoming traffic for:

  • ssh
  • ELOG
  • Nodus apache stuff (svn, wikis, etc.)

I don't think there are any other ports we need open, but I could be wrong. Let me know if I broke something you need!

  11869   Wed Dec 9 23:16:13 2015 ranaUpdateComputer Scripts / ProgramsNodus security

NDS2 and the usual ports so that we can use optimus as a comsol server.

Quote:

 

I don't think there are any other ports we need open, but I could be wrong. Let me know if I broke something you need!

 

  11899   Wed Dec 23 03:27:04 2015 ranaUpdateComputer Scripts / ProgramsLHO EPICS slow down

https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=24321

This LHO log indicates that EPICS slow down could be due to NFS activity. Could we make some trend of NFS activity on Chiara and then see if it correlates with EPICS flatlines?

I wonder if our EPICS issues frequency is correlated to the Chiara install.

  11905   Mon Jan 4 14:45:41 2016 rana, eq, kojiConfigurationComputer Scripts / Programsnodus pwd change

We changed the password for controls on nodus this afternoon. We also zeroed out the authorized_keys file and then added back in the couple that we want in there for automatic backups / detchar.

Also did the recommended Ubuntu updates on there. Everything seems to be going OK so far. We think nothing on the interferometer side cares about the nodus password.

We also decided to dis-allow personal laptops on the new Martian router (to be installed soon).

  12244   Tue Jul 5 18:44:39 2016 PrafulUpdateComputer Scripts / ProgramsWorking 40m Summary Pages

After hardware errors prevented me from using optimus, I switched my generation of summary pages back to the clusters. A day's worth of data is still too much to process using one computer, but I have successfully made summary pages for a timescales of a couple of hours on this site: https://ldas-jobs.ligo.caltech.edu/~praful.vasireddy/

 

Currently, I'm working on learning the current plot-generation code so that it can eventually be modified to include an interactive component (e.g., hovering over a point on a timeseries would display the GPS time). Also, the 40m summary pages have been down for the past 3 weeks but should be up and working soon as the clusters are now alive.

  12252   Wed Jul 6 11:02:41 2016 PrafulUpdateComputer Scripts / ProgramsVMon Tab on Summary Pages

I've added a new tab for VMon under the SUS parent tab. I'm still working out the scale and units, but let me know if you think this is a useful addition. Here's a link to my summary page that has this tab: https://ldas-jobs.ligo.caltech.edu/~praful.vasireddy/1151193617-1151193917/sus/vmon/


I'll have another tab with VMon BLRMS up soon.

Also, the main summary pages should be back online soon after Max fixed a bug. I'll try to add the SUS/VMon tab to the main pages as well.

  12254   Wed Jul 6 17:17:22 2016 PrafulUpdateComputer Scripts / ProgramsNew Tabs and Working Summary Pages

The main C1 summary pages are back online now thanks to Max and Duncan, with a gap in pages from June 8th to July 4th. Also, I've added my new VMon and Sensors tabs to the SUS parent tab on the main pages. These new tabs are now up and running on the July 7th summary page.

Here's a link to the main nodus pages with the new tabs: https://nodus.ligo.caltech.edu:30889/detcharsummary/day/20160707/sus/vmon/

And another to my ldas page with the tabs implemented: https://ldas-jobs.ligo.caltech.edu/~praful.vasireddy/1150848017-1150848317/sus/vmon/

Let me know if you have any suggestions or see anything wrong with these additions, I'm still working on getting the scales to be right for all graphs.

  12257   Wed Jul 6 21:05:36 2016 KojiUpdateComputer Scripts / ProgramsNew Tabs and Working Summary Pages

I started to receive emails from cron every 15min. Is the email related to this? And is it normal? I never received these cron emails before when the sum-page was running.

Attachment 1: cron_mail.txt.zip
  12258   Wed Jul 6 21:09:09 2016 not KojiUpdateComputer Scripts / ProgramsNew Tabs and Working Summary Pages

I don't know much about how the cron job runs, I'll forward this to Max.

Quote:

I started to receive emails from cron every 15min. Is the email related to this? And is it normal? I never received these cron emails before when the sum-page was running.

Max says it should be fixed now. Have the emails stopped?

  12259   Wed Jul 6 21:16:17 2016 Max IsiUpdateComputer Scripts / ProgramsNew Tabs and Working Summary Pages

This should be fixed now—apologies for the spam.

Quote:

I don't know much about how the cron job runs, I'll forward this to Max.

Quote:

I started to receive emails from cron every 15min. Is the email related to this? And is it normal? I never received these cron emails before when the sum-page was running.

 

 

  12260   Wed Jul 6 21:50:21 2016 KojiUpdateComputer Scripts / ProgramsNew Tabs and Working Summary Pages

It seemed something has been done. And I got cron emails.
Then, it seemed something has been done. And the emails stopped.

  12277   Fri Jul 8 19:33:16 2016 PrafulUpdateComputer Scripts / ProgramsMEDM Tab on Summary Pages

A new MEDM tab has been added to the summary pages (https://nodus.ligo.caltech.edu:30889/detcharsummary/day/20160708/medm/), although some of the screens are not updated when /cvs/cds/projects/statScreen/cronjob.sh is run. In /cvs/cds/projects/statScreen/log.txt, the following error is given for those files: import: unable to read X window image `0x20011f': Resource temporarily unavailable @ error/xwindow.c/XImportImage/5027. If anyone has seen this error before or knows how to fix it, please let me know.

In the meantime, I'll be working on creating an archive of MEDM screens for every hour to be displayed on the summary pages.

ELOG V3.1.3-