40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 276 of 335  Not logged in ELOG logo
ID Date Author Typeup Category Subject
  13112   Tue Jul 11 15:12:57 2017 KojiUpdateGeneralAll FEs down

If we have a SATA/USB adapter, we can test if the disk is still responding or not. If it is still responding, can we probably salvage the files?
Chiara used to have a 2.5" disk that is connected via USB3. As far as I know, we have remote and local backup scripts running (TBC), we can borrow the USB/SATA interface from Chiara.

If the disk is completely gone, we need to rebuilt the disk according to Jamie, and I don't know how to do it. (Don't we have any spare copy?)

  13113   Wed Jul 12 10:21:07 2017 gautamUpdateGeneralAll FEs down

Seems like the connector on this particular disk is of the SAS variety (and not SATA). I'll ask Steve to order a SAS to USB cable. In the meantime I'm going to see if the people at Downs have something we can borrow.

Quote:

If we have a SATA/USB adapter, we can test if the disk is still responding or not. If it is still responding, can we probably salvage the files?
Chiara used to have a 2.5" disk that is connected via USB3. As far as I know, we have remote and local backup scripts running (TBC), we can borrow the USB/SATA interface from Chiara.

If the disk is completely gone, we need to rebuilt the disk according to Jamie, and I don't know how to do it. (Don't we have any spare copy?)

 

  13114   Wed Jul 12 14:46:09 2017 gautamUpdateGeneralAll FEs down

I couldn't find an external docking setup for this SAS disk, seems like we need an actual controller in order to interface with it. Mike Pedraza in Downs had such a unit, so I took the disk over to him, but he wasn't able to interface with it in any way that allows us to get the data out. He wants to try switching out the logic board, for which we need an identical disk. We have only one such spare at the 40m that I could locate, but it is not clear to me whether this has any important data on it or not. It has "hda RTLinux" written on its front panel with a sharpie. Mike thinks we can back this up to another disk before trying anything, but he is going to try locating a spare in Downs first. If he is unsuccessful, I will take the spare from the 40m to him tomorrow, first to be backed up, and then for swapping out the logic board.

Chatting with Jamie and Koji, it looks like the options we have are:

  1. Get the data from the old disk, copy it to a working one, and try and revert the original FB machine to its last working state. This assumes we can somehow transfer all the data from the old disk to a working one.
  2. Prepare a fresh boot disk, load the old FB daqd code (which is backed up on Chiara) onto it, and try and get that working. But Jamie isn't very optimistic of this working, because of possible conflicts between the code and any current OS we would install.
  3. Get FB1 working. Jamie is looking into this right now.
Quote:

Seems like the connector on this particular disk is of the SAS variety (and not SATA). I'll ask Steve to order a SAS to USB cable. In the meantime I'm going to see if the people at Downs have something we can borrow.

 

 

  13115   Wed Jul 12 14:52:32 2017 jamieUpdateGeneralAll FEs down

I just want to mention that the situation is actually much more dire than we originally thought.  The diskless NFS root filesystem for all the front-ends was on that fb disk.  If we can't recover it we'll have to rebuilt the front end OS as well.

As of right now none of the front ends are accessible, since obviously their root filesystem has disappeared.

  13117   Fri Jul 14 17:47:03 2017 gautamUpdateGeneralDisks from LLO have arrived

[jamie, gautam]

Today morning, the disks from LLO arrived. Jamie and I have been trying to get things back up and running, but have not had much success today. Here is a summary of what we tried.

Keith Thorne sent us two disks: one has the daqd code and the second is the boot disk for the FE machines. Since Jamie managed to successfully compile the daqd code on FB1 yesterday, we decided to try the following: mount the boot disk KT sent us (using a SATA/USB adapter) on /mnt on FB1, get the FEs booted up, and restart the RT models. 

Quote:

I just want to mention that the situation is actually much more dire than we originally thought.  The diskless NFS root filesystem for all the front-ends was on that fb disk.  If we can't recover it we'll have to rebuilt the front end OS as well.

As of right now none of the front ends are accessible, since obviously their root filesystem has disappeared.

While on FB1, Jamie realized he actually had a copy of the /diskless/root directory, which is the NFS filesystem for the FEs, on FB1. So we decided to try and boot some of the FEs with this (instead of starting from scratch with the disks KT sent us). The way things were set up, the FEs were querying the FB machine as the DHCP server. But today, we followed the instructions here to get the FEs to get their IP address from chiara instead. We also added the line 

/diskless/root *(sync,rw,no_root_squash,no_all_squash,no_subtree_check)

to /etc/exports followed by exportfs -ra on FB1. At which point the FE machine we were testing (c1lsc) was able to boot up. 

However, it looks like the NFS filesystem isn't being mounted correctly, for reasons unknown. We commented out some of the rtcds related lines in /etc/rc.local because they were causing a whole bunch of errors at boot (the lines that were touched have been tagged with today's date).


So in summary, the status as of now is:

  1. Front-end machines are able to boot
  2. There seems to be some problem during the boot process, leading to the NFS file system not being correctly mounted. The closest related thing I could find from an elog search is this entry, but I think we are facing a different probelm.
  3. We wanted to see if we could start the realtime models (but without daqd for now), but we weren't even able to get that far today.

We will resume recovery efforts on Monday.

  13118   Sat Jul 15 01:28:53 2017 jigyasaUpdateCamerasBRDF Calibrations

This evening, Gautam helped me with setting up the apparatus for calibrating the GigE for BRDF measurements.
The SP table was chosen to set up the experiment and for this reason a few things including a laser and power meter (presumably set up by Steve) had to be moved around.

We initially started by setting up the Crysta laser with its power source (Crysta #2, 150-190 mW 1064 laser) on the SP table. The Ophir power meter was used to measure the laser power. We discovered that the laser was highly unstable as its output on the power meter fluctuated (kind of periodically) between 40 and 150 mW. The beam spot on the beam card also appeared to validate this change in intensity. So we decided to use another 1064 nm laser instead.
Gautam got the LightWave NPro laser from the PSL table and set it up on the SP table and with this laser the output as measured by the same power meter was quite stable.

We manually adjusted the power to around 150 mW. This was followed by setting up the half wave plate(HWP) with the polarizing beam splitter (PBS), which was very gently and precisely done by Gautam, while explaining how to handle the optics to me.
 On first installing the PBS, we found that the beam was already quite strongly polarized as there seemed to be zero transmission but a strong reflection.
With the HWP in place, we get a control over the transmitted intensity. The reflected beam is directed to a beam dump.
I have taken down the GigE(+mount) at ETMX and wired a spare PoE injector.
We tried to interface with the camera wirelessly through the wireless network extenders but that seems to render an unstable connection to the GigE so while a single shot works okay, a continuous shot on the GigE didn’t succeed.

The GigE was connected to the Martian via Ethernet cable and images were observed using a continuous shot on the Pylon Viewer App on Paola. 

We deliberated over the need of a beam expander, but it has been omitted presently. White printer paper is currently being used to model the Lambertian scatterer. So light scattered off the paper was observed at a distance of about 40 cm from the sample.
While proceeding with the calibrations further tonight, we realized a few challenges.

While the CCD is able to observe the beam spot perfectly well, measuring the actual power with the power meter seems to be tricky. As the scattered power is quite low, we can’t actually see any spot using a beam card and hence can’t really ensure if we are capturing the entire beam spot on the active region of the power meter (placed at a distance of ~40cm from the paper) or if we are losing out on some light, all the while ensuring that the power meter and the CCD are in the same plane.

We tried to think of some ways around that, the description of which will follow. Any ideas would be greatly appreciated.

Thanks a ton for all your patience and help Gautam! :) 

More to follow.. 

  13119   Sat Jul 15 13:40:59 2017 ranaUpdateCamerasBRDF Calibrations

Power meter only needed to measure power going into the paper not out. We use the BRDF of paper to estimate the power going out given the power going in.

  13120   Sat Jul 15 16:19:00 2017 gautamUpdateCamerasMakeshift PyPylon

Some days ago, I stumbled upon this github page, by a grad student at KIT who developed this code as he was working with Basler GigE cameras. Since we are having trouble installing SnapPy, I figured I'd give this package a try. Installation was very easy, took me ~10mins, and while there isn't great documentation, basic use is very easy - for instance, I was able to adjust the exposure time, and capture an image, all from Pianosa. The attached is some kind of in-built function rendering of the captured image - it is a piece of paper with some scribbles on it near Jigyasa's BRDF measurement setup on the SP table, but it should be straightforward to export the images in any format we like. I believe the axes are pixel indices.

Of course this is only a temporary solution as I don't know if this package will be amenable to interfacing with EPICS servers etc, but seems like a useful tool to have while we figure out how to get SnapPy working. For instance, the HDR image capture routine can now be written entirely as a Python script, and executed via an MEDM button or something.

A rudimentary example file can be found at /opt/rtcds/caltech/c1/scripts/GigE/PyPylon/examples - some of the dictionary keywords to access various properties of the camera (e.g. Exposure time) are different, but these are easy enough to figure out.

 

Attachment 1: pyPylon_test.png
pyPylon_test.png
  13121   Sun Jul 16 11:58:36 2017 jigyasaUpdateCamerasBRDF Calibrations

 

From what I understood froom my reading, [Large-angle scattered light measurements for quantum-noise filter cavity design studies(Refer https://arxiv.org/abs/1204.2528)], we do the white paper test in order to calibrate for the radiometric response, i.e. the response of the CCD sensor to radiance.‘We convert the image counts measured by the CCD camera into a calibrated measure of scatter. To do this we measure the scattered light from a diffusing sample twice, once with the CCD camera and once with a calibrated power meter. We then compare their readings.’

But thinking about this further, if we assume that the BRDF remains unscaled and estimate the scattered power from the images, we get a calibration factor for the scattered power and the angle dependence of the scattered power!

Quote:

Power meter only needed to measure power going into the paper not out. We use the BRDF of paper to estimate the power going out given the power going in.

 

  13122   Sun Jul 16 12:09:47 2017 jigyasaUpdateCamerasBRDF Calibrations

With this idea in mind, we can now actually take images of the illuminated paper at different scattering angles, assume BRDF is the constant value of (1/pi per steradian), 

then scattered power Ps= BRDF * Pi cosθ * Ω, where Pi is the incident power, Ω is the solid angle of the camera and θ is the scattering angle at which measurement is taken. This must also equal the sum of pixel counts divided by the exposure time multiplied by some calibration factor. 

From these two equations we can obtain the calibration factor of the CCD. And for further BRDF measurements, scale the pixel count/ exposure by this calibration factor.  

Quote:

 

From what I understood froom my reading, [Large-angle scattered light measurements for quantum-noise filter cavity design studies(Refer https://arxiv.org/abs/1204.2528)], we do the white paper test in order to calibrate for the radiometric response, i.e. the response of the CCD sensor to radiance.‘We convert the image counts measured by the CCD camera into a calibrated measure of scatter. To do this we measure the scattered light from a diffusing sample twice, once with the CCD camera and once with a calibrated power meter. We then compare their readings.’

But thinking about this further, if we assume that the BRDF remains unscaled and estimate the scattered power from the images, we get a calibration factor for the scattered power and the angle dependence of the scattered power!

Quote:

Power meter only needed to measure power going into the paper not out. We use the BRDF of paper to estimate the power going out given the power going in.

 

 

  13123   Mon Jul 17 16:22:01 2017 SteveUpdateSUSruby wire standoff pictures

Bluebean Optical Tech Limited of Shanghai delivered 50 pieces red ruby prisms with radius.  The first prism pictures were taken at June 5

and it was retaken again as BB#1 later

More samples were selected randomly as one from each bag of 5 and labeled as BB#2.......6    

 The R10 mm radius can be seen agains the  ruler edge.  The v-groove edge was labeled with blue marker and pictures were taken

from both side of this ridge. The top view is shown as the wire laying across on it.

SOS sus wire of 43 micron OD used as calibration as it was placed close to the side that it was focused on.

The V-groove ridge surface quality was evaluated based on as scale of 1 – 10 with 10 being the most positive.

 BB# Edge quality score
1 4
2 8
3 3
4 9.5
5 2
6 9

Remaining thing to examin, take picture of the contacting ridge to SOS from the side.

Attachment 1: contacting_ridge.bmp
Attachment 2: contacting_ridge.png
contacting_ridge.png
  13124   Wed Jul 19 00:59:47 2017 gautamUpdateGeneralFINESSE model of DRMI (no arms)

Summary:

I've been working on improving the 40m FINESSE model I set up sometime last year (where the goal was to model various RC folding mirror scenarios). Specifically, I wanted to get the locking feature of FINESSE working, and also simulate the DRMI (no arms) configuration, which is what I have been working on locking the real IFO to. This elog is a summary of what I have from the last few days of working on this.

Model details:

  • No IMC included for now.
  • Core optics R and T from the 40m wiki page.
  • Cavity lengths are the "ideal" ones - see the attached ipynb for the values used.
  • RF modulation depths from here. But for now, the relative phase between f1 and f2 at the EOM is set to 0.
  • I've not included flipped folding mirrors - instead, I put a loss of 0.5% on PR3 and SR3 in the model to account for the AR surface of these optics being inside the RCs. 
  • I've made the AR surfaces of all optics FINESSE "beamsplitters" - there was some discussion on the FINESSE mattermost channel about how not doing this can lead to slightly inaccurate results, so I've tried to be more careful in this respect.
  • I'm using "maxtem 1" in my FINESSE file, which means TEM_mn modes up to (m+n=1) are taken into account - setting this to 0 makes it a plane wave model. This parameter can significantly increase the computational time. 

Model validation:

  • As a first check, I made the PRM and SRM transparent, and used the in-built routines in FINESSE to mode-match the input beam to the arm cavities.
  • I then scanned one arm cavity about a resonance, and compared the transmisison profile to the analytical FP cavity expression - agreement was good.
  • Next, I wanted to get a sensing matrix for the DRMI (no arms) configuration (see attached ipynb notebook).
    • First, I make the ETMs in the model transparent
    • I started with the phases for the BS, PRM and SRM set to their "naive" values of 0, 0 and 90 (for the standard DRMI configuration)
    • I then scanned these optics around, used various PDs to look at the points where appropriate circulating fields reached their maximum values, and updated the phase of the optic with these values.
    • Next, I set the demod phase of various RFPDs such that the PDH error signal is entirely in one quadrature. I use the RFPDs in pairs, with demod phases separated by 90 degrees. I arbitrarily set the demod phase of the Q phase PD as 90 + phase of I phase PD. I also tried to mimic the RFPD-IFO DoF pairing that we use for the actual IFO - so for example, PRCL is controlled by REFL11_I.
    • Confident that I was close enough to the ideal operating point, I then fed the error signals from these RFPDs to the "lock" routine in FINESSE. The manual recommends setting the locking loop gain to 1/optical gain, which is what I did.
    • The tunings for the BS and RMs in the attached kat file are the result of this tuning.
    • For the actual sensing matrix, I moved each of PRM, BS and SRM +/-5 degrees (~15nm) around each resonance. I then computed the numerical derivative around the zero crossing of each RFPD signal, and then plotted all of this in some RADAR plots - see Attachment #1.

Explanation of Attachments and Discussion:

  • Attachment #1 - Computed sensing matrix from this model. Compare to an actual measurement, for example here - the relative angle between the sensing matrix elements dont exactly line up with what is measured. EQ suggested today that I should look into tuning the relative phase between the RF frequencies at the EOM. Nevertheless, I tried comparing the magnitudes of the MICH sensing element in AS55 Q - the model tells me that it should be ~7.8*10^5 W/m. In this elog, I measured it to be 2.37*10^5 W/m. On the AS table, there is a 50-50 BS splitting the light between the AS55 and AS110 photodiodes which is not accounted for in the model. Factoring this in, along with the fact that there are 6 in-vaccuum steering mirrors (assume 98% reflectivity for these), 3 in air steering mirrors, and the window, the sensing matrix element from the model starts to be in the same ballpark as the measurement, at ~3*10^5 W/m. So the model isn't giving completely crazy results.
  • Attachment #2 - Example of the signals at various RFPDs in response to sweeping the PRM around its resonance. To be compared with actual IFO data. Teal lines are the "I" phase, and orange lines are "Q" phase.
  • Attachment #3 - FINESSE kat file and the IPython notebook I used to make these plots. 
  • Next steps
    • More validation against measurements from the actual IFO.
    • Try and resolve differences between modeled and measured sensing matrices.
    • Get locking working with full IFO - there was a discussion on the mattermost thread about sequential/parallel locking some time ago, I need to dig that up to see what is the right way to get this going. Probably the DRMI operating point will also change, because of the complex reflectivities of the arm cavities seen by the RF sidebands (this effect is not present in the current configuration where I've made the ETMs transparent).

GV Edit: EQ pointed out that my method of taking the slope of the error signal to compute the sensing element isn't the most robust - it relies on choosing points to compute the slope that are close enough to the zero crossing and also well within the linear region of the error signal. Instead, FINESSE allows this computation to be done as we do in the real IFO - apply an excitation at a given frequency to an optic and look at the twice-demodulated output of the relevant RFPD (e.g. for PRCL sensing element in the 1f DRMI configuration, drive PRM and demodulate REFL11 at 11MHz and the drive frequenct). Attachment #4 is the sensing matrix recomputed in this way - in this case, it produces almost identical results as the slope method, but I think the double-demod technique is better in that you don't have to worry about selecting points for computing the slope etc. 

 

Attachment 1: DRMI_sensingMat.pdf
DRMI_sensingMat.pdf
Attachment 2: DRMI_errSigs.pdf
DRMI_errSigs.pdf
Attachment 3: 40m_DRMI_FINESSE.zip
Attachment 4: DRMI_sensingMat_19Jul.pdf
DRMI_sensingMat_19Jul.pdf
  13125   Wed Jul 19 08:37:21 2017 JamieUpdateCDSUpdate on front-end/DAQ rebuild

After the catastrophic fb disk failure last week we lost essentially the entire front end system (not any of the userapp code, but the front end boot server, operating system, and DAQ).  The fb disk was entirely unrecoverable, so we've been trying to rebuild everything from the bits and pieces lying around, and some disks that Keith Thorne sent from LLO.  We're trying to get the front ends working first, and will work on recovering daqd after.

Luckily, fb1, which was being configured as an fb replacement, is mostly fully configured, including having a copy of the front end diskless root image.  We setup fb1 as the new boot server, and were able to get front ends booting again.  Unfortunately, we've been having trouble running and building models, so something is still amis.  We've been taking a three-pronged approach to getting the front ends running:

  • /diskless/root.fb: This involves booting the front ends from the backup of the diskless root from fb.  Runs gentoo kernel 2.6.34.1.  This should correspond to the environment that all models were built and running against.  But something is missing in the configuration.  The front ends were also mounting /opt from fb, which included the dolphin drivers, and we don't have a copy of that, so models aren't loading or recompiling.
  • /diskless/root.x1boot: Keith sent a disk image of the entire x1boot server from LLO.  It uses gentoo kernel 3.0.8.  This ostensibly includes everything we should need to run the front ends, but it's unfortunately configured with newer versions of some of the software and also isn't loading our existing models or building new ones.  This also seems to be having issues with the dolphin drivers.
  • /diskless/root.jessie: This is an entirely new boot image build from scratch with Debian jessie, using an RTS-patched 3.2 kernel.  This would use the latest versions of everything.  It's mostly working, we just need to rebuild the dolphin driver and source.

It seems that in all cases we need to rebuild the dolphin drivers from source.

  13127   Wed Jul 19 14:26:50 2017 JamieUpdateCDSUpdate on front-end/DAQ rebuild

 

Quote:

After the catastrophic fb disk failure last week we lost essentially the entire front end system (not any of the userapp code, but the front end boot server, operating system, and DAQ).  The fb disk was entirely unrecoverable, so we've been trying to rebuild everything from the bits and pieces lying around, and some disks that Keith Thorne sent from LLO.  We're trying to get the front ends working first, and will work on recovering daqd after.

Luckily, fb1, which was being configured as an fb replacement, is mostly fully configured, including having a copy of the front end diskless root image.  We setup fb1 as the new boot server, and were able to get front ends booting again.  Unfortunately, we've been having trouble running and building models, so something is still amis.  We've been taking a three-pronged approach to getting the front ends running:

  • /diskless/root.fb: This involves booting the front ends from the backup of the diskless root from fb.  Runs gentoo kernel 2.6.34.1.  This should correspond to the environment that all models were built and running against.  But something is missing in the configuration.  The front ends were also mounting /opt from fb, which included the dolphin drivers, and we don't have a copy of that, so models aren't loading or recompiling.
  • /diskless/root.x1boot: Keith sent a disk image of the entire x1boot server from LLO.  It uses gentoo kernel 3.0.8.  This ostensibly includes everything we should need to run the front ends, but it's unfortunately configured with newer versions of some of the software and also isn't loading our existing models or building new ones.  This also seems to be having issues with the dolphin drivers.
  • /diskless/root.jessie: This is an entirely new boot image build from scratch with Debian jessie, using an RTS-patched 3.2 kernel.  This would use the latest versions of everything.  It's mostly working, we just need to rebuild the dolphin driver and source.

It seems that in all cases we need to rebuild the dolphin drivers from source.

To clarify, we're able to boot the x1boot image with the existing 2.6.25 kernel that we have from fb.  The issue with the root.x1boot image is not the kernel version but some of the other support libraries, such as dolphin.

  13130   Fri Jul 21 18:03:17 2017 JamieUpdateCDSUpdate on front-end/DAQ rebuild

Update:

  • front ends booting with the new Debian jessie diskless root image and a linux 3.2 version of the RTS-patched kernel
  • dolphin is configured correctly and running on c1lsc and c1sus
  • models building and running with RCG 3.0.3

Up next:

  • add c1ioo to the dolphin network
  • recompile/restart all front end models
  • daqd

I'll try to get the first two of those done tomorrow, although it's unclear what model updates we'll have to do to get things working with the newer RCG.

 

  13133   Sun Jul 23 22:16:55 2017 Jamie, gautamUpdateCDSfront-end now running with new OS, RCG

All front ends and model are (mostly) running now

All suspensions are damped:

It should be possible at this point to do more recovery, like locking the MC.

Some details on the restore process:

  • all models were recompiled with the new RCG version 3.0.3
  • the new RCG does stricter simulink drawing checks, and was complaining about unterminated outputs in some of the SUS models.  Terminated all outputs it was concerned about and saved.
  • RCG 3.0 requires a new directory for doing better filter module diagnostics: /opt/rtcds/caltech/c1/chans/tmp
  • had to reset the slow machines c1susaux, c1auxex, c1auxey

The daqd is not yet running.  This is the next task.

I have been taking copious notes and will fully document the restore process once complete.

c1ioo issues

c1ioo has been giving us a little bit of trouble.  The c1ioo model kept crashing and taking down the whole c1ioo host.  We found a red light on one of the ADCs (ADC1).  We pulled the card and replaced it with a spare from the CDS cabinet.  That seemed to fix the problem and c1ioo became more stable.

We've still been seeing a lot of glitching in c1ioo, though, with CPU cycle times frequently (every couple of seconds) running above threshold for all models, up to 200 us.  I tried unloading every kernel module I could and shutting down every non-critical process, but nothing seemed to help.

We eventually tried stopping the c1ioo model altogether and that seemed to help quite a bit, dropping the long cycle rate down to something like one every 30 seconds or so.  Not sure what that means.  We should look into the BIOS again, to see if there could be something interacting with the newer kernel.

So currently the c1ioo model is not running (which is why it's all white in the CDS overview snapshot above).  The fact that c1ioo is not running and the remaining models are still occaissionly glitching is also causing various IPC errors on auxilliary models (see c1mcs, c1rfm, c1ass, c1asx). 

RCG compile warnings

the new RCG tries to do more checks on custom c code, but it seems to be having trouble finding our custom "ccodeio.h" files that live with the c definitions in USERAPPS/*/common/src/.  Unclear why yet.  This is causing the RCG to spit out warnings like the following:

Cannot verify the number of ins/outs for C function BLRMS.
    File is /opt/rtcds/userapps/release/cds/c1/src/BLRMSFILTER.c
    Please add file and function to CDS_SRC or CDS_IFO_SRC ccodeio.h file.

This are just warnings and will not prevent the model form compiling or warning.  We'll figure out what the problem is to make these go away, but they can be ignored for the time being.

model unload instability

Probably the worst problem we're facing right now is an instability that will occaissionally, but not always, cause the entire front end host to freeze up upon unloading an RTS kernel module.  This is a known issue with the newer linux kernels (we're using kernel version 3.2.35), and is being looked into.

This is particularly annoying with the machines on the dolphin network, since if one of the dolphin hosts goes down it manages to crash all the models reading from the dolphin network.  Since half the time they can't be cleanly restarted, this tends to cause a boot fest with c1sus, c1lsc, and c1ioo.  If this happens, just restart those machines, wait till they've all fully booted, then restart all the models on all hosts with "rtcds start all".

  13135   Mon Jul 24 10:45:23 2017 gautamUpdateCDSc1iscex models died

This morning, all the c1iscex models were dead. Attachment #1 shows the state of the cds overview screen when I came in. The machine itself was ssh-able, so I just restarted all the models and they came back online without fuss.

Quote:

All front ends and model are (mostly) running now

Attachment 1: c1iscexFailure.png
c1iscexFailure.png
  13136   Mon Jul 24 10:59:08 2017 JamieUpdateCDSc1iscex models died
Quote:

This morning, all the c1iscex models were dead. Attachment #1 shows the state of the cds overview screen when I came in. The machine itself was ssh-able, so I just restarted all the models and they came back online without fuss.

This was me.  I had rebooted that machine and hadn't restarted the models.  Sorry for the confusion.

  13137   Mon Jul 24 12:00:21 2017 gautamUpdatePSLPSL NPRO mysteriously shut off

Summary:

At around 10:30AM today morning, the PSL mysteriously shut off. Steve and I confirmed that the NPRO controller had the RED "OFF" LED lit up. It is unknown why this happened. We manually turned the NPRO back on and hte PMC has been stably locked for the last hour or so.

Details:

There are so many changes to lab hardware/software that have been happening recently, it's not entirely clear to me what exactly was the problem here. But here are the observations:

  1. Yesterday, when I came into the lab, the MC REFL trace on the wall StripTool was 0 for the full 8 hour history - since we don't have data records, I can't go back further than this. I remember the PMC TRANS and REFL cameras looked normal, but there was no MC REFL spot on the CCD monitors. This is consistent with the PSL operating normally, the PMC being locked, and the PSL shutter being closed. Isn't the emergency vacuum interlock also responsible for automatically closing the PSL shutter? Perhaps if the turbo controller failure happened prior to Jamie/me coming in yesterday, maybe this was just the interlock doing its job. On Friday evening, the PSL shutter was certainly open and the MC REFL spot was visible on the camera. I also confirmed with Jamie that he didn't close the shutter.
  2. Attachment #1 shows the wall StripTool traces from earlier this morning. It looks like ~7.40AM, the MC REFL level went back up. Steve says he didn't manually open the shutter, and in any case, this was before the turbo pump controller failure was diagnosed. So why did the shutter open again
  3. When I came in at ~10AM, the CCD monitor showed that the PMC was locked, and the MC REFL spot was visible. 
  4. Also on attachment #1, there is a ~10min dip in the MC REFL level. This corresponds to ~10:30AM this morning. Both Steve and I were sitting in the control room at this time. We noticed that the PMC TRANS and REFL CCDs were dark. When we went in to check on the laser, we saw that it was indeed off. There was no one inside the lab area at this time to our knowledge, and as far as I know, the only direct emergency shutoff for the PSL is on the North-West corner of the PSL enclosure. So it is unclear why the laser just suddenly went off.

Steve says that this kind of behaviour is characteristic of a power glitch/surge, but nothing else seems to have been affected (I confirmed that the X and Y end lasers are ON). 

Attachment 1: IMG_7454.JPG
IMG_7454.JPG
  13138   Mon Jul 24 19:28:55 2017 JamieUpdateCDSfront end MX stream network working, glitches in c1ioo fixed

MX/OpenMX network running

Today I got the mx/open-mx networking working for the front ends.  This required some tweaking to the network interface configuration for the diskless front ends, and recompiling mx and open-mx for the newer kernel.  Again, this will all be documented.

controls@fb1:~ 0$ /opt/mx/bin/mx_info
MX Version: 1.2.16
MX Build: root@fb1:/opt/src/mx-1.2.16 Mon Jul 24 11:33:57 PDT 2017
1 Myrinet board installed.
The MX driver is configured to support a maximum of:
    8 endpoints per NIC, 1024 NICs on the network, 32 NICs per host
===================================================================
Instance #0:  364.4 MHz LANai, PCI-E x8, 2 MB SRAM, on NUMA node 0
    Status:        Running, P0: Link Up
    Network:    Ethernet 10G

    MAC Address:    00:60:dd:43:74:62
    Product code:    10G-PCIE-8B-S
    Part number:    09-04228
    Serial number:    485052
    Mapper:        00:60:dd:43:74:62, version = 0x00000000, configured
    Mapped hosts:    6

                                                        ROUTE COUNT
INDEX    MAC ADDRESS     HOST NAME                        P0
-----    -----------     ---------                        ---
   0) 00:60:dd:43:74:62 fb1:0                             1,0
   1) 00:30:48:be:11:5d c1iscex:0                         1,0
   2) 00:30:48:bf:69:4f c1lsc:0                           1,0
   3) 00:25:90:0d:75:bb c1sus:0                           1,0
   4) 00:30:48:d6:11:17 c1iscey:0                         1,0
   5) 00:14:4f:40:64:25 c1ioo:0                           1,0
controls@fb1:~ 0$

c1ioo timing glitches fixed

I also checked the BIOS on c1ioo and found that the serial port was enabled, which is known to cause timing glitches.  I turned off the serial port (and some power management stuff), and rebooted, and all the c1ioo timing glitches seem to have gone away.

It's unclear why this is a problem that's just showing up now.  Serial ports have always been a problem, so it seems unlikely this is just a problem with the newer kernel.  Could the BIOS have somehow been reset during the power glitch?

In any event, all the front ends are now booting cleanly, with all dolphin and mx networking coming up automatically, and all models running stably:

Now for daqd...

  13139   Mon Jul 24 19:57:54 2017 gautamUpdateCDSIMC locked, Autolocker re-enabled

Now that all the front end models are running, I re-aligned the IMC, locked it manually, and then tweaked the alignment some more. The IMC transmission now is hovering around 15300 counts. I re-enabled the Autolocker and FSS Slow loops on Megatron as well.

Quote:

MX/OpenMX network running

Today I got the mx/open-mx networking working for the front ends.  This required some tweaking to the network interface configuration for the diskless front ends, and recompiling mx and open-mx for the newer kernel.  Again, this will all be documented.

 

  13141   Tue Jul 25 02:03:59 2017 gautamUpdateOptical LeversOptical lever tuning thoughts

Summary:

Currently, I am unable to engage the coil-dewhitening filters without destroying cavity locks. One reason why this is so is because the present Oplev servos have a roll-off at high frequencies that is not steep enough - engaging the digital whitening + analog de-whitening just causes the DAC output to saturate. Today, Rana and I discussed some ideas about how to approach this problem. This elog collects these thoughts. As I flesh out these ideas, I will update them in a more complete writeup in T1700363 (placeholder for now). Past relevant elogs: 5376, 9680

  1. Why do we need optical levers?
    • ​​To stabilize the low-frequency seismic driven angular motion of the optics.
  2.  In what frequency range can we / do we need to stabilize the angular motion of the optics? How much error signal suppression do we need in the control band? How much is achievable given the current Oplev setup?
    • ​​To answer these questions, we need to build a detailed Oplev noise budget.
    • Ultimately, the Oplev error signal is sensing the differential motion between the suspended optic and the incident laser beam.
    • What frequency range does laser beam jitter dominate the actual optic motion? What about mechanical drifts of the optical tables the HeNes sit on? And for many of the vertex optics, the Oplev beam has multiple bounces on steering mirrors on the stack. What is the contribution of the stack motion to the error signal?
    • The answers to the above will tell us what lower and upper UGFs we should and can pick. It will also be instructive to investigate if we can come up with a telescope design near the Oplev QPD that significantly reduces beam jitter effects (see elog 10732). Also, can we launch/extract the beam into/from the vacuum chamber in such a way that we aren't so susceptible to motion of the stack?
  3. What are some noises that have to be measured and quantified?
    • Seismic noise
    • ​Shot noise
    • Electronics noise of the QPD readout chain
    • HeNe intensity noise (does this matter since we are normalizing by QPD sum?)
    • HeNe beam pointing / jitter noise (How? N-corner hat method?)
    • Stack motion contribution to the Oplev error signal
  4. How do we design the Oplev controller?
    • ​The main problem is to frame the right cost function for this problem. Once this cost function is made, we can use MATLAB's PSO tool (which is what was used for the PR3 coating design optimization, and also successfully for this kind of loop shaping problems by Rana for aLIGO) to find a minimum by moving the controller poles and zeros around within bounds we define.
  5. What terms should enter the cost function?

    • ​In addition to those listed in elog 5376
    • We need the >10Hz roll-off to be steep enough that turning on the digital whitening will not significantly increase the DAC output RMS or drive it to saturation.
    • We'd like for the controller to be insensitive to 5% (?) errors in the assumed optical plant and noise models i.e. the closed loop shouldn't become unstable if we made a small error in some assumed parameters.
    • Some penalty for using excessive numbers of poles/zeros? Penalty for having too many high-frequency features.
  6. Other things to verify / look into
    • ​Verify if the counts -> urad calibration is still valid for all the Oplevs. We have the arm-cavity power quadratic dependance method, and the geometry method to do this.
    •  Check if the Oplev error signals are normalized by the quadrant sum.
    • How important is it to balance the individual quadrant gains?
    • Check with Koji / Rich about new QPDs. If we can get some, perhaps we can use these in the setup that Steve is going to prepare, as part of the temperature vs HeNe noise invenstigations.

Before the CDS went down, I had taken error signal spectra for the ITMs. I will update this elog tomorrow with these measurements, as well as some noise estimates, to get started.

  13142   Tue Jul 25 08:48:57 2017 SteveUpdateVACRGA scan at d278

The RGA did not shut down at the turbo pump controller failing.

Quote:

Ifo pressure was 5.5 mTorr this morning. The PSL shutter was still open. TP2 controller failed. Interlock closed V1, V4 and VM1

Turbo pump 2 is the fore pump of the Maglev. The pressure here was 3.9 Torr so The Magelv got warm ~38C but it was still rotating at 560 Hz normal with closed V1

What I did:

Looked at pressures of Hornet and Super Bee  Instru Tech. Inc

Closed all annuloses and VA6,  disconnected V4 and VA6 and turned on external fan to cool Maglev

Opened V7 to pump the Maglev fore line with TP3

V1 opened manually when foreline pressure dropped to <2mTorr at P2 and the body temp of the Maglev cooled down to  25-27 C

VM1 opened at 1e-5 Torr

Valve configuration: vacuum normal with annuloses not pumped

Ifo pressure 8.5e-6 Torr -IT at 10am,  P2 foreline pressure 64 mTorr, TP3 controller 0.17A   22C  50Krpm

note: all valves open manually, interlock can only close them

 

Quote:

While walking down to the X end to reset c1iscex I heard what I would call a "rythmic squnching" sound coming from under the turbo pump.  I would have said the sound was coming from a roughing pump, but none of them are on (as far as I can tell).

Steve maybe look into this??

PS: please call me next time you see the vacuum is not Vacuum Normal

 

Attachment 1: RGA278d.png
RGA278d.png
  13143   Tue Jul 25 14:04:06 2017 SteveUpdateVACturbo controller installed and we are running at vac normal

Gautam and Steve,

Spare Varian turbo-V 70 controller, Model 969-9505, sn 21612 was swapped in. It is running the turbo fine @ 50Krpm but it does not allow it's V4 valve to be opened............

It turns out that TP2 @ 75Krpm will allow V4 to open and close. This must be a software issue.

So Vacuum Normal is operational if TP2 is running 75,000 rpm

We want to run at 50,000 rpm on the long term.

Note: the RS232 Dsub connector on the back of this controller is mounted 180 degrees opposite than TP3  and old failed TP2 controller

 

PS: controller is shipping out for repair 7-28-2017

 

Attachment 1: TP2@75Krpm.png
TP2@75Krpm.png
  13144   Tue Jul 25 14:27:19 2017 SteveUpdatesafetysafety training

Kira Dubrovina and Naomi Wharton received 40m specific basic safety training.

Attachment 1: safety.jpg
safety.jpg
  13145   Wed Jul 26 19:13:07 2017 JamieUpdateCDSdaqd showing same instability as before

I recompiled daqd on the updated fb1, similar to how I had before, and we're seeing the same instability: process crashes when it tries to write out the second trend (technically it looks like it crashes while it's trying to write out the full frame while the second trend is also being written out).  Jonathan Hanks and I are actively looking into it and i'll provide further report soon.

  13146   Thu Jul 27 22:42:24 2017 gautamUpdateSUSSeismic noise, DAC noise, and Coil Driver electronics noise

Summary:

Yesterday at the meeting, we talked about how the analog de-whitening filters in the coil driver path may be more aggressive than necessary. I think Attachment #1 shows that this is indeed the case.

Details:

I had done some modeling and measurement of some of these noises while I was putting together the initial DRMI noise budget, but I had never put things together in one plot. In Attachment #1, I've plotted the following:

  1. Quadrature sum of seismic noise (from GWINC calculations) for 3 suspended optics (I'm sticking to the case of 3 optics since I've been doing all the noise-budgeting for MICH - for DARM, it will be 4 suspended optics).
  2. The unfiltered DAC noise estimate. The voltage noise was measured in this elog. To convert this to displacement noise for 3 suspended optics, I've used the value of 1.55e-9/f^2 m/ct as the actuator coefficient. This number should be accurate under the assumption that the series resistance on the coil driver board output is 400 ohms (we could increase this - by how much depends on how much actuation range is needed).  
  3. Coil driver board and de-whitening board electronics noises (added in quadrature). I've used the LISO model noises, which line up well with the measured noises in elogs 13010 and 13015.
  4. The DAC noise filtered by the de-whitening transfer function, separately for the cases of using one or both of the available biquad stages. This cannot be lower than the preceeding trace (electronics noise of de-whitening and coil driver boards), so should be disregarded where it dips below it. 

It would seem that the coil driver + de-whitening board electronic noises dominate above ~150Hz. The electronics noise is ~10nV/rtHz at the output of the coil driver board, which is only a factor of 100 below the DAC noise - so the stopband attenuation of ~70dB on the de-whitening boards seems excessive.

We can lower this noise by a factor of 2.5 if we up the series resistance on the coil driver boards from 400ohm to 1kohm, but even so, the displacement noise is ~1e-18 m/rtHz. I need to investigate the electronics noises a little more carefully - I only measured it for the case when both biquad stages were engaged, I will need to do the model for all permutations - to be updated. 

Attachment #2 has an iPython notebook used to generate this plot along with all the data.


Edit 28 Jul 2.30pm: I've added Attachment #3 with traces for different assumed values of the series resistance on the coil driver board - although I have not re-computed the Johnson noise contribution for the various resistances. If we can afford to reduce the actuation range by a factor of 25, then it looks like we get to within a factor of ~5 of the seismic noise at ~150Hz. 

Attachment 1: noiseComparison.pdf
noiseComparison.pdf
Attachment 2: deWhiteConfigs.zip
Attachment 3: noiseComparison_resistances.pdf
noiseComparison_resistances.pdf
  13147   Fri Jul 28 15:36:32 2017 gautamUpdateOptical LeversOptical lever tuning thoughts

Attachment #1 - Measured error signal spectrum with the Oplev loop disabled, measured at the IN1 input for ITMY. The y-axis calibration into urad/rtHz may not be exact (I don't know when this was last calibrated).

From this measurement, I've attempted to disentangle what is the seismic noise contribution to the measured plant output.

  • To do so, I first modelled the plant as a pair of complex poles @0.95 Hz, Q=3. This gave the best agreement with measurement by eye, I didn't try and optimize this too carefully. 
  • Next, I assumed all the noise between DC-10Hz comes from only seismic disturbance. So dividing the measured PSD by the plant transfer function gives the spectrum of the seismic disturbance. I further assumed this to be flat, and so I averaged it between DC-10Hz.
  • This will be a first seismic noise model to the loop shape optimizer. I can probably get a better model using the GWINC calculations but for a start, this should be good enough.

It remains to characterize various other noise sources.

Quote:

Before the CDS went down, I had taken error signal spectra for the ITMs. I will update this elog tomorrow with these measurements, as well as some noise estimates, to get started.


I have also confirmed that the "QPD" Simulink block, which is what is used for Oplevs, does indeed have the PIT and YAW outputs normalized by the SUM (see Attachment #2). This was not clear to me from the MEDM screen.


GV 30 Jul 5pm: I've included in Attachment #3 the block diagram of the general linear feedback topology, along with the specific "disturbances" and "noises" w.r.t. the Oplev loop. The measured (open loop) error signal spectrum of Attachment #1 (call it y) is given by:

y_{meas}(s) = P(s)\sum_{i=1}^{3}d_{i}(s) + \sum_{k=1}^{4}n_{k}(s)

If it turns out that one (or more) term(s) in each of the summations above dominates in all frequency bands of interest, then I guess we can drop the others. An elog with a first pass at a mathematical formulation of the cost-function for controller optimization to follow shortly.

Attachment 1: errSig.pdf
errSig.pdf
Attachment 2: QPD_simulink.png
QPD_simulink.png
Attachment 3: feedbackTopology.pdf
feedbackTopology.pdf
  13148   Fri Jul 28 16:47:16 2017 gautamUpdateGeneralPSL StripTool flatlined

About 3.5 hours ago, all the PSL wall StripTool traces "flatlined", as happens when we had the EPICS freezes in the past - except that all these traces were flat for more than 3 hours. I checked that the c1psl slow machine responded to ping, and I could also telnet into it. I tried opening the StripTool on pianosa and all the traces were responsive. So I simply re-started the PSL StripTool on zita. All traces look responsive now.

  13149   Fri Jul 28 20:22:41 2017 JamieUpdateCDSpossible stable daqd configuration with separate DC and FW

This week Jonathan Hanks and I have been trying to diagnose why the daqd has been unstable in the configuration used by the 40m, with data concentrator (dc) and frame writer (fw) in the same process (referred to generically as 'fb').  Jonathan has been digging into the core dumps and source to try to figure out what's going on, but he hasn't come up with anything concrete yet.

As an alternative, we've started experimenting with a daqd configuration with the dc and fw components running in separate processes, with communication over the local loopback interface.  The separate dc/fw process model more closely matches the configuration at the sites, although the sites put dc and fwprocesses on different physical machines.  Our experimentation thus far seems to indicate that this configuration is stable, although we haven't yet tested it with the full configuration, which is what I'm attempting to do now.

Unfortunately I'm having trouble with the mx_stream communication between the front ends and the dc process.  The dc does not appear to be receiving the streams from the front ends and is producing a '0xbad' status message for each.  I'm investigating.

  13150   Sat Jul 29 14:05:19 2017 gautamUpdateGeneralPSL StripTool flatlined

The PMC was unlocked when I came in ~10mins ago. The wall StripTool traces suggest it has been this way for > 8hours. I was unable to get the PMC to re-lock by using the PMC MEDM screen. The c1psl slow machine responded to ping, and I could also telnet into it. But despite burt-restoring c1psl, I could not get the PMC to lock. So I re-started c1psl by keying the crate, and then burt-restored the EPICS values again. This seems to have done the trick. Both the PMC and IMC are now locked.


Unrelated to this work: It looks like some/all of the FE models were re-started. The x3 gain on the coil outputs of the 2 ITMs and BS, which I had manually engaged when I re-aligned the IFO on Monday, were off, and in general, the IMC and IFO alignment seem much worse now than it was yesterday. I will do the re-alignment later as I'm not planning to use the IFO today.

  13151   Sat Jul 29 16:24:55 2017 jamieUpdateGeneralPSL StripTool flatlined
Quote:
Unrelated to this work: It looks like some/all of the FE models were re-started. The x3 gain on the coil outputs of the 2 ITMs and BS, which I had manually engaged when I re-aligned the IFO on Monday, were off, and in general, the IMC and IFO alignment seem much worse now than it was yesterday. I will do the re-alignment later as I'm not planning to use the IFO today.

This was me.  I restarted the front ends when I was getting the MX streams working yesterday.  I'll try to me more conscientious about logging front end restarts.

  13152   Mon Jul 31 15:13:24 2017 gautamUpdateCDSFB ---> FB1

[jamie, gautam]

In order to test the new daqd config that Jamie has been working on, we felt it would be most convenient for the host name "fb" (martian network IP 192.168.113.202) to point to the physical machine "fb1" (martian network IP 192.168.113.201).

I made this change in /var/lib/bind/martian.hosts on chiara, and then ran sudo service bind9 restart. It seems to have done the job. So as things stand, both hostnames "fb" and "fb1" point to 192.168.113.201.

Now, when starting up DTT or dataviewer, the NDS server is automatically found.

More details to follow.

  13153   Mon Jul 31 18:44:40 2017 JamieUpdateCDSCDS system essentially fully recovered

The CDS system is mostly fully recovered at this point.  The mx_streams are all flowing from all front ends, and from all models, and the daqd processes are receiving them and writing the data to frames:

Remaining unresolved issues:

  • IFO needs to be fully locked to make sure ALL components of all models are working.
  • The remaining red status lights are from the "FB NET" diagnostics, which are reflecting a missing status bit from the front end processes due to the fact that they were compiled with an earlier RCG version (3.0.3) than the mx_streams were (3.3+/trunk).  There will be a new release of the RTS soon, at which point we'll compile everything from the same version, which should get us all green again.
  • The entire system has been fully modernized, to the target CDS reference OS (Debian jessie) and more recent RCG versions.  The management of the various RTS components, both on the front ends and on fb, have as much as possible been updated to use the modern management tools (e.g. systemd, udev, etc.).  These changes need to be documented.  In particular...
  • The fb daqd process has been split into three separate components, a configuration that mirrors what is done at the sites and appears to be more stable: The "target" directory for all of these components is now:
    • daqd_dc: data concentrator (receives data from front ends)
    • daqd_fw: receives frames from dc and writes out full frames and second/minute trends
    • daqd_rcv: NDS1 server (raises test points and receives archive data from frames from 'nds' process)
    The "target" directory for all of these new components is:
    • /opt/rtcds/caltech/c1/target/daqd
    All of these processes are now managed under systemd supervision on fb, meaning the daqd restart procedure has changed.  This needs to be simplified and clarified.
  • Second trend frames are being written, but for some reason they're not accessible over NDS.
  • Have not had a chance to verify minute trend and raw minute trend writing yet.  Needs to be confirmed.
  • Get wiper script working on new fb.
  • Front end RTS kernel will occaissionally crash when the RTS modules are unloaded.  Keith Thorne apparently has a kernel version with a different set of patches from Gerrit Kuhn that does not have this problem.  Keith's kernel needs to be packaged and installed in the front end diskless root.
  • The models accessing the dolphin shared memory will ALL crash when one of the front end hosts on the dolphin network goes away.  This results in a boot fest of all the dolphin-enabled hosts.  Need to figure out what's going on there.
  • The RCG settings snapshotting has changed significantly in later RCG versions.  We need to make sure that all burt backup type stuff is still working correctly.
  • Restoration of /frames from old fb SCSI RAID?
  • Backup of entirety of fb1, including fb1 root (/) and front end diskless root (/diskless)
  • Full documentation of rebuild procedure from Jamie's notes.
  13155   Mon Jul 31 23:39:02 2017 ranaUpdateCOCCavity Scan Simulation Code

Hiro Yamamoto has updated SIS (Static Interferometer Simulation) to allow us to do the MCMC based inference of the 40m arm cavity mirror maps. 

The latest version is in git.ligo.org: IFOsim/SIS/

In the examples directory I have put 3 files:

  1. mcmcCavityScans.m - runs many cavity scans using parfor and saves the data
  2. plotCavityScans.m - loads the .mat file with the data and plots it
  3. plotCavityScans.py - python file which also loads & plots, but nicer since python has a transparency option for the traces.

Attached is the plots and the data. The first attached plot is a low resolution one: 200 scans of 100 frequency points each. Second plot is 200 scans of 300 points each.

The run was done assuming perfect LIGO arm params with a random set of Zernike perturbations for each run. The amplitude of each Zernike was chosen from a Normal distribution with a standard deviation of 10 nm.

We need to come up with a better guess for the initial distribution from which to sample, and also to use the more smart sampling that one does using the MCMC Hammer.

Attachment 1: manyCavityScans-SIS.pdf
manyCavityScans-SIS.pdf
Attachment 2: manyCavityScans-SIS.pdf
manyCavityScans-SIS.pdf
Attachment 3: MonteCarlo_CavityScans.mat
  13156   Tue Aug 1 16:05:01 2017 gautamUpdateOptical LeversOptical lever tuning - cost function construction

Summary:

I've been trying to put together the cost-function that will be used to optimize the Oplev loop shape. Here is what I have so far.

Details:

All of the terms that we want to include in the cost function can be derived from:

  1. A measurement of the open-loop error signal [using DTT, calibrated to urad/rtHz]. We may want a breakdown of this in terms of "sensing noises" and "disturbances" (see the previous elog in this thread), but just a spectrum will suffice for the optimal controller given the current noises.
  2. A model of the optical plant, P(s) [validated with a DTT swept-sine measurement]. 
  3. A model of the controller, C(s). Some/all of the poles and zeros of this transfer function is what the optimization algorithm will tune to satisfy the design objectives.

From these, we can derive, for a given controller, C(s):

  1. Closed-loop stability (i.e. all poles should be in the left-half of the complex plane), and exactly 2 UGFs. We can use MATLAB's allmargin function for this. An unstable controller can be rejected by assigning it an extremely high cost.
  2. RMS rrror signal suppression in the frequency band (0.5Hz - 2Hz). We can require this to be >= 15dB (say).
  3. Minimize gain peaking and noise injection - this information will be in the sensitivity function, \left | \frac{1}{1+P(s)C(s)} \right |. We can require this to be <= 10dB (say).
  4. RMS of the control signal between 10 Hz and 200 Hz, multiplied by the digital suspension whitening filter, should be <10% of the DAC range (so that we don't have problems engaging the coil de-whitening).
  5. Smallest gain margin (there will be multiple because of the various notches we have) should be > 10dB (say). Phase margin at both UGFs should be >30 degrees.
  6. Terms 1-5 should not change by more than 10% for perturbations in the plant model parameters (f0 and Q of the pendulum) at the 10% (?) level. 

We can add more terms to the cost function if necessary, but I want to get some minimal set working first. All the "requirements" I've quoted above are just numbers out of my head at the moment, I will refine them once I get some feeling for how feasible a solution is for these requirements.

Quote:

An elog with a first pass at a mathematical formulation of the cost-function for controller optimization to follow shortly.


For a start, I attempted to model the current Oplev loop. The modeling of the plant and open-loop error signal spectrum have been described in the previous elogs in this thread.

I am, however, confused by the controller - the MEDM screen (see Attachment #2) would have me believe that the digital transfer function is FM2*FM5*FM7*FM8*gain(10). However, I get much better agreement between the measured and modelled in-loop error signal if I exclude the overall gain of 10 (see Attachments #1 for the models and #3 for measurements).

What am I missing? Getting this right will be important in specifying Term #4 in the cost function...

GV Edit 2 Aug 0030: As another sanity check, I computed the whitened Oplev control signal given the current loop shape (with sub-optimal high-frequency roll-off). In Attachment #4, I converted the y-axis from urad/rtHz to cts/rtHz using the approximate calibration of 240urad/ct (and the fact that the Oplev error signal is normalized by the QPD sum of ~13000 cts), and divided by 4 to account for the fact that the control signal is sent to 4 coils. It is clear that attempting to whiten the coil driver signals with the present Oplev loop shapes causes DAC saturation. I'm going to use this formulation for Term #4 in the cost function, and to solve a simpler optimization problem first - given the existing loop shape, what is the optimal elliptic low-pass filter to implement such that the cost function is minimized? 


There is also the question of how to go about doing the optimization, given that our cost function is a vector rather than a scalar. In the coating optimization code, we converted the vector cost function to a scalar one by taking a weighted sum of the individual components. This worked adequately well.

But there are techniques for vector cost-function optimization as well, which may work better. Specifically, the question is  if we can find the (infinite) solution set for which no one term in the error function can be made better without making another worse (the so-called Pareto front). Then we still have to make a choice as to which point along this curve we want to operate at.

Attachment 1: loopPerformance.pdf
loopPerformance.pdf
Attachment 2: OplevLoop.png
OplevLoop.png
Attachment 3: OL_errSigs.pdf
OL_errSigs.pdf
Attachment 4: DAC_saturation.pdf
DAC_saturation.pdf
  13157   Tue Aug 1 19:23:06 2017 ranaUpdateALSX - arm alignment

Rana, Naomi

We dither locked the X arm and then aligned the green beam to it using the PZTs. Everything looks ready for us to do a mode scan tomorrow.

We got buildup for Red and Green, but saw no beat in the control room. Quick glance at the PSL seems OK, but needs more investigation. We did not try moving around the X-NPRO temperature.

Tomorrow: get the beat, scan the PhaseTracker, and get data using pyNDS.

  13158   Wed Aug 2 09:40:55 2017 SteveUpdateElectronicsspare ILIGO electronics

Spare ILIGO electronics temporarly stored in the east arm. We need cabinet space.

Attachment 1: iLIGOspares.jpg
iLIGOspares.jpg
Attachment 2: spareIligo.jpg
spareIligo.jpg
  13161   Thu Aug 3 00:59:33 2017 gautamUpdateCDSNDS2 server restarted, /frames mounted on megatron

[Koji, Nikhil, Gautam]

We couldn't get data using python nds2. There seems to have been many problems.

  1. /frames wasn't mounted on megatron, which was the nds2 server. Solution: added /frames 192.168.113.209(sync,ro,no_root_squash,no_all_squash,no_subtree_check) to /etc/exportfs on fb1, followed by sudo exportfs -ra. Using showmount -e, we confirmed that /frames was being exported.
  2. Edited /etc/fstab on megatron to be fb1:/frames/ /frames nfs ro,bg,soft 0 0. Tried to run mount -a, but console stalled.
  3. Used nfsstat -m on megatron. Found out that megatron was trying to mount /frames from old FB (192.168.113.202). Used sudo umount -f /frames to force unmount /frames/ (force was required).
  4. Re-ran mount -a on megatron.
  5. Killed nds2 using /etc/init.d/nds2 stop - didn't work, so we manually kill -9'ed it.
  6. Restarted nds2 server using /etc/init.d/nds2 start.
  7. Waited for ~10mins before everything started working again. Now usual nds2 data getting methods work.

I have yet to check about getting trend data via nds2, can't find the syntax. EDIT: As Jamie mentioned in his elog, the second trend data is being written but is inaccessible over nds (either with dataviewer, which uses fb as the ndsserver, or with python NDS, which uses megatron as the ndsserver). So as of now, we cannot read any kind of trends directly, although the full data can be downloaded from the past either with dataviewer or python nds2. On the control room workstations, this can also be done with cds.getdata.

  13162   Thu Aug 3 10:51:32 2017 ranaUpdateCDSNDS2 server restarted, /frames mounted on megatron

same issue on NODUS; I edited the /etc/fstab and tried mount -a, but it gives this error:

controls@nodus|~ 1> sudo mount -a
mount.nfs: access denied by server while mounting fb1:/frames

needs more debugging - this is the machine that allows us to have backed up frames in LDAS. Permissions issues from fb1 ?

  13163   Thu Aug 3 11:11:29 2017 gautamUpdateCDSNDS2 server restarted, /frames mounted on nodus

I added nodus' eth0 IP (192.168.113.200) to the list of allowed nfs clients in /etc/exportfs on fb1, and then ran sudo mount -a on nodus. Now /frames is mounted.

Quote:

needs more debugging - this is the machine that allows us to have backed up frames in LDAS. Permissions issues from fb1 ?

 

  13164   Thu Aug 3 19:46:27 2017 JamieUpdateCDSnew daqd restart procedure

This is the daqd restart procedure:

$ ssh fb1 sudo systemctl restart daqd_*

That will restart all of the daqd services (daqd_dc, daqd_fw, daqd_rcv).

The front end mx_stream processes should all auto-restart after the daqd_dc comes back up.  If they don't (models show "0x2bad" on DC0_*_STATUS) then you can execute the following to restart the mx_stream process on the front end:

$ ssh c1<host> sudo systemctl restart mx_stream

 

 

  13165   Thu Aug 3 20:15:11 2017 JamieUpdateCDSdataviewer can not raise test points

For some reason dataviewer is not able to raise test points with the new daqd setup, even though dtt can.  If you raise a test point with dtt then dataviewer can show the data fine.

It's unclear to me why this would be the case.  It might be that all the versions of dataviewer on the workstations are too old??  I'll look into it tomorrow to see if I can figure out what's going on.

  13166   Fri Aug 4 09:07:28 2017 ranaUpdateCDSCDS system essentially NOT fully recovered

Tried getting trends with dataviewer just now since Jamie re-enabled the minute_raw frame writing yesterday. Unable to get trends still:

Connecting to NDS Server fb1 (TCP port 8088)
Connecting.... done
Server error 18: trend data is not available
datasrv: DataWriteTrend failed in daq_send().
unknown error returned from daq_send()T0=17-08-04-08-02-22; Length=28800 (s)
No data output.

  13167   Fri Aug 4 18:25:15 2017 gautamUpdateGeneralBilinear noise coupling

[Nikhil, gautam]

We repeated the test that EricQ detailed here today. We have downloaded ~10min of data (between GPS times 11885925523 - 11885926117), and Nikhil will analyze it.

Attachment 1: bilinearTest.pdf
bilinearTest.pdf
  13168   Sat Aug 5 11:04:07 2017 gautamUpdateSUSMC1 glitches return

See Attachment #1, which is full (2048Hz) data for a 3 minute stretch around when I saw the MC1 glitch. At the time of the glitch, WFS loops were disabled, so the only actuation on MC1 was via the local damping loops. The oscillations in the MC2 channels are the autolocker turning on the MC2 length tickle.

Nikhil and I tried the usual techniques of squishing cables at the satellite box, and also at 1X4/1X5, but the glitching persists. I will try and localize the problem this weekend. This thread details investigations the last time something like this happened. In the past, I was able to fix this kind of glitching by replacing the (high speed) current buffer IC LM6321M. These are present in a two places: Satellite box (for the shadow sensor LED current drive), and on the coil driver boards. I think we can rule out the slow machine ADCs that supply the static PIT and YAW bias voltages to the optic, as that path is low-passed with a 4th order filter @1Hz, while the glitches that show up in the OSEM sensor channels do not appear to be low-passed, as seen in the zoomed in view of the glitch in Attachment #2 (but there is an LM6321 in this path as well).

Attachment 1: MC1_glitch_Aug42017.png
MC1_glitch_Aug42017.png
Attachment 2: MC1_glitch_zoomed.png
MC1_glitch_zoomed.png
  13169   Mon Aug 7 16:00:41 2017 ranaUpdateGeneralBilinear noise coupling

These are not the angular test parameters that we're looking for:

 recall that what we want is the low frequency beam spot variations and the feedback to be limited to a small high frequency band.

e.g. only inject noise at 40-50 Hz, not loud enough to find at 2x the injected frequency.

It should NOT be the case that the high frequency injected noise be dominating the RMS.

The coupling should be ~1e-3; some combination of beam spot mis-centering and beam spot motion.

  13170   Mon Aug 7 22:50:57 2017 KojiUpdateGeneralNew wifi router for the GC network installed

I have replaced the old 11n wifi router (CISCO / Linksys) for the GC network with a new one with 11ac technology.

The new one is a 3band wifi router. Thus it has one 2.4GHz (11n) SSID and two 5GHz (11ac) SSIDs. All these have been set to be hidden. Just come to the 40m and find the necessary info for the connection.

Note that the user id / password for the admin tool have been changed from the default values.

  13171   Tue Aug 8 17:04:26 2017 SteveUpdateVACunintended pump down

IFO pressure 2 Torr,    PSL  shutter closed.  I'm pumping down with 2 roughing pumps with ion pump gate valves open and annulosses at atm.

The vacuum envelope was vented to 17 Torr while I was replacing the USP battery stack. More about this later.....

Do not plan on using the interferrometer tonight.  I will complete the pumpdown tomorrow morning.

 

Attachment 1: pumping_down.png
pumping_down.png
  13172   Tue Aug 8 17:44:11 2017 SteveUpdateVACunintended pump down

Pumpdown stopped for over night  at ~  1 Torr

The roughing line disconnected.  Valves condition indicator  "moving " means that it is closed and it's cable disconnected so it can not move.

The RGA is off and VM1 is stuck.

Quote:

IFO pressure 2 Torr,    PSL  shutter closed.  I'm pumping down with 2 roughing pumps with ion pump gate valves open and annulosses at atm.

The vacuum envelope was vented to 17 Torr while I was replacing the USP battery stack. More about this later.....

Do not plan on using the interferrometer tonight.  I will complete the pumpdown tomorrow morning.

 

 

Attachment 1: stopped_pumping.png
stopped_pumping.png
ELOG V3.1.3-