I first initialized the drives by hooking them up to my computer and running the setup.app file. After this, plugging the drive into the respective machine and running lsblk, I was able to see the mount point of the external drive. To actually initialize the backup, I ran the following command from a tmux session called ddBackupLaCie:
sudo dd if=/dev/sda of=/dev/sdb bs=64K conv=noerror,sync
Here, /dev/sda is the disk with the root filesystem, and /dev/sdb is the external hard-drive. The installed version of dd is 8.13, and from version 8.21 onwards, there is a progress flag available, but I didn't want to go through the exercise of upgrading coreutils on multiple machines, so we just have to wait till the backup finishes.
We also wanted to do a backup of the root of FB1 - but I'm not sure if dd will work with the external hard drive, because I think it requires the backup disk size (for us, 1TB) to be >= origin disk size (which on FB1, according to df -h, is 2TB). Unsure why the root filesystem of FB is so big, I'm checking with Jamie what we expect it to be. Anyways we have also acquired 2TB HGST SATA drives, which I will use if the LaCie disks aren't an option.
The Fiber ALS box has been installed on the existing shelf on the PSL table. We had to re-arrange some existing cabling to make this possible, but the end result seems okay (to me). The box lid was also re-installed.
Some stuff that still needs to be fixed:
Beat spectrum post changes to follow.
Is it better to mount the box in the PSL under the existing shelf, or in a nearby PSL rack?
Further characterization needs to be done, but the results of this test are encouraging. If we are able to get this kind of out of loop ALS noise with the IR beat, perhaps we can avoid having to frequently fine-tune the green beat alignment on the PSL table. It would also be ideal to mount this whole 1U setup in an electronics rack instead of leaving it on the PSL table
Attachment #1: Result of AM sweeps with EX laser crystal at nominal operating temperature ~ 31.75 C.
Attachment #2: Tarball of data for Attachment #1.
Attachment #3: Result of AM sweeps with EX laser crystal at higher operating temperature ~ 40.95 C.
Attachment #4: Tarball of data for Attachment #2.
Attachment #1: Summary of results of measurements made on Friday. There is a lot in this plot, here is a breakdown:
Data + code for this plot will be attached later.
Attachment #1 is a sketch of the proposed setup to measure the PM response of the EX NPRO. Previously, this measurement was done via PLL. In this approach, we will need to calibrate the DFD output into units of phase, in order to calibrate the transfer function measurement into rad/V. The idea is to repeat the same measurement technique used for the AM - take ~50 1 average measurements with the AG4395, and look at the statistics.
Some more notes:
After consulting with Jamie, we reached the conclusion that the reason why the root of FB1 is so huge is because of the way the RAID for /frames is setup. Based on my googling, I couldn't find a way to exclude the nfs stuff while doing a backup using dd, which isn't all that surprising because dd is supposed to make an exact replica of the disk being cloned, including any empty space. So we don't have that flexibility with dd. The advantage of using dd is that if it works, we have a plug-and-play clone of the boot disk and root filesystem which we can use in the event of a hard-disk failure.
I am trying option 3 now. dd however does requrie that the destination drive size be >= source drive size - I'm not sure if this is true for the HGST drives. lsblk suggests that the drive size is 1.8TB, while the boot disk, /dev/sda, is 2TB. Let's see if it works.
Backup of chiara is done. I checked that I could mount the external drive at /mnt and access the files. We should still do a check of trying to boot from the LaCie backup disk, need another computer for that.
nodus backup is still not complete according to the console - there is no progress indicator so we just have to wait I guess.
I've modified the __init.py__ file located at /ligo/apps/linux-x86_64/cdsutils-480/lib/python2.7/site-packages/cdsutils/__init__.py so that you can now simply import pyawg from cdsutils. On the control room workstations, iPython is set up such that cdsutils is automatically imported as "cds". Now this import also includes the pyawg stuff. So to use some pyawg function, you would just do (for example):
One could also explicitly do the import if cdsutils isn't automatically imported:
from cdsutils import awg
Linking this useful instructional elog from Chris here: https://nodus.ligo.caltech.edu:8081/Cryo_Lab/1748
The nodus backup too is now complete - however, I am unable to mount the backup disk anywhere. I tried on a couple of different machines (optimus, chiara and pianosa), but always get the same error:
mount: unknown filesystem type 'LVM2_member'
The disk itself is being recognized, and I can see the partitions when I run lsblk, but I can't get the disk to actually mount.
Doing a web-search, I came across a few blog posts that look like the problem can be resolved using the vgchange utility - but I am not sure what exactly this does so I am holding off on trying.
To clarify, I performed the cloning by running
sudo dd if=/dev/sda of=/dev/sdb bs=64K conv=noerror,sync
in a tmux session on nodus (as I did for chiara and FB1, latter backup is still running).
As Rana pointed out to me, one important fact to keep in mind w.r.t. DAC noise is that it can be non-linear. So the RMS of the DAC noise in a higher frequency band (say 60-100Hz) can be affected by the RMS of the requested DAC signal in some lower frequency band (say 10-20Hz). One test to see if this hypothesis can explain the difference @100Hz between the ITMX channels and BS channels I observed a couple of days ago is to see if the noise around 100Hz becomes lower when I enable a 20-40Hz bandstop in the digital signal chain.
The FB1 dd backup process seems to have finished too - but I got the following message:
dd: error writing ‘/dev/sdc’: No space left on device
30523666+0 records in
30523665+0 records out
2000398934016 bytes (2.0 TB) copied, 50865.1 s, 39.3 MB/s
Running lsblk shows the following:
controls@fb1:~ 32$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 23.5T 0 disk
└─sdb1 8:17 0 23.5T 0 part /frames
sda 8:0 0 2T 0 disk
├─sda1 8:1 0 476M 0 part /boot
├─sda2 8:2 0 18.6G 0 part /var
├─sda3 8:3 0 8.4G 0 part [SWAP]
└─sda4 8:4 0 2T 0 part /
sdc 8:32 0 1.8T 0 disk
├─sdc1 8:33 0 476M 0 part
├─sdc2 8:34 0 18.6G 0 part
├─sdc3 8:35 0 8.4G 0 part
└─sdc4 8:36 0 1.8T 0 part
While I am able to mount /dev/sdc1, I can't mount /dev/sdc4, for which I get the error message
controls@fb1:~ 0$ sudo mount /dev/sdc4 /mnt/HGSTbackup/
mount: wrong fs type, bad option, bad superblock on /dev/sdc4,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
Looking at dmesg, it looks like this error is related to the fact that we are trying to clone a 2TB disk onto a 1.8TB disk - it complains about block size exceeding device size.
BS connections and damping restored.
I was trying to set up a DAC channel to interface with the AOM driver on the PSL table.
[4072817.132040] c1als: Failed to allocate DAC channel.
[4072817.132040] c1als: DAC local 0 global 16 channel 4 is already allocated.
[4072817.132040] c1als: Failed to allocate DAC channel.
[4072817.132040] c1als: DAC local 0 global 16 channel 5 is already allocated.
[4072817.132040] c1als: Failed to allocate DAC channel.
[4072817.132040] c1als: DAC local 0 global 16 channel 6 is already allocated.
[4072817.132040] c1als: Failed to allocate DAC channel.
[4072817.132040] c1als: DAC local 0 global 16 channel 7 is already allocated.
[4073325.317369] c1als: Setting stop_working_threads to 1
controls@c1ioo:~ 130$ sudo systemctl status mx
● open-mx.service - LSB: starts Open-MX driver
Loaded: loaded (/etc/init.d/open-mx)
Active: failed (Result: exit-code) since Tue 2017-10-03 00:27:32 UTC; 34min ago
Process: 29572 ExecStop=/etc/init.d/open-mx stop (code=exited, status=1/FAILURE)
Process: 32507 ExecStart=/etc/init.d/open-mx start (code=exited, status=1/FAILURE)
Oct 03 00:27:32 c1ioo systemd: Starting LSB: starts Open-MX driver...
Oct 03 00:27:32 c1ioo open-mx: Loading Open-MX driver (with ifnames=eth1 )
Oct 03 00:27:32 c1ioo open-mx: insmod: ERROR: could not insert module /opt/3.2.88-csp/open-mx-1.5.4/modules/3.2.88-csp/open-mx.ko: File exists
Oct 03 00:27:32 c1ioo systemd: open-mx.service: control process exited, code=exited status=1
Oct 03 00:27:32 c1ioo systemd: Failed to start LSB: starts Open-MX driver.
Oct 03 00:27:32 c1ioo systemd: Unit open-mx.service entered failed state.
This did the trick - I simply ran
sudo systemctl restart daqd_*
on FB1, and now all the CDS overview lights are green again.
I thought I had done this already, but I realize that I was supposed to restart the daqd processes on FB1 (which is where they are running) and not on c1ioo .
Thanks Jamie for the speedy resolution!
From page 21 of T1100625, DAQ status "0x2000" means that the channel list is out of sync between the front end and the daqd. This usually happens when you add channels to the model and don't restart the daqd processes, which sounds like it might be applicable here.
It looks like open-mx is loaded fine (via "rtcds lsmod"), even though the systemd unit is complaining. I think this is because the open-mx service is old style and is not intended for module loading/unloading with the new style systemd stuff.
Going through some astronomy CCD calibration resources (-), I gather that there are in general 3 distinct types of correction that are applied:
The flat-field calibration seems to be the most complicated - the idea is to use a source of known radiance, and capture an image of this known radiance with the CCD. Then assuming we know the source radiance well enough, we can use some math to back out what the actual response function of individual pixels are. Then, for an actual image, we would divide by this response-map to get the actual image. There are a number of assumptions that go into this, such as:
I am not sure what error is incurred by ignoring 2 and 3 in the list at the beginning of this elog, perhaps this won't affect our ability to estimate the scattered power from the test-masses to within a factor of 2. But it may be worth it to do these additional calibration steps.
I also wonder what the uncertainty in the 1.5V/A number for the photodiode is (i.e. how much do we trust the Ophir power meter at low power levels?). The datasheet for the PDA100A says the transimpedance gain at 60dB gain is 1.5 MV/A (into high impedance load), and the Si responsivity at 1064nm is ~0.25A/W, so naively I would expect 0.375 V/uW which is ~factor of 4 lower. Is there a reason to trust one method over the other?
Also, are the calibration factor units correct? Jigyasa reported something like 0.5nW s / ct in her report.
The incident power can be calculated as Pin =CF*Total(Counts-DarkCounts)/ExposureTime.
GV Oct 6: This coupling is probably not correct - Finesse outputs TF magnitude in units of W/W, and not W/RIN.
Since I was foiled (by lack of DAC) in my attempt to measure the coupling of laser intensity noise to MICH in the DRMI (no arms) configuration, I decided to try understanding the effect with a simulation.
For this purpose, I used my DRMI Finesse model - this had mirror positions tuned for locking and photodiode demod phases tuned to give a sensing matrix model that wasn't too far from an actual measurement (within factor of a few). So the model seems okay for a first pass at estimating this coupling.
Measuring transfer functions in Finesse is straightforward - use the fsig command to modulate some quantity (in this case the input beam intensity), and use the pd2 detector to demodulate the effect of this modulation at the port of interest (in this case AS55_Q).
**Note that to apply a modulation to an input beam (i.e. Laser) in Finesse, the keyword for the "type" argument given to fsig is "amp" and not "amplitude" as the manual would had me believe. In fact, there seem to be quite a few such caveats. The best way to figure this out is to go to the pykat installation directory, find the file components.py, and look for the fsig_name for the component of interest. It is also indicated in the same file, via the canFsig argument, if that property of the component can be modulated for transfer function measurements.
Attachment #1 shows the result of such a sweep.
To estimate what the actual contribution to the displacement noise is, I used the DQ-ed MC transmission (recorded at 1024Hz) from the DRMI lock, computed the ASD using scipy.signal.welch, divided by the nominal MC transmission of ~15,000 counts to convert to RIN/rtHz. The RIN was then multiplied by the above calculated coupling function, and divided by the sensing matrix element for AS55_Q (in units of W/m) to give the curve shown in Attachment #2. If we believe the simulation, then Laser Intensity Noise shouldn't be the limiting noise between 10Hz-1kHz.
I will of course measure the actual coupling and see how it lines up with Attachment #1 - would be a nice additional validation of the Finesse model. I will also try using the Finesse model to estimate some other coupling transfer functions (e.g. Laser Frequency Noise, Oscillator Noise).
The absence of evidence is not evidence of absence.
Eurocrate key turning reboots for c1susaux, c1auxex,c1auxey, c1iscaux, c1iscaux2 and c1aux. Usual precautions were taken for ITMX. Did burtrestore for c1iscaux andc1iscaux2 in order to restore the LSC PD whitening gains.
Un-related to this work: input pointing into PMC was tweaked as the PMC_REFL spot was pretty bright.
There are at least 5 free DAC channels (4 if you discount the one channel from these that I am hijacking) available in the 1Y2 electronics rack.
Jamie's nice wiring diagram shows the topology - the actual DAC card sits in 1Y3 inside the c1lsc expansion chassis (while the c1lsc frontend itself is in 1X4). The output of the DAC goes via SCSI to an interface box (D080303) and then to some dewhitening/AI boards (D000316). There are a total of 16 DAC channels available, out of which 8 are used for the TTs, 2 are used for the DAFI model, and one is labeleld "From c1ioo 1X2" (I don't know what this one is for). So I'm going to use some of these channels for measuring the coupling of oscillator noise and intensity noise to MICH in the DRMI lock.
The de-whitening/AI board seems to be old - it has 2x 800Hz Butterworth LPFs and no notch for the clock frequency, but maybe this doesn't matter for the tests I have in mind. The AI board available on 1X2 is more modern but routing the DAC channels from 1Y2 to it is going to be some work.
I'm going to add my testpoint to c1daf given that it seems to be the least critical model on c1lsc.
EDIT: testpoints added to c1daf don't show up in the list of available channels - there was some issue with this model while we were getting the new RTCDS going. So I'm moving my temporary testpoint to c1cal instead.
I've located the Stanford Research FS725 Rb reference unit. The question is where to put it. This afternoon Steve and I put it inside the little electronics rack next to 1X3, but in hindsight, this probably isn't such a great place for a timing reference as there are a bunch of Sorensen power supplies in there (and presumably the accompanying harmonics from these switching supplies).
The unit itself was repaired in 2015, and powering it on, it locked to the internal reference within a few minutes as prescribed in the manual.
In the end I decided to access the available spare DAC channels via the C1ASS model - for this purpose, I added a namespace block "TEST" in the C1ASS simulink model, which is a SISO block. Inside is just a single CDS filter module. My idea is to use the EXC of this filter module to inject excitations for measuring various couplings. Rather than have a simple testpoint, we also have the option of adding in some filter shapes in the filter module which could possibly allow a more direct read-off of some coupling TF. Recompiling the model went smooth - there was a crash earlier in the day which required me to hard-reboot c1lsc (and also restart all models on c1sus and c1ioo but no reboots necessary for those machines).
Note that to get the newly added channels to show up in the channel lists in DTT/AWGGUI etc, you need to ssh into fb1 and restart the daqd processes via sudo systemctl restart daqd_*. If I remember right, it used to be enough to do telnet fb 8088 followed by shutdown. This is no longer sufficient.
It took me a while to get the DRMI locking going again. The model restarts earlier in the evening had changed a bunch of EPICS channel settings (and out config scripts don't catch all of these settings). In particular, I forgot to re-enable the x3 digital gain for the ITMs, BS and SRM (necessitated by removing an analog x3 gain on the de-whitening boards). I was hesitant to spend time re-adjusting all damping / oplev loop gains because if we change the series resistor on the coil driver board, we will have to do this again. I also didn't want this arbitrary FM to be enabled in the SDF safe.snap. But maybe it's worth doing it anyways - if nothing it'll be good practise.
Once I hunted down all the setting diffs and tweaked alignment, the DRMI locks were pretty robust.
I had hoped to make some of these TF measurements tonight. But I realized I needed to look up a bunch of stuff in manuals/datasheets, and think about these measurements a little. I wasn't sure if the DW/AI board could drive a signal over 40m of BNC cabling so I added an SR560 (DC coupled, gain=1, low noise mode, 50ohm output used) to buffer the output. The Marconi's external modulation input is high impedance (100k) but for the AOM driver we want 50ohm. For the Marconi, the external input accepts 1Vrms max, while for the AOM driver, we want to drive a signal between 0V and 1V at most.
The general measurement setup is schematically shown in Fig 1. Questions to address:
Am I missing something?
MC Autolocker was umnhappy because c1iool0 was unresponsive and hence it couldn't write to the "C1:IOO-MC_AUTOLOCK_BEAT" channel. I keyed the crate and IMC locked almost immediately. I'm moving this channel into the RTCDS model as we did for the IFO_STATE EPICS channel so that the autolocker isn't dependant on c1iool0 (which was the whole point of migrating the IFO-STATE variable anyways). I also commented out all of these channels in /cvs/cds/caltech/target/c1iool0/autolocker.db so that there aren't duplicate channels.
The 4TB HGST drives have arrived. I've started the FB1 dd backup process. Should take a day or so.
Looks to have worked this time around.
controls@fb1:~ 0$ sudo dd if=/dev/sda of=/dev/sdc bs=64K conv=noerror,sync
33554416+0 records in
33554416+0 records out
2199022206976 bytes (2.2 TB) copied, 55910.3 s, 39.3 MB/s
You have new mail in /var/mail/controls
I was able to mount all the partitions on the cloned disk. Will now try booting from this disk on the spare machine I am testing in the office area now. That'd be a "real" test of if this backup is useful in the event of a disk failure.
Gabriele had prepared a C code implementation of his NN for MICH/PRCL state estimation. He had been trying to get it going on some of the machines in WB, but was unsuccessful. The version of RCG he was trying to compile and run the code on was rather dated so we decided to give it a whirl on our new RCG3.4 here at the 40m. Just noting down stuff we tried here:
All models have been reverted to their state prior to this test, and everything on the CDS_OVERVIEW MEDM screen is green now.
I spent this weekend doing a more careful investigation of the DRMI noise. I think I have some new information/insights. Attachment #1 is the noise budget (png attached because pdf takes forever to upload, probably some ImageMagick problem. The last attachment is a tarball of the PDF). Long elog, so here are the Highlights:
The DAC noise is not limiting us anywhere when the coil de-whitening is switched on.
So I don't understand the measured Dark Noise level, and it is limiting us at frequencies > 200Hz. Some busted electronics in the input signal chain? Or can the LSC demod daughter board gain of ~5 explain the observed noise?
Edit 1730 9 Oct: I had missed out the factor of 5 gain in the demod board in calculating the shot noise curve. Attachment #7 shows the corrected shot noise level. Explicitly:
, where is to convert shot noise in W to displacement units.
This seems to be limiting us from saturating the dark noise once the coil de-whitening is engaged. But lack of coherence means the mechanism is not re-injection of SRCL/PRCL sensing noise? Need to think about what this means / how we can mitigate it.
I've also made several changes to the NB code - will push to git once I finish cleaning stuff up, but it is now much faster to make these plots and see what's what.
I measured the output voltage noise of the Q output of the AS55 Demod Board with the PSL shutter closed, using the SR785 (see Attachment #1). The measured noise is consistent with the expected number of ~120nV/rtHz around 100Hz. I had measured the gain of this board from RFPD input to Q output to be ~5.1: so if the PD dark noise is 16nV/rtHz, this would be amplified to ~80nV/rtHz. Still a discrepancy of ~50%. I didn't measure the noise with the PD input terminated. Added the noise of the demod board output with the RFPD input terminated. The level of ~100nV/rtHz seems consistent with the actual PD dark noise being ~80nV/rtHz, as their quadrature sum is around 130nV/rtHz. Need to dig up the schematics for the demod board + daughter board, and check against LISO, to see if this is consistent with what is expected.
Also - I think I was using the wrong value of the DC power on the AS55 photodiode for shot noise calculations - 13mW was for REFL55, not AS55. I did a crude measurement of the power by sticking the Ophir power meter (filter removed) in front of the AS55 PD with the Michelson flashing around, and noticed the maximum value registered was ~1.2mW. So in the DRMI lock, there would be ~2.4mW, which is 10x lower than the value I was assuming. I've made the correction in the NB code, for the next time the plot is generated. A more rigorous measurement would involve sticking the Ophir in front of the AS110 PD during a DRMI lock. The light from the AS port is split by a 50-50 BS to the AS55 and AS110 PDs (so measuring at AS110 is a reasonable proxy for power at AS55), and the AS110 signals are not used for triggering in the DRMI lock, so this is feasible.
I keep adding traces to this plot, here is the most complete one I have now. Looks like the input noise to the D040179 (measured at "Q out" SMA jack of D990511 with RFPD input terminated) is ~10nV/rtHz. This supports the hypothesis that something is wonky on the daughter board, because the purple trace should only be the quad sum of the orange and green traces. I will pull it out and have a look.
Some other follow-ups on the questions raised at the meeting:
I tried replacing the AD797s on the daughter board with OP27s, and saw no significant improvement in the electronics noise of the demod board. Note that according to LISO, in this configuration, the voltage noise of the Op27 is expected to dominate the total noise of the daughter board. Measurement condition was that the RFPD input was terminated, but the LO input was still being driven (SR785 input range is -50dBVpk for all traces, and the input ranging was set to "UpOnly"). Need to do a more systematic investigation to figure out where this excess noise is coming from. I will upload photos of the board later.
This supports the hypothesis that something is wonky on the daughter board, because the purple trace should only be the quad sum of the orange and green traces. I will pull it out and have a look.
I worked on the daughter board a little more in the evening. I have somehow managed to make the dark noise ~25% worse [Attachment #1].
After making all these changes, I re-installed the card in the eurocrate and repeated the measurement. The Q channel noise was close to the expected value (~50nV/rtHz), but the I channel is twice as noisy. I will continue this investigation tomorrow.
Here is the marked up schematic with the board as it is stuffed. Annoyingly, there is a capacitor (C1) which according to the schematic is supposed to be open, but is stuffed in our board. I can't find any elog about this, and its a pain to measure the value of this capacitance. I will upload all of this + LISO + noise model/measurements to a 40m AS55 daughter board DCC page.
Steve reported problems getting the X arm locked. Alignment sliders were inaccessible. Eurocrate key turning reboots for c1susaux, c1auxex,c1auxey, c1iscaux and c1aux. Usual precautions were taken for ITMX.
This is becoming a once-a-week thing .
Attachment #1 - Measured / modelled noises for AS55 demod board. I've plotted quadrature sum of the LISO trace with the SR785 noise floor with input terminated to ground via 50ohm. Note that these measurements were made after all the changes in the marked up schematic in the previous elog were implemented.
Both channels should be identical - I don't understand why the I channels are noisier than their Q counterparts. This is almost certainly a problem on the daughter board, as the orange traces are pretty much identical for both channels.
The dark red curves were measured by shorting the inputs to D040179 to ground via 50ohms using some Pomona minigrabbers - I wanted to avoid ripping the daughter board out, but this probably explains the excess noise compared to the green trace at low frequencies. All other measurements were made with the board installed in the LSC rack eurocrate, with the LO input driven at the nominal level (I didn't measure this yesterday but a measurement from ~6months ago says that this level is 1.5dBm).
Wall StripTool traces showed that IMC has not been locked for at least 8 hours when I came in this morning. Going to the IMC autolocker log, it looks like the last timestamp was at ~6pm yesterday. Megatron was responding to ping, but I couldn't ssh into it. So I went over to the machine and did a hard-reboot via front panel power switch. The computer took ~10mins to come back online and respond to ping. Once it did, I was able to ssh into it. However, trying the usual commands to restart the IMC autolocker and FSS Slow loops didn't work. Specifically, monitoring the logfile with tail -f Autolocker.log, I would see that the autolocker seemed to get stuck after starting the "blinky" script. Trying to restart the process using sudo initctl restart MCautolocker, init would print to shell that the restart had worked, and reported the PID, but the logfile wouldn't update "live" as it should when tail is used with the -f option. All very strange .
Anyways, as a last resort, I kill -9'ed the PID for the init instance, and init automatically restarted the Autolocker - this did the trick, IMC is locked now and logfile seems to be getting updated normally.
I also cleared a bunch of matlab crash dump files in the home directory.
Koji suggested looking at the output of the AS55 demod board on a fast oscilloscope to look for differences in the two channel outputs (if there is some high-frequency oscillations, for example, we could miss this information in the SR785 spectra). Besides, I was only looking at spectra out to a few kHz on the SR785. I grabbed this data with a 300MHz BW Tektronix oscilloscope (battery mode) today. Input impedance of both channels were set to 1Mohm, and the measurement was made with the RFPD input terminated, output of the daughter board is what is measured. The vertical scaling of the channels was set to the minimum allowed, 1mV/div.
Attachment #1 shows that there is indeed a visible difference between the two channels - the (noisier) I channel has a much larger DC offset of ~5mV compared to the Q channel (I tried switching channels on the O'scope and the larger DC offset remained on the I channel, so seems real). There is also some kind of oscillation going on in the I channel, although the frequency is pretty low, with the peaks spaced ~50us apart. Indeed, in the ASD of the acquired data, the excess power in the I channel at 20kHz and higher harmonics are evident (see Attachment #2). Anyway all of this points to something being anomalous on the daughter board I channel signal path - I will pull it out and monitor the outputs at various points along the signal path with the fast scope to see if I can narrow down what's going on where.
We took a closer look at the AS55 demod board today. The procedure was to just be as thorough as possible, and check the behaviour of the circuit (both Transfer Function and Noise) stage by stage. Checking the transfer function was the key.
During this process, we found that the reason why the Q channels had lower noise than the I channels was because of the gain of the AD829 stage of the circuit was 0dB rather than 4dB (which is what it should be according to the component values used). Specifically, resistor R12, which is supposed to be 1.30kohm, was measured to be 1.03kohm. Replacing this resistor, the transfer functions (see Attachment #1) and noise levels (see Attachment #2) match the expectations from LISO. Some notes:
Assming the whitened ADC noise level is much below this (should only be ~10nV/rtHz), and given the measured sensing element of 6.2e8 V/m, this means that the dark noise sets a maximum achievable sensitivity of 2e-16m/rtHz.
To figure out what (if anything) is to be done next, we need to first figure out what is the goal. In the end, we care about DARM and not MICH. The optical gain for the former is ~300x the latter, so the dark noise contribution gets scaled by this factor (giving us a number of 7e-19 m/rtHz). There are certainly many noises above that level which have to be handled first. Indeed, looking at the DARM spectrum from DRFPMI lock back in March 2016, it looks like the current 1f DRMI (with coils de-whitened) Michelson sensitivity is within a factor of 2 of DARM in the full lock (albeit with vertex DoFs on 3f signals, and no coil de-whitening). Koji pointed out that we need to consider the photodiode resonant circuit itself too.
TODO: Upload all this onto the DCC
While working on the IFO tonight, I noticed that the blinky status lights on c1iscex and c1iscey were frozen (but those on the other 3 FEs seemed fine). But all other lights on the CDS overview screen were green I couldn't access testpoints from these machines, and the EPICS readbacks for models on these FEs (e.g. Oplev servo inputs outputs etc) were frozen at some fixed value. This lasted for a good 5 minutes at least. But the blinky lights started blinking again without me doing anything. Not sure what to make of this. I am also not sure how to diagnose this problem, as trending the slow EPICS records of the CPU execution cycle time (for example) doesn't show any irregularity.
I was looking at the ASDC channel on dataviewer, and toggling various settings like whitening gain. At some point, the signal just froze. So I quit dataviewer and tried restarting it, at which point it complained about not being able to connect to FB. This is when I brought up the CDS_OVERVIEW medm screen, and noticed the frozen 1pps indicator lights. There was certainly something going on with the end FEs, because I was able to ping the machine, but not ssh into it. Once the 1pps lights came back, I was able to ssh into c1iscex and c1iscey, no problems.
Could it be that some of the mx processes stalled, but the systemctl routine automatically restarted them after some time?
So this wasn't just an EPICS freeze? I don't see how this had anything to do with any of the work I did earlier today. I didn't modify any of the running front ends, didn't touch either of the end station machines or the DAQ, and didn't modify the network in any way. I didn't leave anything running.
If you couldn't access test points then it sounds like it was more than just EPICS. It sounds like maybe the end machines somehow fell of the network momentarily. Was there anything else going on at the time?
The signal path for the ASDC signal is AS55 PD --> D990543 (interface board) --> D990694 (whitening board) --> D000076 (AA board) --> ADC Ch 31. Everything in this signal chain should be able to handle signals in the range +/- 10V, which should correspond to the full range of our +/-10V, 16bit ADCs. But the ASDC signal seems to saturate at ~2000 counts (i.e. turning up the analog whitening gain doesn't make the signal get any bigger than this). I investigated this a little more today.
So the problem lies somewhere downstream of the D990694. There are other anomalous behaviours of this channel - e.g. engaging the analog whitening filters changes the DC offset of the signal. I am going to pull out this board to check it out.
Why does this matter? I want to calibrate the ASDC level (and eventually the other PD DC signals as well) into Watts. This is useful for IFO diagnostics, noise budgeting the shot noise level etc.
According to the AS55 schematic, the DC transimpedance is 66.7 ohms. I claim that the DC power on the AS55 photodiode during a DRMI (no arms) lock is ~1mW. The C30642 photodiode (InGaAs) responsivity is ~0.8 A/W. So I'd expect ~50mV to be the signal level into the ADC (assuming gain of all the other electronics in the signal chain at the start of this elog is unity). This corresponds to ~163 counts (since the ADC conversion factor is 2^16 counts over 20volts). The DC signal level I observed is ~200 counts. So things seem roughly consistent.
*Note: Despite my above statement, I don't think it is true that the AS110 PD has more light on it - the BS splitting the light between
AS55 and AS110 PDs is a 50-50 BS, and using the crude method of putting an Ophir power meter in front of both PDs and
monitoring the power while the Michelson was swinging around freely showed roughly the same maximum value.
Last night, I collected ~30mins of data for the vertex seismometer channels and the POP QPD PIT/YAW signals with the PRMI locked on carrier (angular FF OFF). The ITM Oplev loops weren't DC coupled, as they are in the full IFO locking sequence, but I feel like the angular FF filters can be improved - there are frequent sharp dives in the AS110 signal level which are correlated with large amplitude motion of the POP spot on the control room CCD monitor.
Repeating the frequency domain multicoherence analysis using BS_X and BS_Y seismometer channels as witnesses suggest that we can win significantly (see Attachment #1).
I've never really implemented feedforward filters - I was planning on using ericq's latest entry on this subject as a guide. From what I gather, the procedure is as follows:
Some notes from Rana from some years ago: https://nodus.ligo.caltech.edu:8081/40m/11519
If anyone has pointers / other considerations I should take into account, please post here.
This happened again just now - it was roughly this time when this happened last night as well.
There was certainly an EPICS freeze of the kind we were used to seeing prior to replacing the martian wireless router sometime in late 2015 (or early 2016?). I was trying to run the dither alignment servos on the Y-arm at this time, and all the StripTool traces flatlined.
I took the opportunity to try accessing testpoints from the iscey ADCs - specifically C1:SUS-TRY_OUT, and it seemed to work just fine. However, I couldn't ssh into c1iscey.
Looking at the dmesg once I was able to ssh in eventually (~2 minutes deadtime tonight, I feel like it was longer yesterday but can't quantify), I see the following: not sure if there are any clues in here, or whether this is the correct log to check. But there are many instances of the nfs server related message in the log. Note that the system time-stamp corresponds to when this freeze happened.
[5461308.784018] nfs: server 192.168.113.201 not responding, still trying
[5461412.936284] nfs: server 192.168.113.201 OK
[5461412.937130] systemd: Starting Journal Service...
[5461412.947947] systemd-journald: Received SIGTERM from PID 1 (systemd).
[5461412.996063] systemd: Unit systemd-journald.service entered failed state.
[5461413.002627] systemd: systemd-journald.service has no holdoff time, scheduling restart.
[5461413.008983] systemd: Stopping Journal Service...
[5461413.014664] systemd: Starting Journal Service...
[5461413.044262] systemd: Started Journal Service.
[5461413.694838] systemd-journald: Received request to flush runtime journal from PID 1
[steve, jamie, gautam]
The machine that now serves as out Frame Builder, FB1, was sitting on top of megatron. I decided that this wasn't ideal, and asked Steve to get some alternative mounting solution. Today, he procured some shelves to put FB1 on. Jamie suggested looking for the slider-rail that came with the machine, and using that instead, as it will allow us to slide FB1 out of the rack as we do megatron and the old FB. But as luck would have it, the distance between the rack vertical posts is 26 inches, but the rail is 27 inches. So we had to accept using the less ideal solution of putting FB1 on two shelves, with no sliding option. Photo to be uploaded shortly.
For this work, I had to shutdown FB1 for about 1 hour between 3pm and 4pm. It seems to have come back up fine now.
Alex is going to have an undergrad work on a calibration optimization project on the 40m RTCDS system. For this purpose, we wanted to setup a "Simulated DARM loop". Today, Alex and I set this up. I figured we can use the c1tst model for this purpose. We basically copied the topology from Figure 2 of the h(t) paper. Attached are screenshots of the MEDM screens of the system we setup, and the simulink block diagram - the main screen can be accessed from the "SIM PLANT" tab in the sitemp.
It remains to setup the appropriate filters in the filter banks, and an EPICS channel monitor for monitoring the single excitation testpoint in the model. We also did not set up any DQ channels for the time being, as it is not even clear to me what channels need to be DQ-ed.
None of the 3 dd backups I made were bootable - at boot, selecting the drive put me into grub rescue mode, which seemed to suggest that the /boot partition did not exist on the backed up disk, despite the fact that I was able to mount this partition on a booted computer. Perhaps for the same reason, but maybe not.
After going through various StackOverflow posts / blogs / other googling, I decided to try cloning the drives using ddrescue instead of dd.
This seems to have worked for nodus - I was able to boot to console on the machine called rosalba which was lying around under my desk. I deliberately did not have this machine connected to the martian network during the boot process for fear of some issues because of having multiple "nodus"-es on the network, so it complained a bit about starting the elog and other network related issues, but seems like we have a plug-and-play version of the nodus root filesystem now.
chiara and fb1 rootfs backups (made using ddrescue) are still not bootable - I'm working on it.
Nov 6 2017: I am now able to boot the chiara backup as well - although mysteriously, I cannot boot it from the machine called rosalba, but can boot it from ottavia. Anyways, seems like we have usable backups of the rootfs of nodus and chiara now. FB1 is still a no-go, working on it.
controls@fb1:~ 0$ sudo dd if=/dev/sda of=/dev/sdc bs=64K conv=noerror,sync
33554416+0 records in
33554416+0 records out
2199022206976 bytes (2.2 TB) copied, 55910.3 s, 39.3 MB/s
You have new mail in /var/mail/controls
Eurocrate key turning reboots today morning for c1psl and c1aux.c1auxex and c1auxey are also down but I didn't bother keying them for now. PSL FSS slow loop is now active again (its inactivity was what prompted me to check status of the slow machines).
Note that the EPCIS channels for PSL shutter are hosted on c1aux.But looks like the slow machine became unresponsive at some point during the weekend, so plotting the trend data for the PSL shutter channel would have you believe that the PSL shutter was open all the time. But the MC_REFL DC channel tells a different story - it suggests that the PSL shutter was closed at ~4AM on Sunday, presumably by the vacuum interlock system. I wonder:
Eurocrate key turning reboots today morning for and c1susaux, c1auxex and c1auxey. Usual precautions were taken to minimize risk of ITMX getting stuck.
The IFO hasn't been aligned in ~1week, so I recovered arm and PRM alignment by locking individual arms and also PRMI on carrier. I will try recovering DRMI locking in the evening.
As far as MC1 glitching is concerned, there hasn't been any major one (i.e. one in which MC1 is kicked by such a large amount that the autolocker can't lock the IMC) for the past 2 months - but the MC WFS offsets are an indication of when smaller glitches have taken place, and there were large DC offsets on the MC WFS servo outputs, which I offloaded to the DC MC suspension sliders using the MC WFS relief script.
I'd like for the save-restore routine that runs when the slow machines reboot to set the watchdog state default to OFF (currently, after a key-turning reboot, the watchdogs are enabled by default), but I'm not really sure how this whole system works. The relevant files seem to be in the directory /cvs/cds/caltech/target/c1susaux. There is a script in there called startup.cmd, which seems to be the initialization script that runs when the slow machine is rebooted. But looking at this file, it is not clear to me where the default values are loaded from? There are a few "saverestore" files in this directory as well:
Are the "default" channel values loaded from one of these?
Some days ago, I had tried to measure the SRCL->MICH and PRCL->MICH cross couplings using broadband noise injected between 120-180 Hz, a frequency band chosen arbitrarily, in hindsight, I could have done a more broadband test. I've spent some time including the infrastructure to calculate "White-Noise TFs" in the noise budgeting code, where a transfer function is estimated by injecting a "broadband" excitation into a channel of interest, and looking at the resulting response in MICH. I figured this would be useful to estimate other couplings as well, e.g. laser intensity nosie, oscillator noise etc.
I estimate the transfer function of the coupling using the relation (MICH is the median ASD of the MICH error signal in the below expression, and similarly for PRCL)
Attachments #1 and #2 show the spectra of the MICH, PRCL and SRCL signals during 'quiet' times and during the injection, while Attachment #3 shows the calculated coupling TFs using the above relation. These are significantly different (more than 10dB lower) than the numbers I reported in elog 13367, where the measurement was made using swept sine. As can be seen in the attached plots, the injected broadband excitation is visible above the nominal noise level, and I calculated the white noise TFs using ~5mins of data which should be plenty, so I'm not sure atm what to make of the answers from swept-sine and broadband injections being so different.
Attachment #4 shows the noise budget from the October 8 DRMI lock with the updated SRCL->MICH and PRCL->MICH couplings (assumed flat, extrapolated from Attachment #2 in the 120-180Hz band). If these updated coupling numbers are to be believed, then there is still some unexplained noise around 100Hz before we hit the PD dark noise. To be investigated. But if Attachment #4 is to be believed, it is not surprising that there isn't significant coherence between SRCL/PRCL and MICH around 100Hz.
Nov 8 1600: Updating NB to inculde estimated Oplev A2L.
I hadn't re-locked the DRMI after the work on the AS55 demod board. Tonight, I was able to recover the DRMI locking with the old settings.
The feature in the PRCL spectrum (uncalibrated, y-axis is cts/rtHz) at ~1.6kHz is mysterious, I wonder what that's about.
Wasted some time tonight futzing around with various settings because I couldn't catch a DRMI lock, thinking I may have to re-tune demod phases etc given that I've been mucking around the LSC rack a fair bit. But fortunately, the problem turned out to be that the correct feedforward filters were not enabled in the angular feedforward path (seems like these are not SDF monitored). Clue was that there was more angular motion of the POP spot on the CCD than I'm used to seeing, even in the PRMI carrier lock.
After fixing this, lock was acquired within seconds, and the locks are as robust as I remember them - I just broke one after ~20mins locked because I went into the lab. I've been putting off looking at this angular feedforward stuff and trying out some ideas rana suggested, seems like it can be really useful.
As part of the pre-lock work, I dither aligned arms, and then ran the PRCL/MICH dithers as well, following which I re-centered the ITM, PRM and BS Oplevspots onto their respective QPDs - they have not been centered for a couple of months now.
I'm now going to try and measure some other couplings like PSL RIN->MICH, Marconi phase noise->MICH etc.
I tried measuring the coupling of PSL intensity noise by driving some broadband noise bandpassed between 80-300Hz using the spare DAC channel at 1Y3 that I had set up for this purpose a couple of weeks ago (via a battery powered SR560 buffer set to low-noise operation mode because I'm not sure if the DAC output can drive a ~20m long cable). I was monitoring the MC2 TRANS QPD Sum channel spectrum while driving this broadband noise - the "nominal" spectrum isn't very clean, there are a bunch of notches from a 60Hz comb and a forest of peaks over a broad hump from 300Hz-1kHz (see Attachment #1).
I was able to increase the drive to the AOM till the RIN in the band being driven increased by ~10x, and saw no change in the MICH error signal spectrum [see Attachment #1] - during this measurement, the RFPD whitening was turned on for REFL11, REFL55 and AS55, and the ITM coil drivers were de-whitened, so as to get a MICH spectrum that is about as "low-noise" as I've gotten it so far.
I tried increasing the drive further, but at this point, started seeing frequent MC locklosses - I'm not convinced this is entirely correlated to my AOM activities, so I will try some more, but at the very least, this places an upper bound on the coupling from intensity noise to MICH.
The Oplev trace is missing for now, as I have not re-measured the A2L coupling since modifying the Oplev loop shape (specifically the low pass filter and overall gain) to allow engageing the coil de-whitening.
The averaging for the white noise TFs plotted is computed using median averaging - I have used a python transcription of Sujan's matlab code. I use scipy.signal.spectrogram to compute the fft bins (I've set some defaults like 8s fft length and a tukey window), and then take the median average using np.median(). I've also incorporated the ln(2) correction factor.
It seems like GwPy has some in-built capability to compute median (and indeed various other percentile) averages, but since we aren't using it, I just coded this up.
why no oplev trace in the NB ?
also, this method would work better if we had a median averaging python PSD instead of mean averaging as in Welch's method.
We've been talking about increasing the series resistance for the coil driver path for the test masses. One consequence of this will be that we have reduced actuation range.
This may not be a big deal since for almost all of the LSC loops, we currently operate with a limiter on the output of the control filter bank. The value of the limit varies, but to get an idea of what sort of "threshold" velocities we are looking at, I calculated this for our Finesse 400 arm cavities. The calculation is rather simplistic (see Attachment #1), but I think we can still draw some useful conclusions from it:
So, from this rough calculation, it seems like we would lose ~25% efficiency in locking the arm cavity if we up the series resistance from 400ohm to 1kohm. Doesn't seem like a big deal, becuase currently, the single arm locking