A bit more digging on the diagnostics page of the RAID array reveals that the two power supplies actually failed on Jun 2 2017 at 10:21:00. Not surprisingly, this was the date and approximate time of the last major power glitch we experienced. Apart from this, the only other error listed on the diagnostics page is "Reading Error" on "IDE CHANNEL 2", but these errors precede the power supply failure.
Perhaps the power supplies are not really damaged, and its just in some funky state since the power glitch. After discussing with Jamie, I think it should be safe to power cycle the Jetstor RAID array once the FB machine has been powered down. Perhaps this will bring back one/both of the faulty power supplies. If not, we may have to get new ones.
The problem with FB may or may not be related to the state of the Jestor RAID array. It is unclear to me at what point during the boot process we are getting stuck at. It may be that because the RAID disk is in some funky state, the boot process is getting disrupted.
I am unable to get FB to reboot to a working state. A hard reboot throws it into a loop of "Media Test Failure. Check Cable".
Jetstor RAID array is complaining about some power issues, the LCD display on the front reads "H/W Monitor", with the lower line cycling through "Power#1 Failed", "Power#2 Failed", and "UPS error". Going to 192.168.113.119 on a martian machine browser and looking at the "Hardware information" confirms that System Power #1 and #2 are "Failed", and that the UPS status is "AC power loss". So far I've been unable to find anything on the elog about how to handle this problem, I'll keep looking.
In fact, looks like this sort of problem has happened in the past. It seems one power supply failed back then, but now somehow two are down (but there is a third which is why the unit functions at all). The linked elog thread strongly advises against any sort of power cycling.
Over the day, I have been working on a C++ program to interface with Pylon to capture images and reduce dependence on the Pylon GUI. The program uses the Pylon header files along with opencv headers. While ultimately a wrapper in python may be developed for the program, the current C++ program at,
/users/jigyasa/GigEcode/Grab/Grab.cpp when compiled as
g++ -Wl,--enable-new-dtags -Wl,-rpath,/opt/pylon5/lib64 -o Grab Grab.o -L/opt/pylon5/lib64 -Wl,-E -lpylonbase -lpylonutility -lGenApi_gcc_v3_0_Basler_pylon_v5_0 -lGCBase_gcc_v3_0_Basler_pylon_v5_0 `pkg-config opencv --cflags --libs`
returns an executable file named Grab which can be executed as ./Grab
This captures one image from the camera and displays it, additionally it also displays the gray value of the first pixel.
I am working on adding more utility to the program such as manually adjusting exposure, gain and also on the python wrapper (Cython has been installed locally on Ottavia for the purpose)!
Attachment #1: State of CDS overview screen as of 9.30AM today morning when I came in.
Looks like there may have bene a power glitch, although judging by the wall StripTool traces, if there was one, it happened more than 8 hours ago. FB is down atm so can't trend to find out when this happened.
All FEs and FB are unreachable from the control room workstations, but Megatron, Optimus and Chiara are all ssh-able. The latter reports an uptime of 704 days, so all seems okay with its UPS. Slow machines are all responding to ping as well as telnet.
Recovery process to begin now. Hopefully it isn't as complicated as the most recent effort [FAMOUS LAST WORDS]
Indeed, the whole point of the high/low gain setup is to never use the QPDs for the single arm work. Only use the high gain Thorlabs PD and then the switchover code uses the QPD once the arm powers are >5.
I don't know how the operation procedure went so higgledy piggledy.
About 2 weeks ago, I noticed some odd behaviour of the LSC TRY data stream. Its DC value seems to be drifting ~10x more than TRX. Both signals come from the transmission QPDs. At the time, we were dealing with various CDS FE issues but things have been stable on that end for the last two weeks, so I looked into this a bit more today. It seems like one particular channel is bad - Quadrant 4 of the ETMY TRANS QPD. Furthermore, there is a bump around 150Hz, and some features above 2kHz, that are only present for the ETMY channels and not the ETMX ones.
Since these spectra were taken with the PSL shutter closed and all the lab room lights off, it would suggest something is wrong in the electronics - to be investigated.
The drift in TRY can be as large as 0.3 (with 1.0 being the transmitted power in the single arm lock). This seems unusually large, indeed we trigger the arm LSC loops when TRY > 0.3. Attachment #2 shows the second trend of the TRX and TRY 16Hz EPICS channels for 1 day. In the last 12 hours or so, I had left the LSC master switch OFF, but the large drift of the DC value of TRY is clearly visible.
In the short term, we can use the high-gain THORLABS PD for TRY monitoring.
i wonder how 'HDR' these images really are. is there a quantitative way to check that we are really getting more bits? also, how many bits does the PNG format allow for monochrome images? i worry that these elog images are already lossy.
I captured a few images of the beam spot on ETMX at 5ms, 10ms, 14ms, 50ms, 100ms, 500ms, 1000ms exposure and ran them through my python script for HDR images. Here's what I obtained.
The resulting image is an improvement over the highly saturated images at say, 500ms and 1 second exposures.
Additionally, I also included a colormapped version of the image.
I've been making NBs on my laptop, thought I would get the copy under version control up-to-date since I've been negligent in doing so.
The code resides in /ligo/svncommon/NoiseBudget, which as a whole is a git directory. For neatness, most of Evan's original code has been put into the sub-directory /ligo/svncommon/NoiseBudget/H1NB/, while my 40m NB specific adaptations of them are in the sub-directory /ligo/svncommon/NoiseBudget/NB40. So to make a 40m noise budget, you would have to clone and edit the parameter file accordingly, and run python C1NB.py C1NB_2017_04_30.py for example. I've tested that it works in its current form. I had to install a font package in order to make the code run (with sudo apt-get install tex-gyre ), and also had to comment out calls to GwPy (it kept throwing up an error related to the package "lal", I opted against trying to debug this problem as I am using nds2 instead of GwPy to get the time series data anyways).
There are a few things I'd like to implement in the NB like sub-budgets, I will make a tagged commit once it is in a slightly neater state. But the existing infrastructure should allow making of NBs from the control room workstations now.
We spent some time trying to get the noise-budgeting code running today. I guess eventually we want this to be usable on the workstations so we cloned the git repo into /ligo/svncommon. The main objective was to see if we had all the dependencies for getting this code running already installed. The way Evan has set the code up is with a bunch of dictionaries for each of the noise curves we are interested in - so we just commented out everything that required real IFO data. We also commented out all the gwpy stuff, since (if I remember right) we want to be using nds2 to get the data.
Running the code with just the gwinc curves produces the plots it is supposed to, so it looks like we have all the dependencies required. It now remains to integrate actual IFO data, I will try and set up the infrastructure for this using the archived frame data from the 2016 DRFPMI locks..
Reboots for c1susaux, c1iscaux today.
The liquid nitrogen container has a pressure releif valve set to 35 PSI This valve will open periodically when contains LN2
The exiting very cold gas can cause burning so it should not hit directly your eyes or skin. Set the pointing of this valve into the corner.
Leave entry door open so nitrogen concentration can not build up.
Basically we use the arm cavities as the reference of the beam alignment. The incident beam is aligned such that the ITMY angle dither is minimized (at least at the dither freq).
This means that we have no capability to adjust the spot poisitions on the PRM, SRM, BS, ITMX optics.
We are still able to minimize A2L by adding intentional asymmetry to the coil actuators.
I am not attempting a full characterization tonight, but the important changes since the May locks are in the de-whitening boards and coil driver boards. I did not attempt to engage the coil-dewhitening, but the PD whitening works fine.
As a quick check, I tested the hypothesis that the BS OL loop A2L coupling dominates between ~10-50Hz. The attached control signal spectra [Attachment #2] supports this hypothesis. Now to actually change the loop shape.
I've centered Oplevs of all vertex optics, and also the beams on the REFL and AS PDs. The ITMs and BS have been repeatedly aligned since re-installing their respective coil driver electronics, but the SRM alignment needed some adjustment of the bias sliders.
Full characterization to follow. Some things to check:
Lesson learnt: Don't try and change too many things at once!
GV July 5 1130am: Looks like the MICH loop gain wasn't set correctly when I took the attached spectra, seems like the bump around 300Hz was caused by this. On later locks, this feature wasn't present.
All thanks to Steve, we cleaned the view port on the ETMX on which the camera is installed, and with a little fine tuning of the focus of the camera, here's a really good image of the beam spot at 6 and 14 ms.
I'm going to go squish cables and the usual sat. box voodoo, hopefully that settles it.
I attempted to re-lock the DRMI and try and realize some of the noise improvements we have identified. Summary elog, details to follow.
Basically after this point, I was unable to repeat stuff I did earlier in the evening just a couple of hours ago. The single arm locks catch quickly, and seem stable over the hour timescale, but when I run the X arm dither, the BS PITCH loop starts to oscillate at ~0.1 Hz. Moreover, I am unable to acquire PRMI carrier lock. I must have changed a setting somewhere that I am not catching right now (although I've scripted most of these things for repeatability, so I am at a loss what I'm missing ). The only change I can think of is that I changed the BS Oplev loop shape. But I went back into the filter file archives and restored these to their original configuration. Hopefully I'll have better luck figuring this out tomorrow.
Also the GigE has been wired and conencted to the Martian. Image acquisition is possible with Pylon.
The script is being executed again, now.
I worked on the code today and have left a script (MC2rerun.py) running on Ottavia which should run overnight.
In continuation to my previous posts, I have been working on evaluating the data on transfer function. Recently, I have calculated the correlation values between the real and imaginary part of the transfer function. Also I have written the code for plotting the transfer function data stream at each frequency in the argand plane just for referring to. Also I have done a few calculations and found the errors in magnitude and phase using those in the real and imaginary parts of the transfer function. More details for the process are in this git repository.
The following attachments have been added:
Seeing the correlation values, it sounds reasonable that the gaussian in real and imaginary parts approximation is actually holding. This is because the correlation values are mostly quite small. This can be seen by studying the distribution of the transfer function on the argand plane. The entire distribution can be seen to be somewhat, if not entirely, circular. Even when the ellipticity of the curve seems to be high, the curve still appears to be elliptical along the real and imaginary axes, i.e., correlation in them is still low.
In order to test the above again, with an even larger data set, I am leaving a script running on Ottavia. It should take more than just the night(I estimate around 10-11 hours) if there are no problems.
There were a few more flaky things in the Expansion chassis - the IDE connectors don't have "keys" that fix the orientation they should go in, and the whole timing card assembly is kind of difficult and not exactly secure. But for now, things are back to normal it seems.
The values generated from the script were analyzed and a 3D scatter plot in addition to a 2D map were plotted.
Yesterday, Rana pointed me to another method of collecting and analyzing the data. So I worked on the code today and have left a script (MC2rerun.py) running on Ottavia which should run overnight.
The script didn't run properly last night, due to an oversight of variable names! It's been started again and has been running for half an hour now.
The 50mm lens has arrived. (Delivered yesterday).
GigE can be connected to ethernet. AR coated 1064 f50 can arrive any day now.
To re-cap, every time I tried to do this in the last month or so, the optic would get kicked around. I suspected that the main cause was the insufficient low-pass filtering on the Oplev loops, which was causing the DAC rms to rail when the whitening was turned on.
I had tried some loop-tweaking by hand of the OL loops without much success last week - today I had a little more success. The existing OL loops are comprised of the following:
THe elliptic low pass was too shallow. For a first pass at loop shaping today, I checked if the resonant gain filter had any effect on the transmitted power RMS profile - turns out it had negligible effect. So I disabled this filter, replaced the elliptic low pass with a 5th order ELP with 2dB passband ripple and 80dB stopband attenuation. I also adjusted the overall loop gain to have an upper UGF for the OL loops around 2Hz. Looking at the spectrum of one coil output in this configuration (ITMY UL), I determined that the DAC rms was no longer in danger of railing.
However, I was still unable to smoothly engage the de-whitening. The optic again kept getting kicked around each time I tried. So I tried engaging the de-whitening on the ITM with just the local damping loop on, but with the arm locked. This transition was successful, but not smooth. Looking at the transmon spot on the camera, every time I engage the whitening, the spot gets a sizeable kick (I will post a video shortly). In my ~10 trials this afternoon, the arm is able to stay locked when turning the whitening on, but always loses lock when turning the whitening off.
The issue here is certainly not the DAC rms railing. I had a brief discussion with Gabriele just now about this, and he suggested checking for some electronic voltage offset between the two paths (de-whitening engaged and bypassed). I also wonder if this has something to do with some latency between the actual analog switching of paths (done by a slow machine) and the fast computation by the real time model? To be investigated.
GV 170628 11pm: I guess this isn't a viable explanation as the de-whitening switching is handled by the one of the BIO cards which is also handled by the fast FEs, so there isn't any question of latency.
With the Oplev loops disengaged, the initial kick given to the optic when engaging the whitening settles down in about a second. Once the ITM was stable again, I was able to turn on both Oplev loops without any problems. I did not investigate the new Oplev loop shape in detail, but compared to the original loop shape, there wasn't a significant difference in the TRY spectrum in this configuration (plot to follow). This remains to be done in a systematic manner.
Plots to support all of this to follow later in the evening.
Attachment #1: Video of ETMY transmission CCD while engaging whitening. I confirmed that this "glitch" happens while engaging the whitening on the UL channel. This is reminiscent of the Satellite Box glitches seen recently. In that case, the problem was resolved by replacing the high-current buffer in the offending channel. Perhaps something similar is the problem here?
Attachment #2: Summary of the ITMY UL coil output spectra under various conditions.
I tried a couple of things, but no fundamental improvement of the missing LED light on the timing board.
- The power supply cable to the timing board at c1iscex indicated +12.3V
- I swapped the timing fiber to the new one (orange) in the digital cabinet. It didn't help.
- I swapped the opto-electronic I/F for the timing fiber with the Y-end one. The X-end one worked at Y-end, and Y-end one didn't work at X-end.
- I suspected the timing board itself -> I brought a "spare" timing board from the digital cabinet and tried to swap the board. This didn't help.
- Bring the X-end fiber to C1SUS or C1IOO to see if the fiber is OK or not.
- We checked the opto-electronic I/F is OK
- Try to swap the IO chassis with the Y-end one.
- If this helps, swap the timing board only to see this is the problem or not.
I have spent my first few days as a SURF getting experience working with the Network/Spectrum Analyzer (AG 4395A). After an introduction to the 40m by Koji, I was tasked with using the AG4395A to measure the transfer function of several filters (for example, Mini-Circuits Low Pass Filter SLP-30). I am now familiar with configuring the AG 4395A, taking a single set of data using a command from one of the control computers, and plotting the dataset as a Bode plot (separate plots for magnitude and phase) using Python.
To experiment with plotting multiple datasets on a single Bode plot, I used a single dataset from the Network Analyzer using the SLP-30 filter and added random noise to create ten datasets to plot. I am attaching the resulting Bode plot, which has the ten generated sets of data plotted along with their average.
We discussed with Rana and Koji how to interpret this type of dataset from the Network Analyzer. Instead of considering the magnitude and phase as separate quantities, we should consider them together as a single complex number in the form H(f) = M exp(iπP/180), where M is the magnitude and P is the phase in degrees. We can then find the average value of the measured quantity in its complex number form (x + iy), as opposed to just taking the average of the magnitude and phase separately.
I tried all versions of power cycling and debugging this problem known to me, including those suggested in this thread and from a more recent time. I am leaving things as it for the night, will look into this more tomorrow. I've also shutdown the ETMX watchdog for the time being. Looks like this has been down since 24Jun 8am UTC.
I am leaving a script running on the Pianoso for the night. For this purpose, even the AG4395A is kept on. I'll see the result of the script in the morning (it should be complete by then). Just check so before fiddling with the Analyzer.
I have written a code(a basic one which needs a lot of improvements, but still does the job) for taking multiple measurements from the AG4395A. I have also written a separate code for plotting the data taken from the previoius code along with the error bars upto 1 standard deviation.
Details on How To Operate AG4395A:
Brief Details on How the 'AGmeasure' command works:
AGmeasure is a python script developed by some of the people who work at 40m. It is set as a global command and can be used from within any directory. The source code is in the scripts folder on the network, or else it can also be found in Eric Quintero's git repository. This command accepts at the very least a parameter file. This is supposed to be a .yml file. A template (TFAG4395Atemplate.yml) can be found in the scripts folder or in Eric's repo. There are some other options that can be passed to this command, see the help for more details.
The Multi_Measurement Script:
This script calls the 'AGmeasure' command repetitively and keeps storing the data files in a folder. Right now, the script needs to be fed in th template file manually at prompt.
The Test_Plotting Script:
This script plots the a set of data files obtained from the above mentioned script and produces a plot along with the errors bands upto 1 standard deviation of the data. The format (names) and total number of text files need to be explicitly known, for now at least.
Update: Increased the font size in the plot. Added a few comments to the two scripts
To Do: Need to consider the transfer function as a single physical quantity (both the magnitude and phase) and then take the averages and calculate the standard deviation and then plot these results.
The attachment with the test files and the code now also contains a pdf with all the relations/equations I have used to calculate the averages and errors.
I am starting it on Donatella and it should run for a couple of hours.
Apologies for the inconvenience.
A python script to randomly vary the MC2 pitch and yaw offset and correspondingly record the value of MC transmission has been started on Donatella in the control room and should run for a couple of hours overnight.
The script is named MC_TRANS_1.py and is located in my user directory at /users/jigyasa
One of the additional GigE cameras has been IP configured for use and installation.
Static IP assigned to the camera- 192.168.113.152
Subnet mask- 255.255.255.0
The previous run of the script had produced some dubious results!
The script has been modified and now scans the transmission sum for a longer duration to provide a better estimate on the average transmission. The pitch and yaw offsets have been set to the values that were randomly generated in the previous run as this would enable comparison with the current data.
The IRAF software from the National Optical Astronomy Observatory has been installed locally on Donatella(for testing) following the instructions listed here at http://www.astronomy.ohio-state.edu/~khan/iraf/iraf_step_by_step_installation_64bit
This is a step towards "aperture photometry" and would help identify point scatterers in the images of the test masses.
I will be testing this software, in particular, the use of DAOPHOT and if it seems to work out, we may install it on the shared directory.
Hope this isn't an inconvenience.
I just connected the Ottavia to the Netgear box and its working just fine. It'll remain switched on over the weekend.
Kaustubh and I are going to enable the ethernet connection to Ottavia and secure the wiring now.
Static IP assigned to the camera- 192.168.113.152
Subnet mask- 255.255.255.0
Reboots for c1psl, c1iool0, c1iscaux today. MC autolocker log was complaining that the C1:IOO-MC_AUTOLOCK_BEAT EPICS channel did not exist, and running the usual slow machine check script revealed that these three machines required reboots. PMC was relocked, IMC Autolocker was restarted on Megatron and everything seems fine now.
Ottavia had been left running overnight and it seems to work fine. There has been no smell or any noticeable problems in the working. This morning Gautam, Kaustubh and I connected Ottavia to the Matrian Network through the Netgear switch in the 40m lab area. We were able to SSH into Ottavia through Pianosa and access directories. On the ottavia itself we were able to run ipython, access the internet. Since it seems to work out fine, Kaustubh and I are going to enable the ethernet connection to Ottavia and secure the wiring now.
It has been working fine the whole day(we didn't do much testing on it though). We are leaving it on for the night.
Today, I and Jigyasa connected the Ottavia to one of the unused monitor screens Donatella. The Ottavia CPU had a label saying 'SMOKED''. One of the past elogs, 11091, dated back in March 2015, by Jenne had an update regarding the Ottavia smelling 'burny'. It seems to be working fine for about 2 hours now. Once it is connected to the Martian Network we can test it further. The Donatella screen we used seems to have a graphic problem, a damage to the display screen. Its a minor issue and does not affect the display that much, but perhaps it'll be better to use another screen if we plan to use the Ottavia in the future. We will power it down if there is an issue with it.
Apologies for any inconvenience.
Data analysis will follow.
I tried playing around with the Oplev loop shape on ITMY, in order to see if I could successfully engage the Coil Driver whitening. Unfortunately, I had no success tonight.
I was trying to guess a loop shape that would work - I guess this will need some more careful thought about loop shape optimization. I was basically trying to keep all the existing filters, and modify the low-passing that minimizes control noise injection. By adding a 4th order elliptic low pass with corner at 50Hz and stopband attenuation of 60dB yielded a stable loop with upper UGF of ~6Hz and ~25deg of phase margin (which is on the low side). But I was able to successfully engage this loop, and as seen in Attachment #1, the noise performance above 50Hz is vastly improved. But it also seems that there is some injection of noise around 6Hz. In any case, as soon as I tried to engage the dewhitening, the DAC output quickly saturated. The whitening filter for the ITMs has ~40dB of gain at ~40Hz already, so it looks like the high frequency roll-off has to be more severe.
I am not even sure if the Elliptic filter is the right choice here - it does have the steepest roll off for a given filter order, but I need to look up how to achieve good roll off without compromising on the phase margin of the overall loop. I am going to try and do the optimization in a more systematic way, and perhaps play around with some of the other filters' poles and zeros as well to get a stable controller that minimizes control noise injection everywhere.
Happy MC after last glitch at 10:28 so the credit goes to Rana
GV edit 11:30am: I think the stuff at 10:28 is not a glitch but just the WFS servos coming on - the IMC was only hand aligned before this.
It happened again. MC2 UL seems to have gotten the biggest glitch. It's a rather small jump in the signal level compared to what I have seen in the recent past in connection with suspect Satellite boxes, and LL and UR sensors barely see it.
I will squish Sat box cables and check the cabling at the coil driver board end as well, given that these are two areas where there has been some work recently. WFS loops will remain off till I figure this out. At least the (newly centered) DC spot positions on the WFS and MC2 TRANS QPD should serve as some kind of reference for good MC alignment.
GV edit 9pm: I tightened up all the cables, but doesn't seem to have helped. There was another, larger glitch just now. UR and LL basically don't see it at all (see Attachment #2). It also seems to be a much slower process than the glitches seen on MC1, with the misalignment happening over a few seconds (it is also a lot slower). I have to see if this is consistent with a glitch in the bias voltage to one of the coils which gets low passed by a 4xpole@1Hz filter.
Once we had the beam approximately centered for all of the above 3 PDs, we turned on the locking for IMC, and it seems to work just fine. We are waiting for another hour for switching on the angular allignment for the mirrors to make sure the alignment holds with WFS turned off.
wonder if its possible that the slow glitches in MC are just glitches in MC2 trans QPD? Steve sometimes dances on top of the MC2 chamber when he adjusts the MC2 camera.
I've re-enabled the WFS at 22:25 (I think Gautam had them off as part of the MC2 glitch investigation). WFS1 spot position seems way off in pitch & yaw.
From the turn on transient, it seems that the cross-coupled loops have a time constant of ~3 minutes for the MC2 spot, so maybe that's not consistent with the ~30 second long steps seen earlier.
Reboots for c1susaux, c1iscaux, c1auxex today. I took this opportunity to squish the Sat. Box. Cabling for MC2 (both on the Sat box end and also the vacuum feedthrough) as some work has been recently ongoing there, maybe something got accidently jiggled during the process and was causing MC2 alignment to jump around.
Relocked PMC to offload some of the DC offset, and re-aligned IMC after c1susaux reboot. PMC and IMC transmission back to nominal levels now. Let's see if MC2 is better behaved after this sat. box. voodoo.
Interestingly, since Feb 6, there were no slow machine reboots for almost 3 months, while there have been three reboots in the last three weeks. Not sure what (if anything) to make of that.
In order to switch on the angular alignment for the IMC mirrors, we needed to center the laser onto the quad-photodiodes at the IMC and the AS Table(WFS1 and WFS2)
I and Gautam went to the IMC table and did the dc centering for the quad-photodiode by varying the beamsplitter angles. After this, we turned the WFS loops off and performed beam centering for the Quad PDs at the AS Table, the WFS1 and WFS2.