40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 280 of 344  Not logged in ELOG logo
ID Date Author Typeup Category Subject
  13196   Fri Aug 11 17:36:47 2017 gautamUpdateSUSMC1 <--> MC3

About 30mins ago, I saw another glitch on MC1 - this happened while the Watchdog was shutdown.

In order to further narrow down the cause of the glitch, we switched the Coil Driver Board --> Satellite box DB(15?) connectors on the coil drivers between MC1 and MC3 coil driver boards. I also changed the static PIT/YAW bias voltages to MC1 and MC3 such that MC-REFL is now approximately back to the center of the CCD monitor.

 

Attachment 1: MC1_glitch_watchdog_shutdown.png
MC1_glitch_watchdog_shutdown.png
  13197   Fri Aug 11 18:53:35 2017 gautamUpdateCDSSlow EPICS channels -> Frames re-enabled
Quote:

Seems like something has failed after I did this - full frames are no longer on Aug 10 being written since ~2.30pm PDT. I found out when I tried to download some of the free-swinging MC1 data.

To clarify, I logged into fb1, and ran sudo systemctl restart daqd_*. The only change I made was to uncomment the line quoted below in the master file.

Looking at the log using systemctl, I see the following (I just tried restarting the daqd processes again):

Aug 11 00:00:31 fb1 daqd_fw[16149]: LDASUnexpected::unexpected: Caught unexpected exception      "This is a bug. Please log an LDAS problem report including this message.
Aug 11 00:00:31 fb1 daqd_fw[16149]: daqd_fw: LDASUnexpected.cc:131: static void LDASTools::Error::LDASUnexpected::unexpected(): Assertion `false' failed.
Aug 11 00:00:32 fb1 systemd[1]: daqd_fw.service: main process exited, code=killed, status=6/ABRT
Aug 11 00:00:32 fb1 systemd[1]: Unit daqd_fw.service entered failed state.
Aug 11 00:00:32 fb1 systemd[1]: daqd_fw.service holdoff time over, scheduling restart.
Aug 11 00:00:32 fb1 systemd[1]: Stopping Advanced LIGO RTS daqd frame writer...
Aug 11 00:00:32 fb1 systemd[1]: Starting Advanced LIGO RTS daqd frame writer...
Aug 11 00:00:32 fb1 systemd[1]: daqd_fw.service start request repeated too quickly, refusing to start.
Aug 11 00:00:32 fb1 systemd[1]: Failed to start Advanced LIGO RTS daqd frame writer.
Aug 11 00:00:32 fb1 systemd[1]: Unit daqd_fw.service entered failed state.

Oddly, I am able to access second trends for the same channels from the past which will be useful for the MC1 debugging). Not sure whats going on.


The live data grabbing using cdsutils still seems to be working though - so I've kicked MC1 again, and am grabbing 2 hours of data live on Pianosa.

So we tried this again with a fresh build of daqd_fw, and it still fails.  The error message is pointing to an underlying bug in the framecpp library ("LDASTools"), which may be tricky to solve.  I'm rustling the appropriate bushes...

  13198   Fri Aug 11 19:34:49 2017 JamieUpdateCDSCDS final bits status update

So it appears we now have full frames and second, minute, and minute_raw trends.

We are still not able to raise test points with daqd_rcv (e.g. the NDS1 server), which is why dataviewer and nds2-client can't get test points on their own.

We were not able to add the EDCU (EPICS client) channels without daqd_fw crashing.

We have a new kernel image that's supposed to solve the module unload instability issue.  In order to try it we'll need to restart the entire system, though, so I'll do that on Monday morning.

I've got the CDS guys investigating the test point and EDCU issues, but we won't get any action on that until next week.

Quote:

Remaining unresolved issues:

  • IFO needs to be fully locked to make sure ALL components of all models are working.
  • The remaining red status lights are from the "FB NET" diagnostics, which are reflecting a missing status bit from the front end processes due to the fact that they were compiled with an earlier RCG version (3.0.3) than the mx_streams were (3.3+/trunk).  There will be a new release of the RTS soon, at which point we'll compile everything from the same version, which should get us all green again.
  • The entire system has been fully modernized, to the target CDS reference OS (Debian jessie) and more recent RCG versions.  The management of the various RTS components, both on the front ends and on fb, have as much as possible been updated to use the modern management tools (e.g. systemd, udev, etc.).  These changes need to be documented.  In particular...
  • The fb daqd process has been split into three separate components, a configuration that mirrors what is done at the sites and appears to be more stable: The "target" directory for all of these components is now:
    • daqd_dc: data concentrator (receives data from front ends)
    • daqd_fw: receives frames from dc and writes out full frames and second/minute trends
    • daqd_rcv: NDS1 server (raises test points and receives archive data from frames from 'nds' process)
    The "target" directory for all of these new components is:
    • /opt/rtcds/caltech/c1/target/daqd
    All of these processes are now managed under systemd supervision on fb, meaning the daqd restart procedure has changed.  This needs to be simplified and clarified.
  • Second trend frames are being written, but for some reason they're not accessible over NDS.
  • Have not had a chance to verify minute trend and raw minute trend writing yet.  Needs to be confirmed.
  • Get wiper script working on new fb.
  • Front end RTS kernel will occaissionally crash when the RTS modules are unloaded.  Keith Thorne apparently has a kernel version with a different set of patches from Gerrit Kuhn that does not have this problem.  Keith's kernel needs to be packaged and installed in the front end diskless root.
  • The models accessing the dolphin shared memory will ALL crash when one of the front end hosts on the dolphin network goes away.  This results in a boot fest of all the dolphin-enabled hosts.  Need to figure out what's going on there.
  • The RCG settings snapshotting has changed significantly in later RCG versions.  We need to make sure that all burt backup type stuff is still working correctly.
  • Restoration of /frames from old fb SCSI RAID?
  • Backup of entirety of fb1, including fb1 root (/) and front end diskless root (/diskless)
  • Full documentation of rebuild procedure from Jamie's notes.
  13199   Sat Aug 12 14:09:36 2017 gautamUpdateSUSGlitches stay on MC1

Even in the switched state, the glitches stayed on MC1.

The coil driver electronics for MC1, upstream of the Satellite box, was what was previously MC3 electronics.

Attachment #1 shows that there were no glitches in MC3 sensor channels (which are now physically connected to what was previously MC1 coil driver electronics).

Attachment #2 shows the second trends for a 12 hour period for MC1 and MC3 sensor channels. The MC3 channels look well behaved, but there are frequent glitches (at least 9 in the last 12 hours indecision) visible in the MC1 channels.

So to recap:

  • We switched MC1 satellite box - but glitch stayed on MC1, so it would seem the Satellite box is not to blame.
  • We shutdown the watchdog and the glitches persisted.
  • We switched the coil driver electronics for MC1, but glitches remained on MC1, and MC3 doesn't show any evidence of glitching. This and the previous bullet point suggest the coil driver electronics are not to blame.
  • For the glitch posted in Attachment #1, I could see the MC-REFL spot moving around on the CCD monitors, so the glitches aren't just a feature in the shadow sensor readout. 

I need to confirm that the output of the coil driver board goes straight to the Sat. Box, but if there are no intermediate elements, the problem is either in the cable from coil driver to sat. box, or downstream of the Satellite box - i.e. vacuum feedthroughs or the suspension itself? The size of the glitches is roughly the same in all 4 face channels (~60-80cts pp).

Quote:

About 30mins ago, I saw another glitch on MC1 - this happened while the Watchdog was shutdown.

In order to further narrow down the cause of the glitch, we switched the Coil Driver Board --> Satellite box DB(15?) connectors on the coil drivers between MC1 and MC3 coil driver boards. I also changed the static PIT/YAW bias voltages to MC1 and MC3 such that MC-REFL is now approximately back to the center of the CCD monitor.

 


GV addendum 14 Aug 2017, 10.30am: Attachment #3 shows the second trend for the MC sensor channels over the weekend. While there were many on Saturday, it seems that Sunday was quieter.

Attachment 1: MC1_glitch.png
MC1_glitch.png
Attachment 2: MC_12hr_trend.png
MC_12hr_trend.png
Attachment 3: MC1_glitches_intermittent.png
MC1_glitches_intermittent.png
  13200   Sat Aug 12 22:35:22 2017 ranaUpdateSUSGlitches stay on MC1

To add to Gautam's entry: we swapped the cables at the coil driver side (these are the ones that go from coil driver to sat box). In this state, damping is not useable since the MC1 servos would drive MC3.

~70 counts in the sensor means ~70 microns of motion. Since the watchdogs are off and the coil drivers are swapped, this can't be laser beam getting in to the sensors.

WE have to consider that these are some real strain release type events happening in the suspension wire or wire standoff, so may require a vent to inspect and possible repair MC1.

  13201   Sun Aug 13 01:35:09 2017 KojiUpdateSUSGlitches stay on MC1

We used to have similar suspension excursion at ETMX. This was the motivation to replace the stand-offs from Al ones to ruby ones. Did the replacement solve the issue at ETMX?

  13202   Mon Aug 14 09:49:18 2017 KiraUpdatePEMtemperature sensor

Decided to try adding in an OP amp just to see if it would work. Added LT1012 and a 100k resistor to the circuit (I originally wanted to do AD743 as it seems to be the best choice according to Zach's elog here, but it said that they are very precious so I went with LT1012 for testing purposes). When heating it with a heating gun, the output voltage went down by a few 0.01V. The maximum voltage was 0.686V. Similar thing happened when I switched to a 10k resistor, where the maximum was 0.705V and it also went down by a few 0.01V upon heating.

I've attached a few pictures showing the circuit.

Attachment 1: IMG_20170814_092452.jpg
IMG_20170814_092452.jpg
Attachment 2: IMG_20170814_092513.jpg
IMG_20170814_092513.jpg
  13203   Mon Aug 14 12:52:33 2017 KiraUpdatePEMtemperature sensor

I didn't realize that the LT1012 needed an additional input to function. I added in +15V and -15V to pins 7 and 4, respectively and placed a 10k resistor and the numbers make more sense now. The voltage showed a negative value, but it became more negative as I heated it up (it's negative due to how a transimpedance amplifier works).

I have attached the new setup and the value it shows (~-3V). It became more negative by about 0.4V, which translates to about a 40K increase in temperature, which makes sense.

In addition, I have attached an updated sketch of the circuit. I will need to do more testing to determine how accurate this is. The next step would be to calculate how much noise there is currently and figure out how to remove this circuit from the breadboard and use a PCB or something like that for final testing in an insulated container.

The reason I chose AD743 initially for the OP amp is because at low frequencies (which is what we are working with), a FET amp such as AD743 will have a low current noise at high impedance, which is what we have in this case. While a FET amp has high voltage noise compared to other OP amps, the current noise becomes more important at high impedance, so it will work better. According to Zach's graphs, the AD743 is best at high impedances, followed by LT1012.

Quote:

Decided to try adding in an OP amp just to see if it would work. Added LT1012 and a 100k resistor to the circuit (I originally wanted to do AD743 as it seems to be the best choice according to Zach's elog here, but it said that they are very precious so I went with LT1012 for testing purposes). When heating it with a heating gun, the output voltage went down by a few 0.01V. The maximum voltage was 0.686V. Similar thing happened when I switched to a 10k resistor, where the maximum was 0.705V and it also went down by a few 0.01V upon heating.

I've attached a few pictures showing the circuit.

 

Attachment 1: IMG_20170814_121131.jpg
IMG_20170814_121131.jpg
Attachment 2: IMG_20170814_121139.jpg
IMG_20170814_121139.jpg
Attachment 3: IMG_20170814_121758~2.jpg
IMG_20170814_121758~2.jpg
  13204   Mon Aug 14 16:24:09 2017 gautamUpdateALSFiber ALS

Today, I borrowed the fiber microscope from Johannes and took a look at the fibers coupled to the PDs. The PD labelled "BEAT PD AUX Y" has an end that seems scratched (Attachments #1 and #2). The scratch seems to be on (or at least very close to) the core. The other PD (Attachments #3 and #4) doesn't look very clean either, but at least the area near the core seems undamaged. The two attachments for each PD corresponds to the two available lighting settings on the fiber microscope.

I have not attempted to clean them yet, though I have also borrowed the cleaning supplies to facilitate this from Johannes. I also plan to inspect the ends of all other fiber connections before re-installing them.

Quote:

Last week, we were talking about reviving the Fiber ALS box. Right now, it's not in great shape. Some changes to be made:

  1. Supply power to the PDs (Menlo FPD310) via a power regulator board. The datasheet says the current consumption per PD is 250 mA. So we need 500mA. We have the D1000217 power regulator board available in the lab. It uses the LM2941 and LM2991 power regulator ICs, both of which are rated for 1A output current, so this seems suitable for our purposes. Thoughts?
  2. Install power decoupling capacitors on the PDs.
  3. Clean up the fiber arrangement inside the box.
  4. Install better switches, plus LED indicators.
  5. Cover the box.
  6. Install it in a better way on the PSL table. Thoughts? e.g. can we mount the unit in some electronics rack and route the fibers to the rack? Perhaps the PSL IR and one of the arm fibers are long enough, but the other arm might be tricky.

Previous elog thread about work done on this box: elog11650

 

Attachment 1: IMG_7471.JPG
IMG_7471.JPG
Attachment 2: IMG_7472.JPG
IMG_7472.JPG
Attachment 3: IMG_7473.JPG
IMG_7473.JPG
Attachment 4: IMG_7474.JPG
IMG_7474.JPG
  13205   Mon Aug 14 19:41:46 2017 JamieUpdateCDSfront-end/DAQ network down for kernel upgrade, and timing errors

I'm upgrading the linux kernel for all the front ends to one that is supposedly more stable and won't freeze when we unload RTS models (linux-image-3.2.88-csp).  Since it's a different kernel version it requires rebuilds of all kernel-related support stuff (mbuf, symmetricom, mx, open-mx, dolphin) and all the front end models.  All the support stuff has been upgraded, but we're now waiting on the front end rebuilds, which takes a while.

Initial testing indicates that the kernel is more stable; we're mostly able to unload/reload RTS modules without the kernel freezing.  However, the c1iscey host seems to be oddly problematic and has frozen twice so far on module unloads.  None of the other hosts have frozen on unload (yet), though, so still not clear.

We're now seeing some timing errors between the front ends and daqd, resulting in a "0x4000" status message in the 'C1:DAQ-DC0_*_STATUS' channels.  Part of the problem was an issue with the IRIG-B/GPS receiver timing unit, which I'll log in a separate post.  Another part of the problem was a bug in the symmetricom driver, which has been resolved.  That wasn't the whole problem, though, since we're still seeing timing errors.  Working with Jonathan to resolve.

  13206   Mon Aug 14 20:01:38 2017 gautamUpdateSUSGlitches stay on MC1

I don't think we can say for sure. I was just talking to EricQ about this, he said the glitches were often seen when changing the alignment offsets when aligning the arm. I am pretty sure I have seen the ETMX alignment change abruptly since the Ruby Standoff replacement (the Oplev spot just slides across the MEDM display rapidly), but I can't find an elog where I've put in details. I also haven't done a whole lot of work with the arm cavities where I would have noticed this problem. There is this test that Eric did, and it didn't throw up any red flags. But the suspension can be well behaved for weeks at a time before this problem pops up again.

There was also the flaky power connection to the timing card on the ETMX expansion chassis which was fixed only recently, after which there has been no systematic investigation of the status of ETMX.

If it is true that these events are caused by strain building up in the suspension wire, I wonder how we can take systematic steps to avoid it. From what I remember of the SOS assembly procedure, the (unglued) standoff is slid along the optic with the wire under slight tension until the wire slips into the groove on the standoff. Then the tension in the wire is adjusted till the optic is pitch balanced and at the desired height. But it is easy to imagine imprinting some torsional stresses in the (40 um?) wire during this process of looping it around under the optic and placing it in the groove. But perhaps this mechanism makes a negligible contribution to the effect we are seeing, and some other mechanism is responsible in this case.

Quote:

We used to have similar suspension excursion at ETMX. This was the motivation to replace the stand-offs from Al ones to ruby ones. Did the replacement solve the issue at ETMX?

 

  13207   Mon Aug 14 20:12:09 2017 Jamie, GautumUpdateCDSWeird problem with GPS receiver

Today we saw a weird issue with the GPS receiver (EndRun Technologies Tempus LX).  GPS timing on fb1 (which is handled via IRIG-B connection to the receiver with a spectracom card) was off by +18 seconds.  We tried resetting the GPS receiver and it still came up with +18 second offset.  To be clear, the GPS receiver unit itself was showing a time on it's front panel that looked close enough to 24-hour UTC, but was off by +18s.  The time also said "GPS" vertically to the right of the time.

We started exploring the settings on the GPS receiver and found this menu item:

Clock -> "Time Mode" -> "UTC"/"GPS"/"Local"

The setting when we found it was "GPS", which seems logical enough.  However, when we switched it to "UTC" the time as shown on the front panel was correct, now with "UTC" vertically to the right of the time, and fb1 was then showing the correct GPS time.

From the manual:

Time Mode
Time mode defines the time format used for the front-panel time display and, if installed, the optional
time code or Serial Time output. The time mode does not affect the NTP output, which is always
UTC. Possible values for the time mode are GPS, UTC, and local time. GPS time is derived from
the GPS satellite system. UTC is GPS time minus the current leap second correction. Local time is
UTC plus local offset and Daylight Savings Time. The local offset and daylight savings time displays
are described below.

The fact that moving to "UTC" fixed the problem, even though that is supposed to remove the leap second correction, might indicate that there's another bug in the symmetricom driver...

  13208   Tue Aug 15 08:22:16 2017 SteveUpdateGeneralprojector light bulb replaced

BL-FS300C-PH-LE was replaced after 2,904 hrs  It did not explode this time.  The 4 monts life period is actually pritty good because this is a $73.  cheap bulb. The best-high priced warranty is 5 months.

 

PS: future option_bulbless laser projector

HITACHI LP-WU3500 PROJECTOR	$2,549.00	
DLP, WUXGA, 3500 LUMENS, HDMI, 20K HR LASER, 5YR WARRANTY, 1.36-2.34:1 LENS, 24/7 DUTY CYCLE
45x72 IMAGE FROM 8' TO 13'10" LENS TO SCREEN, AND AT 10' APPROX. 154FL OF BRIGHTNESS ON A 1.0 GAIN SCREEN
http://www.projectorcentral.com/pdf/projector_spec_9722.pdf

This would give you everything you are requesting, plus a lamp-less design and 5yr warranty.  Ground shipping would be free anywhere in the lower 48, and we would not charge sales tax on orders billing/shipping outside of AZ.  If you have any questions or if you would like to order... just let me know!
  13210   Tue Aug 15 13:32:38 2017 KiraUpdatePEMtemperature sensor

Tested to make sure that even when only the AD586 was heated that there was no change in the reading. I did so by placing the AD586 away from the rest of the circuit and blowing hot air only on it. There was, in fact, no change. 

Quote:

I didn't realize that the LT1012 needed an additional input to function. I added in +15V and -15V to pins 7 and 4, respectively and placed a 10k resistor and the numbers make more sense now. The voltage showed a negative value, but it became more negative as I heated it up (it's negative due to how a transimpedance amplifier works).

I have attached the new setup and the value it shows (~-3V). It became more negative by about 0.4V, which translates to about a 40K increase in temperature, which makes sense.

In addition, I have attached an updated sketch of the circuit. I will need to do more testing to determine how accurate this is. The next step would be to calculate how much noise there is currently and figure out how to remove this circuit from the breadboard and use a PCB or something like that for final testing in an insulated container.

The reason I chose AD743 initially for the OP amp is because at low frequencies (which is what we are working with), a FET amp such as AD743 will have a low current noise at high impedance, which is what we have in this case. While a FET amp has high voltage noise compared to other OP amps, the current noise becomes more important at high impedance, so it will work better. According to Zach's graphs, the AD743 is best at high impedances, followed by LT1012.

Quote:

Decided to try adding in an OP amp just to see if it would work. Added LT1012 and a 100k resistor to the circuit (I originally wanted to do AD743 as it seems to be the best choice according to Zach's elog here, but it said that they are very precious so I went with LT1012 for testing purposes). When heating it with a heating gun, the output voltage went down by a few 0.01V. The maximum voltage was 0.686V. Similar thing happened when I switched to a 10k resistor, where the maximum was 0.705V and it also went down by a few 0.01V upon heating.

I've attached a few pictures showing the circuit.

 

 

  13211   Tue Aug 15 16:32:42 2017 Jamie, GautumUpdateCDSGPS receiver apparently set to correct mode as "UTC"
Quote:

The setting when we found it was "GPS", which seems logical enough.  However, when we switched it to "UTC" the time as shown on the front panel was correct, now with "UTC" vertically to the right of the time, and fb1 was then showing the correct GPS time.

From Keith Thorne:

In the GPS receiver, you are trying to match the IRIG-B output format that is created by the aLIGO IRIG-B Fanout.  Since we have to prep the aLIGO IRIG-B Fanout every time there is a leap second coming, I would suspect that we are sending UTC to the IRIG-B receivers.  Thus, the GPS receiver needs to be set to that mode.

Soooo, "UTC" is the correct mode for the GPS receiver.

  13212   Wed Aug 16 14:54:13 2017 gautamUpdateCDSPSL monitoring Acromag EPICS server restarted

[johannes, gautam, jamie]

  • Made a directory /opt/rtcds/caltech/c1/scripts/Acromag/PSL where I copied over the files needed my modbusApp to start the server from Lydia's user directory
  • Edited /ligo/apps/ubuntu12/ligoapps-user-env.sh to export a couple of EPICS variables to facilitate easy startup of the EPICS server
  • Started a tmux session on (soon to be re-christened?) megatron called "acroEPICS"
  • Ran the following command to start up the EPICS server:
${EPICS_MODULES}/modbus/bin/${EPICS_HOST_ARCH}/modbusApp npro_config.cmd

To do:

  1. Make a startup script that runs the above command - eventually this can contain the initialization instructions for all the Acromags
  2. Figure out the initctl/systemctl stuff to make the server automatically restart if it drops for some reason (e.g. power failure)
  13213   Wed Aug 16 14:57:01 2017 SteveUpdatePSLref Cavity heating blanket power supply

The last entry I found relating to ref cavity was 2011 Aug 19

Attachment 1: refCavPSheat.jpg
refCavPSheat.jpg
Attachment 2: refCavTempSens.jpg
refCavTempSens.jpg
Attachment 3: refTempSen_1X1.jpg
refTempSen_1X1.jpg
  13214   Wed Aug 16 16:05:53 2017 KiraUpdatePEMtemp sensor PCB

Tried taking the circuit from the breadboard to the PCB. I attached all the components to adapters that would allow them to be connected to the PCB. From the first picture, the first component is AD586, the second is AD590, and the third is LT1012, along with a resistor across it. I then soldered the connections between the components, as can be seen in the second picture. When I tested out this version of the circuit by hooking it up to the DC source, I got a reading of ~-15V. I will have to check all the connections to make sure there is contact where there should be one, and no contact where there shouldn't be. I had issues attaching the tiny AD590 and LT1012 to its adaptor, so the issue may lie there as well. I'll also check that each component is in working order as well.

Once I figure out where my error is, my plan is to build two more of these and place a metal object such that it contacts only the surface of the AD590s. This would allow me to compare the three values to the actual temperature of the metal, which would then tell me how accurate this setup is.

Note on the resistor: I measured all the resistors and chose three that had exactly 10.00k Ohm. The voltage detected is dependent on the resistor, so if we are to take three identical copies, I ensured that there would be no error due to the resistors being a little different.

Attachment 1: IMG_20170816_154514.jpg
IMG_20170816_154514.jpg
Attachment 2: IMG_20170816_154541.jpg
IMG_20170816_154541.jpg
  13215   Wed Aug 16 17:05:53 2017 JamieUpdateCDSfront-end/DAQ network down for kernel upgrade, and timing errors

The CDS system has now been up moved to a supposedly more stable real-time-patched linux kernel (3.2.88-csp) and RCG r4447 (roughly the head of trunk, intended to be release 3.4).  With one major and one minor exception, everything seems to be working:

The remaining issues are:

  • RFM network down.  The IOP models on all hosts on the RFM network are not detecting their RFM cards.  Keith Thorne thinks that this is because of changes in trunk to support the new long-range PCIe that will be used at the sites, and that we just need to add a new parameter to the cdsParameters block in models that use RFM.  Him and Rolf are looking into for us.
  • The 3.2.88-csp kernel is still not totally stable.  On most hosts (c1sus, c1ioo, c1iscex) it seems totally fine and we're able to load/unload models without issue.  c1iscey is definitely problematic, frequently freezing on module unload.  There must be a hardware/bios issue involved here.  c1lsc has also shown some problems.  A better kernel is supposedly in the works.
  • NDS clients other than DTT are still unable to raise test points.  This appears to be an issue with the daqd_rcv component (i.e. NDS server) not properly resolving the front ends in the GDS network.  Still looking into this with Keith, Rolf, and Jonathan.

Issues that have been fixed:

  • "EDCU" channels, i.e. non-front-end EPICS channels, are now being acquired properly by the DAQ.  The front-ends now send all slow channels to the daq over the MX network stream.  This means that front end channels should no longer be specified in the EDCU ini file.  There were a couple in there that I removed, and that seemed to fix that issue.
  • Data should now be recorded in all formats: full frames, as well as second, minute, and raw_minute trends
  • All FE and DAQD diagnostics are green (other than the ones indicating the problems with the RFM network).  This was fixed by getting the front ends models, mx_stream processes, and daqd processes all compiled against the same version of the advLigoRTS, and adding the appropriate command line parameters to the mx_stream processes.
  13216   Wed Aug 16 17:14:02 2017 KojiUpdateCDSfront-end/DAQ network down for kernel upgrade, and timing errors

What's the current backup situation?

  13217   Wed Aug 16 18:01:28 2017 JamieUpdateCDSfront-end/DAQ network down for kernel upgrade, and timing errors
Quote:

What's the current backup situation?

Good question.  We need to figure something out.  fb1 root is on a RAID1, so there is one layer of safety.  But we absolutely need a full backup of the fb1 root filesystem.  I don't have any great suggestions, other than just getting an external disk, 1T or so, and just copying all of root (minus NFS mounts).

  13218   Wed Aug 16 18:06:01 2017 KojiUpdateCDSfront-end/DAQ network down for kernel upgrade, and timing errors

We also need to copy chiara's root. What is the best way to get the full image of the root FS?
We may need to restore these root images to a different disk with a different capacity.

Is the dump command good for this?

  13219   Wed Aug 16 18:50:58 2017 JamieUpdateCDSfront-end/DAQ network down for kernel upgrade, and timing errors
Quote:

The remaining issues are:

  • RFM network down.  The IOP models on all hosts on the RFM network are not detecting their RFM cards.  Keith Thorne thinks that this is because of changes in trunk to support the new long-range PCIe that will be used at the sites, and that we just need to add a new parameter to the cdsParameters block in models that use RFM.  Him and Rolf are looking into for us.

RFM network is back!  Everything green again.

Use of RFM has been turned off in adLigoRTS trunk in favor of the new long-range PCIe networking being developed for the sites.  Rolf provided a single-line patch that re-enables it:

controls@c1sus:/opt/rtcds/rtscore/trunk 0$ svn diff
Index: src/epics/util/feCodeGen.pl
===================================================================
--- src/epics/util/feCodeGen.pl    (revision 4447)
+++ src/epics/util/feCodeGen.pl    (working copy)
@@ -122,7 +122,7 @@
 $diagTest = -1;
 $flipSignals = 0;
 $virtualiop = 0;
-$rfm_via_pcie = 1;
+$rfm_via_pcie = 0;
 $edcu = 0;
 $casdf = 0;
 $globalsdf = 0;
controls@c1sus:/opt/rtcds/rtscore/trunk 0$

This patched was applied to RTS source checkout we're using for the FE builds (/opt/rtcds/rtscore/trunk, which is r4447, and is linked to /opt/rtcds/rtscore/release).  The following models that use RFM were re-compiled, re-installed, and re-started:

  • c1x02
  • c1rfm
  • c1x03
  • c1als
  • c1x01
  • c1scx
  • c1asx
  • c1x05
  • c1scy
  • c1tst

The re-compiled models now see the RFM cards (dmesg log from c1ioo):

[24052.203469] c1x03: Total of 4 I/O modules found and mapped
[24052.203471] c1x03: ***************************************************************************
[24052.203473] c1x03: 1 RFM cards found
[24052.203474] c1x03:     RFM 0 is a VMIC_5565 module with Node ID 180
[24052.203476] c1x03: address is 0xffffc90021000000
[24052.203478] c1x03: ***************************************************************************

This cleared up all RFM transmission error messages.

CDS upstream are working to make this RFM usage switchable in a reasonable way.

  13220   Wed Aug 16 19:50:17 2017 gautamUpdateSUSMC1 <--> MC3 switched back

Now that all the CDS overview lights are green, I decided to switch back the coil driver outputs to their original state so that the MC optics could be damped and the IMC relocked. I also restored the static PIT/YAW bias values to their original values.

MC1 has been quiet over the last couple of days, lets see how it behaves in the next few days. In all the glitches I have observed, if the IMC is locked and WFS loops are enabled, the loops are able to correct for the DC misalignment caused by the glitch. But the mcwfs off script is currently set up in such a way that the output history is cleared between IMC locks. I made two copies of the mcwfson/mcwfsoff scripts, called mcwfsunhold/mcwfshold respectively. They live in /opt/rtcds/caltech/c1/scripts/MC/WFS. I've also modified the autolocker script to call these modified scripts, such that when the IMC loses lock, the WFS servo outputs are held, while the input is turned off. The hope is that in this configuration, the autolocker can catch a lock even if there is a glitch on MC1.

I haven't tried locking the arms yet, but I think other IFO work discussed at the meeting (like arm loss estimation / cavity scans etc) can proceed.

Quote:

In order to further narrow down the cause of the glitch, we switched the Coil Driver Board --> Satellite box DB(15?) connectors on the coil drivers between MC1 and MC3 coil driver boards. I also changed the static PIT/YAW bias voltages to MC1 and MC3 such that MC-REFL is now approximately back to the center of the CCD monitor.

 

 

  13221   Wed Aug 16 20:01:03 2017 gautamUpdateGeneralSUS model ASC input weirdness

I'm not sure if this has something to do with the model restarts / new RCG, but while I was re-enabling the MC watchdogs, I noticed the RMS sensor voltage channels on ITMX hovering around ~100mV, even though local damping was on (in which configuration I would expect <1mV if everything is working normally).  I was confused by this behaviour, and after staring at the ITMX suspension screen for a while, I noticed that the input to the "ASCP" and "ASCY" servos were "-nan", and the outputs were 10^20 cts frown (see Attachment #1).

Digging a little deeper, I found that the same problem existed on ITMY, ETMX, ETMY, PRM (but not BS or SRM) - reasons unknown for now.

I have to check where this signal is coming from, but for now I just turned the "ASC Input" switch off. More investigation to be done, but in the meantime, ASS dither alignment may not be possible.

After consulting with Jamie, I have just disabled all outputs to the suspensions other than local damping loop outputs. I need to figure out how to get this configuration into the safe.snap file such that until we are sure of what is going on, the models start up in this safer configuration.

gedit 28 Oct 0026: Seems like this problem is seen at the sites as well. I wonder if the problem is related.

Attachment 1: ITMX_ASC.png
ITMX_ASC.png
  13222   Wed Aug 16 20:24:23 2017 gautamUpdateALSFiber ALS

Today, with Johannes' help, I cleaned the fiber tips of the photodiodes. The effect of the cleaning was dramatic - see Attachments #1-4, which are X Beat PD, axial illumination, X Beat PD, oblique illumination, Y beat PD, axial illumination, Y beat PD, oblique illumination. They look much cleaner now, and the feature that looked like a scratch has vanished.

The cleaning procedure followed was:

  • Blow clean air over the fiber tip
  • First, we tried cleaning with the Q-tip like tool, but the results weren't great. The way to use it is to dip the tip in the cleaning solvent for a few seconds, hold the tip to the fiber taking into account the angled cut, and apply 10 gentle quarter turns.
  • Next, we tried cleaning with the wipes. We peeled out an approximately 5" section of the wipe, and laid it out on the table. We then applied cleaning solvent liberally on the central area where we were sure we hadn't touched the wipe. Then you just drag the fiber tip along the soaked part of the wipe. If you get the angle exactly right, the fiber glides smoothly along the surface, but if you are a little misaligned, you get a scratchy sensation. 
  • Blow dry and inspect.

I will repeat this procedure for all fiber connections once I start putting the box back together - I'm almost done with the new box, just waiting on some hardware to arrive.

 

Quote:

Today, I borrowed the fiber microscope from Johannes and took a look at the fibers coupled to the PDs. The PD labelled "BEAT PD AUX Y" has an end that seems scratched (Attachments #1 and #2). The scratch seems to be on (or at least very close to) the core. The other PD (Attachments #3 and #4) doesn't look very clean either, but at least the area near the core seems undamaged. The two attachments for each PD corresponds to the two available lighting settings on the fiber microscope.

I have not attempted to clean them yet, though I have also borrowed the cleaning supplies to facilitate this from Johannes. I also plan to inspect the ends of all other fiber connections before re-installing them.

 

Attachment 1: IMG_7476.JPG
IMG_7476.JPG
Attachment 2: IMG_7477.JPG
IMG_7477.JPG
Attachment 3: IMG_7478.JPG
IMG_7478.JPG
Attachment 4: IMG_7479.JPG
IMG_7479.JPG
  13223   Thu Aug 17 08:42:27 2017 SteveUpdatePSLPSL HEPA

The PSL HEPA was running noisy at 100V   The bearing is wearing out. I turned it down to 30V It is quiet there.

  13224   Thu Aug 17 10:41:58 2017 KiraUpdatePEMtemp sensor PCB

Got it to work. One of the connections was faulty. I decided to check the temperature measured against a thermometer. The sensor showed 26.1 C, but the thermometer showed 25.8 C after I let them both cool down after heating them up. The temperature of the thermometer was dropping at the time of measurement, but the temperature of the sensor was not. This is still a rough version of the final sensor, so I'm not sure what exactly causes this discrepancy.

Quote:

Tried taking the circuit from the breadboard to the PCB. I attached all the components to adapters that would allow them to be connected to the PCB. From the first picture, the first component is AD586, the second is AD590, and the third is LT1012, along with a resistor across it. I then soldered the connections between the components, as can be seen in the second picture. When I tested out this version of the circuit by hooking it up to the DC source, I got a reading of ~-15V. I will have to check all the connections to make sure there is contact where there should be one, and no contact where there shouldn't be. I had issues attaching the tiny AD590 and LT1012 to its adaptor, so the issue may lie there as well. I'll also check that each component is in working order as well.

Once I figure out where my error is, my plan is to build two more of these and place a metal object such that it contacts only the surface of the AD590s. This would allow me to compare the three values to the actual temperature of the metal, which would then tell me how accurate this setup is.

Note on the resistor: I measured all the resistors and chose three that had exactly 10.00k Ohm. The voltage detected is dependent on the resistor, so if we are to take three identical copies, I ensured that there would be no error due to the resistors being a little different.

 

Attachment 1: IMG_20170817_095917.jpg
IMG_20170817_095917.jpg
  13225   Thu Aug 17 11:17:49 2017 gautamUpdateSUSMC1 <--> MC3 switched back

Seems like this modification didn't really work. There were several large MC1 glitches, and one of them misaligned MC1 so much that the IMC didn't relock for the last ~6 hours. I re-aligned MC1 manually, and now it is locked fine.

Quote:

Now that all the CDS overview lights are green, I decided to switch back the coil driver outputs to their original state so that the MC optics could be damped and the IMC relocked. I also restored the static PIT/YAW bias values to their original values.

MC1 has been quiet over the last couple of days, lets see how it behaves in the next few days. In all the glitches I have observed, if the IMC is locked and WFS loops are enabled, the loops are able to correct for the DC misalignment caused by the glitch. But the mcwfs off script is currently set up in such a way that the output history is cleared between IMC locks. I made two copies of the mcwfson/mcwfsoff scripts, called mcwfsunhold/mcwfshold respectively. They live in /opt/rtcds/caltech/c1/scripts/MC/WFS. I've also modified the autolocker script to call these modified scripts, such that when the IMC loses lock, the WFS servo outputs are held, while the input is turned off. The hope is that in this configuration, the autolocker can catch a lock even if there is a glitch on MC1.

I haven't tried locking the arms yet, but I think other IFO work discussed at the meeting (like arm loss estimation / cavity scans etc) can proceed.

 

 

Attachment 1: MC1_misaligned.png
MC1_misaligned.png
Attachment 2: MC1_glitch.png
MC1_glitch.png
  13226   Thu Aug 17 17:33:01 2017 gautamUpdateSUSMC1 <--> MC3 switched back

that's why the Autolocker clears the outputs; we don't want to be holding the offsets from the last ms of lock when it was all messed up; instead it would be best to have a slow (~mHz) relief script that takes the WFS controls and puts them onto the MC SUS sliders. This would then re-align the MC to the input beam rather than the input to the MC. Which is not the best idea.

Quote:

Seems like this modification didn't really work.

 

  13227   Thu Aug 17 22:54:49 2017 ericqUpdateComputersTrying to access JetStor RAID files

The JetStor RAID unit that we had been using for frame writing before the fb meltdown has some archived frames from DRFPMI locks that I want to get at. I spent some time today trying to mount it on optimus with no success crying

The unit was connected to fb via a SCSI cable to a SCSI-to-PCI card inside of fb. I moved the card to optimus, and attached the cable. However, no mountable device corresponding to the RAID seems to show up anywhere.

The RAID unit can tell that it's hooked up to a computer, because when optimus restarts, the RAID event log says "Host Channel 0 - SCSI Bus Reset."

The computer is able to get some sort of signals from the RAID unit, because when I change the SCSI ID, the syslog will say 'detected non-optimal RAID status'.

The PCI card is ID'd fine in lspci as "06:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev c1)"

'lsssci' does not list anything related to the unit

Using 'mpt-status -p', which is somehow associated with this kind of thing returns the disheartening output:

Checking for SCSI ID:0
Checking for SCSI ID:1
Checking for SCSI ID:2
Checking for SCSI ID:3
Checking for SCSI ID:4
Checking for SCSI ID:5
Checking for SCSI ID:6
Checking for SCSI ID:7
Checking for SCSI ID:8
Checking for SCSI ID:9
Checking for SCSI ID:10
Checking for SCSI ID:11
Checking for SCSI ID:12
Checking for SCSI ID:13
Checking for SCSI ID:14
Checking for SCSI ID:15
Nothing found, contact the author
 
I don't know what to try at this point.
  13228   Fri Aug 18 21:58:35 2017 gautamUpdateGeneralSUS model ASC input weirdness

I spent some time today trying to debug this issue.

Jamie and I had opened up the c1sus frontend to try and replace the RFM card before we realized that the problem was in the RCG code generator. During this process, we had disconnected all of the back-panel cabling to this machine (2 ethernet cables, dolphin cable, and RFM cables/fibers). I thought I may have accidentally returned the cables to the wrong positions - but all the status indicator lights indicate that everything is working as it should, and I also confirmed that the cabling is as it is in the pictures of the rack on the wiki page.

Looking at the SimuLink model diagram (see Attachment #1 for example), it looks like (at least some of) these channels are actually on the dolphin network, and not the RFM network (with which we were experiencing problems). This suggests that the problem is something deeper. Although I did see nans in some of the ETMX ASC channels as well, for which the channels are piped over the RFM network. Even more puzzling is that the ASC MEDM screen (Attachment #3) and the SimuLink diagram (Attachment #2) suggest that there is an output matrix in between the input signals and the output angular control signals to the suspensions. As Attachment #4 shows, the rows corresponding to ITMX PIT and YAW are zero (I confirmed using z read <matrixElement>). Attachment #3 shows that the output of all the servo banks except CARM_YAW is zero, but CARM_YAW has no matrix element going to the ITMs (also confirmed with z read <servoOutputChannel>). So 0 x 0 should be 0, but for some reason the model doesn't give this output?

GV Edit: As EricQ just pointed out to me, nan x 0 is still nan, which probably explains the whole issue. Poking a little further, it seems like this is an SDF issue - the SDF table isn't able to catch differences for this hold output channel.


As I was writing this elog, I noticed that, as mentioned above, the CARM_YAW output was "nan". When I restart the model (thankfully this didn't crash c1lsc!), it seems to default to this state. Opening up the filter module, I saw that the "hold output" was enabled.

Toggling that switch made the nans in all the SUS ASC channels disappear. Mysterious indecision.

All the points above stand - CARM_YAW output shouldn't have been going anywhere as per the output matrix, but it seems to have been responsible? Seems like a bug in any case if a model restarts with a field as "nan".

Anyways the problem seems to have been resolved so I'm going to try locking and dither aligning the arms now.

Rolf mentioned that a simple update could fix several of the CDS issues we are facing (e.g. inability to open up testpoints), but he didn't seem to have any insight into this particular issue. Jamie will try and recompile all the models and then we have to see if that fixes the remaining problems.

Quote:
 

I have to check where this signal is coming from, but for now I just turned the "ASC Input" switch off. More investigation to be done, but in the meantime, ASS dither alignment may not be possible.

After consulting with Jamie, I have just disabled all outputs to the suspensions other than local damping loop outputs. I need to figure out how to get this configuration into the safe.snap file such that until we are sure of what is going on, the models start up in this safer configuration.

 

Attachment 1: ITMXP.png
ITMXP.png
Attachment 2: ASC_model_outmatrix.png
ASC_model_outmatrix.png
Attachment 3: ASC_medm.png
ASC_medm.png
Attachment 4: ASC_outMat.png
ASC_outMat.png
  13229   Fri Aug 18 23:59:53 2017 gautamUpdateALSX Arm ALS lock

[ericq, gautam]

  • I was just getting the IFO aligned, and single arm lock going, when EricQ came in and asked if we could get some ALS data.
  • ALS beats seemed fine, in particular the X-Arm. The broad hump around ~70Hz that was present in my previous ALS update was nowhere to be seen - reasons unknown.
  • Copied over /opt/rtcds/caltech/c1/scripts/YARM/Lock_ALS_YARM.py to /opt/rtcds/caltech/c1/scripts/XARM/Lock_ALS_XARM.py. Could be useful when we want to do arm cavity scans.
  • Made appropriate changes to allow ALS locking of Xarm - the testpoint inaccessibility makes things a little annoying but for tonight we just used DQ channels in place (or slow channels when DQ chans were not available)
  • Calibration of X arm error signal seemed off - so we fixed it by driving a line in ETMX and matching up the peaks in the ALS error signal and POX11. We then updated the gain of the filter in the CINV filter bank accordingly.
  • Got some decent data - X arm stayed locked on ALS for >60mins, during which time the Y arm stayed locked on POY11, and the Y green also reained locked yes. There was no evidence of the X arm 00 mode randomly dropping out of lock tonight.
  • EQ will update with a sick comparison plot - today we looked at the ALS noise from the perspective of the Green Locking Izumi et. al. paper.
  • Y arm ALS noise didn't look so hot tonight - to be investigated...

Leaving LSC mode OFF for now while CDS is still under investigation


Not really related to this work: We saw that the safe.snap file for c1oaf seems to have gotten overwritten at some point. I restored the EPICS values from a known good time, and over-wrote the safe.snap file.

  13230   Sat Aug 19 01:35:08 2017 ericqUpdateALSX Arm ALS lock

My motivation tonight was to get an up-to-date spectrum of a calibrated measurement of the out-of-loop displacement of an arm locked on ALS (using the PDH signal as the out-of-loop sensor) to compare the performance of ALS control noise with the Izumi et al green locking paper. 

I was able to fish out the PSD from the paper from the 40m svn, but the comparison as plotted looks kind of fishy. I don't see why the noise from 10-60Hz should be so different/worse. We updated the POX counts to meters conversion by looking at the Hz-calibrated ALSX signal and a ~800Hz line injected on ETMX.

Attachment 1: ALS_comparison.pdf
ALS_comparison.pdf
  13231   Mon Aug 21 09:08:54 2017 SteveUpdateSUStiny glitches

They are synchronised tiny glitches. They are not mechanical.
 

Attachment 1: glitches.png
glitches.png
  13232   Mon Aug 21 13:07:08 2017 KiraUpdatePEMtemp sensor PCB

On Friday, I cleaned up the circuit so that there are only three connections needed (+15V, -15V, GND) and a BNC connector for reading the output. Today, I added in bypass capacitors. The small yellow ones are 0.1 microF ceramic, and the large ones are 100 microF electrolytic. They are used to stabilize the +15V and -15V inputs to the OP amp and minimize fluctuations, since it doesn't have a regulator for stability. I have also attached the circuit diagram for the OP amp only, where 1 are the electrolytic and 2 are the ceramic. The temperature is still about 2 degrees off, but if that difference is constant for all temperatures in our range we can just calibrate it later.

Here is a helpful link on bypass capacitors (thanks to Kevin for sending it to me).

As a note, the electrolytic capacitors do have a polarity, so it is important to place them correctly (the negative side is towards the lower voltage potential, and not always towards ground).

Quote:

Got it to work. One of the connections was faulty. I decided to check the temperature measured against a thermometer. The sensor showed 26.1 C, but the thermometer showed 25.8 C after I let them both cool down after heating them up. The temperature of the thermometer was dropping at the time of measurement, but the temperature of the sensor was not. This is still a rough version of the final sensor, so I'm not sure what exactly causes this discrepancy.

Quote:

Tried taking the circuit from the breadboard to the PCB. I attached all the components to adapters that would allow them to be connected to the PCB. From the first picture, the first component is AD586, the second is AD590, and the third is LT1012, along with a resistor across it. I then soldered the connections between the components, as can be seen in the second picture. When I tested out this version of the circuit by hooking it up to the DC source, I got a reading of ~-15V. I will have to check all the connections to make sure there is contact where there should be one, and no contact where there shouldn't be. I had issues attaching the tiny AD590 and LT1012 to its adaptor, so the issue may lie there as well. I'll also check that each component is in working order as well.

Once I figure out where my error is, my plan is to build two more of these and place a metal object such that it contacts only the surface of the AD590s. This would allow me to compare the three values to the actual temperature of the metal, which would then tell me how accurate this setup is.

Note on the resistor: I measured all the resistors and chose three that had exactly 10.00k Ohm. The voltage detected is dependent on the resistor, so if we are to take three identical copies, I ensured that there would be no error due to the resistors being a little different.

 

 

Attachment 1: IMG_20170821_124121.jpg
IMG_20170821_124121.jpg
Attachment 2: IMG_20170821_124429~2.jpg
IMG_20170821_124429~2.jpg
Attachment 3: IMG_20170821_124108.jpg
IMG_20170821_124108.jpg
  13233   Mon Aug 21 14:53:32 2017 gautamUpdateVACRGA reset

[gautam, steve]

In the aftermath of the accidental vent, it looks like the RGA was shutdown.

We followed the instructions in this elog to restart the RGA.

Seems to be working now, Steve says we just need to wait for it to warm up before we can collect a reliable scan.

Quote:

We have good RGA scan now. There was no scan for 3 months.

 

  13234   Mon Aug 21 16:35:48 2017 gautamUpdateVACUPS checkup

[steve, gautam]

At Rolf/Rich Abbott's request, we performed a check of the UPS today.

Steve believed that the UPS was functioning as it should, and the recent accidental vent was because the UPS batteries were insufficiently charged when the test was performed. Today, we decided to try testing the UPS.

We first closed V1, VM1 and VA6 using the MEDM screen. We prepared to pull power on all these valves by loosening the power connections (but not detaching them). [During this process, I lost the screw holding the power cord fixed to the gate valve V1 - we are looking for a replacement right now but it seems to be an odd size. It is cable tied for now.]

The battery charge indicator LEDs on the UPS indicated that the batteries were fully charged.

Next, we hit the "Test" button on the UPS - it has to be held down for ~3 seconds for the test to be actually initiated, seems to be a safety feature of the UPS. Once the test is underway, the LED indicators on the UPS will indicate that the loading is on the UPS batteries. The test itself lasts for ~5seconds, after which the UPS automatically reverts to the nominal configuration of supplying power from the main line (no additional user input is required).

In this test, one of the five battery charge indicator LEDs went off (5 ON LEDs indicate full charge).

So on the basis of this test, it would seem that the UPS is functioning as expected. It remains to be investigated if the various hardware/software interlocks in place will initiate the right sequence of valve closures when required.


Quote:
 

Never hit O on the Vacuum UPS !

Note: the " all off " configuration should be all valves closed ! This should be fixed now.

In case of  emergency  you can close V1 with disconnecting it's actuating power as shown on Atm3 if you have peumatic pressure 60 PSI 

 

  13237   Mon Aug 21 23:38:55 2017 gautamUpdateALSALS out-of-loop noise

I worked a little bit on the Y arm ALS today. 

  • Started by locking the Y arm to IR with POY, and then ran the dither alignment script to maximize Y arm transmission.
  • Green TRY DC monitor was around 0.16, whereas I have seen ~0.45 when we were doing DRFPMI locking.
  • So I went to the Y end table and tweaked the steering mirrors a little. I was able to get GTRY to ~0.42. I think this can be tweaked a little further but I decided to push on for tonight.
  • The beat amplitude on the network analyzer in the control room is comparable to the X arm beat now.
  • Adjusted the gain of the phase tracker servos, cleared phase history.
  • Looking at the ALS beat noise with the arms locked to IR and the slow ALS temperature control loops ON (see Attachment #1), the current measurements line up quite well with the reference traces.

I am now going to measure the OLTFs of both green PDH loops to check that the overall loop gain is okay, and also check the measurement against EricQ's LISO model of the (modified) AUX green PDH servos. Results to follow.


Some weeks ago, I had moved some of the Green steering optics on the PSL table around, in order to flip some mirror mounts and try and get angles of incidence closer to ~45deg on some of the steering mirrors. As a result of this work, I can see some light on the GTRY CCD when the X green shutter is open. It is unclear if there is also some scattered light on the RFPDs. I will post pictures + a more detailed investigation of the situation on the PSL table later, there are multiple stray green beams on the PSL table which should probably be dumped.


As I was writing this elog, I saw the X green lock drop abruptly. During this time, the X arm stayed locked to the IR, and the Y arm beat on the control room network analyzer did not jump (at least not by an amount visible to the eye). Toggling the X end shutter a few times, the green TEM00 lock was re-acquired, but the beatnote has moved on the control room analyzer by ~40MHz. On Friday evening however, the X green lock held for >1 hour. Need to keep an eye on this.

Attachment 1: ALS_21082017.pdf
ALS_21082017.pdf
  13238   Tue Aug 22 02:19:11 2017 gautamUpdateALSALS OLTFs

Attachment #1 shows the results of my measurements tonight (SR785 data in Attachment #2). Both loops have a UGF of ~10kHz, with ~55 degrees of phase margin.

Excitation was injected via SR560 at the PDH error point, amplitude was 35mV. According to the LED indicators on these boxes, the low frequency boost stages were ON. Gain knob of the X end PDH box was at 6.5, that of the Y end PDH box was at 4.9. I need to check the schematics to interpret these numbers. GV Edit: According to this elog, these numbers mean that the overall gain of the X end PDH box is approx. 25dB, while that of the Y end PDH box is approx. 15dB. I believe the Y end Lightwave NPRO has an actuator discriminant ~5MHz/V, while the X end Innolight is more like 1MHz/V.

Not sure what to make of the X PDH loop measurement being so much noisier than the Y end, I need to think about this.

More detailed analysis to follow.

Quote:

 

I am now going to measure the OLTFs of both green PDH loops to check that the overall loop gain is okay, and also check the measurement against EricQ's LISO model of the (modified) AUX green PDH servos. Results to follow.

 

Attachment 1: ALS_OLTFs.pdf
ALS_OLTFs.pdf
Attachment 2: ALS_OLTF_Aug2017.zip
  13239   Tue Aug 22 15:17:19 2017 ericqUpdateComputersOld frames accessible again

It turns out the problem was just a bent pin on the SCSI cable, likely from having to stretch things a bit to reach optimus from the RAID unit.frown

I hooked it up to megatron, and it was automatically recognized and mounted. yes

I had to turn off the new FB machine and remove it from the rack to be able to access megatron though, since it was just sitting on top. FB needs a rail to sit on!

At a cursory glance, the filesystem appears intact. I have copied over the achived DRFPMI frame files to my user directory for now, and Gautam is going to look into getting those permanently stored on the LDAS copy of 40m frames, so that we can have some redundancy.

Also, during this time, one of the HDDs in the RAID unit failed its SMART tests, so the RAID unit wanted it replaced. There were some spare drives in a little box directly under the unit, so I've installed one and am currently incorporating it back into the RAID.

There are two more backup drives in the box. We're running a RAID 5 configuration, so we can only lose one drive at a time before data is lost.

  13240   Tue Aug 22 15:40:06 2017 gautamUpdateComputersOld frames accessible again

[jamie, gautam]

I had some trouble getting the daqd processes up and running again using Jamie's instructions.

With Jamie's help however, they are back up and running now. The problem was that the mx infrastructure didn't come back up on its own. So prior to running sudo systemctl restart daqd_*, Jamie ran sudo systemctl start mx. This seems to have done the trick.

c1iscey was still showing red fields on the CDS overview screen so Jamie did a soft reboot. The machine came back up cleanly, so I restarted all the models. But the indicator lights were still red. Apparently the mx processes weren't running on c1iscey. The way to fix this is to run sudo systemctl start mx_stream. Now everything is green.

Now we are going to work on trying the fix Rolf suggested on c1iscex.

Quote:

It turns out the problem was just a bent pin on the SCSI cable, likely from having to stretch things a bit to reach optimus from the RAID unit.frown

I hooked it up to megatron, and it was automatically recognized and mounted. yes

I had to turn off the new FB machine and remove it from the rack to be able to access megatron though, since it was just sitting on top. FB needs a rail to sit on!

At a cursory glance, the filesystem appears intact. I have copied over the achived DRFPMI frame files to my user directory for now, and Gautam is going to look into getting those permanently stored on the LDAS copy of 40m frames, so that we can have some redundancy.

Also, during this time, one of the HDDs in the RAID unit failed its SMART tests, so the RAID unit wanted it replaced. There were some spare drives in a little box directly under the unit, so I've installed one and am currently incorporating it back into the RAID.

There are two more backup drives in the box. We're running a RAID 5 configuration, so we can only lose one drive at a time before data is lost.

 

  13242   Tue Aug 22 17:11:15 2017 gautamUpdateComputersc1iscex model restarts

[jamie, gautam]

We tried to implement the fix that Rolf suggested in order to solve (perhaps among other things) the inability of some utilities like dataviewer to open testpoints. The problem isn't wholly solved yet - we can access actual testpoint data (not just zeros, as was the case) using DTT, and if DTT is used to open a testpoint first, then dataviewer, but DV itself can't seem to open testpoints.

Here is what was done (Jamie will correct me if I am mistaken).

  1. Jamie checked out branch 3.4 of the RCG from the SVN.
  2. Jamie recompiled all the models on c1iscex against this version of RCG.
  3. I shutdown ETMX watchdog, then ran rtcds stop all on c1iscex to stop all the models, and then restarted them using rtcds start <model> in the order c1x01, c1scx and c1asx. 
  4. Models came back up cleanly. I then restarted the daqd_dc process on FB1. At this point all indicators on the CDS overview screen were green.
  5. Tried getting testpoint data with DTT and DV for ETMX Oplev Pitch and Yaw IN1 testpoints. Conclusion as above.

So while we are in a better state now, the problem isn't fully solved. 

Comment: seems like there is an in-built timeout for testpoints opened with DTT - if the measurement is inactive for some time (unsure how much exactly but something like 5mins), the testpoint is automatically closed.

  13243   Tue Aug 22 18:36:46 2017 gautamUpdateComputersAll FE models compiled against RCG3.4

After getting the go ahead from Jamie, I recompiled all the FE models against the same version of RCG that we tested on the c1iscex models.

To do so:

  • I did rtcds make and rtcds install for all the models.
  • Then I ssh-ed into the FEs and did rtcds stop all, followed by rtcds start <model> in the order they are listed on the CDS overview MEDM screen (top to bottom).
  • During the compilation process (i.e. rtcds make), for some of the models, I got some compilation warnings. I believe these are related to models that have custom C code blocks in them. Jamie tells me that it is okay to ignore these warnings at that they will be fixed at some point.
  • c1lsc FE crashed when I ran rtcds stop all - had to go and do a manual reboot.
  • Doing so took down the models on c1sus and c1ioo that were running - but these FEs themselves did not have to be robooted.
  • Once c1lsc came back up, I restarted all the models on the vertex FEs. They all came back online fine.
  • Then I ssh-ed into FB1, and restarted the daqd processes - but c1lsc and c1ioo CDS indicators were still red.
  • Looks like the mx_stream processes weren't started automatically on these two machines. Reasons unknown. Earlier today, the same was observed for c1iscey.
  • I manually restarted the mx_stream processes, at which point all CDS indicator lights became green (see Attachment #1).

IFO alignment needs to be redone, but at least we now have a (admittedly rounabout way) of getting testpoints. Did a quick check for "nan-s" on the ASC screen, saw none. So I am re-enabling watchdogs for all optics.

GV 23 August 9am: Last night, I re-aligned the TMs for single arm locks. Before the model restarts, I had saved the good alignment on the EPICs sliders, but the gain of x3 on the coil driver filter banks have to be manually turned on at the moment (i.e. the safe.snap file has them off). ALS noise looked good for both arms, so just for fun, I tried transitioning control of both arms to ALS (in the CARM/DARM basis as we do when we lock DRFPMI, using the Transition_IR_ALS.py script), and was successful.

Quote:

[jamie, gautam]

We tried to implement the fix that Rolf suggested in order to solve (perhaps among other things) the inability of some utilities like dataviewer to open testpoints. The problem isn't wholly solved yet - we can access actual testpoint data (not just zeros, as was the case) using DTT, and if DTT is used to open a testpoint first, then dataviewer, but DV itself can't seem to open testpoints.

Here is what was done (Jamie will correct me if I am mistaken).

  1. Jamie checked out branch 3.4 of the RCG from the SVN.
  2. Jamie recompiled all the models on c1iscex against this version of RCG.
  3. I shutdown ETMX watchdog, then ran rtcds stop all on c1iscex to stop all the models, and then restarted them using rtcds start <model> in the order c1x01, c1scx and c1asx. 
  4. Models came back up cleanly. I then restarted the daqd_dc process on FB1. At this point all indicators on the CDS overview screen were green.
  5. Tried getting testpoint data with DTT and DV for ETMX Oplev Pitch and Yaw IN1 testpoints. Conclusion as above.

So while we are in a better state now, the problem isn't fully solved. 

Comment: seems like there is an in-built timeout for testpoints opened with DTT - if the measurement is inactive for some time (unsure how much exactly but something like 5mins), the testpoint is automatically closed.

 

Attachment 1: CDS_Aug22.png
CDS_Aug22.png
  13244   Tue Aug 22 23:27:14 2017 ranaUpdateALSALS OLTFs

Didn't someone look at what the OLG req. should be for these servos at some point? I wonder if we can make a parallel digital path that we switch on after green lock. Then we could make this a simple 1/f box and just add in the digital path (take analog control signal into ADC, filter, and then sum into the control point further down the path to the laser) for the low frequency boost.

  13245   Wed Aug 23 10:11:46 2017 SteveUpdateVACvacuum valve specifications

The V1 gate valve specs installed at 40m wiki page. VAT model number 10846-UE44-0007 Our main volume pumping goes through this 8" id gate valve V1 to Maglev turbo  or Cryo pump to  VC1

The ion pumps have 6" id gate valves:VAT 10844-UE44-AAY1,  Pneumatic actuator with position indicator and double acting solenoid valve 115V 60Hz  Purchased 1999 Dec 22

UHV gate valves 2.5" id. VAT  10836-UE44    Pneumatic actuator with position indicator and double acting solenoid valve 115V 60 hz, IFO to RGA  VM1 &  RGA to Maglev  VM2

mini UHV gate valve 1.5" id.   VAT  01032-UE01      2016 cataloge page 14,   manual - no position indicator, VM4  next to manual adjustable fine leak valve to RGA

UHV angle valve 1.5" id, model VAT 28432-GE41, Viton plate seal, pneumatic actuator with position indicator & solenoid valve 115V & single acting closing spring  MEDM screen: VM3,VC2, V3,V4,V5,V6,VA6,V7 & annuloses Each chamber annulos has 2 valves.

UHV angle valve 1.5" id, model VAT 57132-GE05   go page 208,   Metal tip seal, manual actuating only with position indicator,   MEDM screen: roughing RV1 and venting VV1 hand wheel needed to close to torque spec

UHV angle valve  1.5" id. model VAT 28432-GE01           Viton plate seal, manual operation only at IT gauges Hornet & Super Bee and  ion pumps roughing  ports. These are not labeled.

                                                      

The Cryo pump interlock wiring was added too

Note: all moving valve plate seals are single.

  13246   Wed Aug 23 17:22:36 2017 gautamUpdateALSFiber ALS - reinstalled

I completed the revamp of the box, and re-installed the box on the PSL table today. I think it would be ideal to install this on one of the electronic racks, perhaps 1X2 would be best. We would have to re-route the fibers from the PSL table to 1X2, but I think they have sufficient length, and this way, the whole arrangement is much cleaner.

Did a quick check to make sure I could see beat notes for both arms. I will now attempt to measure the ALS noise with this revamped box, to see if the improved power supply and grounding arrangement, as well as fiber cleaning, has had any effect.

Photos + power budget + plan of action for using this box to characterize the green PDH locking to follow. 

For quick reference: here is the AM/PM measurement done when we re-installed the repaired Innolight NPRO on the new X endtable.

  13248   Thu Aug 24 00:39:47 2017 gautamUpdateLSCDRMI locking attempt

Since the single arm locking and dither alignment seemed to work alright after the CDS overhaul, I decided to try some recycling cavity locking tonight.

  • First, I locked single arms, ran dither alignment servos, and centered all test mass Oplevs. Note: the X arm dither alignment doesn't seem to work if we use the High-Gain Thorlabs PD as the Transmission PD. The BS loops just seem to pick up large offsets and the alignment actually degrades over a couple of minutes. This needs to be investigated.
  • Next, to get good PRM alignment, I manually moved the EPICS sliders till the REFL spot became roughly centered on the CCD screen.
  • Then I tried locking PRMI on carrier using the usual C1IFOConfigure script - the lock was caught within ~30 seconds.
  • The PRCL and MICH dither servo scripts also ran fine.
    • Centered PRM Oplev.
  • Next, I tried enabling the PRC angular feedforward.
    • OAF model does not automatically revert to its safe.snap configuration on model reboot, so I first manually did this such that the correct filter banks were enabled.
    • I was able to turn on the angular feedforward without disturbing the PRMI carrier lock. The angular motion of the POP spot on the CCD monitor was visibly reduced.
  • At this point I decided to try DRMI locking.
    • I centered the beam on the AS PDs with the simple Michelson.
    • Centered the beam on the REFL PDs with PRM aligned and PRC flashing through resonances.
    • Restored SRM alignment by eye with EPICS sliders.
    • Cavity alignment seemed alright - so I tried to lock DRMI with the old settings (i.e. from DRMI 1f locking a couple of months ago). But I had no success.
    • The behaviour of REFL55 (used for SRCL control) has changed dramatically - the analog whitening gain for this PD used to be +18dB, but at this setting, there are frequent ADC overflows. I had to reduce the whitening gain to +6dB to stop the ADC overflows. I also checked to make sure that the whitening setting was "manual" and not triggered.

Why should this have changed? I was just on the AS table and did re-center the beam onto the REFL 55 RFPD, but I had also done this in April/May when I was last doing DRMI locking. But I can't explain the apparent factor of ~4 increase in light level. I think I have some measurements of the light levels at various PDs from April 2017, I will see how the present levels line up.

Of course dataviever won't cooperate when I am trying to monitor testpoints.

I may be missing something obvious, but I am quitting for tonight, will look into this more tomorrow.


Unrelated to this work: looking at the GTRY spot on the CCD monitor, there seems to be some excess angular motion. Not sure where this is coming from. In the past, this sort of problem has been symptomatic of something going wonky with the Oplev loops. But I took loop measurements for ITMY and ETMY PIT and YAW, they look normal. I will investigate further when I am doing some more ALS work.

  13249   Thu Aug 24 17:36:11 2017 gautamUpdateCDSFSS Slow Python maintenance

A couple of weeks ago, I was trying to modernize the python version of the FSS Slow temperature control loops, when I accidentally ended up deleting it frown. There was no svn backup. So the old Perl PID script has been running for the last few days.

Today, I checked out the latest version that Andrew and co. have running in the PSL lab. I had to make some important modifications for the script to work for the 40m setup.

  1. The script is conveniently setup in a way that the channels it needs to read from / write to are read in from an .ini file. I renamed all the channels to match the appropriate 40m ones.
  2. We don't have a soft epics channel in which to define the setpoint for our PID servo (which is 0). Rather than poke around with slow machine EPICS records, I simply commented out this line in the script and included the hard-coded value of 0. When we modernize to the Acromag era, we can setup an EPICS channel + MEDM slider for the setpoint.
  3. The way the Perl script was setup, the error signal was pre-scaled by a factor of 0.01, supposedly to make the PID gains be of order 1. For consistency, I re-inserted this scaling, which awade and co. had removed.
  4. Modified the FSSslowPy.init file to call the script in accordance with the new syntax:
python FSSSlow.py -i FSSSlowPy.ini

Then I stopped the Perl process on megatron by running

sudo initctl stop FSSslow

and started the Python process by running

sudo initctl start FSSslowPy

I have now committed the files FSSSlow.py and FSSSlowPy.ini to the 40m svn.  Things seem to be stable for the last 20 mins or so, let's keep an eye on this though - although we had been running the Python PID loop for some months, this version is a slightly modified one. 

The initctl stuff still isn't very robust - I think both the Autolocker and the FSS slow servos have to be manually restarted if megatron is shutdown/restarted for whatever reason. It doesn't seem to be a problem with the initctl routine itself - looking at the logs, I can see that init is trying to start both processes, but is failing to do so each time. To be investigated. The wiki procedure to restart this process is up to date.

GV Edit 0000 25 Aug 2017: I had to add a line to the script that checks MC transmission before enabling the PID loop. Change has been committed to svn. Now, when the MC loses lock or if the PSL shutter is kept closed for an extended period of time, the temperature loop doesn't rail.

  13252   Fri Aug 25 01:20:52 2017 gautamUpdateLSCDRMI locking attempt

I tried some DRMI locking again tonight, but had no success. Here is the story.

  • I started out by going to the AS table and measuring the light level on the REFL55 photodiode (with PRM aligned and the PRC flashing, but LSC disabled).
    • The Ophir power meter reads 13mW
    • The DC output of the photodiode shows ~500mV on an oscilloscope.
    • Both of these numbers line up well with measurements I made in April/May.
  • Returned to the control room and aligned the IFO for DRMI locking - but LSC servos remained disabled.
    • At the nominal REFL55 whitening level of +18dB, the REFL 55 signals saturated the ADC (confirmed by looking at the traces on dataviewer).
    • But the signals still looked like PDH error signals.
    • Lowering the whitening gain to 6dB makes the PDH error signal horns peak around 20,000 counts.
    • Could this be indicative of problems with either the analog whitening gain switching or the LSC Demod Boards? To be investigated.
  • Tried enabling LSC servos with same settings with which I had success right up till a couple of months ago, but had no success.
    • If it is true that the REFL55 signal is getting amplified because of some gain stage not being switched correctly, I should still have been able to lock the SRC with a lowered loop gain - but even lowering the gain by a factor of 10 had no effect on the locking success rate.

Looks like I will have to embark on the REFL55 LSC electronics investigation. I was able to successfully lock the PRC on carrier and sideband, and the Michelson lock also seems to work fine, all of which seem to point to a hardware problem with the REFL55 signal chain.

I did a quick check by switching the output of the REFL55 demod board to the inputs normally used by AS55 signals on the whitening board. Setting the whitening gain to +18dB for these channels had the same effect - ADC overflow galore. So looks like the whitening board isn't to blame. I will have to check the demod board out.

 

ELOG V3.1.3-