The vacuum safety policy and design are not clear to me, and I don't know what the first and second defense is. Since we had limited time and bandwidth during the remotely-supported recovery work today, we wanted to work step by step.
The pressure rising rate is 20mtorr/day, and turning on TP3 early next week will resume the main-volume pumping without too much hustle. If you need the IFO time now, contact with Jon and use backing with TP3.
Didn't mean to sound whiny. I will wait until the vacuum team tells me it is okay.
I will be at the 40m today at 10am to deliver optics to Downs and to replace the TP2 controller.
ITMU01 / ITMU02 as well as the five E1800089 mirrors came back to the 40m. Instead, the two ETM spares (ETMU06 / ETMU08) were delivered to GariLynn.
Jordan worked on transportation.
Note that the E1800089 mirrors are together with the ITM container in the precious optics cabinet.
[Jon, Jordan, Koji]
Today Jordan reconfigured the vac system to allow pumping of the main volume resume, with Jon and Koji remotely advising. All clear to resume normal IFO activities. However, the vac system is operating in a temporary configuration that will have to be reverted as we locate replacement components. Details below.
Since serial readback of the TP2 controller seems to be failing, we reconfigured the system with TP3 now backing for TP1. TP2 was valved off (at V4) and shut down until we can replace its controller.
TP3 has its own problems, however. It was valved off in January after its temperature readback began glitching and spuriously triggering the interlocks [ELOG 15140]. However the problem appears to be limited only one readback (rotation speed, current, voltage are fine) and there is enough redundancy in the pump-dependent interlock conditions to safely connect it to the main volume.
We also discovered that sometime since January, the TP3 dry pump has failed. The foreline pressure had risen to 165 torr. Since the TP2 and TP3 dry pumps are not interchangeable (Agilent vs. Varian), we instead valved in the auxiliary dry pump and disconnected the failed dry pump using a KF blank. This is a temporary arrangement until the permanent dry pump can be repaired. Jordan removed it to replace the tip seals and will test it in the bake lab before reinstalling.
With this configuration in place, we proceeded to pump down the main volume without issue (attachment 1). We monitored the pumpdown for about 45 min., until the pressure had reached ~1E-5 torr and TP3 had been transitioned to standby (low-speed) mode.
I replaced an empty N2 cylinder, there are now two empty tanks in the outside rack.
I missed the vacuum discussion on the call today, but I have some questions/comments:
At the very least, I think we should consider making the interlock code have levels (like interrupts on a micro controller). So if the pressure gauges are communicating and are reporting acceptable pressure readings, we should be able to reject unphysical readbacks from the TP controllers.
I still don’t understand why TP2 can’t back TP1, but we just disable all the software interlock conditions contingent on TP2 readbacks. This pump is far newer than TP3, and unless I’ve misunderstood something major about the vacuum infrastructure, I don’t really see why we should trust this flaky serial readbacks for any actionable interlocks, at least without some AND logic (since temperature, current and speed aren’t really independent variables).
I also think we should finally implement the email alert in the event the vacuum interlock is tripped. I can implement this if no one else volunteers.
This might also be a good reminder to get the documentation in order about the new vacuum system.
I will be at the 40m today from 9:30am to 4pm.
Looking at images of the old vac screens, the TP2/3 rotation speed and status string were digitally monitored. However I don't know if there were software interlocks predicated on those.
The temperature and current interlocks are implemented precisely because the pumps can shut themselves off. The concern is not about damaging the pumps (their internal logic protects against that); it's that a pump could automatically shut down and back-vent the IFO to atmosphere. Another interlock (e.g., the pressure differentials) might catch it, but it would depend on the back-vent rate and the scenario has never been tested. The temperature and current interlocks are set to trip just before the pump reaches its internal shut-down threshold.
One way we might be able to reduce our reliance on the flaky serial readbacks is to implement rotation-speed hardware interlocks. The old vac documentation alludes to these, but as far as Chub and I could determine in 2018, they never actually existed. The older turbo controllers, at least, had an analog output proportional to speed which could be used to control a relay to interrupt the V4/5 control signals. I'll look into this for the new controllers. If it could be done, we could likely eliminate the layer of serial-readback interlocks altogether.
That would be awesome if you're willing to volunteer. I agree this would be great to have.
I agree there were MEDM fields, but I can't find any record of these channels being recorded till 2018 December, so I don't agree that they were being digitally monitored. You can also look back in the elog (e.g. here and here) and see that the display fields are just blank. I would then assume that no interlocks were dependent on these channels, because otherwise the vacuum interlocks would be perpetually tripped.
Sorry but I'm having trouble imagining a scenario how the pressure gauges wouldn't register this before the IFO volume is compromised. Is there some back of the envelope calculations I can do to understand this? Since both the pressure gauges and the TP diagnostic channels are being monitored via EPICS, the refresh rate is similar, so I don't see how we can have a pump temperature / speed / current threshold tripped but NOT have this be registered on all the pressure gauges, seems like a bit of a contrived scenario to me. Our thresholds currently seem to be arbitrary numbers anyway, or are they based on some expected backstreaming rate? Isn't this scenario degenerate with a leak elsewhere in the vacuum envelope that would be caught by the differential pressure interlocks?
For the email alert, can you expose a soft channel that is a flag - if this flag is not 1, then the service will send out an email.
Right, I doubt they were ever recorded or used for interlocks. But the readbacks did work at one point in the past. There's a photo of the old vac monitor screen on p. 19 of E1500239 (last updated 2017) which shows the fields once alive.
I don't disagree that the pressure gauges would register the change. What I'm not sure about is whether the change would violate any of the existing interlock conditions, triggering a shutdown. Looking at what we have now, the only non-pump-related conditions I see that might catch it are the diffpres conditions:
abs(P2 - PTP2) > 1 torr (for a TP2 failure)
abs(P3 - PTP3) > 1 torr (for a TP3 failure)
abs(P1a - P2) > 1 torr (for either a TP2 or TP3 failure)
For the P1a-P2 differential, the threshold of 1 torr is the smallest value that in practice still allows us to pump down the IFO without having to disable the interlocks (P1a-P2 is the TP1 intake/exhaust differential). The purpose of the P2-PTP2/P3-PTP3 differentials is to prevent V4/5 from opening and suddenly exposing the spinning turbo to high pressure. I'm not aware of a real damage threshold calculation that any one has done; I think < 1 torr is lore passed down by Steve.
If a turbo pump fails, the rate it would backstream is unknown (to me, at least) and likely depends on the failure mode. The scenario I'm concerned about is if the backstream rate is slower than the conduction time through the pumspool and into the main volume. In that case, the pressure gauges will rise more or less together all the way up to atmosphere, likely never crossing the 1 torr differential pressure thresholds.
There's already a channel C1:Vac-error_status, where if the value is anything other than an empty string, there is an interlock tripped. Does that work?
I removed the backing pumps for TP2 and TP3 today to test ultimate pressure and determine if they need a tip seal replacement. This was done with Jon backing me on Zoom. We closed off TP3 and powered down TP3 and the auxilliary pump, in order to remove the forepumps from the exhaust line.
Once pumps were removed I connected a Pirani gauge to the pump directly and pumped down, results as follows:
TP2 Forepump (Agilent IDP 7):
TP3 Forepump (Varian SH 110):
TP3 forepump defintely needs a new tip seal, and while the pressure on TP2 Forepump was good there was a significant amount of particulate that came out of the exhaust line, so a new tip seal might not be needed but it is recommeded.
So why not just have a special mode for the interlock code during pumpdown and venting, and during normal operation we expect the main volume pressure to be <100uTorr so the interlock trips if this condition is violated? These can just be EPICS buttons on the Vac control MEDM screen. Both of these procedures are not "business as usual", and even if we script them in the future, it's likely to have some operator supervising, so I don't think it's unreasonable to have to switch between these modes. I just think the pressure gauges have demonstrated themselves to be much more reliable than these TP serial readbacks (as you say, they worked once upon a time, but that is already evidence of its flakiness?). The Pirani gauges are not ultra-reliable, they have failed in the past, but at least less frequently than this serial comm glitching. In fact, if these readbacks are so flaky, it's not impossible that they don't signal a TP shutdown? I just think the real power of having these multi-channel diagnostics is lost without some AND logic - a turbopump failure is likely to result in an increase in pump current and temperature increase and pump speed decrease, so it's not the individual channel values that should be determining if an interlock is tripped.
I definitely think that protecting the vacuum envelope is a priority - but I don't think it should be at the expense of commissioning time. But if you think these extra interlocks are essential to the safety of the vacuum system, I withdraw my request.
It would be better to have a flag channel, might be useful for the summary pages too. I will make it if it is too much trouble.
I agree with your assessment, Jordan. If I'm not mistaken the scroll pump for TP2 is new; we had a very early failure with the last new scroll pump (the forepump for TP3) tip seals at just over 5000 hours. Glad to see my replacement seals held up for over 60K hours. If this is the trend with these pumps, we can simply run them to around 60000 hours and replace the seals at that time, rather than waiting for failure! - Chub
I think we should discuss interlock possibilities at a 40m meeting. I'm reluctant to make the system more complicated, but perhaps we can find ways to reduce the reliance on the turbo pump readbacks. I agree they've proven to be the least reliable.
While we may be able to improve the tolerance to certain kinds of hardware malfunctions (and if so, we should), I don't see interlocks triggering on abnormal behavior of critical equipment as the root problem. As I see it, our bigger problem is with all the malfunctioning, mostly end-of-lifetime pieces of vacuum equipment still in use. If we can address the hardware problems, as I'm trying to do with replacements [ELOG 15412], I think that in itself will make the interlocking much less of an issue.
Ok, this can be added pretty easily. Its value will just be toggled between 1 and 0 every time the interlock server raises/clears the existing string channel. Adding the channel will require restarting the whole vac IOC, so I'll do it at a time when Jordan is on hand in case something fails to come back up.
I will be at the 40m today from 9am to 3pm.
For this particular email service, ideally the email should be sent out as soon as the interlock is tripped, so this would require a line of code to be added to the main interlock code. Which I guess would require a restart of the interlock service. So let me know when you guys plan to do the dry-pump tip seal replacement operation (when I presume valves will be closed anyways) so that we can do this in a minimally invasive way.
The four 4x25DSUB and single 8x25DSUB feedthrough flanges have arrived and will be picked up from the dock and brought to the 40M lab.
Tip Seals were replaced on the forepumps for TP2 and TP3, and both are ready to be installed back onto the forelines.
TP2 Forepump Ultimate Pressure: 180 mtorr
TP3 Forepump Ultimate Pressure: 120 mtorr
In ELOG 15368, I had claimed that the POP QPD based feedback servo actuating on the PRM stabilized the lock. I now believe this scheme of sensing using the POP QPD and feeding back to the PRM is not a good topology for stabilizing the PRC angular motion.
I would also like to bring up the topic of implementing some WFS for the interferometer fields again, there doesn't seem to be any mention of this in the procurement/planning for the BHD. It is not obvious to me yet that we need WFS and not just DC QPDs from a noise point of view, but at least we should discuss this.
I want some input about what the short-term (next two weeks) commissioning goals should be.
Before the vacuum fracas, the locking was pretty robust. With some human servoing of the input beam, I could maintain locks for ~1 hour. My primary goals were:
I didn't succeed in either so far.
I guess apart from this, we want to run the ALS scan to try and infer something about the absorption-induced thermal lens. I guess at this point, the costs outweigh the benefits in trying to bring in the SRC as well, since we will be changing the SRC config?
The PSL shutter was closed from the vacuum interlock trip. Today, I did the following:
All looks good for now. I will probably get back to PRFPMI locking Monday.
I will be at the 40m today from 11am to 4pm.
The machine needed a hard reboot as it was un-ssh-able.
The exact time that the machine went down is unknown because the blinkys were not DQ-ed. I've now added these to the EDCU to make these channels actually useful, and we may look back on the reliability (or otherwise) of the Acromag system. To my memory, this is the ~5th time one of the new Acromag servers has needed a hard reboot. While this may be less frequent (?) than the VME machines, perhaps there is some other reason for these dropouts. Maybe something to do with the martian network?
Anyway the machine is back up and running now.
I will be in the Clean and Bake lab today from 10am to 4pm.
Per the discussion at the meeting today, the plan of action is:
If I missed something, please add here.
This earthquake tripped all suspensions and ITMX got stuck. The watchdogs were restored and the stuck optic was released. The IFO was re-aligned, POX/POY and PRMI on carrier locking all work okay.
I will be in the Clean and Bake lab from 11pm to 4pm
I implemented this change today. We only had 100 ohm, 3W resistors in stock (no 200 ohm with adequate power rating). Assuming 10 V is dropped across this resistor, the power dissipation is V^2/R ~ 1 W, so we should have sufficient margin. DCC entry has been updated with new schematic and photo of the component side of the board. Note that the series resistance of the fast actuation path was untouched.
As expected, the requested voltage no longer exceeds the Acromag DAC range, it is now more like 2.5 V. However, I still notice that the MC REFL spot moves somewhat diagonally on the camera image - so maybe the coil gains are seriously imbalanced? Anyway, the WFS control signals can once again be safely offloaded to the slow bias voltages once again, preserving the fast ADC range for other actuation.
The Johnson noise of the series resistor has now increased by a factor of 2, from ~6.4 pA/rtHz to 12.8 pA/rtHz. Assuming a current to force coefficient of 1.6 mN/A per coil, the length noise of the cavity is expected to be 12.8e-12 * 0.064/0.25/(2*pi*100)^2 ~ 8e-18 m/rtHz at 100 Hz. In frequency units, this is 80 uHz/rtHz. I think our IMC noise is at least 10 times higher than this at 100 Hz (in any case, the noise of the coil driver is NOT dominated by the series resistance). Attachment #1 confirms that there isn't any significant MCF noise increase, and I will check with the arm cavity too. Nevertheless, we should, if possible, align the optic better and use as high a series resistance as possible.
The watchdog for MC1 was disabled and the board was pulled out for this work. After it was replaced, the IMC re-locks readily.
But this does not solve the MC1 issue. Only we can do right now is to make the output resister half, for example.
I will be in the Clean and Bake lab today from 11am to 4pm.
While the vacuum system was knocked out, I measured the RF transimpedance (using the AM laser setup, didn't do the shot noise intercept current measurement for now) of all the RFPDs (except PMC REFL). At the very least, the following photodiodes are suspect:
For the remaining photodiodes, I measure a transimpedance that is within ~20% of what is on the wiki page. The notches may benefit from some retuning. While I have the data, I will fit this and post a more complete report on the wiki.
Update July 6 1145am: WFS response plots now have legends mapping quadrants, and I've also added the response of a spare PDA10CF (which is now the new POP22/POP110 photodiode).
Judging by the summary pages, some 18 hours after this change was made and the board re-installed, the MC1 shadow sensors began to report frequent glitches. I can't think of a plausible causal connection, especially given the 18 hour time lag, but also hard to believe there isn't one? As a result, the IMC is no longer able to stay locked for extended periods of time. I did the usual cable squishing, and also took off the lid to see if that helps the situation.
While the reduced series resistance means there is more current flowing through the slow path,
The attached FLIR camera image re-inforces what we already know, that the thermal environment inside the satellite box is horrible. The absolute temperature calibration may be off, but it was difficult to touch the components with a bare finger, so I'd say its definitely > 70 C.
does the FLIR have an option to export image with a colorbar?
How about just leave the lid open? or more open? I don't know what else can be done in the near term. Maybe swap with the SRM sat box to see if that helps?
Hmm I can't seem to export with the colorbar, might be just my phone though. I tried to add some "cursors" with the temperature at a few spots, but the font color contrast is poor so you have to squint really hard to see the temperatures in the photo I attached.
I'll leave the MC1 box open overnight and see if that improves the situation, and if not, I'll switch in the SRM satellite box tomorrow.
I will be in the Clean and Bake lab today from 11:30am to 4pm
There was no improvement to the situation overnight. So, I did the following today:
IMC is now locked again, I will monitor for glitching/stability.
Update 6pm PDT: as shown in Attachment #1, there is a huge difference in the stability of the lock after the sat box swap. Let's hope it stays this way for a while...
A more comprehensive report has been uploaded here. I'll zip the data files and add them there too. In summary:
I'll upload the data and analysis notebook + liso fit files to the wiki as well shortly. The data, a Jupyter notebook making the plots, and the LISO fit files have been uploaded here.
I didn't do it this time but it'd be nice to also do the noise measurement and get an estimate for the shot-noise intercept current.
While I have the data, I will fit this and post a more complete report on the wiki.
Sigh. Do we have a spare sat box?
I will be in the clean and bake lab today from 9am to 4pm.
I injected some sensing lines and measured their responses in the various photodiodes, with the interferometer in a few different configurations. The results are summarized in Attachments #1 - #3. Even with the PRMI (no arm cavities) locked on 1f error signals, the MICH and PRCL signals show up in nearly the same quadrature in the REFL port photodiodes, except REFL165. I am now thinking if the output (actuation) matrix has something to do with this - part of the MICH control signal is fed back to the PRM in order to minimize the appearance of the MICH dither in the PRCL error signal, but maybe this matrix element is somehow horribly mistuned?
Some other mysteries that I will investigate further:
I blew the long lock last night because I forgot to not clear the ASS offsets when trying to find the right settings for running the ASS system at high power. Will try again tonight...
Lock the PRMI on carrier and measure the sensing matrix, see if the MICH and PRCL signals look sensible in 1f and 3f photodiodes.
This problem reared its ugly head again. I am inclined to believe the problem is electronic and not on the light, since the POY channels seem immune to this issue (see Attachment #1). I will investigate in the daytime tomorrow. Note that while the POX photodiode head has ~twice the transimpedance than POY (per measurement), the POY signal gets amplified by a ZHL-500-HLN amplifier before heading to the demod electronics (nominal gain is 19dB = x9). There is also some imbalance in the light level at the photodiodes I guess, because overall, the PDH fringe is ~twice as large for the Y arm as the X arm. Basically, the y-axes of the attached plot cannot be directly compared between POX and POY.
Mostly this is an annoyance - right now, the POX signal is only used for locking and dither aligning the X arm cavity, and so once that is done, the locking can proceed (as long as the other channels, e.g. REFL11, aren't glitching as well...)
I will be in the Clean and Bake lab from 9am to 4pm today.
I re-connected the 3 accelerometers located near the MC1/MC3 chamber. It was a bit tedious to get the cabling sorted - I estimate the cable is ~80m long, and the excess length had to be wound around a spool (see Attachment #1), which wasn't really a 1 person job. It's neat-ish for now, but I'm not entirely satisfied. I think we should get shorter cables (~20m), and also mount the pre-amp/power units in a rack instead of leaving it on the floor. The pre-amp settings are x100 for all three channels. The MC2 channels are powered, but are unconnected to the seismometers - it was too tedious to unroll the other spool yesterday. Apart from this, the cable for the "Z" channel had to be re-seated in the strain relief clamp.
I did not enable any of the CDS filters that convert the raw signal into physical units, so for now, these channels are just recording raw counts.
Update 7pm: the spectra in the current config are here - not sure what to make of the MC2_Z channel appearing to show lower noise?
Update July 13 2020 430pm: This afternoon, I hooked up the MC2 accelerometer channels too...
In an effort to make a second usable workstation, I did the following (remotely) on rossa today (not necessarily in this order, I wasn't maintaining a live log so I forgot):
So, in summary, rossa is now all set up for use during lock acquisition. However, until this machine has undergone a few months of testing, we should freeze the pianosa config and not mess with it.
Note that this version of the "crtools" is rather new. Please, use them and if there is an issue, report the errors! I am going to occassionally try lock acquisition using rossa.
wiped and install Debian 10 on rossa today
still to be done: config it as CDS workstation
please don't try to "fix" it in the meantime
As part of an ongoing effort to improve airflow in workspaces/bathrooms on campus, I have installed an air scrubber unit in each of the bathrooms at the 40m lab.
maybe we should make a "dd" copy of pianosa in case rossa has issues and someone destroys pianosa by accidentally spilling coffee on it.
in the lab, checkin on the WFS
Sun Jul 5 18:25:50 2020
I redid Gautam's measurements to get a baseline before changing the head, and my results are very different: To me it looks like the WFS2 quadrants are all OK.
I've left the setup as is in case either me or Gautam want to double check. If we're agreed on this response, I'll remove the notches and disable the RF attenuators.
Sun Jul 5 21:42:45 2020
sudo usermod -a -G lpadmin controls
and then was able to add Grazia to the list of printers for Rossa by following the instructions on the 40m Wiki.
I installed color syntax highlighting on Rossa using the internet (https://superuser.com/questions/71588/how-to-syntax-highlight-via-less). Now if you do 'less genius_code.py', it will be highlighting the python syntax.
when I try 'sitemap' on rossa I get:
medm: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
This is strange - I was definitely able to launch medm when I was working on this machine remotely on Friday. But now, there does seem to be a problem with this shared library being missing.
First of all, I installed mlocate to find where the shared library files are installed. Then I made the symlink, and now sitemap seems to work again.
Weirdly, my changes to /etc/resolv.conf got overwritten somehow. Was this machine rebooted? Uptime suggests it's only been running for ~6 hours at the time of writing of this elog.
sudo apt install mlocate
sudo ln -s /usr/lib/x86_64-linux-gnu/libreadline.so.7 /usr/lib/x86_64-linux-gnu/libreadline.so.6
medm: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory