The second big glich trips ETMX sus. There were small earth quakes around the glitches. It's damping recovered.
Glitch, small amplitude, 350 counts & no trip.
Here is an other big one
A brief follow-up on this since we discussed this at the meeting yesterday: the attached DV screenshot shows the full 2k data for a period of 2 seconds starting just before the watchdog tripped. It is clear that the timescale of the glitch in the UL channel is much faster (~50 ms) compared to the (presumably mechanical) timescale seen in the other channels of ~250 ms, with the step also being much smaller (a few counts as opposed to the few thousand counts seen in the UL channel, and I guess 1 OSEM count ~ 1 um). All this supports the hypothesis that the problem is electrical and not mechanical (i.e. I think we can rule out the Acromag sending a glitchy signal to the coil and kicking the optic). The watchdog itself gets tripped because the tripping condition is the RMS of the shadow sensor outputs, which presumably exceeds the set threshold when UL glitches by a few thousand counts.
Small earth quakes and suspensions. Which one is the most free and most sensitive: ITMX
I found c1lsc unresponsive again today. Following the procedure in elog #13935, I ran the rebootC1LSC.sh script to perform a soft reboot of c1lsc and restart the epics processes on c1lsc, c1sus, and c1ioo. It worked. I also manually restarted one unresponsive slow machine, c1aux.
After the restarts, the CDS overview page shows the first three models on c1lsc are online (image attached). The above elog references c1oaf having to be restarted manually, so I attempted to do that. I connect via ssh to c1lsc and ran the script startc1oaf. This failed as well, however.
In this state I was able to lock the MICH configuration, which is sufficient for my purposes for now, but I was not able to lock either of the arm cavities. Are some of the still-dead models necessary to lock in resonant configurations?
All suspension tripped. Their damping restored. The MC is locked.
ITMX-UL & side magnets are stuck.
TP-1 Osaka maglev controller [ model TCO10M, ser V3F04J07 ] needs maintenance. Alarm led on indicating that we need Lv2 service.
The turbo and the controller are in good working order.
Our maintenance level 2 service price is $...... It consists of a complete disassembly of the controller for internal cleaning of all ICB’s, replacement of all main board capacitors, replacement of all internal cooling units, ROM battery replacement, re-assembly, and mandatory final testing to make sure it meets our factory specifications. Turnaround time is approximately 3 weeks.
RMA 5686 has been assigned to Caltech’s returning TC010M controller. Attached please find our RMA forms. Complete and return them to us via email, along with your PO, prior to shipping the cont
Osaka Vacuum USA, Inc.
510-770-0100 x 109
our TP-1 TG390MCAB is 9 years old. What is the life expectancy of this turbo?
The Osaka maglev turbopumps are designed with a 100,000 hours(or ~ 10 operating years) life span but as you know most of our end-users are
running their Osaka maglev turbopumps in excess of 10+, 15+ years continuously. The 100,000 hours design value is based upon the AL material being rotated at
the given speed. But the design fudge factor have somehow elongated the practical life span.
We should have the cost of new maglev & controller in next year budget. I put the quote into the wiki.
I freed ITMX and coarsely realigned the IFO using the OPLEVs. All the alignments were a bit off from overnight.
The IFO is still only able to lock in MICH mode currently, which was the situation before the earthquake. This morning I additionally tried restoring the burt state of the four machines that had been rebooted in the last week (c1iscaux, c1aux, c1psl, c1lsc) but that did not solve it.
Electrician is coming to fix one of the fluorenent light fixture holder in the east arm tomorrow morning at 8am. He will be out by 9am.
The job did not get done. There was no scaffolding or ladder to reach troubled areas.
c1lsc crashed again. I've contacted Rolf/JHanks for help since I'm out of ideas on what can be done to fix this problem.
Starting c1cal now, let's see if the other c1lsc FE models are affected at all... Moreover, since MC1 seems to be well-behaved, I'm going to restore the nominal eurocrate configuration (sans extender board) tomorrow.
Rolf came by today morning. For now, we've restarted the FE machine and the expansion chassis (note that the correct order in which to do this is: turn off computer--->turn off expansion chassis--->turn on expansion chassis--->turn on computer). The debugging measures Rolf suggested are (i) to replace the old generation ADC card in the expansion chassis which has a red indicator light always on and (ii) to replace the PCIe fiber (2010 make) running from the c1lsc front-end machine in 1X6 to the expansion chassis in 1Y3, as the manufacturer has suggested that pre-2012 versions of the fiber are prone to failure. We will do these opportunistically and see if there is any improvement in the situation.
Another tip from Rolf: if the c1lsc FE is responsive but the models have crashed, then doing sudo reboot by ssh-ing into c1lsc should suffice* (i.e. it shouldn't take down the models on the other vertex FEs, although if the FE is unresponsive and you hard reboot it, this may still be a problem). I'll modify I've modified the c1lsc reboot script accordingly.
* Seems like this can still lead to the other vertex FEs crashing, so I'm leaving the reboot script as is (so all vertex machines are softly rebooted when c1lsc models crash).
Todd E. came by this morning and gave us (i) 1x new ADC card and (ii) 1x roll of 100m (2017 vintage) PCIe fiber. This afternoon, I replaced the old ADC card in the c1lsc expansion chassis, and have returned the old card to Todd. The PCIe fiber replacement is a more involved project (Steve is acquiring some protective tubing to route it from the FE in 1X6 to the expansion chassis in 1Y3), but hopefully the problem was the ADC card with red indicator light, and replacing it has solved the issue. CDS is back to what is now the nominal state (Attachment #1) and Yarm is locked for Jon to work on his IFOcoupling study. We will monitor the stability in the coming days.
(i) to replace the old generation ADC card in the expansion chassis which has a red indicator light always on and (ii) to replace the PCIe fiber (2010 make) running from the c1lsc front-end machine in 1X6 to the expansion chassis in 1Y3, as the manufacturer has suggested that pre-2012 versions of the fiber are prone to failure. We will do these opportunistically and see if there is any improvement in the situation.
Looks like the ADC was not to blame, same symptoms persist.
The PCIe fiber replacement is a more involved project (Steve is acquiring some protective tubing to route it from the FE in 1X6 to the expansion chassis in 1Y3), but hopefully the problem was the ADC card with red indicator light, and replacing it has solved the issue.
Gautam and I restarted the models on c1lsc, c1ioo, and c1sus. The LSC system is functioning again. We found that only restarting c1lsc as Rolf had recommended did actually kill the models running on the other two machines. We simply reverted the rebootC1LSC.sh script to its previous form, since that does work. I'll keep using that as required until the ongoing investigations find the source of the problem.
LIGO GC notified us that nodus had SSL2.0 and SSL3.0 enabled. This has been disabled now.
The details are described on 40m wiki.
The PMC and IMC were unlocked. Both were re-locked, and alignment of both cavities were adjusted so as to maximize MC2 trans (by hand, input alignment to PMC tweaked on PSL table, IMC alignment tweaked using slow bias voltages). I disabled the inputs to the WFS loops, as it looks like they are not able to deal with the glitching IMC suspensions. c1lsc models have crashed again but I am not worrying about that for now.
9pm: The alignment is wandering all over the place so I'm just closing the PSL shutter for now.
Yuki Miyazaki received 40m specific basic safety training.
I restarted the LSC models in the usual way via the c1lsc reboot script. After doing this I was able to lock the YARM configuration for more noise coupling scripting.
M3.4 Colton shake did not trip sus.
[steve, yuki, gautam]
The plastic tubing/housing for the fiber arrived a couple of days ago. We routed ~40m of fiber through roughly that length of the tubing this morning, using some custom implements Steve sourced. To make sure we didn't damage the fiber during this process, I'm now testing the vertex models with the plastic tubing just routed casually (= illegally) along the floor from 1X4 to 1Y3 (NOTE THAT THE WIKI PAGE DIAGRAM IS OUT OF DATE AND NEEDS TO BE UPDATED), and have plugged in the new fiber to the expansion chassis and the c1lsc front end machine. But I'm seeing a DC error (0x4000), which is indicative of some sort of timing error (Attachment #1) **. Needs more investigation...
Pictures + more procedural details + proper routing of the protected fiber along cable trays after lunch. If this doesn't help the stability problem, we are out of ideas again, so fingers crossed...
** In the past, I have been able to fix the 0x4000 error by manually rebooting fb (simply restarting the daqd processes on fb using sudo systemctl restart daqd_* doesn't seem to fix the problem). Sure enough, seems to have done the job this time as well (Attachment #2). So my initial impression is that the new fiber is functioning alright .
The PCIe fiber replacement is a more involved project (Steve is acquiring some protective tubing to route it from the FE in 1X6 to the expansion chassis in 1Y3)
This didn't go as smoothly as planned. While there were no issues with the new fiber over the ~3 hours that I left it plugged in, I didn't realize the fiber has distinct ends for the "HOST" and "TARGET" (-5 points to me I guess). So while we had plugged in the ends correctly (by accident) for the pre-lunch test, while routing the fiber on the overhead cable tray, we switched the ends (because the "HOST" end of the cable is close to the reel and we felt it would be easier to do the routing the other way.
Anyway, we will fix this tomorrow. For now, the old fiber was re-connected, and the models are running. IMC is locked.
[steve, koji, gautam]
We took another pass at this today, and it seems to have worked - see Attachment #1. I'm leaving CDS in this configuration so that we can investigate stability. IMC could be locked. However, due to the vacuum slow machine having failed, we are going to leave the PSL shutter closed over the weekend.
Steve pointed out that some of the vacuum MEDM screen fields were reporting "NO COMM". Koji confirmed that this is a c1vac1 problem, likely the same as reported here and can be fixed using the same procedure.
However, Steve is worried that the interlock won't kick in in case of a vacuum emergency, so we are leaving the PSL shutter closed over the weekend. The problem will be revisited on Monday.
Multiple realtime processes on c1sus are suffering from frequent time outs. It eventually knocks out c1sus (process).
Obviously this has started since the fiber swap this afternoon.
gautam 10pm: there are no clues as to the origin of this problem on the c1sus frontend dmesg logs. The only clue (see Attachment #3) is that the "ADC" error bit in the CDS status word is red - but opening up the individual ADC error log MEDM screens show no errors or overflows. Not sure what to make of this. The IOP model on this machine (c1x02) reports an error in the "Timing" bit of the CDS status word, but from the previous exchange with Rolf / J Hanks, this is down to a misuse of ADC0 Ch31 which is supposed to be reserved for a DuoTone diagnostic signal, but which we use for some other signal (one of the MC suspension shadow sensors iirc). The response is also not consistent with this CDS manual - which suggests that an "ADC" error should just kill the models. There are no obvious red indicator lights in the c1sus expansion chassis either.
We had another crash of c1sus and Gautam did full power cycling of c1sus. It was a sturggle to recover all the frontends, but this solved the timing issue.
We went through full reset of c1sus, and rebooting all the other RT hosts, as well as daqd and fb1.
[ Yuki, Koji, Gautam ]
An alignment of AUX Y end green beam was bad. With Koji and Gautam's advice, it was recovered on Friday. The maximum value of TRY was about 0.5.
Following the procedure in this elog, we effected a reset of the vacuum slow machines. Usually, I just turn the key on these crates to do a power cycle, but Steve pointed out that for the vacuum machines, we should only push the "reset" button.
While TP1 was spun down, we took the opportunity to replace the TP1 controller with a spare unit the company has sent us for use while our unit is sent to them for maintenance. The procedure was in principle simple (I only list the additional ones, for the various valve closures, see the slow machine reset procedure elog):
However, we were foiled by a Philips screw on the DB37 connector labelled "MAG BRG", which had all its head worn out. We had to make a cut in this screw using a saw blade, and use a "-" screwdriver to get this troublesome screw out. Steve suspects this is a metric gauge screw, and will request the company to send us a new one, we will replace it when re-installing the maintaiend controller.
Attachments #1 and #2 show the Vacuum MEDM screen before and after the reboot respectively - evidently, the fields that were reading "NO COMM" now read numbers. Attachment #3 shows the main volume pressure during this work.
The problem will be revisited on Monday.
Precondition: c1vac1 & c1vac2 all LED warning lights green [ atm3 ], the only error message is in the gauge readings NO COMM, dataviewer will plot zero [ atm1 ], valves are operational
When our vacuum gauges read " NO COMM " than our INTERLOCKS do NOT communicate either.
So V1 gate valve and PSL output shutter can not be triggered to close if the the IFO pressure goes up.
[ only CC1_HORNET_PRESSURE reading is working in this condition because it goes to a different compuer ]
I've been plugging away at Altium prototyping the high-voltage bias idea, this is meant to be a progress update.
I need to get footprints for some of the more uncommon parts (e.g. PA95) from Rich before actually laying this out on a PCB, but in the meantime, I'd like feedback on (but not restricted to) the following:
I also don't have a good idea of what the PCB layer structure (2 layers? 3 layers? or more?) should be for this kind of circuit, I'll try and get some input from Rich.
*Updated with current noise (Attachment #2) at the output for this topology of series resistance of 25 kohm in this path. Modeling was done (in LTspice) with a noiseless 25kohm resistor, and then I included the Johnson noise contribution of the 25k in quadrature. For this choice, we are below 1pA/rtHz from this path in the band we care about. I've also tried to estimate (Attachment #3) the contribution due to (assumed flat in ASD) ripple in the HV power supply (i.e. voltage rails of the PA95) to the output current noise, seems totally negligible for any reasonable power supply spec I've seen, switching or linear.
We have been working on double checking the noise budget calculations. We wanted to evaluate the amount of squeezing for a few different scenarios that vary in cost and time. Here are the findings:
All calculations done with
Main unbudgeted noises:
Threat matrix has been updated.
This is the procedure I follow when I take these measurements for the XARM (symmetric under XARM <-> YARM):
Information for the armloss measurement:
Note: The scripts uses httplib2 module. You have to install it if you don't have.
The locked arms are needed to calculate armloss but the alignment of PMC is deadly bad now. So at first I will make it aligned. (Gautam aligned it and PMC is locked now.)
gautam: The PMC alignment was fine, the problem was that the c1psl slow machine had become unresponsive, which prevented the PMC length servo from functioning correctly. I rebooted the machine and undid the alignment changes Yuki had made on the PSL table.
Gautam and Steve,
Our TP3 drypump seal is at 360 mT [0.25A load on small turbo] after one year. We tried to swap in old spare drypump with new tip seal. It was blowing it's fuse, so we could not do it.
Noisy aux drypump turned on and opened to TP3 foreline [ two drypumps are in the foreline now ] The pressure is 48 mT and 0.17A load on small turbo.
With Gautam's help, Y-arm was locked. Then I ran the script "armloss_dcrefl_asdcpd_scope.py" which gets the signals from oscilloscope. It ran and got data, but I found some problems.
Anyway, I got the data needed so I will calculate the loss after converting the format.
We want to measure the pressure gradient in the 40m IFO
Our old MKS cold cathodes are out of order. The existing working gauge at the pumpspool is InstruTech CCM501
The plan is to purchase 3 new gauges for ETMY, BS and MC2 location.
Basic cold cathode or Bayard-Alpert Pirani
I ran the script for measuring arm-loss and calculated rough Y-arm round trip loss temporally. The result was 89.6ppm. (The error should be considered later.)
The measurement was done as follows:
('AS_DARK =', '0.0019517200000000003') #dark noise at ASDC
('MC_DARK =', '0.02792') #dark noise at MC2 trans
('AS_LOCKED =', '2.04293') #beam power at ASDC when the cavity was locked
('MC_LOCKED =', '2.6951620000000003')
('AS_MISALIGNED =', '2.0445439999999997') #beam power at ASDC when the cavity was misaligned
('MC_MISALIGNED =', '2.665312')
#normalized beam power
the script "armloss_AS_calc.py",
Some changes were made in the script for getting the signals of beam power:
In the yesterday measurement the beam power of ASDC is higher when locked than when misaligned and I wrote it maybe caused by over-coupled cavity. Then I did a calculation as following to explain this:
DASWG is not what we want to use for config; we should use the K. Thorne LLO instructions, like I did for ROSSA.
pianosa has been upgraded to SL7. I've made a controls user account, added it to sudoers, did the network config, and mounted /cvs/cds using /etc/fstab. Other capabilities are being slowly added, but it may be a while before this workstation has all the kinks ironed out. For now, I'm going to follow the instructions on this wiki to try and get the usual LSC stuff working.
I used these values for measuring armloss:
then the uncertainties reported by the individual measurements are on the order of 6 ppm (~6.2 for the XARM, ~6.3 for the YARM). This accounts for fluctuations of the data read from the scope and uncertainties in mode-matching and modulation depths in the EOM. I made histograms for the 20 datapoints taken for each arm: the standard deviation of the spread is over 6ppm. We end up with something like:
XARM: 123 +/- 50 ppm
YARM: 152+/- 50 ppm
This result has about 40% of uncertaintities in XARM and 33% in YARM (so big... ).
In the previous measurement, the fluctuation of each power was 0.1% and the fluctuation of P(Locked)/P(misaligned) was also 0.1%. Then the uncertainty was small. On the other hand in my measurement, the fluctuation of power is about 2% and the fluctuation of P(Locked)/P(misaligned) is 2%. That's why the uncertainty became big.
We want to measure tiny value of loss (~100ppm). So the fluctuation of P(Locked)/P(misaligned) must be smaller than 1.6%.
(Edit on 10/23)
I think the error is dominated by systematic error in scope. The data of beam power had only 3 degits. If P(Locked) and P(misaligned) have 2% error, then
You have to check the configuration of scope.
but there's one weirdness: It get's the channel offset wrong. However this doesn't matter in our measurement because we're subtracting the dark level, which sees the same (wrong) offset.
When you do this measurement with oscilloscope, take care two things:
Steve & Bob,
Bob removed the head cover from the housing to inspect the condition of the the tip seal. The tip seal was fine but the viton cover seal had a bad hump. This misaligned the tip seal and it did not allow it to rotate.
It was repositioned an carefully tithened. It worked. It's starting current transiant measured 28 A and operational mode 3.5 A
This load is normal with an old pump. See the brand new DIP7 drypump as spare was 25 A at start and 3.1 A in operational mode. It is amazing how much punishment a slow blow ceramic 10A fuse can take [ 0215010.HXP ]
In the future one should measure the current pick up [ transient <100ms ] after the the seal change with Fluke 330 Series Current Clamp
It was swapped in and the foreline pressure dropped to 24 mTorr after 4 hours. It is very good. TP3 rotational drive current 0.15 A at 50K rpm 24C
The scripts for measuring armloss are in the directory "/opt/rtcds/caltech/c1/scripts/lossmap_scripts/armloss_scope".
The main laser went off when PSL doors were opened-closed. It was turned back on and the PSL is locked.
As a part of the preparation for the replacement of c1susaux with Acromag, I made inspection of the coil-osem transfer function measurements for the vertex SUSs.
The TFs showed typical f^-2 with the whitening on except for ITMY UL (Attachment 1). Gautam told me that this is a known issue for ~5 years.
We made a thorough inspection/replacement of the components and identified the mechanism of the problem.
It turned out that the inputs to MAX333s are as listed below.
The switching voltage for UL is obviously incorrect. We thought this comes from the broken BIO board and thus swapped the corresponding board. But the issue remained. There are 4 BIO boards in total on c1sus, so maybe we have replaced a wrong board?
Initially, we thought that the BIO can't drive the pull-up resistor of 5KOhm from 15V to 0V (=3mA of current). So I have replaced the pull-up resistor to be 30KOhm. But this did not help. These 30Ks are left on the board.
Gautam & Steve,
Our controller is back with Osaka maintenace completed. We swapped it in this morning.
Chub Osthelder received 40m specific basic safety traning today.
Steve reported to me that the CC1 Hornet gauge was not reporting the IFO pressure after some cable tracing at EX. I found that the power to the unit had been accidentally disconnected. I re-connected the power and manually turned on the HV on the CC gauge (perhaps this can be automated in the new vacuum paradigm). IFO pressure of 8e-6 torr is being reported now.
Physical plan is cleaning our roof and gutters today.
Let's install Jamie's new Data Viewer
I'm continuing the arm loss measurements Yuki was making. I'm first familiarizing myself with the procedures for the measurement Johannes describes.
I'm not very familiar with the medm screens, so I'm just kind of poking around and checking with Gautam. I do the following:
I've left the script running.
Some facts which should be considered when doing this measurement and the associated uncertainty: