40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 1 of 325  Not logged in ELOG logo
ID Date Author Type Category Subject
  16350   Mon Sep 20 21:56:07 2021 KojiUpdateComputersWifi internet fixed

Ug, factory resets... Caltech IMSS announced that there was an intermittent network service due to maintenance between Sept 19 and 20. And there seemed some aftermath of it. Check out "Caltech IMSS"

 

  16349   Mon Sep 20 20:43:38 2021 TegaUpdateElectronicsSat Amp modifications

Running update of Sat Amp modification work, which involves the following procedure (x8) per unit:

  1. Replace R20 & R24 with 4.99K ohms, R23 with 499 ohms, and remove C16.
  2. (Testing) Connect LEDDrive output to GND and check that
    • TP4 is ~ 5V
    •  TP5-8 ~ 0V. 
  3. Install 40m Satellite to Flange Adapter (D2100148-v1)

 

Unit Serial Number Issues Status
S1200740 NONE DONE
S1200742 NONE DONE
S1200743 NONE DONE
S1200744

TP4 @ LED1,2 on PCB S2100568 is 13V instead of 5V

TP4 @ LED4 on PCB S2100559 is 13V instead of 5V

DONE
S1200752 NONE DONE

 

 

 

  Draft   Mon Sep 20 15:42:44 2021 Ian MacMillanSummaryComputersQuantization Code Summary

This post serves as a summary and description of code to run to test the impacts of quantization noise on a state-space implementation of the suspension model.

Purpose: We want to use a state-space model in our suspension plant code. Before we can do this we want to test to see if the state-space model is prone to problems with quantization noise. We will compare two models one for a standard direct-ii filter and one with a state-space model and then compare the noise from both. 

Signal Generation:

First I built a basic signal generator that can produce a sine wave for a specified amount of time then can produce a zero signal for a specified amount of time. This will let the model ring up with the sine wave then decay away with the zero signal. This input signal is generated at a sample rate of 2^16 samples per second then stored in a numpy array. I later feed this back into both models and record their results.

State-space Model:

The code can be seen here

The state-space model takes in the list of excitation values and feeds them through a loop that calculates the next value in the output.

Given that the state-space model follows the form

  \dot{x}(t)=\textbf{A}x(t)+ \textbf{B}u(t)   and  y(t)=\textbf{C}x(t)+ \textbf{D}u(t) ,

the model has three parts the first equation, an integration, and the second equation.

  1. The first equation takes the input x and the excitation u and generates the x dot vector shown on the left-hand side of the first state-space equation.
  2. The second part must integrate x to obtain the x that is found in the next equation. This uses the velocity and acceleration to integrate to the next x that will be plugged into the second equation
  3. The second equation in the state space representation takes the x vector we just calculated and then multiplies it with the sensing matrix C. we don't have a D matrix so this gives us the next output in our system

This system is the coded form of the block diagram of the state space representation shown in attachment 1

Direct-II Model:

The direct form 2 filter works in a much simpler way. because it involves no integration and follows the block diagram shown in Attachment 2, we can use a single difference equation to find the next output. However, the only complication that comes into play is that we also have to keep track of the w(n) seen in the middle of the block diagram. We use these two equations to calculate the output value

y[n]=b_0 \omega [n]+b_1 \omega [n-1] +b_2 \omega [n-2],  where w[n] is  \omega[n]=x[n] - a_1 \omega [n-1] -a_2 \omega[n-2]

Bit length Control:

 

  Draft   Mon Sep 20 15:25:01 2021 PacoSummaryCalibration 
  16346   Mon Sep 20 15:23:08 2021 YehonathanUpdateComputersWifi internet fixed

Over the weekend and today, the wifi was acting bad with frequent disconnections and no internet access. I tried to log into the web interface of the ASUS wifi but with no success.

I pushed the reset button for several seconds to restore factory settings. After that, I was able to log in. I did the automatic setup and defined the wifi passwords to be what they used to be.

Internet access was restored. I also unplugged and plugged back all the wifi extenders in the lab and moved the extender from the vertex inner wall to the outer wall of the lab close to the 1X3.

Now, there seems to be wifi reception both in X and Y arms (according to my android phone).

 

  16345   Mon Sep 20 14:22:00 2021 ranaSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

I suggest plotting all the traces in the plot so we can see their differences. Also remove the 1/f^2 slope so that we can see small differences. Since the optlev servos all have low pass filters around 15-20 Hz, its not necessary to turn off the optlev servos for this measurement.

I think that based on the coherence and the number of averages, you should also be able to use Bendat and Piersol so estimate the uncertainy as a function of frequency. And we want to see the comparison coil-by-coil, not in the DoF basis.

4 sweeps for BS and 4 sweeps for PRM.

  16344   Mon Sep 20 14:11:40 2021 KojiUpdateBHDEnd DAC Adapter Unit D2100647

I've uploaded the schematic and PCB PDF for End DAC Adapter Unit D2100647.

Please review the design.

  • CH1-8 SUS actuation channels.
    • 5CHs out of 8CHs are going to be used, but for future extensions, all the 8CHs are going to be filled.
    • It involves diff-SE conversion / dewhitening / SE-diff conversion. Does this make sense?
  • CH9-12 PZT actuation channels. It is designed to send out 4x SE channels for compatibility. The channels have the jumpers to convert it to pass through the diff signals.
  • CH13-16 are general purpose DIFF/SE channels. CH13 is going to be used for ALS Laser Slow control. The other 3CHs are spares.

The internal assembly drawing & BOM are still coming.

  16343   Mon Sep 20 12:20:31 2021 PacoSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

[yehonathan, paco, anchal]

We attempted to find any symptoms for actuation problems in the PRMI configuration when actuated through BS and PRM.

Our logic was to check angular (PIT and YAW) actuation transfer function in the 30 to 200 Hz range by injecting appropriately (f^2) enveloped excitations in the SUS-ASC EXC points and reading back using the SUS_OL (oplev) channels.

From the controls, we first restored the PRMI Carrier to bring the PRM and BS to their nominal alignment, then disabled the LSC output (we don't need PRMI to be locked), and then turned off the damping from the oplev control loops to avoid supressing the excitations.

We used diaggui to measure the 4 transfer functions magnitudes PRM_PIT, PRM_YAW, BS_PIT, BS_YAW, as shown below in Attachments #1 through #4. We used the Oplev calibrations to plot the magnitude of the TFs in units of urad / counts, and verified the nominal 1/f^2 scaling for all of them. The coherence was made as close to 1 as possible by adjusting the amplitude to 1000 counts, and is also shown below. A dip at 120 Hz is probably due to line noise. We are also assuming that the oplev QPDs have a relatively flat response over the frequency range below.

  16342   Fri Sep 17 20:22:55 2021 KojiUpdateSUSEQ M4.3 Long beach

EQ  M4.3 @longbeach
2021-09-18 02:58:34 (UTC) / 07:58:34 (PDT)
https://earthquake.usgs.gov/earthquakes/eventpage/ci39812319/executive

  • All SUS Watchdogs tripped, but the SUSs looked OK except for the stuck ITMX.
  • Damped the SUSs (except ITMX)
  • IMC automatically locked
  • Turned off the damping of ITMX and shook it only with the pitch bias -> Easily unstuck -> damping recovered -> realignment of the ITMX probably necessary.
  • Done.
  16341   Fri Sep 17 00:56:49 2021 KojiUpdateGeneralAwesome

The Incredible Melting Man!

 

  16340   Thu Sep 16 20:18:13 2021 AnchalUpdateGeneralReset

Fridge brought back inside.

Quote:

Put outside.

Quote:

It happened again. Defrosting required.

 

 

  16339   Thu Sep 16 14:08:14 2021 Ian MacMillanFrogs Tour

I gave some of the data analysts a look around because they asked and nothing was currently going on in the 40m. Nothing was changed.

  16338   Thu Sep 16 12:06:17 2021 TegaUpdateComputer Scripts / ProgramsTemperature sensors added to the summary pages

We can now view the minute trend of the temperature sensors under the PEM tab of the summary pages. See attachment 1 for an example of today's temperature readings. 

  16337   Thu Sep 16 10:07:25 2021 AnchalUpdateGeneralMelting 2

Put outside.

Quote:

It happened again. Defrosting required.

 

  16336   Thu Sep 16 01:16:48 2021 KojiUpdateGeneralFrozen 2

It happened again. Defrosting required.

  16335   Thu Sep 16 00:00:20 2021 KojiUpdateGeneralRIO Planex 1064 Lasers in the south cabinet

RIO Planex 1064 Lasers in the south cabinet

Property Number C30684/C30685/C30686/C30687

  16334   Wed Sep 15 23:53:54 2021 KojiSummaryGeneralTowards the end upgrade

Ordered compoenents are in.

- Made 36 more Sat Amp internal boards (Attachment 1). Now we can install the adapters to all the 19 sat amp units.

- Gave Tega the components for the sat amp adapter units. (Attachment 2)

- Gave Tega the componennts for the sat amp / coil driver modifications.

- Made 5 PCBs for the 16bit DAC AI rear panel interface (Attachment 3)

  16333   Wed Sep 15 23:38:32 2021 KojiUpdateALSALS ASX PZT HV was off -> restored

It was known that the Y end ALS PZTs are not working. But Anchal reported in the meeting that the X end PZTs are not working too.

We went down to the X arm in the afternoon and checked the status. The HV (KEPCO) was off from the mechanical switch. I don't know this KEPCO has the function to shutdown the switch at the power glitch or not.
But anyway the power switch was engaged. We also saw a large amount of misalignment of the X end green. The alignment was manually adjusted. Anchal was able to reach ~0.4 Green TRX, but no more. He claimed that it was ~0.8.

We tried to tweak the SHG temp from 36.4. We found that the TRX had the (local) maximum of ~0.48 at 37.1 degC. This is the new setpoint right now.

  16332   Wed Sep 15 11:27:50 2021 YehonathanUpdateCDSc1auxey assembly

{Yehonathan, Paco}

We turned off the ETMX watchdogs and OpLevs. We went to the X end and shut down the Acromag chassi. We labeled the chassi feedthroughs and disconnected all the cables from it.

We took it out and tied the common wire of the power supplies (the commons of the 20V and 15V power supplies were shorted so there is no difference which we connect) to the RTNs of the analog inputs.

The chassi was put back in place. All the cables were reconnected. Power turn on.

We rebooted c1auxex and the channels went back online. We turned on the watchdogs and watched the ETMX motion get damped. We turned on the OpLev. We waited until the beam position got centered on the ETMX.

Attachment shows a comparison between the OSEM spectra before and after the grounding work. Seems like there is no change.

We were able to lock the arms with no issues.

 

  16331   Tue Sep 14 19:12:03 2021 KojiSummaryPEMExcess seismic noise in 0.1 - 0.3 Hz band

Looks like this increase is correlated for BS/EX/EY. So it is likely to be real.

Comparison between 9/15 (UTC) (Attachment 1) and 9/10 (UTC) (Attachment 2)

  16330   Tue Sep 14 17:22:21 2021 AnchalUpdateCDSAdded temp sensor channels to DAQ list

[Tega, Paco, Anchal]

We attempted to reboot fb1 daqd today to get the new temperature sensor channels recording. However, the FE models got stuck, apparantely due to reasons explaine din 40m/16325. Jamie cleared the /var/logs in fb1 so that FE can reboot. We were able to reboot the FE machines after this work successfully and get the models running too. During the day, the FE machines were shut down manually and brought back on manually, a couple of times on the c1iscex machine. Only change in fb1 is in the /opt/rtcds/caltech/c1/chans/daq/C0EDCU.ini where the new channels were added, and some hacking was done by Jamie in gpstime module (See 40m/16327).

  16329   Tue Sep 14 17:19:38 2021 PacoSummaryPEMExcess seismic noise in 0.1 - 0.3 Hz band

For the past couple of days the 0.1 to 0.3 Hz RMS seismic noise along BS-X has increased. Attachment 1 shows the hour trend in the last ~ 10 days. We'll keep monitoring it, but one thing to note is how uncorrelated it seems to be from other frequency bands. The vertical axis in the plot is in um / s

  16328   Tue Sep 14 17:14:46 2021 KojiUpdateSUSSOS Tower Hardware

Yup this is OK. No problem.

 

  16327   Tue Sep 14 16:44:54 2021 jamieFrogsCDSfb1 /var full after reboot, caused all sorts of problems

Jonathan Hanks pointed me to this fix to the gpstime kernel module that was unfortunately put in after the 3.4 release that we're currently using:

https://git.ligo.org/cds/advligorts/-/commit/6f6d6e2eb1d3355d0cbfe9fe31ea3b59af1e7348

I hacked the source in place (/usr/src/gpstime-3.4/drv/gpstime/gpstime.c) to get the fix, and then rebuilt the kernel module with dkms :

sudo dkms uninstall gpstime/3.4
sudo dkms install gpstime/3.4

I then stopped daqd_dc, unloaded gpstime, reloaded it, restarted daqd_dc.  The messages are no longer showing up in /var/log/messages, so I think we're ok for the moment.

NOTE: the fix will be undone if we for some reason reinstall the advligorts-gpstime-dkms package.  There shouldn't be a need to do that, but we should be aware.  I'm discussing with Jonathan if we want to try to push out a new debian package to fix this issue...

  16326   Tue Sep 14 16:12:03 2021 JordanUpdateSUSSOS Tower Hardware

Yehonathan noticed today that the silver plated hardware on the assembled SOS towers had some pretty severe discoloration on it. See attached picture.

These were all brand new screws from UC components, and have been sitting on the flow bench for a couple months now. I believe this is just oxidation and is not an issue, I spoke to Calum as well and showed him the attached picture and he agreed it was likely oxidation and should not be a problem once installed.

He did mention if there is any concern from anyone, we could take an FTIR sample and send it to JPL for analysis, but this would cost a few hundred dollars.

I don't believe this to be an issue, but it is odd that they oxidized so quickly. Just wanted to relay this to everyone else to see if there was any concern.

  16325   Tue Sep 14 15:57:05 2021 jamieFrogsCDSfb1 /var full after reboot, caused all sorts of problems

/var on fb1 filled up today, which caused all sorts of CDS issues.  I found out about the problem by reading the logs of the services that were having trouble running, in which they complained about not being able to write to disk.  I looked at the filesystem status with 'df' and noticed that /var was full, which is where applications write temporary data, and will always cause problems if it's full.

I tracked the issue down to multiple multi-gigabyte log files: /var/log/messages and /var/log/messages.1.  They were full of lines like this one:

Aug 29 06:25:21 fb1 kernel: l called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl ca

Seems like something related to the gpstime kernel module?

Anyway, I deleted the log files for now, which cleared up the space on /var.  Things should be back to normal now, until the logs fill up again...

  16324   Mon Sep 13 18:19:25 2021 TegaUpdateComputer Scripts / ProgramsMoved modbus service from chiara to c1susaux

[Tega, Anchal, Paco]

After talking to Anchal, it was made clear that chiara is not the place to host the modbus service for the temperature sensors. The obvious machine is c1pem, but the startup cmd script loads c object files and it is not clear how easy it would integrate the modbus functionality since we can only login via telnet, so we decided to instead host the service on c1susaux. We also modified the /etc/motd file on c1susaucx which displays the welcome message during login to inform the user that this machine hosts the modbus service for the temperature sensor. Anchal plans to also document this information on the temperature sensor wiki at some point in the future when the page is updated to include what has been learnt so far.

We might also consider updating the database file to a more modern way of reading the temperature sensor data using FLOAT32_LE which is available on EPICs version 3.14 and above, instead of the current method which works but leaves the reader bemused by the bitwise operations that convert the two 16 bits words (A and B) to IEEE-754 32-bit float, via 

field(CALC, "(A&D?(A&C?-1:1):0)*((G|A&E)*J+B)*2^((A&D)/G-F)")

where 

   field(INPA, "$HiWord")
   field(INPB, "$LoWord")
   field(INPC, "0x8000")   # Hi word, sign bit
   field(INPD, "0x7F80")   # Hi word, exponent mask
   field(INPE, "0x00FF")   # Hi word, mantissa mask (incl hidden bit)
   field(INPF, "150")      # Exponent offset plus 23-bit mantissa shift
   field(INPG, "0x0080")   # Mantissa hidden bit
   field(INPJ, "65536")    # Hi/Lo mantissa ratio
   field(CALC, "(A&D?(A&C?-1:1):0)*((G|A&E)*J+B)*2^((A&D)/G-F)")
   field(PREC, "4")

as opposed to the more modern form

field(INP,"@asyn($(PORT) $(OFFSET))FLOAT32_LE")
  16323   Mon Sep 13 17:05:04 2021 TegaSummaryPEMInfrasensing temperature sensor modbus configuration

Anchal mentioned it would be good to put more details about how I arrived at the values needed to configure the modbus drive for the temperature sensor, since this information is not in the manual and is hard to find on the internet, so here is a breakdown.

So the generic format is:

drvAsynIPPortConfigure("<TCP_PORT_NAME>","<UNIT_IP_ADDRESS>:502",0,0,1)
modbusInterposeConfig("<TCP_PORT_NAME>",0,5000,0)
drvModbusAsynConfigure("<PORT_NAME>","<TCP_PORT_NAME>",<slaveAddress>,<modbusFunction>,<modbusStartAddress>,<modbusLength>,<dataType>,<pollMsec>,<plcType>)

which in our case become:

drvAsynIPPortConfigure("c1pemxendtemp","192.168.113.240:502",0,0,1)
modbusInterposeConfig("c1pemxendtemp",0,5000,0)
drvModbusAsynConfigure("C1PEMXENDTEMP","c1pemxendtemp",0,4,199,2,0,1000,"ServerCheck")

As can be seen, the parameters of the first two functions "drvAsynIPPortConfigure" and "modbusInterposeConfig" are straight forward, so we restrict our discussion to the case of third function "drvModbusAsynConfigure". Well, after hours of trolling the internet, I was able to piece together a coherent picture of what needs doing and I have summarised them in the table below.

 

drvModbusAsynConfigure

Once the asyn IP or serial port driver has been created, and the modbusInterpose driver has been configured, a modbus port driver is created with the following command:

drvModbusAsynConfigure(portName,                # used by channel definitions in .db file to reference this unit)
                       tcpPortName,             # reference to portName created with drvAsynIPPortConfigure command
                       slaveAddress,            # 
                       modbusFunction,          # 
                       modbusStartAddress,      # 
                       modbusLength,            # length in dataType units
                       dataType,                # 
                       pollMsec,                # how frequently to request a value in [ms]
                       plcType);                #

drvModbusAsynConfigure command
Parameter Data type Description
portName string Name of the modbus port to be created.
 
tcpPortName string Name of the asyn IP or serial port previously created.

tcpPortName = { 192.168.113.240:502192.168.113.241:502192.168.113.242:502 }
 
slaveAddress int The address of the Modbus slave. This must match the configuration of the Modbus slave (PLC) for RTU and ASCII. For TCP the slave address is used for the "unit identifier", the last field in the MBAP header. The "unit identifier" is ignored by most PLCs, but may be required by some.

ServersCheck API ignores this value, as confirmed with pymodbus query, so set to default value: 
slaveAddress = 0
 
modbusFunction int

modbus supports the following 8 Modbus function codes:

Modbus Function Codes
Access Function description Function code
Bit access Read Coils 1
Bit access Read Discrete Inputs 2
Bit access Write Single Coil 5
Bit access Write Multiple Coils 15
16-bit word access Read Input Registers 4
16-bit word access Read Holding Registers 3
16-bit word access Write Single Register 6
16-bit word access Write Multiple Registers 16
modbusStartAddress int Start address for the Modbus data segment to be accessed.
(0-65535 decimal, 0-0177777 octal).

Modbus addresses are specified by a 16-bit integer address. The location of inputs and outputs within the 16-bit address space is not defined by the Modbus protocol, it is vendor-specific. Note that 16-bit Modbus addresses are commonly specified with an offset of 400001 (or 300001). This offset is not used by the modbus driver, it uses only the 16-bit address, not the offset.

For ServersCheck, the offset is "30001", so that

modbusStartAddress = 30200 - 30001 = 199

modbusLength int The length of the Modbus data segment to be accessed.
This is specified in bits for Modbus functions 1, 2, 5 and 15.
It is specified in 16-bit words for Modbus functions 3, 4, 6 and 16.
Length limit is 2000 for functions 1 and 2, 1968 for functions 5 and 15,
125 for functions 3 and 4, and 123 for functions 6 and 16.

ServersCheck uses two's complement 32-bits word (with big-endian byte order & little-endian word order) format to store floating-point data, as confirmed with pymodbus query, so that:

modbusLength = 2
 
modbusDataType int The modbusDataType is used to tell the driver the format of the Modbus data. The driver uses this information to convert the number between EPICS and Modbus. Data is transferred to and from EPICS as epicsUInt32, epicsInt32, and epicsFloat64 numbers.

Modbus data type:
0 = binary, twos-complement format
1 = binary, sign and magnitude format
2 = BCD, unsigned
3 = BCD, signed

Some Modbus devices (including ServersCheck) use floating point numbers, typically by storing a 32-bit float in two consecutive 16-bit registers. This is not supported by the Modbus specification, which only supports 16-bit registers and single-bit data. The modbus driver does not directly support reading such values, because the word order and floating point format is not specified.

Note that if it is desired to transmit BCD numbers untranslated to EPICS over the asynInt32 interface, then data type 0 should be used, because no translation is done in this case. 

For ServersCheck, we wish to transmit the untranslated data, so:

modbusDataType = 0
 
pollMsec int Polling delay time in msec for the polling thread for read functions.
For write functions, a non-zero value means that the Modbus data should
be read once when the port driver is first created.

ServersCheck recommends setting sensor polling interval between 1-5 seconds, so we can try:

pollMsec = 1000
 
plcType string Type of PLC (e.g. Koyo, Modicon, etc.).
This parameter is currently used only to print information in asynReport.
In the future it could be used to modify the driver behavior for a specific PLC.

plcType = "ServersCheck"
 

 

Useful links

https://nodus.ligo.caltech.edu:8081/40m/16214

https://nodus.ligo.caltech.edu:8081/40m/16269

https://nodus.ligo.caltech.edu:8081/40m/16270

https://nodus.ligo.caltech.edu:8081/40m/16274

 

http://manuals.serverscheck.com/InfraSensing_Sensors_Platform.pdf

http://manuals.serverscheck.com/InfraSensing_Modbus_manualv5.pdf

https://community.serverscheck.com/discussion/comment/7419#Comment_7419

 

https://wiki-40m.ligo.caltech.edu/CDS/SlowControls

https://www.slac.stanford.edu/grp/ssrl/spear/epics/site/modbus/modbusDoc.html#Creating_a_modbus_port_driver

 

https://github.com/riptideio/pymodbus

 

https://en.wikipedia.org/wiki/Modbus

https://deltamotion.com/support/webhelp/rmctools/Communications/Ethernet/Supported_Protocols/Ethernet_Modbus_TCP.htm

  16322   Mon Sep 13 15:14:36 2021 AnchalUpdateLSCXend Green laser injection mirrors M1 and M2 not responsive

I was showing some green laser locking to Tega, I noticed that changing the PZT sliders of M1/M2 angular position on Xend had no effect on locked TEM01 or TEM00 mode. This is odd as changing these sliders should increase or decrease the mode-matching of these modes. I suspect that the controls are not working correctly and the PZTs are either not powered up or not connected. We'll investigate this in near future as per priority.

  16321   Mon Sep 13 14:32:25 2021 YehonathanUpdateCDSc1auxey assembly

So we agreed that the RTNs points on the c1auxex Acromag chassis should just be grounded to the local Acromag ground as it just needs a stable reference. Normally, the RTNs are not connected to any ground so there is should be no danger of forming ground loops by doing that. It is probably best to use the common wire from the 15V power supplies since it also powers the VME crate. I took the spectra of the ETMX OSEMs (attachment) for reference and proceeding with the grounding work.

 

  16320   Mon Sep 13 09:15:15 2021 PacoUpdateLSCMC unlocked?

Came in at ~ 9 PT this morning to find the IFO "down". The IMC had lost its lock ~ 6 hours before, so at about 03:00 AM. Nothing seemed like the obvious cause; there was no record of increased seismic activity, all suspensions were damped and no watchdog had tripped, and the pressure trends similar to those in recent pressure incidents show nominal behavior (Attachment #1). What happened?

Anyways I simply tried reopening the PSL shutter, and the IMC caught its lock almost immediately. I then locked the arms and everything seems fine for now cool.

  16319   Mon Sep 13 04:12:01 2021 TegaUpdateGeneralAdded temperature sensors at Yend and Vertex too

I finally got the modbus part working on chiara, so we can now view the temperature data on any machine on the martian network, see Attachment 1. 

I also updated the entries on /opt/rtcds/caltech/c1/chans/daq/C0EDCU.ini, as suggested by Koji, to include the SensorGatway temperature channels, but I still don't see their EPICs channels on https://ldvw.ligo.caltech.edu/ldvw/view. This means the channels are not available via nds so I think the temperature data is not being to be written to frame files on framebuilder but I am not sure what this entails, since I assumed C0EDCU.ini is the framebuilder daq channel list.

When the EPICs channels are available via nds, we should be able to display the temperature data on the summary pages.

Quote:

I've added the other two temperature sensor modules on Y end (on 1Y4, IP: 192.168.113.241) and in the vertex on (1X2, IP: 192.168.113.242). I've updated the martian host table accordingly. From inside martian network, one can go to the browser and go to the IP address to see the temperature sensor status . These sensors can be set to trigger alarm and send emails/sms etc if temperature goes out of a defined range.

I feel something is off though. The vertex sensor shows temperature of ~28 degrees C, Xend says 20 degrees C and Yend says 26 degrees C. I believe these sensors might need calibration.

Remaining tasks are following:

  • Modbus TCP solution:
    • If we get it right, this will be easiest solution.
    • We just need to add these sensors as streaming devices in some slow EPICS machine in there .cmd file and add the temperature sensing channels in a corresponding database file.
  • Python workaround:
    • Might be faster but dirty.
    • We run a python script on megatron which requests temperature values every second or so from the IP addresses and write them on a soft EPICs channel.
    • We still would need to create a soft EPICs channel fro this and add it to framebuilder data acquisition list.
    • Even shorted workaround for near future could be to just write temperature every 30 min to a log file in some location.

[anchal, paco]

We made a script under scripts/PEM/temp_logger.py and ran it on megatron. The script uses the requests package to query the latest sensor data from the three sensors every 10 minutes as a json file and outputs accordingly. This is not a permanent solution.

 

  16318   Thu Sep 9 09:54:41 2021 StephenSummaryBHDBHD OMC invacuum wiring - cable lengths

[Stephen]

Cable lengths task - in vacuum cabling for the green section (new, custom for 40m) and yellow section (per aLIGO, except likely with cheaper FEP ribbon cable material) from QIL/16198. These arethe myriad of cables extending from the in vacuum flange to the aLIGO-style on-table Cable Stand (think, for example, D1001347), then from the cable stand to the OMCs.

a) select a position for the cable stand.

 - Koji and I discussed and elected to place in the (-X, -Y) corner of the table (Northwest in the typical diagram) and near the table edge. This is adjacent to the intended exit flange for the last cable.

b) measure distances (point to point) and cable routing approximations for all items.

 +X OMC (long edge aligned with +Y beam axis) (overview image in Attachment 1)

- QPDs to the cable stand, point to point = 12, routing estimate = 20.
- DCPDs to the cable stand, point to point = 25, routing estimate = 32.
- PZTs to the cable stand, point to point = 21, routing estimate = 32.

+Y OMC (long edge aligned with +Y beam axis) (overview image in Attachment 1)

- QPDs to the cable stand, point to point = 16, routing estimate = 23.
- DCPDs to the cable stand, point to point = 26, routing estimate = 38.
- PZTs to the cable stand, point to point = 24, routing estimate = 33.

Cable stand to flange (Attachment 2) (specific image in Attachment 2)

 - point to point = 35, routing estimate = 42

  16317   Wed Sep 8 19:06:14 2021 KojiUpdateGeneralBackup situation

Tega mentioned in the meeting that it could be safer to separate some of nodus's functions from the martian file system.
That's an interesting thought. The summary pages and other web services are linked to the user dir. This has high traffic and can cause the issure of the internal network once we crash the disk.
Or if the internal system is crashed, we still want to use elogs as the source of the recovery info. Also currently we have no backup of the elog. This is dangerous.

We can save some of the risks by adding two identical 2TB disks to nodus to accomodate svn/elog/web and their daily backup.

host file system or contents condition note
nodus root none or unknown  
nodus home (svn, elog) none  
nodus web (incl summary pages) backed up linked to /cvs/cds
chiara root maybe need to check with Jon/Anchal
chiara /home/cds local copy The backup disk is smaller than the main disk.
chiara /home/cds remote copy - stalled we used to have, but stalled since 2017/11/17
fb1 root maybe need to check with Jon/Anchal
fb1 frame rsync pulled from LDAS according to Tega
       

 

  16316   Wed Sep 8 18:00:01 2021 KojiUpdateVACcronjobs & N2 pressure alert

In the weekly meeting, Jordan pointed out that we didn't receive the alert for the low N2 pressure.

To check the situation, I went around the machines and summarized the cronjob situation.
[40m wiki: cronjob summary]
Note that this list does not include the vacuum watchdog and mailer as it is not on cronjob.

Now, I found that there are two N2 scripts running:

1. /opt/rtcds/caltech/c1/scripts/Admin/n2Check.sh on megatron and is running every minute (!)
2. /opt/rtcds/caltech/c1/scripts/Admin/N2check/pyN2check.sh on c1vac and is running every 3 hours.

Then, the N2 log file was checked: /opt/rtcds/caltech/c1/scripts/Admin/n2Check.log

Wed Sep 1 12:38:01 PDT 2021 : N2 Pressure: 76.3621
Wed Sep 1 12:38:01 PDT 2021 : T1 Pressure: 112.4
Wed Sep 1 12:38:01 PDT 2021 : T2 Pressure: 349.2
Wed Sep 1 12:39:02 PDT 2021 : N2 Pressure: 76.0241
Wed Sep 1 12:39:02 PDT 2021 : N2 pressure has fallen to 76.0241 PSI !

Tank pressures are 94.6 and 98.6 PSI!

This email was sent from Nodus.  The script is at /opt/rtcds/caltech/c1/scripts/Admin/n2Check.sh

Wed Sep 1 12:40:02 PDT 2021 : N2 Pressure: 75.5322
Wed Sep 1 12:40:02 PDT 2021 : N2 pressure has fallen to 75.5322 PSI !

Tank pressures are 93.6 and 97.6 PSI!

This email was sent from Nodus.  The script is at /opt/rtcds/caltech/c1/scripts/Admin/n2Check.sh

...

The error started at 11:39 and lasted until 13:01 every minute. So this was coming from the script on megatron. We were supposed to have ~20 alerting emails (but did none).
So what's happened to the mails? I tested the script with my mail address and the test mail came to me. Then I sent the test mail to 40m mailing list. It did not reach.
-> Decided to put the mail address (specified in /etc/mailname , I believe) to the whitelist so that the mailing list can accept it.
I did run the test again and it was successful. So I suppose the system can now send us the alert again.
And alerting every minute is excessive. I changed the check frequency to every ten minutes.

What's happened to the python version running on c1vac?
1) The script is running, spitting out some error in the cron report (email on c1vac). But it seems working.
2) This script checks the pressures of the bottles rather than the N2 pressure downstream. So it's complementary.
3) During the incident on Sept 1, the checker did not trip as the pressure drop happened between the cronjob runs and the script didn't notice it.
4) On top of them, the alert was set to send the mails only to an our former grad student. I changed it to deliver to the 40m mailing list. As the "From" address is set to be some ligox...@gmail.com, which is a member of the mailing list (why?), we are supposed to receive the alert. (And we do for other vacuum alert from this address).

 

 

 

 

  16315   Tue Sep 7 18:00:54 2021 TegaSummaryCalibrationSystem Identification via line injection

[paco]

This morning, I spent some time restoring the jupyter notebook server running in allegra. This server was first set up by Anchal to be able to use the latest nds python API tools which is handy for the calibration stuff. The process to restore the environment was to run "source ~/bashrc.d/*" to restore some of the aliases, variables, paths, etc... that made the nds server work. I then ran ssh -N -f -L localhost:8888:localhost:8888 controls@allegra from pianosa and carry on with the experiment.


[paco, hang, tega]

We started a notebook under /users/paco/20210906_XARM_Cal/XARM_Cal.ipynb on which the first part was doing the following;

  • Set up list of excitations for C1:LSC-XARM_EXC (for example three sine waveforms) using awg.py
  • Make sure the arm is locked
  • Read a reference time trace of the C1:LSC-XARM_IN2 channel for some duration
  • Start excitations (one by one at the moment, ramptime ~ 3 seconds, same duration as above)
  • Get data for C1:LSC-XARM_IN2 for an equal duration (raw data in Attachment #1)
  • Generate the excitation sine and cosine waveforms using numpy and demodulate the raw timeseries using a 4th order lowpass filter with fc ~ 10 Hz
  • Estimate the correct demod phase by computing arctan(Q / I) and rerunning the demodulation to dump the information into the I quadrature (Attachment #2).
  • Plot the estimated ASD of all the quadratures (Attachment #3)

[paco, hang, tega]

Estimation of open loop gain:

  • Grab data from the C1:LSC-XARM_IN1 and C1:LSC-XARM_IN2 test points
  • Infer excitation from their differnce, i.e. C1:LSC-XARM_EXC = C1:LSC-XARM_IN2 - C1:LSC-XARM_IN1
  • Compute the open loop gain as follows : G(f) = csd(EXC,IN1)/csd(EXC,IN2), where csd computes the cross spectra density of the input arguments
  • For the uncertainty in G, dG, we repeat steps (1) to (3) with & without signal injection in the C1:LSC-XARM_EXC channel. In the absence of signal injection, the signal in C1:LSC-XARM_IN2 is of the form: Y_ref = Noise/(1-G), whereas with nonzero signal injection, the signal in C1:LSC-XARM_IN2 has the form: Y_cal = EXC/(1-G) + Noise/(1-G), so their ratio, Y_cal/Y_ref = EXC/Noise, gives the SNR, which we can then invert to give the uncertainty in our estimation of G, i.e dG = Y_ref/Y_cal.
  • For the excitation at 53 Hz, our measurtement for the open loop gain comes out to about 5 dB whiich is consistent with previous measurement.
  • We seem to have an SNR in excess of 100 at measurement time of 35 seconds and 1 count of amplitude which gives a relative uncertainty of G of 0.1%
  • The analysis details are ongoing. Feedback is welcome.
  16314   Fri Sep 3 02:03:15 2021 TegaSummaryComputersStrip down large error files

Also deleted the ~50GB error files from ldas to prevent rsync from copying them to nodus again. With the new update to GWsumm, there are new error messages that initially didn't seem to affect the summary pages functionality, but in the extreme case can populated the error files the repeated warnings on the form "Loading: FrSerData", "Loading: FrSerData::n4294967295", "Loading: FrSummary","Loading: FrSerDataLoading: FrSerData" and many more combinations until we get file sizes of the order of ~50GB. So I have updated the checkstatus script to parse the error files and strip out the majority of these error messages. Work is ongoing to get them all.

In light of these large files generation, I decided to look in the summary pages folder to see if there are other large files that we need to keep track of and it turns there are indeed a collection of files in the archive folder that bloats the summary pages on ldas to ~1TB. Luckily these are not synced to nodus so no problem here. However, since the beginning of the year, the archive folders that hold data used for each day's computation have not been cleared. We have a script for doing this but it has not been run for a while now and it only delete archive files for a specific month which is hardcoded to two months from the date the file is run. I have modified the code to allow archive deletion for a range of months so we can clear data from Jan to July. 

Quote:

[tega, paco]

We found the files that took excess space in the chiara filesystem (see Attachment 1). They were error files from the summary pages that were ~ 50 GB in size or so located under /home/cds/caltech/users/public_html/detcharsummary/logs/. We manually removed them and then copied the rest of the summary page contents into the main file system drive (this is to preserve the information backup before it gets deleted by the cron job at the end of today) and checked carefully to identify the actual issue for why these files were as large in the first place.

We then copied the /detcharsummary directory from /media/40mBackup into /home/cds to match the two disks.

 

  16313   Thu Sep 2 21:49:03 2021 PacoSummaryComputerschiara down, vac interlock tripped

[tega, paco]

We found the files that took excess space in the chiara filesystem (see Attachment 1). They were error files from the summary pages that were ~ 50 GB in size or so located under /home/cds/caltech/users/public_html/detcharsummary/logs/. We manually removed them and then copied the rest of the summary page contents into the main file system drive (this is to preserve the information backup before it gets deleted by the cron job at the end of today) and checked carefully to identify the actual issue for why these files were as large in the first place.

We then copied the /detcharsummary directory from /media/40mBackup into /home/cds to match the two disks.

  16312   Thu Sep 2 21:21:14 2021 KojiSummaryComputersVacuum recovery 2

Attachment 1:
We are pumping the main volume with TP2. Once P1a reached the pressure ~2.2mtorr, we could open the PSL shutter. The TP2 voltage went up once but came down to ~20V. It's close to nominal now.
We wondered if we should use TP3 or not. I checked the vacuum pressure trends and found that the annulus pressures were going up. So we decided to open the annulus valves.

Attachment 2:
The current vacuum status is as shown in the MEDM screenshot.

There is no trend data of the valve status (sad)

  16311   Thu Sep 2 20:47:19 2021 KojiUpdateCDSChiara DHCP restarted

[Paco, Tega, Koji]

Once chiara's DHCP is back, things got much more straight forward.
c1iscex and c1iscey were rebooted and the IOPs were launched without any hesitation.

Paco ran rebootC1LSC.sh and for the first time in this year we had the launch of the processes without any issue.

  16310   Thu Sep 2 20:44:18 2021 KojiUpdateCDSChiara DHCP restarted

We had the issue of the RT machines rebooting. Once we hooked up the display on c1iscex, it turned out that the IP was not given at it's booting-up.

I went to chiara and confirmed that the DHCP service was not running

~>sudo service isc-dhcp-server status
[sudo] password for controls:
isc-dhcp-server stop/waiting

So the DHCP service was manually restarted

~>sudo service isc-dhcp-server start
isc-dhcp-server start/running, process 24502
~>sudo service isc-dhcp-server status
isc-dhcp-server start/running, process 24502

 

 

  16309   Thu Sep 2 19:47:38 2021 KojiUpdateCDSThis week's FB1 GPS Timing Issue Solved

After the reboot daqd_dc was not working, but manual starting of open-mx / mx services solved the issue.

sudo systemctl start open-mx.service
sudo systemctl start mx.service
sudo systemctl start daqd_*

 

  16308   Thu Sep 2 19:28:02 2021 KojiUpdate This week's FB1 GPS Timing Issue Solved

After the disk system trouble, we could not make the RTS running at the nominal state. A part of the troubleshoot FB1 was rebooted. But the we found that the GPS time was a year off from the current time

controls@fb1:/diskless/root/etc 0$ cat /proc/gps 
1283046156.91
controls@fb1:/diskless/root/etc 0$ date
Thu Sep  2 18:43:02 PDT 2021
controls@fb1:/diskless/root/etc 0$ timedatectl 
      Local time: Thu 2021-09-02 18:43:08 PDT
  Universal time: Fri 2021-09-03 01:43:08 UTC
        RTC time: Fri 2021-09-03 01:43:08
       Time zone: America/Los_Angeles (PDT, -0700)
     NTP enabled: no
NTP synchronized: yes
 RTC in local TZ: no
      DST active: yes
 Last DST change: DST began at
                  Sun 2021-03-14 01:59:59 PST
                  Sun 2021-03-14 03:00:00 PDT
 Next DST change: DST ends (the clock jumps one hour backwards) at
                  Sun 2021-11-07 01:59:59 PDT
                  Sun 2021-11-07 01:00:00 PST


Paco went through the process described in Jamie's elog [40m ELOG 16299] (except for the installation part) and it actually made the GPS time even strange

controls@fb1:~ 0$ cat /proc/gps
967861610.89

I decided to remove the gpstime module and then load it again. This made the gps time back to normal again.

controls@fb1:~ 0$ sudo modprobe -r gpstime
controls@fb1:~ 0$ cat /proc/gps
cat: /proc/gps: No such file or directory
controls@fb1:~ 1$ sudo modprobe gpstime
controls@fb1:~ 0$ cat /proc/gps
1314671254.11

 

  16307   Thu Sep 2 17:53:15 2021 PacoSummaryComputerschiara down, vac interlock tripped

[paco, koji, tega, ian]

Today in the morning the name server / network file system running in chiara failed. This resulted in donatella/pianosa/rossa shell prompts to hang forever. It also made sitemap crash and even dropping into a bash shell and just listing files from some directory in the file system froze the computer. Remote ssh sessions on nodus also had the same symptoms.

A little after 1 pm, we started debugging this issue with help from Koji. He suggested we hook a monitor, keyboard, and mouse onto chiara as it should still work locally even if something with the NFS (network file system) failed. We did this and then we tried for a while to unmount the /dev/sdc1/ from /home/cds/ (main file system) and mount /dev/sdb1/ from /media/40mBackup (backup copy) such that they swap places. We had no trouble unmounting the backup drive, but only succeeded in unmounting the main drive with the "lazy" unmount, or running "umount -l". Running "df" we could see that the disk space was 100% used, with only ~ 1 GB of free space which may have been the cause for the issue. After swapping these disks by editing the /etc/fstab file to implement the aforementioned swapping, we rebooted chiara and we recovered the shell prompts in all workstations, sitemap, etc... due to the backup drive mounting. We then started investigating what caused the main drive to fill up that quickly, and noted that weirdly now the capacity was at 85% or about 500GB less than before (after reboot and remount), so some large file was probably dumped into chiara that froze the NFS causing the issue.

At this point we tried opening the PSL shutter to recover the IMC. The shutter would not open and we suspected the vacuum interlock was still tripped... and indeed there was an uncleared error in the VAC screen. So with Koji's guidance we walked to the c1vac near the HV station and did the following at ~ 5:13 PM -->

  1. Open V4; apart from a brief pressure spike in PTP2, everything looked ok so we proceeded to
  2. Open V1; P2 spiked briefly and then started to drop. Then, Koji suggested that we could
  3. Close V4; but we saw P2 increasing by a factor of~ 10 in a few seconds, so we
  4. Reopened V4;

We made sure that P1a (main vacuum pressure) was dropping and before continuing we decided to look back to see what the nominal vacuum state was that we should try to restore.

We are currently searching the two systems for diffrences to see if we can narrow down the culprit of the failure.

 

  16306   Wed Sep 1 21:55:14 2021 KojiSummaryGeneralTowards the end upgrade

- Sat amp mod and test: on going (Tega)
- Coil driver mod and test: on going (Tega)

- Acromag: almost ready (Yehonathan)

- IDC10-DB9 cable / D2100641 / IDC10F for ribbon in hand / Dsub9M ribbon brought from Downs / QTY 2 for two ends -> Made 2 (stored in the DSUB connector plastic box)
- IDC40-DB9 cable / D2100640 / IDC40F for ribbon in hand / DB9F solder brought from Downs  / QTY 4 for two ends -> Made 4 0.5m cables (stored in the DSUB connector plastic box)

- DB15-DB9 reducer cable / ETMX2+ETMY2+VERTEX16+NewSOS14 = 34 / to be ordered

- End DAC signal adapter with Dewhitening (with DIFF/SE converter) / to be designed & built
- End ADC adapter (with SE/DIFF converter) / to be designed & built


MISC Ordering

  • 3.5 x Sat Amp Adapter made (order more DSUB25 conns)
    • -> Gave 2 to Tega, 1.5 in the DSUB box
    • 5747842-4 A32100-ND -> ‎5747842-3‎ A32099-ND‎ Qty40
    • 5747846-3 A32125-ND -> ‎747846-3‎ A23311-ND‎ Qty40
  • Tega's sat amp components
    • 499Ω P499BCCT-ND 78 -> Backorder -> ‎RG32P499BCT-ND‎ Qty 100
    • 4.99KΩ TNPW12064K99BEEA 56 -> Qty 100
    • 75Ω YAG5096CT-ND 180 -> Qty 200
    • 1.82KΩ P18391CT-ND 103 -> Qty 120
    • 68 nF P10965-ND 209
  • Order more DB9s for Tega's sat amp adapter 4 units (look at the AA IO BOM) 
    • 4x 8x 5747840-4 DB9M PCB A32092-ND -> 6-747840-9‎ A123182-ND‎ Qty 35
    • 4x 5x 5747844-4 A32117-ND -> Qty 25
    • 4x 5x DB9M ribbon MMR09K-ND -> 8209-8000‎ 8209-8000-ND‎ Qty 25
    • 4x 5x 5746861-4 DB9F ribbon 5746861-4-ND -> 400F0-09-1-00 ‎LFR09H-ND‎ Qty 35
  • Order 18bit DAC AI -> 16bit DAC AI components 4 units
    • 4x 4x 5747150-8 DSUB9F PCB A34072-ND -> ‎D09S24A4PX00LF‎609-6357-ND‎ Qty 20
    • 4x 1x 787082-7 CONN D-TYPE RCPT 68POS R/A SLDR (SCSI Female) A3321-ND -> ‎5787082-7‎ A31814-ND‎ Qty 5
    • 4x 1x 22-23-2021 Connector Header Through Hole 2 position 0.100" (2.54mm)    WM4200-ND -> Qty5

 

 

  16305   Wed Sep 1 14:16:21 2021 JordanUpdateVACEmpty N2 Tanks

The right N2 tank had a bad/loose valve and did not fully open. This morning the left tank was just about empty and the right tank showed 2000+ psi on the gauge. Once the changeover happened the copper line emptied but the valve to the N2 tank was not fully opened. I noticed the gauges were both reading zero at ~1pm just before the meeting. I swapped the left tank, but not in time. The vacuum interlocks tripped at 1:04 pm today when the N2 pressure to the vacuum valves fell below 65psi. After the meeting, Chub tightened the valve, fully opened it and refilled the lines. I will monitor the tank pressures today and make sure all is ok.

There used to be a mailer that was sent out when the sum pressure of the two tanks fell <600 psi, telling you to swap tanks. Does this no longer exist?

  16304   Tue Aug 31 14:55:24 2021 ranaSummaryLSCXARM POX OLTF

this model doesn't seem to include the analog AA, analog AI, digital AA, digital AI, or data transfer delays in the system. I think if you include those you will get more accuracy at high frequencies. Probably Anchal has those included in his DARM loop model?

 

  16303   Mon Aug 30 17:49:43 2021 PacoSummaryLSCXARM POX OLTF

Used diaggui to get OLTF in preparation for optimal system identification / calibration. The excitation was injected at the control point of the XARM loop C1:LSC-XARM_EXC. Attachment 1 shows the TF (red scatter) taken from 35 Hz to 2.3 kHz with 201 points. The swept sine excitation had an envelope amplitude of 50 counts at 35 Hz, 0.2 counts at 100 Hz, and 0.2 at 200 Hz. In purple continous line, the model for the OLTF using all the digital control filters as well as a simple 1 degree of freedom plant (single pole at 0.99 Hz) is overlaid. Note the disagreement of the OLTF "model" at higher frequencies which we may be able to improve upon using vector fitting.

Attachment 2 shows the coherence (part of this initial measurement was to identify an appropriately large frequency range where the coherence is good before we script it).

  16302   Thu Aug 26 10:30:14 2021 JamieConfigurationCDSfront end time synchronization fixed?

I've been looking at why the front end NTP time synchronization did not seem to be working.  I think it might not have been working because the NTP server the front ends were point to, fb1, was not actually responding to synchronization requests.

I cleaned up some things on fb1 and the front ends, which I think unstuck things.

On fb1:

  • stopped/disabled the default client (systemd-timesyncd), and properly installed the full NTP server (ntp)
  • the ntp server package for debian jessie is old-style sysVinit, not systemd.  In order to make it more integrated I copied the auto-generated service file to /etc/systemd/system/ntp.service, and added and "[install]" section that specifies that it should be available during the default "multi-user.target".
  • "enabled" the new service to auto-start at boot ("sudo systemctl enable ntp.service") 
  • made sure ntp was configured to serve the front end network ('broadcast 192.168.123.255') and then restarted the server ("sudo systemctl restart ntp.service")

For the front ends:

  • on fb1 I chroot'd into the front-end diskless root (/diskless/root) and manually specifed that systemd-timesyncd should start on boot by creating a symlink to the timesyncd service in the multi-user.target directory:
$ sudo chroot /diskless/root
$ cd /etc/systemd/system/multi-user.target.wants
$ ln -s /lib/systemd/system/systemd-timesyncd.service
  • on the front end itself (c1iscex as a test) I did a "systemctl daemon-reload" to force it to reload the systemd config, and then restarted the client ("systemctl restart systemd-timesyncd")
  • checked the NTP synchronization with timedatectl:
controls@c1iscex:~ 0$ timedatectl 
      Local time: Thu 2021-08-26 11:35:10 PDT
  Universal time: Thu 2021-08-26 18:35:10 UTC
        RTC time: Thu 2021-08-26 18:35:10
       Time zone: America/Los_Angeles (PDT, -0700)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: yes
 Last DST change: DST began at
                  Sun 2021-03-14 01:59:59 PST
                  Sun 2021-03-14 03:00:00 PDT
 Next DST change: DST ends (the clock jumps one hour backwards) at
                  Sun 2021-11-07 01:59:59 PDT
                  Sun 2021-11-07 01:00:00 PST
controls@c1iscex:~ 0$ 

Note that it is now reporting "NTP enabled: yes" (the service is enabled to start at boot) and "NTP synchronized: yes" (synchronization is happening), neither of which it was reporting previously.  I also note that the systemd-timesyncd client service is now loaded and enabled, is no longer reporting that it is in an "Idle" state and is in fact reporting that it synchronized to the proper server, and it is logging updates:

controls@c1iscex:~ 0$ sudo systemctl status systemd-timesyncd
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
   Active: active (running) since Thu 2021-08-26 10:20:11 PDT; 1h 22min ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 2918 (systemd-timesyn)
   Status: "Using Time Server 192.168.113.201:123 (ntpserver)."
   CGroup: /system.slice/systemd-timesyncd.service
           └─2918 /lib/systemd/systemd-timesyncd

Aug 26 10:20:11 c1iscex systemd[1]: Started Network Time Synchronization.
Aug 26 10:20:11 c1iscex systemd-timesyncd[2918]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 26 10:20:11 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+26ppm
Aug 26 10:21:15 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 128s/-0.000s/0.000s/0.000s/+25ppm
Aug 26 10:23:23 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 256s/+0.001s/0.000s/0.000s/+26ppm
Aug 26 10:27:40 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 512s/+0.003s/0.000s/0.001s/+29ppm
Aug 26 10:36:12 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 1024s/+0.008s/0.000s/0.003s/+33ppm
Aug 26 10:53:16 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 2048s/-0.026s/0.000s/0.010s/+27ppm
Aug 26 11:27:24 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 2048s/+0.009s/0.000s/0.011s/+29ppm
controls@c1iscex:~ 0$ 

So I think this means everything is working.

I then went ahead and reloaded and restarted the timesyncd services on the rest of the front ends.

We still need to confirm that everything comes up properly the next time we have an opportunity to reboot fb1 and the front ends (or the opportunity is forced upon us).

There was speculation that the NTP clients on the front ends (systemd-timesyncd) would not work on a read-only filesystem, but this doesn't seem to be true.  You can't trust everything you read on the internet.

  16300   Thu Aug 26 10:10:44 2021 PacoUpdateCDSFB is writing the frames with a year old date

[paco, ]

We went over the X end to check what was going on with the TRX signal. We spotted the ground terminal coming from the QPD is loosely touching the handle of one of the computers on the rack. When we detached it completely from the rack the noise was gone (attachment 1).

We taped this terminal so it doesn't touch anything accidently. We don't know if this is the best solution since it is probably needs a stable voltage reference. In the Y end those ground terminals are connected to the same point on the rack. The other ground terminals in the X end are just cut.

We also took the PSD of these channels (attachment 2). The noise seem to be gone but TRX is still a bit noisier than TRY. Maybe we should setup a proper ground for the X arm QPD?


We saw that the X end station ALS laser was off. We turned it on and also the crystal oven and reenabled the temperature controller. Green light immidiately appeared. We are now working to restore the ALS lock. After running XARM ASS we were unable to lock the green laser so we went to the XEND and moved the piezo X ALS alignment mirrors until we maximized the transmission in the right mode. We then locked the ALS beams on both arms successfully. It very well could be that the PZT offsets were reset by the power glitch. The XARM ALS still needs some tweaking, its level is ~ 25% of what it was before the power glitch.

ELOG V3.1.3-