40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 325 of 349  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
IDup Date Author Type Category Subject
  16342   Fri Sep 17 20:22:55 2021 KojiUpdateSUSEQ M4.3 Long beach

EQ  M4.3 @longbeach
2021-09-18 02:58:34 (UTC) / 07:58:34 (PDT)
https://earthquake.usgs.gov/earthquakes/eventpage/ci39812319/executive

  • All SUS Watchdogs tripped, but the SUSs looked OK except for the stuck ITMX.
  • Damped the SUSs (except ITMX)
  • IMC automatically locked
  • Turned off the damping of ITMX and shook it only with the pitch bias -> Easily unstuck -> damping recovered -> realignment of the ITMX probably necessary.
  • Done.
  16343   Mon Sep 20 12:20:31 2021 PacoSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

[yehonathan, paco, anchal]

We attempted to find any symptoms for actuation problems in the PRMI configuration when actuated through BS and PRM.

Our logic was to check angular (PIT and YAW) actuation transfer function in the 30 to 200 Hz range by injecting appropriately (f^2) enveloped excitations in the SUS-ASC EXC points and reading back using the SUS_OL (oplev) channels.

From the controls, we first restored the PRMI Carrier to bring the PRM and BS to their nominal alignment, then disabled the LSC output (we don't need PRMI to be locked), and then turned off the damping from the oplev control loops to avoid supressing the excitations.

We used diaggui to measure the 4 transfer functions magnitudes PRM_PIT, PRM_YAW, BS_PIT, BS_YAW, as shown below in Attachments #1 through #4. We used the Oplev calibrations to plot the magnitude of the TFs in units of urad / counts, and verified the nominal 1/f^2 scaling for all of them. The coherence was made as close to 1 as possible by adjusting the amplitude to 1000 counts, and is also shown below. A dip at 120 Hz is probably due to line noise. We are also assuming that the oplev QPDs have a relatively flat response over the frequency range below.

  16344   Mon Sep 20 14:11:40 2021 KojiUpdateBHDEnd DAC Adapter Unit D2100647

I've uploaded the schematic and PCB PDF for End DAC Adapter Unit D2100647.

Please review the design.

  • CH1-8 SUS actuation channels.
    • 5CHs out of 8CHs are going to be used, but for future extensions, all the 8CHs are going to be filled.
    • It involves diff-SE conversion / dewhitening / SE-diff conversion. Does this make sense?
  • CH9-12 PZT actuation channels. It is designed to send out 4x SE channels for compatibility. The channels have the jumpers to convert it to pass through the diff signals.
  • CH13-16 are general purpose DIFF/SE channels. CH13 is going to be used for ALS Laser Slow control. The other 3CHs are spares.

The internal assembly drawing & BOM are still coming.

  16345   Mon Sep 20 14:22:00 2021 ranaSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

I suggest plotting all the traces in the plot so we can see their differences. Also remove the 1/f^2 slope so that we can see small differences. Since the optlev servos all have low pass filters around 15-20 Hz, its not necessary to turn off the optlev servos for this measurement.

I think that based on the coherence and the number of averages, you should also be able to use Bendat and Piersol so estimate the uncertainy as a function of frequency. And we want to see the comparison coil-by-coil, not in the DoF basis.

4 sweeps for BS and 4 sweeps for PRM.

  16346   Mon Sep 20 15:23:08 2021 YehonathanUpdateComputersWifi internet fixed

Over the weekend and today, the wifi was acting bad with frequent disconnections and no internet access. I tried to log into the web interface of the ASUS wifi but with no success.

I pushed the reset button for several seconds to restore factory settings. After that, I was able to log in. I did the automatic setup and defined the wifi passwords to be what they used to be.

Internet access was restored. I also unplugged and plugged back all the wifi extenders in the lab and moved the extender from the vertex inner wall to the outer wall of the lab close to the 1X3.

Now, there seems to be wifi reception both in X and Y arms (according to my android phone).

 

  16348   Mon Sep 20 15:42:44 2021 Ian MacMillanSummaryComputersQuantization Code Summary

This post serves as a summary and description of code to run to test the impacts of quantization noise on a state-space implementation of the suspension model.

Purpose: We want to use a state-space model in our suspension plant code. Before we can do this we want to test to see if the state-space model is prone to problems with quantization noise. We will compare two models one for a standard direct-ii filter and one with a state-space model and then compare the noise from both. 

Signal Generation:

First I built a basic signal generator that can produce a sine wave for a specified amount of time then can produce a zero signal for a specified amount of time. This will let the model ring up with the sine wave then decay away with the zero signal. This input signal is generated at a sample rate of 2^16 samples per second then stored in a numpy array. I later feed this back into both models and record their results.

State-space Model:

The code can be seen here

The state-space model takes in the list of excitation values and feeds them through a loop that calculates the next value in the output.

Given that the state-space model follows the form

  \dot{x}(t)=\textbf{A}x(t)+ \textbf{B}u(t)   and  y(t)=\textbf{C}x(t)+ \textbf{D}u(t) ,

the model has three parts the first equation, an integration, and the second equation.

  1. The first equation takes the input x and the excitation u and generates the x dot vector shown on the left-hand side of the first state-space equation.
  2. The second part must integrate x to obtain the x that is found in the next equation. This uses the velocity and acceleration to integrate to the next x that will be plugged into the second equation
  3. The second equation in the state space representation takes the x vector we just calculated and then multiplies it with the sensing matrix C. we don't have a D matrix so this gives us the next output in our system

This system is the coded form of the block diagram of the state space representation shown in attachment 1

Direct-II Model:

The direct form 2 filter works in a much simpler way. because it involves no integration and follows the block diagram shown in Attachment 2, we can use a single difference equation to find the next output. However, the only complication that comes into play is that we also have to keep track of the w(n) seen in the middle of the block diagram. We use these two equations to calculate the output value

y[n]=b_0 \omega [n]+b_1 \omega [n-1] +b_2 \omega [n-2],  where w[n] is  \omega[n]=x[n] - a_1 \omega [n-1] -a_2 \omega[n-2]

Bit length Control:

To control the bit length of each of the models I typecast all the inputs using np.float then the bit length that I want. This simulates the computer using only a specified bit length. I have to go through the code and force the code to use 128 bit by default. Currently, the default is 64 bit which so at the moment I am limited to 64 bit for highest bit length. I also need to go in and examine how numpy truncates floats to make sure it isn't doing anything unexpected.

Bode Plot: 

The bode plot at the bottom shows the transfer function for both the IIR model and the state-space model. I generated about 100 seconds of white noise then computed the transfer function as 

G(f) = \frac{P_{csd}(f)}{P_{psd}(f)}

which is the cross-spectral density divided by the power spectral density. We can see that they match pretty closely at 64 bits. The IIR direct II model seems to have more noise on the surface but we are going to examine that in the next elog

 

  16349   Mon Sep 20 20:43:38 2021 TegaUpdateElectronicsSat Amp modifications

Running update of Sat Amp modification work, which involves the following procedure (x8) per unit:

  1. Replace R20 & R24 with 4.99K ohms, R23 with 499 ohms, and remove C16.
  2. (Testing) Connect LEDDrive output to GND and check that
    • TP4 is ~ 5V
    •  TP5-8 ~ 0V. 
  3. Install 40m Satellite to Flange Adapter (D2100148-v1)

 

Unit Serial Number Issues Status
S1200740 NONE DONE
S1200742 NONE DONE
S1200743 NONE DONE
S1200744

TP4 @ LED1,2 on PCB S2100568 is 13V instead of 5V

TP4 @ LED4 on PCB S2100559 is 13V instead of 5V

DONE
S1200752 NONE DONE

 

 

 

  16350   Mon Sep 20 21:56:07 2021 KojiUpdateComputersWifi internet fixed

Ug, factory resets... Caltech IMSS announced that there was an intermittent network service due to maintenance between Sept 19 and 20. And there seemed some aftermath of it. Check out "Caltech IMSS"

 

  16351   Tue Sep 21 11:09:34 2021 AnchalSummaryCDSXARM YARM UGF Servo and Oscillators added

I've updated the c1LSC simulink model to add the so-called UGF servos in the XARM and YARM single arm loops as well. These were earlier present in DARM, CARM, MICH and PRCL loops only. The UGF servo themselves serves a larger purpose but we won't be using that. What we have access to now is to add an oscillator in the single arm and get realtime demodulated signal before and after the addition of the oscillator. This would allow us to get the open loop transfer function and its uncertaintiy at particular frequencies (set by the oscillator) and would allow us to create a noise budget on the calibration error of these transfer functions.

 

The new model has been committed locally in the 40m/RTCDSmodels git repo. I do not have rights to push to the remote in git.ligo. The model builds, installs and starts correctly.

  16352   Tue Sep 21 11:13:01 2021 PacoSummaryCalibrationXARM calibration noise

Here are some plots from analyzing the C1:LSC-XARM calibration. The experiment is done with the XARM (POX) locked, a single line is injected at C1:LSC-XARM_EXC at f0 with some amplitude determined empirically using diaggui and awggui tools. For the analysis detailed in this post, f0 = 19 Hz, amp = 1 count, and gain = 300 (anything larger in amplitude would break the lock, and anything lower in frequency would not show up because of loop supression). Clearly, from Attachment #3 below, the calibration line can be detected with SNR > 1.

We read the test point right after the excitation C1:LSC-XARM_IN2 which, in a simplified loop will carry the excitation suppressed by 1 - OLTF, the open loop transfer function. The line is on for 5 minutes, and then we read for another 5 minutes but with the excitation off to have a reference. Both the calibration and reference signal time series are shown in Attachment #1 (decimated by 8). The corresponding ASDs are shown in Attachment #2. Then, we demodulate at 19 Hz and a 30 Hz, 4th-order butterworth LPF, and get an I and Q timeseries (shown in Attachment #3). Even though they look similar, the Q is centered about 0.2 counts, while the I is centered about 0.0. From this time series, we can of course show the noise ASDs in Attachment #3.


The ASD uncertainty bands in the last plot are statistical estimates and depend on the number of segments used in estimating the PSD. A thing to note is that the noise features surrounding the signal ASD around f0 are translated into the ASD in the demodulated signals, but now around dc. I guess from Attachment #3 there is no difference in the noise spectra around the calibration line with and without the excitation. This is what I would have expected from a linear system. If there was a systematic contribution, I would expect it to show at very low frequencies.

  16353   Wed Sep 22 11:43:04 2021 ranaSummaryCalibrationXARM calibration noise

I would expect to see some lower frequency effects. i.e. we should look at the timeseries of the demod with the excitation on and off.

I would guess tat the exc on should show us the variations in the optical gain below 3 Hz, whereas the exc off would not show it.

Maybe you should do some low pass filtering on the time series you have to see the ~DC effects? Also, reconsider your AA filter design: how do you quantitatively choose the cutoff frequency and stopband depth?

  16354   Wed Sep 22 12:40:04 2021 AnchalSummaryCDSXARM YARM UGF Servo and Oscillators shifted to OAF

To reduce burden on c1lsc, I've shifted the added UGF block to to c1oaf model. c1lsc had to be modified to allow addition of an oscillator in the XARm and YARM control loops and take out test points before and after the addition to c1oaf through shared memory IPC to do realtime demodulation in c1oaf model.

The new models built and installed successfully and I've been able to recover both single arm locks after restarting the computers.

 

  16355   Wed Sep 22 14:22:35 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

Now that we have a model of how the SS and IIR filters work we can get to the problem of how to measure the quantization noise in each of the systems. Den Martynov's thesis talks a little about this. from my understanding: He measured quantization noise by having two filters using two types of variables with different numbers of bits. He had one filter with many more bits than the second one. He fed the same input signal to both filters then recorded their outputs x_1 and x_2, where x_2 had the higher number of bits. He then took the difference x_1-x_2. Since the CDS system uses double format, he assumes that quantization noise scales with mantissa length. He can therefore extrapolate the quantization noise for any mantissa length.

Here is the Code that follows the following procedure (as of today at least)

This problem is a little harder than I had originally thought. I took Rana's advice and asked Aaron about how he had tackled a similar problem. We came up with a procedure explained below (though any mistakes are my own):

  1. Feed different white noise data into three of the same filter this should yield the following equation: \textbf{S}_i^2 =\textbf{S}_{ni}^2+\textbf{S}_x^2, where \textbf{S}_i^2 is the power spectrum of the output for the ith filter,  \textbf{S}_{ni}^2 is the noise filtered through an "ideal" filter with no quantization noise, and  \textbf{S}_x^2 is the power spectrum of the quantization noise. Since we are feeding random noise into the input the power of the quantization noise should be the same for all three of our runs.
  2. Next, we have our three outputs:  \textbf{S}_1^2,  \textbf{S}_2^2, and  \textbf{S}_3^2 that follow the equations: 

\textbf{S}_1^2 =\textbf{S}_{n1}^2+\textbf{S}_x^2

\textbf{S}_2^2 =\textbf{S}_{n2}^2+\textbf{S}_x^2

\textbf{S}_3^2 =\textbf{S}_{n3}^2+\textbf{S}_x^2

From these three equations, we calculate the three quantities: \textbf{S}_{12}^2\textbf{S}_{23}^2, and \textbf{S}_{13}^2 which are calculated by:

\textbf{S}_{12}^2 =\textbf{S}_{1}^2-\textbf{S}_2^2\approx \textbf{S}_{n1}^2 -\textbf{S}_{n2}^2

\textbf{S}_{23}^2 =\textbf{S}_{2}^2-\textbf{S}_3^2\approx \textbf{S}_{n2}^2 -\textbf{S}_{n3}^2

\textbf{S}_{13}^2 =\textbf{S}_{1}^2-\textbf{S}_3^2\approx \textbf{S}_{n1}^2 -\textbf{S}_{n3}^2

from these quantities, we can calculate three values: \bar{\textbf{S}}_{n1}^2\bar{\textbf{S}}_{n2}^2, and \bar{\textbf{S}}_{n3}^2 since these are just estimates we are using a bar on top. These are calculated using:

\bar{\textbf{S}}_{n1}^2\approx\frac{1}{2}(\textbf{S}_{12}^2+\textbf{S}_{13}^2+\textbf{S}_{23}^2)

\bar{\textbf{S}}_{n2}^2\approx\frac{1}{2}(-\textbf{S}_{12}^2+\textbf{S}_{13}^2+\textbf{S}_{23}^2)

\bar{\textbf{S}}_{n3}^2\approx\frac{1}{2}(\textbf{S}_{12}^2+\textbf{S}_{13}^2-\textbf{S}_{23}^2)

using these estimates we can then estimate  \textbf{S}_{x}^2  using the formula:

\textbf{S}_{x}^2 \approx \textbf{S}_{1}^2 - \bar{\textbf{S}}_{n1}^2 \approx \textbf{S}_{2}^2 - \bar{\textbf{S}}_{2}^2 \approx \textbf{S}_{3}^2 - \bar{\textbf{S}}_{n3}^2

we can average the three estimates for  \textbf{S}_{x}^2  to come up with one estimate.

This procedure should be able to give us a good estimate of the quantization noise. However, in the graph shown in the attachments below show that the noise follows the transfer function of the model to begin with. I would not expect this to be true so I believe that there is an error in the above procedure or in my code that I am working on finding. I may have to rework this three-corner hat approach. I may have a mistake in my code that I will have to go through.

I would expect the quantization noise to be flatter and not follow the shape of the transfer function of the model. Instead, we have what looks like just the result of random noise being filtered through the model. 

Next steps:

The first real step is being able to quantify the quantization noise but after I fix the issues in my code I will be able to start liking at optimal model design for both the state-space model and the direct form II model. I have been looking through the book "Quantization noise" by Bernard Widrow and Istvan Kollar which offers some good insights on how to minimize quantization noise. 

  16356   Wed Sep 22 17:22:59 2021 TegaUpdateElectronicsSat Amp modifications

[Koji, Tega]

 

Decided to do a quick check of the remaining Sat Amp units before component replacement to identify any unit with defective LED circuits. Managed to examine 5 out of 10 units, so still have 5 units remaining. Also installed the photodiode bias voltage jumper (JP1) on all the units processed so far.

Unit Serial Number Issues Debugging Status
S1200738

TP4 @ LED3 on chan 1-4 PCB was ~0.7 V instead of 5V

Koji checked the solder connections of the various components, then swapped out the IC OPAMP. Removed DB9 connections to the front panel to get access to the bottom of the board. Upon close inspection, it looked like an issue of a short connection between the Emitter & Base legs of the Q1 transistor.

Solution - Remove the short connection between the Emitter & Base legs of the Q1 transistor legs.

DONE
S1200748 TP4 @ LED2 on chan 1-4 PCB was ~0.7 V instead of 5V

This issue was caused by a short connection between the Emitter & Base legs of the Q1 transistor.

Solution - Remove the short connection between the Emitter & Base legs of the Q1 transistor legs.

DONE
S1200749 NONE N/A DONE
S1200750 NONE N/A DONE
S1200751 NONE N/A DONE

 

Defective unit with updated resistors and capacitors in the previous elog

Unit Serial Number Issues Debugging Status
S1200744

TP4 @ LED1,2 on PCB S2100568 is 13V instead of 5V

TP4 @ LED4 on PCB S2100559 is 13V instead of 5V

This issue was caused by a short between the Collector & Base legs of the Q1 transistor.

Solution - Remove the short connection between the Collector & Base legs of the Q1 transistor legs

 

Complications - During the process of flipping the board to get access to the bottom of the board, a connector holding the two middle black wires, on P1, came loose. I resecured the wires to the connector and checked all TP4s on the board afterwards to make sure things are as expected.

DONE

 

 

 

Quote:

Running update of Sat Amp modification work, which involves the following procedure (x8) per unit:

  1. Replace R20 & R24 with 4.99K ohms, R23 with 499 ohms, and remove C16.
  2. (Testing) Connect LEDDrive output to GND and check that
    • TP4 is ~ 5V
    •  TP5-8 ~ 0V. 
  3. Install 40m Satellite to Flange Adapter (D2100148-v1)

 

Unit Serial Number Issues Status
S1200740 NONE DONE
S1200742 NONE DONE
S1200743 NONE DONE
S1200744

TP4 @ LED1,2 on PCB S2100568 is 13V instead of 5V

TP4 @ LED4 on PCB S2100559 is 13V instead of 5V

DONE
S1200752 NONE DONE

 

 

 

 

  16357   Thu Sep 23 14:17:44 2021 TegaUpdateElectronicsSat Amp modifications debugging update

Debugging complete.

All units now have the correct TP4 voltage reading needed to drive a nominal current of 35 mA through to OSEM LED. The next step is to go ahead and replace the components and test afterward that everything is OK.

 

Unit Serial Number Issues Debugging Status
S1200736 TP4 @ LED4 on chan 1-4 PCB reads 13V instead of 5V

This issue was caused by a short between the Collector & Base legs of the Q1 transistor.

Solution - Remove the short connection between the Collector & Base legs of the Q1 transistor legs

DONE
S1200737 NONE N/A DONE
S1200739 NONE N/A DONE
S1200746 TP4 @ LED3 on chan 5-8 PCB reads 0.765 V instead of 5V

This issue was caused by a short between the Emitter & Base legs of the Q1 transistor.

Solution - Remove the short connection between the Emitter & Base legs of the Q1 transistor legs

 

Complications - I was extra careful this time because of the problem of loose cable from the last flip-over of the right PCB containing chan 5-8. Anyways, after I was done I noticed one of the pink wires (it carries the +14V to the left PCB) had come off on P1. At least this time I could also see that the corresponding front panel green LED turn off as a result. So I resecured the wire to the connector (using solder as my last attempt yesterday to reattach the via crimping didn't work after a long time trying. I hope this is not a problem.) and checked the front panel LED turns on when the unit is powered before closing the unit. These connectors are quite flimsy.

DONE
S1200747 TP4 @ LED2 on chan 1-4 PCB reads 13V instead of 5V

This issue was caused by a short between the Collector & Base legs of the Q1 transistor.

Solution - Remove the short connection between the Collector & Base legs of the Q1 transistor legs

DONE

 

 

 

  16358   Thu Sep 23 15:29:11 2021 PacoSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

[Anchal, Paco]

We had a second go at this with an increased number of averages (from 10 to 100) and higher excitation amplitudes (from 1000 to 10000). We did this to try to reduce the relative uncertainty a-la-Bendat-and-Pearsol

\delta G / G = \frac{1}{\gamma \sqrt{n_{\rm avg}}}

where \gamma, n_{\rm avg} are the coherence and number of averages respectively. Before, this estimate had given us a ~30% relative uncertainty and now it has been improved to ~ 10%. The re-measured TFs are in Attachment #1. We did 4 sweeps for each optic (BS, PRM) and removed the 1/f^2 slope for clarity. We note a factor of ~ 4 difference in the magnitude of the coil to angle TFs from BS to PRM (the actuation strength in BS is smaller).


For future reference:

With complex G, we get complex error in G using the formula above. To get uncertainity in magnitude and phase from real-imaginary uncertainties, we do following (assuming the noise in real and imaginary parts of the measured transfer function are incoherent with each other):
G = \alpha + i\beta

\delta G = \delta\alpha + i\delta \beta

\delta |G| = \frac{1}{|G|}\sqrt{\alpha^2 \delta\alpha^2 + \beta^2 \delta \beta^2}

\delta(\angle G) = \frac{1}{|G|^2}\sqrt{\alpha^2 \delta\alpha^2 + \beta^2 \delta\beta^2} = \frac{\delta |G|}{|G|}

  16359   Thu Sep 23 18:18:07 2021 YehonathanUpdateBHDSOS assembly

I have noticed that the dumbells coming back from C&B had glue residues on them. An example is shown in attachment 1: it can be seen that half of the dumbell's surface is covered with glue.

Jordan gave me a P800 sandpaper to remove the glue. I picked the dumbells with the dirty face down and slid them over the sandpaper in 8 figures several times to try and keep the surface untilted. Attachment 2 shows the surface from attachment 1 after this process.

Next, the dumbells will be sent to another C&B.

  16360   Mon Sep 27 12:12:15 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

I have not been able to figure out a way to make the system that Aaron and I talked about. I'm not even sure it is possible to pull the information out of the information I have in this way. Even the book uses a comparison to a high precision filter as a way to calculate the quantization noise:

"Quantization noise in digital filters can be studied in simulation by comparing the behavior of the actual quantized digital filter with that of a refrence digital filter having the same structure but whose numerical calculations are done extremely accurately."
-Quantization Noise by Bernard Widrow and Istvan Kollar (pg. 416)

Thus I will use a technique closer to that used in Den Martynov's thesis (see appendix B starting on page 171). A summary of my understanding of his method is given here:

A filter is given raw unfiltered gaussian data f(t) then it is filtered and the result is the filtered data x(t) thus we get the result: f(t)\rightarrow x(t)=x_N(t)+x_q(t)  where x_N(t) is the raw noise filtered through an ideal filter and x_q(t) is the difference which in this case is the quantization noise. Thus I will input about 100-1000 seconds of the same white noise into a 32-bit and a 64-bit filter. (hopefully, I can increase the more precise one to 128 bit in the future) then I record their outputs and subtract the from each other. this should give us the Quantization error e(t):
e(t)=x_{32}(t)-x_{64}(t)=x_{N_{32}}(t)+x_{q_{32}}(t) - x_{N_{64}}(t)-x_{q_{64}}(t)
and since x_{N_{32}}(t)=x_{N_{64}}(t) because they are both running through ideal filters:
e(t)=x_{N}(t)+x_{q_{32}}(t) - x_{N}(t)-x_{q_{64}}(t)
e(t)=x_{q_{32}}(t) -x_{q_{64}}(t)
and since in this case, we are assuming that the higher bit-rate process is essentially noiseless we get the Quantization noise x_{q_{32}}(t).

If we make some assumptions, then we can actually calculate a more precise version of the quantization noise:

"Since aLIGO CDS system uses double precision format, quantization noise is extrapolated assuming that it scales with mantissa length"
-Denis Martynov's Thesis (pg. 173)

From this assumption, we can say that the noise difference between the 32-bit and 64-bit filter outputs:  x_{q_{32}}(t)-x_{q_{64}}(t)  is proportional to the difference between their mantissa length. by averaging over many different bit lengths, we can estimate a better quantization noise number.

I am building the code to do this in this file

  16361   Mon Sep 27 16:03:15 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

I have coded up the procedure in the previous post: The result does not look like what I would expect. 

As shown in Attachment1 I have the power spectrum of the 32-bit output and the 64-bit output as well as the power spectrum of the two subtracted time series as well as the subtracted power spectra of both. unfortunately, all of them follow the same general shape of the raw output of the filter. 

I would not expect quantization noise to follow the shape of the filter. I would instead expect it to be more uniform. If anything I would expect the quantization noise to increase with frequency. If a high-frequency signal is run through a filter that has high quantization noise then it will severely degrade: i.e. higher quantization noise. 

This is one reason why I am confused by what I am seeing here. In all cases including feeding the same and different white noise into both filters, I have found that the calculated quantization noise is proportional to the response of the filter. this seems wrong to me so I will continue to play around with it to see if I can gain any intuition about what might be happening.

  16362   Mon Sep 27 17:04:43 2021 ranaSummaryComputersQuantization Noise Calculation Summary

I suggest that you

  1. change the corner frequency to 10 Hz as I suggested last week. This filter, as it is, is going to give you trouble.
  2. Put in a sine wave at 3.4283 Hz with an amplitude of 1, rather than white noise. In this way, its not necessary to do any subtraction. Just make PSD of the output of each filter.
  3. Be careful about window length and window function. If you don't do this carefully, your PSD will be polluted by window bleeding.

 

  16363   Tue Sep 28 16:31:52 2021 PacoSummaryCalibrationXARM OLTF (calibration) at 55.511 Hz

[anchal, paco]

Here is a demonstration of the methods leading to the single (X)arm calibration with its budget uncertainty. The steps towards this measurement are the following:

  1. We put a single line excitation through the C1:SUS-ETMX_LSC_EXC at 55.511 Hz, amp = 1 counts, gain = 300 (ramptime=10 s).
  2. With the arm locked, we grab a long timeseries of the C1:LSC-XARM_IN1_DQ (error point) and C1:SUS-ETMX_LSC_OUT_DQ (control point) channels.
  3. We assume the single arm loop to have the four blocks shown in Attachment #1, A (actuator + sus), plant (mainly the cavity pole), D (detection + electronics), and K (digital control).
    1. At this point, Anchal made a model of the single arm loop including the appropriate filter coefficients and other parameters. See Attachments #2-3 for the split and total model TFs.
    2. Our line would actually probe a TF from point b (error point) to point d (control point). We multiplied our measurement with open loop TF from b to d from model to get complete OLTF.
    3. Our initial estimate from documents and elog made overall loop shape correct but it was off by an overall gain factor. This could be due to wrong assumption on RFPD transimpedance or analog gains of AA or whitening filters. We have corrected for this factor in the RFPD transimpedance, but this needs to be checked (if we really care).
  4. We demodulate decimated timeseries (final sampling rate ~ 2.048 kHz) and I & Q for both the b and d signals. From this and our model for K, we estimate the OLTF. Attachment #4 shows timeseries for magnitude and phase.
  5. Finally, we compute the ASD for the OLTF magnitude. We plot it in Attachment #5 together with the ASD of the XARM transmission (C1:LSC-TRX_OUT_DQ) times the OLTF to estimate the optical gain noise ASD (this last step was a quick attempt at budgeting the calibration noise).
    1. For each ASD we used N = 24 averages, from which we estimate rms (statistical) uncertainties which are depicted by error bands (\pm \sigma) around the lines.

** Note: We ran the same procedure using dtt (diaggui) to validate our estimates at every point, as well as check our SNR in b and d before taking the ~3.5 hours of data.

  16364   Wed Sep 29 09:36:26 2021 JordanUpdateSUS2" Adapter Ring Parts for SOS Arrived 9/28/21

The remaining machined parts for the SOS adapter ring have arrived. I will inspect these today and get them ready for C&B.

  16365   Wed Sep 29 17:10:09 2021 AnchalSummaryCDSc1teststand problems summary

[anchal, ian]

We went and collected some information for the overlords to fix the c1teststand DAQ network issue.


  • from c1teststand, c1bhd and c1sus2 computers were not accessible through ssh. (No route to host). So we restarted both the computers (the I/O chassis were ON).
  • After the computers restarted, we were able to ssh into c1bhd and c1sus, ad we ran rtcds start c1x06 and rtcds start c1x07.
  • The first page in attachment shows the screenshot of GDS_TP screens of the IOP models after this step.
  • Then we started teh user models by running rtcds start c1bhd and rtcds start c1su2.
  • The second page shows the screenshot of GDS_TP screens. You can notice that DAQ status is red in all the screens and the DC statuses are blank.
  • So we checked if daqd_ services are running in the fb computer. They were not. So we started them all by sudo systemctl start daqd_*.
  • Third page shows the status of all services after this step. the daqd_dc.service remained at failed state.
  • open-mx_stream.service was not even loaded in fb. We started it by running sudo systemctl start open-mx_stream.service.
  • The fourth page shows the status of this service. It started without any errors.
  • However, when we went to check the status of mx_stream.service in c1bhd and c1sus2, they were not loaded and we we tried to start them, they showed failed state and kept trying to start every 3 seconds without success. (See page 5 and 6).
  • Finally, we also took a screenshot of timedatectl command output on the three computers fb, c1bhd, and c1sus2 to show that their times were not synced at all.
  • The ntp service is running on fb but it probably does not have access to any of the servers it is following.
  • The timesyncd on c1bhd and c1sus2 (FE machines) is also running but showing status 'Idle' which suggested they are unable to find the ntp signal from fb.
  • I believe this issue is similar to what jamie ficed in the fb1 on martian network in 40m/16302. Since the fb on c1teststand network was cloned before this fix, it might have this dysfunctional ntp as well.

We would try to get internet access to c1teststand soon. Meanwhile, someone with more experience and knowledge should look into this situation and try to fix it. We need to test the c1teststand within few weeks now.

  16366   Thu Sep 30 11:46:33 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

First and foremost I have the updated bode plot with the mode moved to 10 Hz. See Attachment 1. Note that the comparison measurement is a % difference whereas in the previous bode plot it was just the difference. I also wrapped the phase so that jumps from -180 to 180 are moved down. This eliminates massive jumps in the % difference. 

Next, I have two comparison plots: 32 bit and 64bit. As mentioned above I moved the mode to 10 Hz and just excited both systems at 3.4283Hz with an amplitude of 1. As we can see on the plot the two models are practically the same when using 64bits. With the 32bit system, we can see that the noise in the IIR filter is much greater than in the State-space model at frequencies greater than our mode.

Note about windowing and averaging: I used a Hanning window with averaging over 4 neighbor points. I came to this number after looking at the results with less averaging and more averaging. In the code, this can be seen as nperseg=num_samples/4 which is then fed into signal.welch

  16367   Thu Sep 30 14:09:37 2021 AnchalSummaryCDSNew way to ssh into c1teststand

Late elog, original time Wed Sep 29 14:09:59 2021

We opened a new port (22220) in the router to the martian subnetwork which is forwarded to port 22 on c1teststand (192.168.113.245) allowing direct ssh access to c1teststand computer from the outside world using:

                                                                       

                                                                                    
 

Checkout this wiki page for unredadcted info.

  16368   Thu Sep 30 14:13:18 2021 AnchalUpdateLSCHV supply to Xend Green laser injection mirrors M1 and M2 PZT restored

Late elog, original date Sep 15th

We found that the power switch of HV supply that powers the PZT drivers for M1 and M2 on Xend green laser injection alignment was tripped off. We could not find any log of someone doing it, it is a physical switch. Our only explanation is that this supply might have a solenoid mechansm to shut off during power glitches and it probably did so on Aug 23 (see 40m/16287). We were able to align the green laser using PZT again, however, the maximum power at green transmission from X arm cavity is now about half of what it used to be before the glitch. Maybe the seed laser on the X end died a little.

  16369   Thu Sep 30 18:04:31 2021 PacoSummaryCalibrationXARM OLTF (calibration) with three lines

[anchal, paco]

We repeated the same procedure as before, but with 3 different lines at 55.511, 154.11, and 1071.11 Hz. We overlay the OLTF magnitudes and phases with our latest model (which we have updated with Koji's help) and include the rms uncertainties as errorbars in Attachment #1.

We also plot the noise ASDs of calibrated OLTF magnitudes at the line frequencies in Attachment #2. These curves are created by calculating power spectral density of timeseries of OLTF values at the line frequencies generated by demodulated XARM_IN and ETMX_LSC_OUT signals. We have overlayed the TRX noise spectrum here as an attempt to see if we can budget the noise measured in values of G to the fluctuation in optical gain due to changing power in the arms. We multiplied the the transmission ASD with the value of OLTF at those frequencies as the transfger function from normalized optical gain to the total transfer function value.

It is weird that the fluctuations in transmission power at 1 mHz always crosses the total noise in the OLTF value in all calibration lines. This could be an artificat of our data analysis though.

Even if the contribution of the fluctuating power is correct, there is remaining excess noise in the OLTF to be budgeted.

  16370   Fri Oct 1 12:12:54 2021 StephenUpdateBHDITMY (3002) CAD layout pushed to Box

Koji requested current state of BHD 3D model. I pushed this to Box after adding the additional SOSs and creating an EASM representation (also posted, Attachment 1). I also post the PDF used to dimension this model (Attachment 2). This process raised some points that I'll jot down here:

1) Because the 40m CAD files are not 100% confirmed to be clean of any student license efforts, we cannot post these files to the PDM Vault or transmit them this way. When working on BHD layout efforts, these assemblies which integrate new design work therefore must be checked for most current revisions of vault-managed files - this Frankenstein approach is not ideal but can be managed for this effort. 

2) Because the current files reflect the 40m as built state (as far as I can tell), I shared the files in a zip directory without increasing the revisions. It is unclear whether revision control is adequate to separate [current 40m state as reflected in CAD] from [planned 40m state after BHD upgrade]. Typically a CAD user would trust that we could find the version N assembly referenced in the drawing from year Y, so we wouldn't hesitate to create future design work in a version N+1 assembly file pending a current drawing. However, this form of revision control is not implemented. Perhaps we want to use configurations to separate design states (in other words, create a parallel model of every changed component, without creating paralle files - these configurations can be selected internal to the assembly without a need to replace files)? Or more simply (and perhaps more tenuously), we could snapshot the Box revisions and create a DCC page which notes the point of departure for BHD efforts?

Anyway, the cold hard facts:

 - Box location: 40m/40m_cad_models/Solidworks_40m (LINK)

 - Filenames: 3002.zip and 3002 20211001 ITMY BHD for Koji presentation images.easm (healthy disregard for concerns about spaces in filenames)

  16371   Fri Oct 1 14:25:27 2021 yehonathanSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

{Paco, Yehonathan, Hang}

We measured the sensing PRMI sensing matrix. Attachment 1 shows the results, the magnitude of the response is not calibrated. The orthogonality between PRCL and MICH is still bad (see previous measurement for reference).

Hang suggested that since MICH actuation with BS and PRM is not trivial (0.5*BS - 0.34*PRM) and since PRCL is so sensitive to PRM movement there might be a leakage to PRCL when we are actuating on MICH. So there may be a room to tune the PRM coefficient in the MICH output matrix.

Attachment 2 shows the sensing matrix after we changed the MICH->PRM coefficient in the OSC output matrix to -0.1.

It seems like it made things a little bit better but not much and also there is a huge uncertainty in the MICH sensing.

  16372   Mon Oct 4 11:05:44 2021 AnchalSummaryCDSc1teststand problems summary

[Anchal, Paco]

We tried to fix the ntp synchronization in c1teststand today by repeating the steps listed in 40m/16302. Even though teh cloned fb1 now has the exact same package version, conf & service files, and status, the FE machines (c1bhd and c1sus2) fail to sync to the time. the timedatectl shows the same stauts 'Idle'. We also, dug bit deeper into the error messages of daq_dc on cloned fb1 and mx_stream on FE machines and have some error messages to report here.


Attempt on fixing the ntp

  • We copied the ntp package version 1:4.2.6 deb file from /var/cache/apt/archives/ntp_1%3a4.2.6.p5+dfsg-7+deb8u3_amd64.deb on the martian fb1 to the cloned fb1 and ran.
    controls@fb1:~ 0$ sudo dbpg -i ntp_1%3a4.2.6.p5+dfsg-7+deb8u3_amd64.deb
  • We got error messages about missing dependencies of libopts25 and libssl1.1. We downloaded oldoldstable jessie versions of these packages from here and here. We ensured that these versions are higher than the required versions for ntp. We installed them with:
    controls@fb1:~ 0$ sudo dbpg -i libopts25_5.18.12-3_amd64.deb 
    controls@fb1:~ 0$ sudo dbpg -i libssl1.1_1.1.0l-1~deb9u4_amd64.deb
  • Then we installed the ntp package as described above. It asked us if we want to keep the configuration file, we pressed Y.
  • However, we decided to make the configuration and service files exactly same as martian fb1 to make it same in cloned fb1. We copied /etc/ntp.conf and /etc/systemd/system/ntp.service files from martian fb1 to cloned fb1 in the same positions. Then we enabled ntp, reloaded the daemon, and restarted ntp service:
    controls@fb1:~ 0$ sudo systemctl enable ntp
    controls@fb1:~ 0$ sudo systemctl daemon-reload
    controls@fb1:~ 0$ sudo systemctl restart ntp
  • But ofcourse, since fb1 doesn't have internet access, we got some errors in status of the ntp.service:
    controls@fb1:~ 0$ sudo systemctl status ntp
    ● ntp.service - NTP daemon (custom service)
       Loaded: loaded (/etc/systemd/system/ntp.service; enabled)
       Active: active (running) since Mon 2021-10-04 17:12:58 UTC; 1h 15min ago
     Main PID: 26807 (code=exited, status=0/SUCCESS)
       CGroup: /system.slice/ntp.service
               ├─30408 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 105:107
               └─30525 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 105:107
    
    Oct 04 17:48:42 fb1 ntpd_intres[30525]: host name not found: 2.debian.pool.ntp.org
    Oct 04 17:48:52 fb1 ntpd_intres[30525]: host name not found: 3.debian.pool.ntp.org
    Oct 04 18:05:05 fb1 ntpd_intres[30525]: host name not found: 0.debian.pool.ntp.org
    Oct 04 18:05:15 fb1 ntpd_intres[30525]: host name not found: 1.debian.pool.ntp.org
    Oct 04 18:05:25 fb1 ntpd_intres[30525]: host name not found: 2.debian.pool.ntp.org
    Oct 04 18:05:35 fb1 ntpd_intres[30525]: host name not found: 3.debian.pool.ntp.org
    Oct 04 18:21:48 fb1 ntpd_intres[30525]: host name not found: 0.debian.pool.ntp.org
    Oct 04 18:21:58 fb1 ntpd_intres[30525]: host name not found: 1.debian.pool.ntp.org
    Oct 04 18:22:08 fb1 ntpd_intres[30525]: host name not found: 2.debian.pool.ntp.org
    Oct 04 18:22:18 fb1 ntpd_intres[30525]: host name not found: 3.debian.pool.ntp.org
  • But the ntpq command is giving the saem output as given by ntpq comman in martian fb1 (except for the source servers), that the broadcasting is happening in the same manner:
    controls@fb1:~ 0$ ntpq -p
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
     192.168.123.255 .BCST.          16 u    -   64    0    0.000    0.000   0.000
    
  • On the FE machines side though, the systemd-timesyncd are still unable to read the time signal from fb1 and show the status as idle:
    controls@c1bhd:~ 3$ timedatectl
          Local time: Mon 2021-10-04 18:34:38 UTC
      Universal time: Mon 2021-10-04 18:34:38 UTC
            RTC time: Mon 2021-10-04 18:34:38
           Time zone: Etc/UTC (UTC, +0000)
         NTP enabled: yes
    NTP synchronized: no
     RTC in local TZ: no
          DST active: n/a
    controls@c1bhd:~ 0$ systemctl status systemd-timesyncd -l
    ● systemd-timesyncd.service - Network Time Synchronization
       Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
       Active: active (running) since Mon 2021-10-04 17:21:29 UTC; 1h 13min ago
         Docs: man:systemd-timesyncd.service(8)
     Main PID: 244 (systemd-timesyn)
       Status: "Idle."
       CGroup: /system.slice/systemd-timesyncd.service
               └─244 /lib/systemd/systemd-timesyncd
  • So the time synchronization is still not working. We expected the FE machined to just synchronize to fb1 even though it doesn't have any upstream ntp server to synchronize to. But that didn't happen.
  • I'm (Anchal) working on getting internet access to c1teststand computers.

Digging into mx_stream/daqd_dc errors:

  • We went and changed the Restart fileld in /etc/systemd/system/daqd_dc.service on cloned fb1 to 2. This allows the service to fail and stop restarting after two attempts. This allows us to see the real error message instead of the systemd error message that the service is restarting too often. We got following:
    controls@fb1:~ 3$ sudo systemctl status daqd_dc -l
    ● daqd_dc.service - Advanced LIGO RTS daqd data concentrator
       Loaded: loaded (/etc/systemd/system/daqd_dc.service; enabled)
       Active: failed (Result: exit-code) since Mon 2021-10-04 17:50:25 UTC; 22s ago
      Process: 715 ExecStart=/usr/bin/daqd_dc_mx -c /opt/rtcds/caltech/c1/target/daqd/daqdrc.dc (code=exited, status=1/FAILURE)
     Main PID: 715 (code=exited, status=1/FAILURE)
    
    Oct 04 17:50:24 fb1 systemd[1]: Started Advanced LIGO RTS daqd data concentrator.
    Oct 04 17:50:25 fb1 daqd_dc_mx[715]: [Mon Oct  4 17:50:25 2021] Unable to set to nice = -20 -error Unknown error -1
    Oct 04 17:50:25 fb1 daqd_dc_mx[715]: Failed to do mx_get_info: MX not initialized.
    Oct 04 17:50:25 fb1 daqd_dc_mx[715]: 263596
    Oct 04 17:50:25 fb1 systemd[1]: daqd_dc.service: main process exited, code=exited, status=1/FAILURE
    Oct 04 17:50:25 fb1 systemd[1]: Unit daqd_dc.service entered failed state.
    
  • It seemed like the only thing daqd_dc process doesn't like is that mx_stream services are in failed state in teh FE computers. So we did the same process on FE machines to get the real error messages:
    controls@fb1:~ 0$ sudo chroot /diskless/root
    fb1:/ 0#
    fb1:/ 0# sudo nano /etc/systemd/system/mx_stream.service
    fb1:/ 0#
    fb1:/ 0# exit
  • Then I ssh'ed into c1bhd to see the error message on mx_stream service properly.
    controls@c1bhd:~ 0$ sudo systemctl daemon-reload
    controls@c1bhd:~ 0$ sudo systemctl restart mx_stream
    controls@c1bhd:~ 0$ sudo systemctl status mx_stream -l
    ● mx_stream.service - Advanced LIGO RTS front end mx stream
       Loaded: loaded (/etc/systemd/system/mx_stream.service; enabled)
       Active: failed (Result: exit-code) since Mon 2021-10-04 17:57:20 UTC; 24s ago
      Process: 11832 ExecStart=/etc/mx_stream_exec (code=exited, status=1/FAILURE)
     Main PID: 11832 (code=exited, status=1/FAILURE)
    
    Oct 04 17:57:20 c1bhd systemd[1]: Starting Advanced LIGO RTS front end mx stream...
    Oct 04 17:57:20 c1bhd systemd[1]: Started Advanced LIGO RTS front end mx stream.
    Oct 04 17:57:20 c1bhd mx_stream_exec[11832]: send len = 263596
    Oct 04 17:57:20 c1bhd mx_stream_exec[11832]: OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)
    Oct 04 17:57:20 c1bhd mx_stream_exec[11832]: mx_connect failed Nic ID not Found in Peer Table
    Oct 04 17:57:20 c1bhd mx_stream_exec[11832]: c1x06_daq mmapped address is 0x7f516a97a000
    Oct 04 17:57:20 c1bhd mx_stream_exec[11832]: c1bhd_daq mmapped address is 0x7f516697a000
    Oct 04 17:57:20 c1bhd systemd[1]: mx_stream.service: main process exited, code=exited, status=1/FAILURE
    Oct 04 17:57:20 c1bhd systemd[1]: Unit mx_stream.service entered failed state.
    
  • c1sus2 shows the same error. I'm not sure I understand these errors at all. But they seem to have nothing to do with timing issuessurprise!

As usual, some help would be helpful

  16373   Mon Oct 4 15:50:31 2021 HangUpdateCalibrationFisher matrix estimation on XARM parameters

[Anchal, Hang]

What: Anchal and I measured the XARM OLTF last Thursday.

Goal: 1. measure the 2 zeros and 2 poles in the analog whitening filter, and potentially constrain the cavity pole and an overall gain. 

          2. Compare the parameter distribution obtained from measurements and that estimated analytically from the Fisher matrix calculation.

          3. Obtain the optimized excitation spectrum for future measurements.   

How: we inject at C1:SUS-ETMX_LSC_EXC so that each digital count should be directly proportional to the force applied to the suspension. We read out the signal at C1:SUS-ETMX_LSC_OUT_DQ. We use an approximately white excitation in the 50-300 Hz band, and intentionally choose the coherence to be only slightly above 0.9 so that we can get some statistical error to be compared with the Fisher matrix's prediction. For each measurement, we use a bandwidth of 0.25 Hz and 10 averages (no overlapping between adjacent segments). 

The 2 zeros and 2 poles in the analog whitening filter and an overall gain are treated as free parameters to be fitted, while the rest are taken from the model by Anchal and Paco (elog:16363). The optical response of the arm cavity seems missing in that model, and thus we additionally include a real pole (for the cavity pole) in the model we fit. Thus in total, our model has 6 free parameters, 2 zeros, 3 poles, and 1 overall gain. 

The analysis codes are pushed to the 40m/sysID repo. 

===========================================================

Results:

Fig. 1 shows one measurement. The gray trace is the data and the olive one is the maximum likelihood estimation. The uncertainty for each frequency bin is shown in the shaded region. Note that the SNR is related to the coherence as 

        SNR^2 = [coherence / (1-coherence)] * (# of average), 

and for a complex TF written as G = A * exp[1j*Phi], one can show the uncertainty is given by 

        \Delta A / A = 1/SNR,  \Delta \Phi = 1/SNR [rad]. 

Fig. 2. The gray contours show the 1- and 2-sigma levels of the model parameters using the Fisher matrix calculation. We repeated the measurement shown in Fig. 1 three times, and the best-fit parameters for each measurement are indicated in the red-crosses. Although we only did a small number of experiments, the amount of scattering is consistent with the Fisher matrix's prediction, giving us some confidence in our analytical calculation. 

One thing to note though is that in order to fit the measured data, we would need an additional pole at around 1,500 Hz. This seems a bit low for the cavity pole frequency. For aLIGO w/ 4km arms, the single-arm pole is about 40-50 Hz. The arm is 100 times shorter here and I would naively expect the cavity pole to be at 3k-4k Hz if the test masses are similar. 

Fig. 3. We then follow the algorithm outlined in Pintelon & Schoukens, sec. 5.4.2.2, to calculate how we should change the excitation spectrum. Note that here we are fixing the rms of the force applied to the suspension constant. 

Fig. 4 then shows how the expected error changes as we optimize the excitation. It seems in this case a white-ish excitation is already decent (as the TF itself is quite flat in the range of interest), and we only get some mild improvement as we iterate the excitation spectra (note we use the color gray, olive, and purple for the results after the 0th, 1st, and 2nd iteration; same color-coding as in Fig. 3).   

 

 

 

  16374   Mon Oct 4 16:00:57 2021 YehonathanSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

{Yehonathan, Anchel}

In an attempt to fix the actuation of the PRMI DOFs we set to modify the output matrix of the BS and PRM such that the response of the coils will be similar to each other as much as possible.

To do so, we used the responses at a single frequency from the previous measurement to infer the output matrix coefficients that will equilize the OpLev responses (arbitrarily making the LL coil as a reference). This corrected the imbalance in BS almost completely while it didn't really work for PRM (see attachment 1).

The new output matrices are shown in attachment 2-3.

  16375   Mon Oct 4 16:10:09 2021 ranaSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

not sure that this is necessary. If you look at teh previous entries Gautam made on this topic, it is clear that the BS/PRM PRMI matrix is snafu, whereas the ITM PRMI matrix is not.

Is it possible that the ~5% coil imbalance of the BS/PRM can explain the observed sensing matrix? If not, then there is no need to balance these coils.

  16376   Mon Oct 4 18:00:16 2021 KojiSummaryCDSc1teststand problems summary

I don't know anything about mx/open-mx, but you also need open-mx,don't you?


controls@c1ioo:~ 0$ systemctl status *mx*
● open-mx.service - LSB: starts Open-MX driver
   Loaded: loaded (/etc/init.d/open-mx)
   Active: active (running) since Wed 2021-09-22 11:54:39 PDT; 1 weeks 5 days ago
  Process: 470 ExecStart=/etc/init.d/open-mx start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/open-mx.service
           └─620 /opt/3.2.88-csp/open-mx-1.5.4/bin/fma -d

● mx_stream.service - Advanced LIGO RTS front end mx stream
   Loaded: loaded (/etc/systemd/system/mx_stream.service; enabled)
   Active: active (running) since Wed 2021-09-22 12:08:00 PDT; 1 weeks 5 days ago
 Main PID: 5785 (mx_stream)
   CGroup: /system.slice/mx_stream.service
           └─5785 /usr/bin/mx_stream -e 0 -r 0 -w 0 -W 0 -s c1x03 c1ioo c1als c1omc -d fb1:0

 

  16377   Mon Oct 4 18:35:12 2021 PacoUpdateElectronicsSatellite amp box adapters

[Paco]

I have finished assembling the 1U adapters from 8 to 5 DB9 conn. for the satellite amp boxes. One thing I had to "hack" was the corners of the front panel end of the PCB. Because the PCB was a bit too wide, it wasn't really flush against the front panel (see Attachment #1), so I just filed the corners by ~ 3 mm and covered with kapton tape to prevent contact between ground planes and the chassis. After this, I made DB9 cables, connected everything in place and attached to the rear panel (Attachment #2). Four units are resting near the CAD machine (next to the bench area), see Attachment #3.

  16378   Mon Oct 4 20:46:08 2021 KojiUpdateElectronicsSatellite amp box adapters

Thanks. You should be able to find the chassis-related hardware on the left side of the benchtop drawers at the middle workbench.

Hardware: The special low profile 4-40 standoff screw / 1U handles / screws and washers for the chassis / flat-top screws for chassis panels and lids

  16379   Mon Oct 4 21:58:17 2021 TegaUpdateElectronicsSat Amp modifications

Trying to finish 2 more Sat Amp units so that we have the 7 units needed for the X-arm install. 

S2100736 - All good

S2100737 - This unit presented with an issue on the PD1 circuit of channel 1-4 PCB where the voltage reading on TP6, TP7 and TP8 are -15.1V,  -14.2V, and +14.7V respectively, instead of ~0V.  The unit also has an issue on the PD2 circuit of channel 1-4 PCB because the voltage reading on TP7 and TP8 are  -14.2V, and +14.25V respectively, instead of ~0V.

 

  16380   Tue Oct 5 17:01:20 2021 KojiUpdateElectronicsSat Amp modifications

Make sure the inputs for the PD amps are open. This is the current amplifier and we want to leave the input pins open for the test of this circuit.

TP6 is the first stage of the amps (TIA). So this stage has the issue. Usual check if the power is properly supplied / if the pins are properly connected/isolated / If the opamp is alive or not.

For TP8, if TP8 get railed. TP5 and TP7 are going to be railed too. Is that the case, if so, check this whitening stage in the same way as above.
If the problem is only in the TP5 and/or TP7 it is the differential driver issue. Check the final stage as above. Replacing the opamp could help.

 

  16381   Tue Oct 5 17:58:52 2021 AnchalSummaryCDSc1teststand problems summary

open-mx service is running successfully on the fb1(clone), c1bhd and c1sus.

Quote:

I don't know anything about mx/open-mx, but you also need open-mx,don't you?


  16382   Tue Oct 5 18:00:53 2021 AnchalSummaryCDSc1teststand time synchronization working now

Today I got a new router that I used to connect the c1teststand, fb1 and chiara. I was able to see internet access in c1teststand and fb1, but not in chiara. I'm not sure why that is the case.

The good news is that the ntp server on fb1(clone) is working fine now and both FE computers, c1bhd and c1sus2 are succesfully synchronized to the fb1(clone) ntpserver. This resolves any possible timing issues in this DAQ network.

On running the IOP and user models however, I see the same errors are mentioned in 40m/16372. Something to do with:

Oct 06 00:47:56 c1sus2 mx_stream_exec[21796]: OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)
Oct 06 00:47:56 c1sus2 mx_stream_exec[21796]: mx_connect failed Nic ID not Found in Peer Table
Oct 06 00:47:56 c1sus2 mx_stream_exec[21796]: c1x07_daq mmapped address is 0x7fa4819cc000
Oct 06 00:47:56 c1sus2 mx_stream_exec[21796]: c1su2_daq mmapped address is 0x7fa47d9cc000


Thu Oct 7 17:04:31 2021

I fixed the issue of chiara not getting internet. Now c1teststand, fb1 and chiara, all have internet connections. It was the issue of default gateway and interface and findiing the DNS. I have found the correct settings now.

  16383   Tue Oct 5 20:04:22 2021 PacoSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

[Paco, Rana]

We had a look at the BS actuation. Along the way we created a couple of issues that we fixed. A summary is below.

  1. First, we locked MICH. While doing this, we used the /users/Templates/ndscope/LSC/MICH.yml ndscope template to monitor some channels. I edited the yaml file to look at C1:LSC-ASDC_OUT_DQ instead of the REFL_DC. Rana pointed out that the C1:LSC-MICH_OUT_DQ (MICH control point) had a big range (~ 5000 counts rms) and this should not be like that.
  2. We tried to investigate the aforementioned thing by looking at the whitening / uwhitening filters but all the slow epics channels where "white" on the medm screen. Looking under CDS/slow channel monitors, we realized that both c1iscaux and c1auxey were weird, so we tried telnet to c1iscaux without success. Therefore, we followed the recommended wiki procedure of hard rebooting this machine. While inside the lab and looking for this machine, we touched things around the 'rfpd' rack and once we were back in the control room, we couldn't see any light on the AS port camera. But the whitening filter medm screens were back up.
  3. While rana ssh'd into c1auxey to investigate about its status, and burtrestored the c1iscaux channels, we looked at trends to figure out if anything had changed (for example TT1 or TT2) but this wasn't the case. We decided to go back inside to check the actual REFL beams and noticed it was grossly misaligned (clipping)... so we blamed it on the TTs and again, went around and moved some stuff around the 'rfpd' rack. We didn't really connect or disconnect anything, but once we were back in the control room, light was coming from the AS port again. This is a weird mystery and we should systematically try to repeat this and fix the actual issue.
  4. We restored the MICH, and returned to BS actuation problems. Here, we essentially devised a scheme to inject noise at 310.97 Hz and 313.74. The choice is twofold, first it lies outside the MICH loop UGF (~150 Hz), and second, it matches the sensing matrix OSC frequencies, so it's more appropriate for a comparison.
  5. We injected two lines using the BS SUS LOCKIN1 and LOCKIN2 oscilators so we can probe two coils at once, with the LSC loop closed, and read back using the C1:LSC-MICH_IN1_DQ channel. We excited with an amplitude of 1234.0 counts and 1254 counts respectively (to match the ~ 2 % difference in frequency) and noted that the magnitude response in UR was 10% larger than UL, LL, and LR which were close to each other at the 2% level.

[Paco]

After rana left, I did a second pass at the BS actuation. I took TF measurements at the oscilator frequencies noted above using diaggui, and summarize the results below:

TF UL (310.97 Hz) UR (313.74 Hz) LL (310.97 Hz) LR (313.74 Hz)
Magnitude (dB) 93.20 92.20 94.27 93.85
Phase (deg) -128.3 -127.9 -128.4 -127.5

This procedure should be done with PRM as well and using the PRCL instead of MICH.

  16384   Wed Oct 6 15:04:36 2021 HangUpdateSUSPRM L2P TF measurement & Fisher matrix analysis

[Paco, Hang]

Yesterday afternoon Paco and I measured the PRM L2P transfer function. We drove C1:SUS-PRM_LSC_EXC with a white noise in the 0-10 Hz band (effectively a white, longitudinal force applied to the suspension) and read out the pitch response in C1:SUS-PRM_OL_PIT_OUT. The local damping was left on during the measurement. Each FFT segment in our measurement is 32 sec and we used 8 non-overlapping segments for each measurement. The empirically determined results are also compared with the Fisher matrix estimation (similar to elog:16373).

Results:

Fig. 1 shows one example of the measured L2P transfer function. The gray traces are measurement data and shaded region the corresponding uncertainty. The olive trace is the best fit model. 

Note that for a single-stage suspension, the ideal L2P TF should have two zeros at DC and two pairs of complex poles for the length and pitch resonances, respectively. We found the two resonances at around 1 Hz from the fitting as expected. However, the zeros were not at DC as the ideal, theoretical model suggested. Instead, we found a pair of right-half plane zeros in order to explain the measurement results. If we cast such a pair of right-half plane zeros into (f, Q) pair, it means a negative value of Q. This means the system does not have the minimum phase delay and suggests some dirty cross-coupling exists, which might not be surprising. 

Fig. 2 compares the distribution of the fitting results for 4 different measurements (4 red crosses) and the analytical error estimation obtained using the Fisher matrix (the gray contours; the inner one is the 1-sigma region and the outer one the 3-sigma region). The Fisher matrix appears to underestimate the scattering from this experiment, yet it does capture the correlation between different parameters (the frequencies and quality factors of the two resonances).

One caveat though is that the fitting routine is not especially robust. We used the vectfit routine w/ human intervening to get some initial guesses of the model. We then used a standard scipy least-sq routine to find the maximal likelihood estimator of the restricted model (with fixed number of zeros and poles; here 2 complex zeros and 4 complex poles). The initial guess for the scipy routine was obtained from the vectfit model.  

Fig. 3 shows how we may shape our excitation PSD to maximize the Fisher information while keeping the RMS force applied to the PRM suspension fixed. In this case the result is very intuitive. We simply concentrate our drive around the resonance at ~ 1 Hz, focusing on locations where we initially have good SNR. So at least code is not suggesting something crazy... 

Fig. 4 then shows how the new uncertainty (3-sigma contours) should change as we optimize our excitation. Basically one iteration (from gray to olive) is sufficient here. 

We will find a time very recently to repeat the measurement with the optimized injection spectrum.

  16385   Wed Oct 6 15:39:29 2021 AnchalSummarySUSPRM and BS Angular Actuation transfer function magnitude measurements

Note that your tests were done with the output matrix for BS and PRM in the compensated state as done in 40m/16374. The changes made there were supposed to clear out any coil actuation imbalance in the angular degrees of freedom.

  16386   Wed Oct 6 16:31:02 2021 TegaUpdateElectronicsSat Amp modifications

[Tega, Koji]

(S2100737) - Debugging showed that the opamp, AD822ARZ, for PD2 circuit was not working as expected so we replaced with a spare and this fixed the problem. Somehow, the PD1 circuit no longer presents any issues, so everything is now fine with the unit.

(S2100741) - All good.

Quote:

Trying to finish 2 more Sat Amp units so that we have the 7 units needed for the X-arm install. 

S2100736 - All good

S2100737 - This unit presented with an issue on the PD1 circuit of channel 1-4 PCB where the voltage reading on TP6, TP7 and TP8 are -15.1V,  -14.2V, and +14.7V respectively, instead of ~0V.  The unit also has an issue on the PD2 circuit of channel 1-4 PCB because the voltage reading on TP7 and TP8 are  -14.2V, and +14.25V respectively, instead of ~0V.

 

 

  16387   Thu Oct 7 02:04:19 2021 KojiUpdateElectronicsSatellite amp adapter chassis

The 4 units of Satellite Amp Adapter were done:
- The ears were fixed with the screws
- The handles were attached (The stock of the handles is low)
- The boards are now supported by plastic stand-offs. (The chassis were drilled)
- The front and rear panels were fixed to the chassis
- The front and rear connectors were fixed with the low profile 4-40 stand-off screws (3M 3341-1S)
 

  16388   Fri Oct 8 17:33:13 2021 HangUpdateSUSMore PRM L2P measurements

[Raj, Hang]

We did some more measurements on the PRM L2P TF. 

We tried to compare the parameter estimation uncertainties of white vs. optimal excitation. We drove C1:SUS-PRM_LSC_EXC with "Normal" excitation and digital gain of 700.

For the white noise exciation, we simply put a butter("LowPass",4,10) filter to select out the <10 Hz band.

For the optimal exciation, we use butter("BandPass",6,0.3,1.6) gain(3) notch(1,20,8) to approximate the spectral shape reported in elog:16384. We tried to use awg.ArbitraryLoop yet this function seems to have some bugs and didn't run correctly; an issue has been submitted to the gitlab repo with more details. We also noticed that in elog:16384, the pitch motion should be read out from C1:SUS-PRM_OL_PIT_IN1 instead of the OUT channel, as there are some extra filters between IN1 and OUT. Consequently, the exact optimal exciation should be revisited, yet we think the main result should not be altered significantly.

While a more detail analysis will be done later offline, we post in the attached plot a comparison between the white (blue) vs optimal (red) excitation. Note in this case, we kept the total force applied to the PRM the same (as the RMS level matches).

Under this simple case, the optimal excitation appears reasonable in two folds.

First, the optimization tries to concentrate the power around the resonance. We would naturally expect that near the resonance, we would get more Fisher information, as the phase changes the fastest there (i.e., large derivatives in the TF).

Second, while we move the power in the >2 Hz band to the 0.3-2 Hz band, from the coherence plot we see that we don't lose any information in the > 2 Hz region. Indeed, even with the original white excitation, the coherence is low and the > 2 Hz region would not be informative. Therefore, it seems reasonable to give up this band so that we can gain more information from locations where we have meaningful coherence.

  16389   Mon Oct 11 11:13:04 2021 ranaUpdateSUSMore PRM L2P measurements

For the oplev, there are DQ channels you can use so that its possible to look back in the past for long measurements. They have names like PERROR

  16390   Mon Oct 11 13:59:47 2021 HangUpdateSUSMore PRM L2P measurements

We report here the analysis results for the measurements done in elog:16388

Figs. 1 & 2 are respectively measurements of the white noise excitation and the optimized excitation. The shaded region corresponds to the 1-sigma uncertainty at each frequency bin. By eyes, one can already see that the constraints on the phase in the 0.6-1 Hz band are much tighter in the optimized case than in the white noise case. 

We found the transfer function was best described by two real poles + one pair of complex poles (i.e., resonance) + one pair of complex zeros in the right-half plane (non-minimum phase delay). The measurement in fact suggested a right-hand pole somewhere between 0.05-0.1 Hz which cannot be right. For now, I just manually flipped the sign of this lowest frequency pole to the left-hand side. However, this introduced some systematic deviation in the phase in the 0.3-0.5 Hz band where our coherence was still good. Therefore, a caveat is that our model with 7 free parameters (4 poles + 2 zeros + 1 gain as one would expect for an ideal signal-stage L2P TF) might not sufficiently capture the entire physics. 

In Fig. 3 we showed the comparison of the two sets of measurements together with the predictions based on the Fisher matrix. Here the color gray is for the white-noise excitation and olive is for the optimized excitation. The solid and dotted contours are respectively the 1-sigma and 3-sigma regions from the Fisher calculation, and crosses are maximum likelihood estimations of each measurement (though the scipy optimizer might not find the true maximum).

Note that the mean values don't match in the two sets of measurements, suggesting potential bias or other systematics exists in the current measurement. Moreover, there could be multiple local maxima in the likelihood in this high-D parameter space (not surprising). For example, one could reduce the resonant Q but enhance the overall gain to keep the shoulder of a resonance having the same amplitude. However, this correlation is not explicit in the Fisher matrix (first-order derivatives of the TF, i.e., local gradients) as it does not show up in the error ellipse. 

In Fig. 4 we show the further optimized excitation for the next round of measurements. Here the cyan and olive traces are obtained assuming different values of the "true" physical parameter, yet the overall shapes of the two are quite similar, and are close to the optimized excitation spectrum we already used in elog:16388

 

  16391   Mon Oct 11 17:31:25 2021 AnchalSummaryCDSFixed mounting of mx devices in fb. daqd_dc is running now.
 
 

However, lspci | grep 'Myri' shows following output on both computers:

controls@fb1:/dev 0$ lspci | grep 'Myri'
02:00.0 Ethernet controller: MYRICOM Inc. Myri-10G Dual-Protocol NIC (rev 01)

Which means that the computer detects the card on PCie slot.

 

I tried to add this to /etc/rc.local to run this script at every boot, but it did not work. So for now, I'll just manually do this step everytime. Once the devices are loaded, we get:

controls@fb1:/etc 0$ ls /dev/*mx*
/dev/mx0  /dev/mx4  /dev/mxctl   /dev/mxp2  /dev/mxp6         /dev/ptmx
/dev/mx1  /dev/mx5  /dev/mxctlp  /dev/mxp3  /dev/mxp7
/dev/mx2  /dev/mx6  /dev/mxp0    /dev/mxp4  /dev/open-mx
/dev/mx3  /dev/mx7  /dev/mxp1    /dev/mxp5  /dev/open-mx-raw

The, restarting all daqd_ processes, I found that daqd_dc was running succesfully now. Here is the status:

controls@fb1:/etc 0$ sudo systemctl status daqd_* -l
● daqd_dc.service - Advanced LIGO RTS daqd data concentrator
   Loaded: loaded (/etc/systemd/system/daqd_dc.service; enabled)
   Active: active (running) since Mon 2021-10-11 17:48:00 PDT; 23min ago
 Main PID: 2308 (daqd_dc_mx)
   CGroup: /daqd.slice/daqd_dc.service
           ├─2308 /usr/bin/daqd_dc_mx -c /opt/rtcds/caltech/c1/target/daqd/daqdrc.dc
           └─2370 caRepeater

Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: mx receiver 006 thread priority error Operation not permitted[Mon Oct 11 17:48:06 2021]
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: mx receiver 005 thread put on CPU 0
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: [Mon Oct 11 17:48:06 2021] [Mon Oct 11 17:48:06 2021] mx receiver 006 thread put on CPU 0
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: mx receiver 007 thread put on CPU 0
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: [Mon Oct 11 17:48:06 2021] mx receiver 003 thread - label dqmx003 pid=2362
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: [Mon Oct 11 17:48:06 2021] mx receiver 003 thread priority error Operation not permitted
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: [Mon Oct 11 17:48:06 2021] mx receiver 003 thread put on CPU 0
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: warning:regcache incompatible with malloc
Oct 11 17:48:07 fb1 daqd_dc_mx[2308]: [Mon Oct 11 17:48:06 2021] EDCU has 410 channels configured; first=0
Oct 11 17:49:06 fb1 daqd_dc_mx[2308]: [Mon Oct 11 17:49:06 2021] ->4: clear crc

● daqd_fw.service - Advanced LIGO RTS daqd frame writer
   Loaded: loaded (/etc/systemd/system/daqd_fw.service; enabled)
   Active: active (running) since Mon 2021-10-11 17:48:01 PDT; 23min ago
 Main PID: 2318 (daqd_fw)
   CGroup: /daqd.slice/daqd_fw.service
           └─2318 /usr/bin/daqd_fw -c /opt/rtcds/caltech/c1/target/daqd/daqdrc.fw

Oct 11 17:48:09 fb1 daqd_fw[2318]: [Mon Oct 11 17:48:09 2021] [Mon Oct 11 17:48:09 2021] Producer thread - label dqproddbg pid=2440
Oct 11 17:48:09 fb1 daqd_fw[2318]: Producer crc thread priority error Operation not permitted
Oct 11 17:48:09 fb1 daqd_fw[2318]: [Mon Oct 11 17:48:09 2021] [Mon Oct 11 17:48:09 2021] Producer crc thread put on CPU 0
Oct 11 17:48:09 fb1 daqd_fw[2318]: Producer thread priority error Operation not permitted
Oct 11 17:48:09 fb1 daqd_fw[2318]: [Mon Oct 11 17:48:09 2021] Producer thread put on CPU 0
Oct 11 17:48:09 fb1 daqd_fw[2318]: [Mon Oct 11 17:48:09 2021] Producer thread - label dqprod pid=2434
Oct 11 17:48:09 fb1 daqd_fw[2318]: [Mon Oct 11 17:48:09 2021] Producer thread priority error Operation not permitted
Oct 11 17:48:09 fb1 daqd_fw[2318]: [Mon Oct 11 17:48:09 2021] Producer thread put on CPU 0
Oct 11 17:48:10 fb1 daqd_fw[2318]: [Mon Oct 11 17:48:10 2021] Minute trender made GPS time correction; gps=1318034906; gps%60=26
Oct 11 17:49:09 fb1 daqd_fw[2318]: [Mon Oct 11 17:49:09 2021] ->3: clear crc

● daqd_rcv.service - Advanced LIGO RTS daqd testpoint receiver
   Loaded: loaded (/etc/systemd/system/daqd_rcv.service; enabled)
   Active: active (running) since Mon 2021-10-11 17:48:00 PDT; 23min ago
 Main PID: 2311 (daqd_rcv)
   CGroup: /daqd.slice/daqd_rcv.service
           └─2311 /usr/bin/daqd_rcv -c /opt/rtcds/caltech/c1/target/daqd/daqdrc.rcv

Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1X07_CRC_SUM
Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1BHD_STATUS
Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1BHD_CRC_CPS
Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1BHD_CRC_SUM
Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1SU2_STATUS
Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1SU2_CRC_CPS
Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1SU2_CRC_SUM
Oct 11 17:50:21 fb1 daqd_rcv[2311]: Creating C1:DAQ-NDS0_C1OM[Mon Oct 11 17:50:21 2021] Epics server started
Oct 11 17:50:24 fb1 daqd_rcv[2311]: [Mon Oct 11 17:50:24 2021] Minute trender made GPS time correction; gps=1318035040; gps%120=40
Oct 11 17:51:21 fb1 daqd_rcv[2311]: [Mon Oct 11 17:51:21 2021] ->3: clear crc

Now, even before starting teh FE models, I see DC status as ox2bad in the CDS screens of the IOP and user models. The mx_stream service remains in a failed state at teh FE machines and remain the same even after restarting the service.

controls@c1sus2:~ 0$ sudo systemctl status mx_stream -l
● mx_stream.service - Advanced LIGO RTS front end mx stream
   Loaded: loaded (/etc/systemd/system/mx_stream.service; enabled)
   Active: failed (Result: exit-code) since Mon 2021-10-11 17:50:26 PDT; 15min ago
  Process: 382 ExecStart=/etc/mx_stream_exec (code=exited, status=1/FAILURE)
 Main PID: 382 (code=exited, status=1/FAILURE)

Oct 11 17:50:25 c1sus2 systemd[1]: Starting Advanced LIGO RTS front end mx stream...
Oct 11 17:50:25 c1sus2 systemd[1]: Started Advanced LIGO RTS front end mx stream.
Oct 11 17:50:25 c1sus2 mx_stream_exec[382]: Failed to open endpoint Not initialized
Oct 11 17:50:26 c1sus2 systemd[1]: mx_stream.service: main process exited, code=exited, status=1/FAILURE
Oct 11 17:50:26 c1sus2 systemd[1]: Unit mx_stream.service entered failed state.

But  if I restart the mx_stream service before starting the rtcds models, the mx-stream service starts succesfully:

controls@c1sus2:~ 0$ sudo systemctl restart mx_stream
controls@c1sus2:~ 0$ sudo systemctl status mx_stream -l
● mx_stream.service - Advanced LIGO RTS front end mx stream
   Loaded: loaded (/etc/systemd/system/mx_stream.service; enabled)
   Active: active (running) since Mon 2021-10-11 18:14:13 PDT; 25s ago
 Main PID: 1337 (mx_stream)
   CGroup: /system.slice/mx_stream.service
           └─1337 /usr/bin/mx_stream -e 0 -r 0 -w 0 -W 0 -s c1x07 c1su2 -d fb1:0

Oct 11 18:14:13 c1sus2 systemd[1]: Starting Advanced LIGO RTS front end mx stream...
Oct 11 18:14:13 c1sus2 systemd[1]: Started Advanced LIGO RTS front end mx stream.
Oct 11 18:14:13 c1sus2 mx_stream_exec[1337]: send len = 263596
Oct 11 18:14:13 c1sus2 mx_stream_exec[1337]: Connection Made

However, the DC status on CDS screens still show 0x2bad. As soon as I start the rtcds model c1x07 (the IOP model for c1sus2), the mx_stream service fails:

controls@c1sus2:~ 0$ sudo systemctl status mx_stream -l
● mx_stream.service - Advanced LIGO RTS front end mx stream
   Loaded: loaded (/etc/systemd/system/mx_stream.service; enabled)
   Active: failed (Result: exit-code) since Mon 2021-10-11 18:18:03 PDT; 27s ago
  Process: 1337 ExecStart=/etc/mx_stream_exec (code=exited, status=1/FAILURE)
 Main PID: 1337 (code=exited, status=1/FAILURE)

Oct 11 18:14:13 c1sus2 systemd[1]: Starting Advanced LIGO RTS front end mx stream...
Oct 11 18:14:13 c1sus2 systemd[1]: Started Advanced LIGO RTS front end mx stream.
Oct 11 18:14:13 c1sus2 mx_stream_exec[1337]: send len = 263596
Oct 11 18:14:13 c1sus2 mx_stream_exec[1337]: Connection Made
Oct 11 18:18:03 c1sus2 mx_stream_exec[1337]: isendxxx failed with status Remote Endpoint Unreachable
Oct 11 18:18:03 c1sus2 mx_stream_exec[1337]: disconnected from the sender
Oct 11 18:18:03 c1sus2 mx_stream_exec[1337]: c1x07_daq mmapped address is 0x7fe3620c3000
Oct 11 18:18:03 c1sus2 mx_stream_exec[1337]: c1su2_daq mmapped address is 0x7fe35e0c3000
Oct 11 18:18:03 c1sus2 systemd[1]: mx_stream.service: main process exited, code=exited, status=1/FAILURE
Oct 11 18:18:03 c1sus2 systemd[1]: Unit mx_stream.service entered failed state.

This shows that the start of rtcds model, causes the fail in mx_stream, possibly due to inability of finding the endpoint on fb1. I've again reached to the edge of my knowledge here. Maybe the fiber optic connection between fb and the network switch that connects to FE is bad, or the connection between switch and FEs is bad.

But we are just one step away from making this work.

 

 

  16392   Mon Oct 11 18:29:35 2021 AnchalSummaryCDSMoving forward?

The teststand has some non-trivial issue with Myrinet card (either software or hardware) which even teh experts are saying they don't remember how to fix it. CDS with mx was iin use more than a decade ago, so it is hard to find support for issues with it now and will be the same in future. We need to wrap up this test procedure one way or another now, so I have following two options moving forward:


Direct integration with main CDS and testing

  • We can just connect the c1sus2 and c1bhd FE computers to martian network directly.
  • We'll have to connect c1sus2 and c1bhd to the optical fiber subnetwork as well.
  • On booting, they would get booted through the exisitng fb1 boot server which seems to work fine for the other 5 FE machines.
  • We can update teh DHCP in chiara and reload it so that we can ssh into these FEs with host names.
  • Hopefully, presence of these computers won't tank the existing CDS even if they  themselves have any issues, as they have no shared memory with other models.
  • If this works, we can do the loop back testing of I/O chassis using the main DAQ network and move on with our upgrade.
  • If this does not work and causes any harm to exisitng CDS network, we can disconnect these computers and go back to existing CDS. Recently, our confidence on rebooting the CDS has increased with the robust performance as some legacy issues were fixed.
  • We'll however, continue to use a CDS which is no more supported by the current LIGO CDS group.

Testing CDS upgrade on teststand

  • From what I could gather, most of the hardware in I/O chassis that I could find, is still used in CDS of LLO and LHO, with their recent tests and documents using the same cards and PCBs.
  • There might be some difference in the DAQ network setup that I need to confirm.
  • I've summarised the current c1teststand hardware on this wiki page.
  • If the latest CDS is backwards compatible with our hardware, we can test the new CDS in teh c1teststand setup without disrupting our main CDS. We'll have ample help and support for this upgrade from the current LIGO CDS group.
  • We can do the loop back testing of the I/O chassis as well.
  • If the upgrade is succesfull in the teststand without many hardware changes, we can upgrade the main CDS of 40m as well, as it has the same hardware as our teststand.
  • Biggest plus point would be that out CDS will be up-to-date and we will be able to take help from CDS group if any trouble occurs.

So these are the two options we have. We should discuss which one to take in the mattermost chat or in upcoming meeting.

ELOG V3.1.3-