40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 100 of 350  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  16360   Mon Sep 27 12:12:15 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

I have not been able to figure out a way to make the system that Aaron and I talked about. I'm not even sure it is possible to pull the information out of the information I have in this way. Even the book uses a comparison to a high precision filter as a way to calculate the quantization noise:

"Quantization noise in digital filters can be studied in simulation by comparing the behavior of the actual quantized digital filter with that of a refrence digital filter having the same structure but whose numerical calculations are done extremely accurately."
-Quantization Noise by Bernard Widrow and Istvan Kollar (pg. 416)

Thus I will use a technique closer to that used in Den Martynov's thesis (see appendix B starting on page 171). A summary of my understanding of his method is given here:

A filter is given raw unfiltered gaussian data f(t) then it is filtered and the result is the filtered data x(t) thus we get the result: f(t)\rightarrow x(t)=x_N(t)+x_q(t)  where x_N(t) is the raw noise filtered through an ideal filter and x_q(t) is the difference which in this case is the quantization noise. Thus I will input about 100-1000 seconds of the same white noise into a 32-bit and a 64-bit filter. (hopefully, I can increase the more precise one to 128 bit in the future) then I record their outputs and subtract the from each other. this should give us the Quantization error e(t):
e(t)=x_{32}(t)-x_{64}(t)=x_{N_{32}}(t)+x_{q_{32}}(t) - x_{N_{64}}(t)-x_{q_{64}}(t)
and since x_{N_{32}}(t)=x_{N_{64}}(t) because they are both running through ideal filters:
e(t)=x_{N}(t)+x_{q_{32}}(t) - x_{N}(t)-x_{q_{64}}(t)
e(t)=x_{q_{32}}(t) -x_{q_{64}}(t)
and since in this case, we are assuming that the higher bit-rate process is essentially noiseless we get the Quantization noise x_{q_{32}}(t).

If we make some assumptions, then we can actually calculate a more precise version of the quantization noise:

"Since aLIGO CDS system uses double precision format, quantization noise is extrapolated assuming that it scales with mantissa length"
-Denis Martynov's Thesis (pg. 173)

From this assumption, we can say that the noise difference between the 32-bit and 64-bit filter outputs:  x_{q_{32}}(t)-x_{q_{64}}(t)  is proportional to the difference between their mantissa length. by averaging over many different bit lengths, we can estimate a better quantization noise number.

I am building the code to do this in this file

  16361   Mon Sep 27 16:03:15 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

I have coded up the procedure in the previous post: The result does not look like what I would expect. 

As shown in Attachment1 I have the power spectrum of the 32-bit output and the 64-bit output as well as the power spectrum of the two subtracted time series as well as the subtracted power spectra of both. unfortunately, all of them follow the same general shape of the raw output of the filter. 

I would not expect quantization noise to follow the shape of the filter. I would instead expect it to be more uniform. If anything I would expect the quantization noise to increase with frequency. If a high-frequency signal is run through a filter that has high quantization noise then it will severely degrade: i.e. higher quantization noise. 

This is one reason why I am confused by what I am seeing here. In all cases including feeding the same and different white noise into both filters, I have found that the calculated quantization noise is proportional to the response of the filter. this seems wrong to me so I will continue to play around with it to see if I can gain any intuition about what might be happening.

Attachment 1: DeltaNoiseSpectrum.pdf
DeltaNoiseSpectrum.pdf
  16362   Mon Sep 27 17:04:43 2021 ranaSummaryComputersQuantization Noise Calculation Summary

I suggest that you

  1. change the corner frequency to 10 Hz as I suggested last week. This filter, as it is, is going to give you trouble.
  2. Put in a sine wave at 3.4283 Hz with an amplitude of 1, rather than white noise. In this way, its not necessary to do any subtraction. Just make PSD of the output of each filter.
  3. Be careful about window length and window function. If you don't do this carefully, your PSD will be polluted by window bleeding.

 

  16366   Thu Sep 30 11:46:33 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

First and foremost I have the updated bode plot with the mode moved to 10 Hz. See Attachment 1. Note that the comparison measurement is a % difference whereas in the previous bode plot it was just the difference. I also wrapped the phase so that jumps from -180 to 180 are moved down. This eliminates massive jumps in the % difference. 

Next, I have two comparison plots: 32 bit and 64bit. As mentioned above I moved the mode to 10 Hz and just excited both systems at 3.4283Hz with an amplitude of 1. As we can see on the plot the two models are practically the same when using 64bits. With the 32bit system, we can see that the noise in the IIR filter is much greater than in the State-space model at frequencies greater than our mode.

Note about windowing and averaging: I used a Hanning window with averaging over 4 neighbor points. I came to this number after looking at the results with less averaging and more averaging. In the code, this can be seen as nperseg=num_samples/4 which is then fed into signal.welch

Attachment 1: SS-IIR-Bode.pdf
SS-IIR-Bode.pdf
Attachment 2: PSD_32bit.pdf
PSD_32bit.pdf
Attachment 3: PSD_64bit.pdf
PSD_64bit.pdf
  16481   Wed Nov 24 11:02:23 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

I added mpmath to the quantization noise code. mpmath allows me to specify the precision that I am using in calculations. I added this to both the IIR filters and the State-space models although I am only looking at the IIR filters here. I hope to look at the state-space model soon. 

Notebook Summary:

I also added a new notebook which you can find HERE. This notebook creates a signal by summing two sine waves and windowing them.

Then that signal is passed through our filter that has been limited to a specific precision. In our case, we pass the same signal through a number of filters at different precisions.

Next, we take the output from the filter with the highest precision, because this one should have the lowest quantization noise by a significant margin, and we subtract the outputs of the lower precision filters from it. In summary, we are subtracting a clean signal from a noisy signal; because the underlying signal is the same, when we subtract them the only thing that should be left is noise. and since this system is purely digital and theoretical the limiting noise should be quantization noise.

Now we have a time series of the noise for each precision level (except for our highest precision level but that is because we are defining it as noiseless). From here we take a power spectrum of the result and plot it.

After this, we can calculate a frequency-dependent SNR and plot it. I also calculated values for the SNR at the frequencies of our two inputs. 

This is the procedure taken in the notebook and the results are shown below.

Analysis of Results:

The first thing we can see is that the precision levels 256 and 128 bits are not shown on our graph. the 256-bit signal was our clean signal so it was defined to have no noise so it cant be plotted. The 128-bit signal should have some quantization noise but I checked the output array and it contained all zeros. after further investigation, I found that the quantization noise was so small that when the result was being handed over from mpmath to the general python code it was rounding those numbers to zero. To overcome this issue I would have to keep the array as a mpmath object the entire time. I don't think this is useful because matplotlib probably couldn't handle it and it would be easier to just rewrite the code in C. 

The next thing to notice is sort of a sanity check thing. In general, low precision filters yield higher noise than high precision. This is a good quick sanity check. However, this does not hold true at the low end. we can see that 16-bit actually has the highest noise for most of the range. Chris pointed out that at low precisions that quantization noise can become so large that it is no longer a linearly coupled noise source. He also noted that this is prone to happen for low precision coefficients with features far below the Nyquist frequency like I have here. This is one explanation that seems to explain the data especially because this ambiguity is happening at 16-bit and lower as he points out. 

Another thing that I must mention, even if it is just a note to future readers, is that quantization noise is input dependent. by changing the input signal I see different degrees of quantization noise.

Analysis of SNR:

One of the things we hoped to accomplish in the original plan was to play around with the input and see how the results changed. I mainly looked at how the amplitude of the input signal scaled the SNR of the output. Below I include a table of the results. These results were taken from the SNR calculated at the first peak (see the last code block in the notebook) with the amplitude of the given sine wave given at the top of each column. this amplitude was given to both of the two sine waves even though only the first one was reported. To see an example, currently, the notebook is set up for measurement of input amplitude 10.

  0.1 Amplitude of input 1 Amplitude 100 Amplitude 1000 Amplitude
4-bit SNR 5.06e5 5.07e5 5.07e5 5.07e5
8-bit SNR 5.08e5 5.08e5 5.08e5 5.08e5
16-bit SNR 2.57e6 8.39e6 3.94e6 1.27e6
32-bit SNR 7.20e17 6.31e17 1.311e18 1.86e18
64-bit SNR 6.0e32 1.28e32 1.06e32 2.42e32
128-bit SNR unknown unknown unknown unknown

As we can see from the table above the SNR does not seem to relate to the amplitude of the input. in multiple instances, the SNR dips or peaks in the middle of our amplitude range.

 

Attachment 1: PSD_IIR_all.pdf
PSD_IIR_all.pdf
  16482   Wed Nov 24 13:44:19 2021 ranaSummaryComputersQuantization Noise Calculation Summary

This looks great. I think what we want to see mainly is just the noise in the 32 bit IIR filtering subtracted from the 64 bit one.

It would be good if Tega can look through your code to make sure there's NO sneaky places where python is doing some funny casting of the numbers. I didn't see anything obvious, but as Chris points out, these things can be really sneaky so you have to be next level paranoid to really be sure. Fox Mulder level paranoia.

And, we want to see a comparison between what you get and what Denis Martynov put in an appendix of his thesis when comparing the Direct Form II, with the low-noise form (also some slides from Matt Evans on thsi from a ~decade agoo). You should be able to reproduce his results. He used matlab + C, so I am curious to see if it can be done all in python, or if we really need to do it in C.

And then...we can make this a part of the IFOtest suite, so that we point it at any filter module anywhere in LIGO, and it downloads the data and gives us an estimate of the digital noise being generated.

  16492   Tue Dec 7 10:55:25 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

[Ian, Tega]

Tega and I have gone through the IIR Filter code and optimized it to make sure there aren't any areas that force high precision to be down-converted to low precision.

For the new biquad filter we have run into the issue where the gain of the filter is much higher than it should be. Looking at attachments 1 and 2, which are time series comparisons of the inputs and outputs from the different filters, we see that the scale for the output of the Direct form II filter shown in attachment 1 on the right is on the order of 10^-5 where the magnitude of the response of the biquad filter is on the order of 10^2. other than this gain the responses look to be the same. 

I am not entirely sure how this gain came into the system because we copied the c code that actually runs on the CDS system into python. There is a gain that affects the input of the biquad filter as shown on this slide of Matt Evans Slides. This gain, shown below as g, could decrease the input signal and thus fix the gain. However, I have not found any way to calculate this g.

 

 

With this gain problem we are left with the quantization noise shown in Attachment 4.

State Space:

I have controlled the state space filter to act with a given precision level. However, my code is not optimized. It works by putting the input state through the first state-space equation then integrating the result, which finally gets fed through the second state-space equation. 

This is not optimized and gives us the resulting quantization noise shown in attachment 5.

However, the state-space filter also has a gain problem where it is about 85 times the amplitude of the DF2 filter. Also since the state space is not operating in the most efficient way possible I decided to port the code chris made to run the state-space model to python. This code has a problem where it seems to be unstable. I will see if I can fix it

 

 

Attachment 1: DF2_TS.pdf
DF2_TS.pdf
Attachment 2: BIQ_TS.pdf
BIQ_TS.pdf
Attachment 4: PSD_COMP_BIQ_DF2.pdf
PSD_COMP_BIQ_DF2.pdf
Attachment 5: PSD_COMP_SS_DF2.pdf
PSD_COMP_SS_DF2.pdf
  16498   Fri Dec 10 13:02:47 2021 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

I am trying to replicate the simulation done by Matt Evans in his presentation  (see Attachment 1 for the slide in particular). 

He defines his input as x_{\mathrm{in}}=sin(2\pi t)+10^{-9} sin(2\pi t f_s/4) so he has two inputs one of amplitude 1 at 1 Hz and one of amplitude 10^-9 at 1/4th the sampling frequency  in this case: 4096 Hz

For his filter, he uses a fourth-order notch filter. To achieve this filter I cascaded two second-order notch filters (signal.iirnotch) both with locations at 1 Hz and quality factors of 1 and 1e6. as specified in slide 13 of his presentation

I used the same procedure outlined here. My results are posted below in attachment 2.

Analysis of results:

As we can see from the results posted below the results don't match. there are a few problems that I noticed that may give us some idea of what went wrong.

First, there is a peak in the noise around 35 Hz. this peak is not shown at all in Matt's results and may indicate that something is inconsistent.

the second thing is that there is no peak at 4096 Hz. This is clearly shown in Matt's slides and it is shown in the input spectrum so it is strange that it does not appear in the output.

My first thought was that the 4kHz signal was being entered at about 35Hz but even when you remove the 4kHz signal from the input it is still there. The spectrum of the input shown in Attachment 3 shows no features at ~35Hz.

The Input filter, Shown in attachment 4 shows the input filter, which also has no features at ~35Hz. Seeing how the input has no features at ~35Hz and the filter has no features at ~35Hz there must be either some sort of quantization noise feature there or more likely there is some sort of sampling effect or some effect of the calculation.

To figure out what is causing this I will continue to change things in the model until I find what is controlling it. 

I have included a Zip file that includes all the necessary files to recreate these plots and results.

Attachment 1: G0900928-v1_(dragged).pdf
G0900928-v1_(dragged).pdf
Attachment 2: PSD_COMP_BIQ_DF2.pdf
PSD_COMP_BIQ_DF2.pdf
Attachment 3: Input_PSD.pdf
Input_PSD.pdf
Attachment 4: Input_Filter.pdf
Input_Filter.pdf
Attachment 5: QuantizationN.zip
  16507   Wed Dec 15 13:57:59 2021 PacoUpdateComputersupgraded ubuntu on zita

[Paco]

Upgraded zita's ubuntu and restarted the striptool script.

  16650   Mon Feb 7 16:14:37 2022 TegaUpdateComputersrealtime system reboot problem

I was looking into plotting temperature sensor data trend and why we currently do not have frame data written to file (on /frames) since Friday, and noticed that the FE models were not running. So I spoke to Anchal about it and he mentioned that we are currently unable to ssh into the FE machines, therefore we have been unable to start the models. I recalled the last time we enountered this problem Koji resolved it on Chiara, so I search the elog for Koji's fix and found it here, https://nodus.ligo.caltech.edu:8081/40m/16310. I followed the procedure and restarted c1sus and c1lsc machine and we are now able to ssh into these machines. Also restarted the remaining FE machines and confirm that can ssh into them. Then to start models, I ssh into each FE machine (c1lsc, c1sus, c1ioo, c1iscex, c1iscey, c1sus2) and ran the command

rtcds start --all

to start all models on the FE machine. This procedure worked for all the FE machines but failed for c1lsc. For some reason after starting the first the IOP model - c1x04, c1lsc and c1ass, the ssh connection to the machine drops. When we try to ssh into c1lsc after this event, we get the following error :  "ssh: connect to host c1lsc port 22: No route to host".  I reset the c1lsc machine and deecided to to start the IOP model for now. I'll wait for Anchal or Paco to resolve this issue.


[Anchal, Tega]

I informed Anchal of the problem and ask if he could take a look. It turn out 9 FE models across 3 FE machines (c1lsc, c1sus, c1ioo) have a certain interdependece that requires careful consideration when starting the FE model. In a nutshell, we need to first start the IOP models in all three FE machines before we start the other models in these machines. So we turned off all the models and shutdown the FE machines mainly bcos of a daq issue, since the DC (data concentrator) indicator was not initialised. Anchal looked around in fb1 to figure out why this was happening and eventually discovered that it was the same as the ms_stream issue encountered earlier in fb1 clone (https://nodus.ligo.caltech.edu:8081/40m/16372). So we restarted fb1 to see if things clear up given that chiara dhcp sever is now working fine. Upon restart of fb1, we use the info in a previous elog that shows if the DAQ network is working or not, r.e. we ran the command

$ /opt/mx/bin/mx_info
MX:fb1:mx_init:querying driver:error 5(errno=2):No MX device entry in /dev.

 The output shows that MX device was not initialiesd during the reboot as can also be seen below.

$ sudo systemctl status daqd_dc -l

● daqd_dc.service - Advanced LIGO RTS daqd data concentrator
   Loaded: loaded (/etc/systemd/system/daqd_dc.service; enabled)
   Active: failed (Result: exit-code) since Mon 2022-02-07 18:02:02 PST; 12min ago
  Process: 606 ExecStart=/usr/bin/daqd_dc_mx -c /opt/rtcds/caltech/c1/target/daqd/daqdrc.dc (code=exited, status=1/FAILURE)
 Main PID: 606 (code=exited, status=1/FAILURE)

Feb 07 18:01:56 fb1 systemd[1]: Starting Advanced LIGO RTS daqd data concentrator...
Feb 07 18:01:56 fb1 systemd[1]: Started Advanced LIGO RTS daqd data concentrator.
Feb 07 18:02:00 fb1 daqd_dc_mx[606]: [Mon Feb  7 18:01:57 2022] Unable to set to nice = -20 -error Unknown error -1
Feb 07 18:02:00 fb1 daqd_dc_mx[606]: Failed to do mx_get_info: MX not initialized.
Feb 07 18:02:00 fb1 daqd_dc_mx[606]: 263596
Feb 07 18:02:02 fb1 systemd[1]: daqd_dc.service: main process exited, code=exited, status=1/FAILURE
Feb 07 18:02:02 fb1 systemd[1]: Unit daqd_dc.service entered failed state.


NOTE: We commented out the line

Restart=always

in the file "/etc/systemd/system/daqd_dc.service" in order to see the error, BUT MUST UNDO THIS AFTER THE PROBLEM IS FIXED!

  16656   Thu Feb 10 14:39:31 2022 KojiSummaryComputersNetwork security issue resolved

[Mike P / Koji / Tega / Anchal]

IMSS/LIGO IT notified us that "ILOM ports" of one of our hosts on the "114" network are open. We tried to shut down obvious machines but could not identify the host in question. So we decided to do a bit more systematic search of the host.

[@Network Rack]
- First of all, we disconnected the optical cables coming to the GC router while the ping is running on the AIRLIGO connected laptop (i.e. outside of the 40m network). This made the ping stopped. This means that the issue was definitely in the 40m.
- Secondly, we started to disconnect (and reconnect) the ethernet cables from the GC router one by one. We found that the ping response stops when the cable named "NODUS" was disconnected.

[@40m IFO lab]
- So we tracked the cable down in the 40m lab. After a while, we identified that the cable was really connected to nodus.

- Nodus was supposed to have one network connection to the martian network since the introduction of the bidirectional NAT router (rather than the old configuration with a single direction NAT router).

- In fact, the cable was connected to "non-networking" port of nodus. (Attachment 1). I guess the cable was connected like this long time, but somehow the ILOM (IPMI) port was activated along with the recent power cycling.

- The cable was disconnected at nodus too. (Attachment 2) And a tape was attached to the port so that we don't connect anything to the port anymore.

Attachment 1: PXL_20220210_220816955.jpg
PXL_20220210_220816955.jpg
Attachment 2: PXL_20220210_220827167.jpg
PXL_20220210_220827167.jpg
  16836   Mon May 9 15:32:14 2022 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

I made the first pass at a tool to measure the quantization noise of specific filters in the 40m system. The code for which can be found here. It takes the input to the filter bank and the filter coefficients for all of the filters in the filter bank. it then runs the input through all the filters and measures the quantization noise at each instance. It does this by subtracting the 64-bit output from the 32-bit output. Note: the actual system is 64 bit so I need to update it to subtract the 64-bit output from the 128-bit output using the long double format. This means that it must be run on a computer that supports the long double format. which I checked and Rossa does. The code outputs a number of plots that look like the one in Attachment 1. Koji suggested formatting a page for each of the filters that is automatically generated that shows the filter and the results as well as an SNR for the noise source. The code is formatted as a class so that it can be easily added to the IFOtest repo when it is ready.

I tracked down a filter that I thought may have lower thermal noise than the one that is currently used. The specifics of this will be in the DCC document version 2 that I am updating but a diagram of it is found in attachment 2. Preliminary calculations seemed to show that it had lower quantization noise than the current filter realization. I added this filter realization to the c code and ran a simple comparison between all of them. The results in Attachment 3 are not as good as I had hoped. The input was a two-toned sin wave. The low-level broadband signal between 10Hz and 4kHz is the quantization noise. The blue shows the current filter realization and there shows the generic and most basic direct form 2. The orange one is the new filter, which I personally call the Aircraft Biquad because I found it in this paper by the Hughes Aircraft Company. See fig 2 in paper. They call it the "modified canonic form realization" but there are about 20 filters in the paper that also share that name. in the DCC doc I have just given them numbers because it is easier. 

Whats next:

1) I need to make the review the qnoisetool code to make it compute the correct 64-bit noise. 

        a) I also want to add the new filter to the simulation to see how it does

2) Make the output into a summary page the way Koji suggested. 

3) complete the updated DCC document. I need to reconcile the differences between the calculation I made and the actual result of the simulation.

Attachment 1: SUS-ETMX_SUSYAW3_0.0.pdf
SUS-ETMX_SUSYAW3_0.0.pdf
Attachment 2: LowNoiseBiquad2.pdf
LowNoiseBiquad2.pdf
Attachment 3: quant_noise_floor.pdf
quant_noise_floor.pdf
  16881   Fri May 27 17:46:48 2022 PacoSummaryComputersCDS upgrade visit, downfall and rise of c1lsc models

[Paco, Anchal-remote, Yuta, JC]

Sometime around noon today, right after cds upgrade planning tour, c1lsc FE fell. We though this was ok because anyways c1sus was still up, but somehow the IFO alignment was compromised (this is in fact how we first noticed this loss). Yuta couldn't see REFL on the camera, and neither on the AP table (!!) so somehow either/all of TT1, TT2, PRM got affected by this model stopping. We even tried kicking PRM slightly to try and see if the beam was nearby with no success.

We decided to restart the models. To do this we first ssh into c1lsc, c1ioo and c1sus and stop all models. During this step, c1ioo and c1sus dropped their connection and so we had to physically restart them. We then noticed DC 0x4000 error in c1x04 (c1lsc iop) and after checking the gpstimes were different by 1 second. We then did stopped the model again, and from fb1 restart all daqd_* services and modprobe -r gpstime, modprobe gpstime, restart c1lsc and start the c1x04 model. This fixed the issue, so we finished restarting all FE models and burt restore all the relevant snap files to today 02:19 AM PDT.

This made the IFO recover its nominal alignment, minus the usual drift.

* The OAF model failed to start but we left it like so for now.

  16991   Tue Jul 12 13:59:12 2022 ranaSummaryComputersprocess monitoring: Monit

I've installed Monit on megatron and nodus just now, and will set it up to monitor some of our common processes. I'm hoping that it can give us a nice web view of what's running where in the Martian network.

  17058   Thu Aug 4 19:01:59 2022 TegaUpdateComputersFront-end machine in supermicro boxes

Koji and JC looked around the lab today and found some supermicro boxes which I was told to look into to see if they have any useful computers.

 

Boxes next to Y-arm cabinets (3 boxes: one empty)

We were expecting to see a smaller machine in the first box - like top machine in attachement 1 - but it turns out to actually contain the front-end we need, see bottom machine in attachment 1. This is the same machine as c1bhd currently on the teststand. Attachment 2 is an image of the machine in the second box (maybe a new machine for frambuilder?). The third box is empty.

 

Boxes next to X-arm cabinets (3 boxes)

Attachement 3 shows the 3 boxes each of which contains the same FE machine we saw earlier at the bottom of attachement 1. The middle box contains the note shown in attacment 4.

 

Box opposite Y-arm cabinets (1 empty box)

 

In summary, it looks like we have 3 new front-ends, 1 new front-end with networking issue and 1 new tower machine (possibly a frame builder replacement).

Attachment 1: IMG_20220804_184444473.jpg
IMG_20220804_184444473.jpg
Attachment 2: IMG_20220804_191658206.jpg
IMG_20220804_191658206.jpg
Attachment 3: IMG_20220804_185336240.jpg
IMG_20220804_185336240.jpg
Attachment 4: IMG_20220804_185023002.jpg
IMG_20220804_185023002.jpg
  17066   Mon Aug 8 17:16:51 2022 TegaUpdateComputersFront-end machine setup

Added 3 FE machines - c1ioo, c1lsc, c1sus -  to the teststand following the instructions in elog15947. Note that we also updated /etc/hosts on chiara by adding the names and ip of the new FE since we wish to ssh from there given that chiara is where we land when we connect to c1teststand.

Two of the FE machines - c1lsc & c1ioo - have the 6-core X5680 @ 3.3GHz processor and the BIOS were already mostly configured because they came from LLO I believe. The third machine - c1sus - has the 6-core X5650 @ 2.67GHz processor and required a complete BIOS config according to the doc.

Next Step:  I think the next step is to get the latest RTS working on the new fb1 (tower machine), then boot the frontends from there.


KVM switch note:

All current front-ends have the ps/2 keyboard and mouse connectors except for fb1, which only has usb ports. So we may not be able to connect to fb1 using a ps/2 KVM switch that works for all the current front-ends. The new tower machine does have a ps/2 connector so if we decide to use that as the bootserver and framebuilder, then we should be fine.

Attachment 1: IMG_20220808_170349717.jpg
IMG_20220808_170349717.jpg
  17074   Wed Aug 10 20:51:14 2022 TegaUpdateComputersCDS upgrade Front-end machine setup

Here is a summary of what needs doing following the chat with Jamie today.

 

Jamie brought over the KVM switch shown in the attachment and I tested all 16 ports and 7 cables and can confirm that they all work as expected.

 

TODO

1. Do a rack space budget to get a clear picture of how many front-ends we can fit into the new rack

2. Look into what needs doing and how much effort would be needed to clear rack 1X7 and use that instead of the new rack. The power down on Friday would present a good opportunity to do this work on Monday, so get the info ready before then. 

3. Start mounting front-ends, KVM and dolphin network switch

4. Add the BOX rack layout to the CDS upgrade page.

Attachment 1: IMG_20220810_171002928.jpg
IMG_20220810_171002928.jpg
Attachment 2: IMG_20220810_171019633.jpg
IMG_20220810_171019633.jpg
  17083   Tue Aug 16 18:22:59 2022 TegaUpdateComputersc1teststand rack mounting for CDS upgrade

[Tega, Yuta]

I keep getting confused about the purpose of the teststand. The view I am adopting going forward is its use as a platform for testing the compatibility of new hardware upgrade, instead of thinking of it as an independent system that works with old hardware.

The initial idea of clearing 1X7 cannot be done for now, because I missed the deadline for providing a detailed enough plan before Monday power up of the lab, so we are just going to go ahead and use the new rack as was initially intended and get the latest hardware and software tested here.

We mounted the DAQ, subnet and dolphin IX switches, see attachement 1. The mounting ears that came with the dolphin switch did not fit and so could not be used for mounting. We looked around the lab and decided to used one of the NavePoint mounting brackets which we found next to the teststand, see attachment 2.

We plan to move the new rack to the current location of the teststand and use the power connection from there. It is also closer to 1X7 so that moving the front-ends and switches to 1X7 should be straight forward after we complete all CDS upgrade testing.

Attachment 1: IMG_20220816_180157132.jpg
IMG_20220816_180157132.jpg
Attachment 2: IMG_20220816_175125874.jpg
IMG_20220816_175125874.jpg
  17088   Wed Aug 17 11:10:51 2022 ranaUpdateComputersc1teststand rack mounting for CDS upgrade

we want to be able to run SimPlant on the teststand, test our new controls algorithms, test watchdogs, and any other software upgrades. Ideally in the steady state it will run some plants with suspensions and cavities and we will develop our measurement scripts on there also (e.g. IFOtest).

Quote:

[Tega, Yuta]

I keep getting confused about the purpose of the teststand. The view I am adopting going forward is its use as a platform for testing the compatibility of new hardware upgrade, instead of thinking of it as an independent system that works with old hardware.

  17098   Mon Aug 22 19:02:15 2022 TegaUpdateComputersc1teststand rack mounting for CDS upgrade II

[Tega, JC]

Moved the rack to the location of the test stand just behind 1X7 and plan to remove the other two small test stand racks to create some space there.  We then mounted the c1bhd I/O chassis and 4 front-end machines on the test stand (see attachment 1).

Installed the dolphin IX cards on all 4 front-end machines: c1bhd, c1ioo, c1sus, c1lsc. I also removed the dolphin DX card that was previously installed on c1bhd.

Found a single OneStop host card with a mini PCI slot mounting plate in a storage box (see attachment 2). Since this only fits into the dual PCI riser card slot on c1bhd, I swapped out the full-length PCI slot OneStop host card on c1bhd and installed it on c1lsc, (see attachments 3 & 4).

 

Attachment 1: IMG_20220822_185437763.jpg
IMG_20220822_185437763.jpg
Attachment 2: IMG_20220822_131340214.jpg
IMG_20220822_131340214.jpg
Attachment 3: c1bhd.jpeg
c1bhd.jpeg
Attachment 4: c1lsc.jpeg
c1lsc.jpeg
  17100   Tue Aug 23 22:30:24 2022 TegaUpdateComputersc1teststand OS upgrade - I

[JC, Tega, Chris]

After moving the test stand front-ends, chiara (name server) and fb1 (boot server) to the new rack behind 1X7, we powered everything up and checked that we can reach c1teststand via pianosa and that the front-ends are still able to boot from fb1. After confirming these tests, we decided to start the software upgrade to debian 10. We installed buster on fb1 and are now in the process of setting up diskless boot. I have been looking around for cds instructions on how to do this and I found the CdsFrontEndDebian10page which contains most of the info we require. The page suggests that it may be cleaner to start the debian10 installation on a front-end that is connected to an I/O chassis with at least 1 ADC and 1 DAC card, then move the installation disk to the boot server and continue from there, so I moved the disk from fb1 to one of the front-ends but I had trouble getting it to boot. I decided to do a clean install on another disk on the c1lsc front-end which has a host adapter card that can be connected to the c1bhd I/O chassis. We can then mount this disk on fb1 and use it to setup the diskless boot OS.

  17108   Fri Aug 26 14:05:09 2022 TegaUpdateComputersrack reshuffle proposal for CDS upgrade

[Tega, Jamie]

Here is a proposal for what we would like to do in terms of reshuffling a few rack-mounted equipments for the CDS upgrade. 

  • Frequency Distribution Amp - Move the unit from 1X7 to 1X6 without disconnecting the attached cables. Then disconnect power and signal cables one at a time to enable optimum rerouting for strain relief and declutter.  

 

  • GPS Receiver Tempus LX - Move the unit from 1X7 to 1X6 without disconnecting the attached cables. Then disconnect power and signal cables one at a time to enable optimum rerouting for strain relief and declutter.

 

  • PEM & ADC Adapter - Move the unit from 1X7 to 1X6 without disconnecting the attached cables. Disconnect the single signal cable from the rear of the ADC adapter to allow for optimum rerouting for strain relief.

 

  • Martian Network Switch - Make a note of all connections, disconnect them, move the switch to 1X7 and reconnect ethernet cables. 

 

  • MARTIAN NETWORK SWITCH CONNECTIONS
    # LABEL # LABEL
    1 Tempus LX (yellow,unlabeled) 13 FB1
    2 1Y6 HUB 14 FB
    3 C0DCU1 15 NODUS
    4 C1PEM1 16  
    5 RFM-BYPASS 17 CHIARA
    6 MEGATRON/PROCYON 18  
    7 MEGATRON 19 CISC/C1SOSVME
    8 BR40M 20 C1TESTSTAND [blue/unlabelled]
    9 C1DSCL1EPICS0 21 JetStar [blue/unlabelled]
    10 OP340M 22 C1SUS [purple]
    11 C1DCUEPICS 23 unknown [88/purple/goes to top-back rail]
    12 C1ASS 24 unknown [stonewall/yellow/goes to top-front rail]

     

I believe all of this can be done in one go followed by CDS validation. Please comment so we can improve the plan. Should we move FB1 to 1X7 and remove old FB & JetStor during this work?

Attachment 1: Reshuffling proposal

Attachment 2: Front of 1X7 Rack

Attachment 3: Rear of 1X7 Rack

Attachment 4: Front of 1X6 Rack

Attachment 5: Rear of 1X6 Rack

Attachment 6: Martian switch connections

Attachment 1: rack_change_proposal.pdf
rack_change_proposal.pdf
Attachment 2: IMG_20220826_131331042.jpg
IMG_20220826_131331042.jpg
Attachment 3: IMG_20220826_131153172.jpg
IMG_20220826_131153172.jpg
Attachment 4: IMG_20220826_131428125.jpg
IMG_20220826_131428125.jpg
Attachment 5: IMG_20220826_131543818.jpg
IMG_20220826_131543818.jpg
Attachment 6: IMG_20220826_142620810.jpg
IMG_20220826_142620810.jpg
  17109   Sun Aug 28 23:14:22 2022 JamieUpdateComputersrack reshuffle proposal for CDS upgrade

@tega This looks great, thank you for putting this together.  The rack drawing in particular is great.  Two notes:

  1. In "1X6 - proposed" I would move the "PEM AA + ADC Adapter" down lower in the rack, maybe where "Old FB + JetStor" are, after removing those units since they're no longer needed.  That would keep all the timing stuff together at the top without any other random stuff in between them.  If we can't yet remove Old FB and the JetStor then I would move the VME GPS/Timing chassis up a couple units to make room for the PEM module between the VME chassis and FB1.
  2. We'll eventually want to move FB1 and Megatron into 1X7, since it seems like there will be room there.  That will put all the computers into one rack, which will be very nice.  FB1 should also be on the KVM switch as well.

I think most of this work can be done with very little downtime.

  17111   Mon Aug 29 15:15:46 2022 TegaUpdateComputers3 FEs from LLO got delivered today

[JC, Tega]

We got the 3 front-ends from LLO today. The contents of each box are:

  1. FE machine
  2. OSS adapter card for connecting to I/O chassis
  3. PCI riser cards (x2)
  4. Timing Card and cable
  5. Power cables, mounting brackets and accompanying screws
Attachment 1: IMG_20220829_145533452.jpg
IMG_20220829_145533452.jpg
Attachment 2: IMG_20220829_144801365.jpg
IMG_20220829_144801365.jpg
  17113   Tue Aug 30 15:21:27 2022 TegaUpdateComputers3 FEs from LHO got delivered today

[Tega, JC]

We received the remaining 3 front-ends from LHO today. They each have a timing card and an OSS host adapter card installed. We also receive 3 dolphin DX cards. As with the previous packages from LLO, each box contains a rack mounting kit for the supermicro machine.

Attachment 1: IMG_20220830_144925325.jpg
IMG_20220830_144925325.jpg
Attachment 2: IMG_20220830_142307495.jpg
IMG_20220830_142307495.jpg
Attachment 3: IMG_20220830_143059443.jpg
IMG_20220830_143059443.jpg
  17127   Fri Sep 2 13:30:25 2022 Ian MacMillanSummaryComputersQuantization Noise Calculation Summary

P. P. Vaidyanathan wrote a chapter in the book "Handbook of Digital Signal Processing: Engineering Applications" called "Low-Noise and Low-Sensitivity Digital Filters" (Chapter 5 pg. 359).  I took a quick look at it and wanted to give some thoughts in case they are useful. The experts in the field would be Leland B. JacksonP. P. VaidyanathanBernard Widrow, and István Kollár.  Widrow and Kollar  wrote the book "Quantization Noise Roundoff Error in Digital Computation, Signal Processing, Control, and Communications" (a copy of which is at the 40m). it is good that P. P. Vaidyanathan is at Caltech.

Vaidyanathan's chapter is serves as a good introduction to the topic of quantization noise. He starts off with the basic theory similar to my own document on the topic. From there, there are two main relevant topics to our goals.

The first interesting thing is using Error-Spectrum Shaping (pg. 387). I have never investigated this idea but the general gist is as poles and zeros move closer to the unit circle the SNR deteriorates so this is a way of implementing error feedback that should alleviate this problem. See Fig. 5.20 for a full realization of a second-order section with error feedback.

The second starts on page 402 and is an overview of state space filters and gives an example of a state space realization (Fig. 5.26). I also tested this exact realization a while ago and found that it was better than the direct form II filter but not as good as the current low-noise implementation that LIGO uses. This realization is very close to the current realization except uses one less addition block.

Overall I think it is a useful chapter. I like the idea of using some sort of error correction and I'm sure his other work will talk more about this stuff. It would be useful to look into.

One thought that I had recently is that if the quantization noise is uncorrelated between the two different realizations then connecting them in parallel then averaging their results (as shown in Attachment 1) may actually yield lower quantization noise. It would require double the computation power for filtering but it may work. For example, using the current LIGO realization and the realization given in this book it might yield a lower quantization noise. This would only work with two similarly low noise realizations. Since it would be randomly sampling two uniform distributions and we would be going from one sample to two samples the variance would be cut in half, and the ASD would show a 1/√2 reduction if using realizations with the same level of quantization noise. This is only beneficial if the realization with the higher quantization noise only has less than about 1.7 times the one with the lower noise. I included a simple simulation to show this in the zip file in attachment 2 for my own reference.

Another thought that I had is that the transpose of this low-noise state-space filter (Fig. 5.26) or even of LIGO's current filter realization would yield even lower quantization noise because both of their transposes require one less calculation.

Attachment 1: averagefiltering.pdf
averagefiltering.pdf
Attachment 2: AveragingFilter.py.zip
  17144   Mon Sep 19 20:21:06 2022 TegaUpdateComputers1X7 and 1X6 work

[Tega, Paco, JC]


We moved the GPS network time server and the Frequency distribution amplifier from 1X7 to 1X6 and the PEM AA, ADC adapter and Martian network switch from 1X6 to 1X7. Also mounted the dolphin IX switch at the rear of 1X7 together with the DAQ and martian switches. This cleared up enough space to mount all the front-ends, however, we found that the mounting brackets for the frontends do not fit in the 1X7 rack, so I have decided to mount them on the upper part of the test stand for now while we come up with a fix for this problem. Attachments 1 to 3 show the current state of racks 1X6, 1X7 and the teststand.

 

Attachment 1: Front of racks 1X6 and 1X7

Attachment 2: Rear of rack 1X7

Attachment 3: Front of teststand rack


Plan for the remainder of the week

Tuesday

  • Setup the 6 new front-ends to boot off the FB1 clone.
  • Test PCIe I/O cables by connecting them btw the front-ends and teststand I/O chassis one at a time to ensure they work
  • Then lay the fiber cables to the various I/O chassis.

Wednesday

  • Migrate the current models on the 6 front-ends to the new system.
  • Replace RFM IPC parts with dolphin IPC parts in c1rfm model running c1sus machine
  • Replace the RFM parts in c1iscex and c1iscey models
  • Drop c1daf and c1oaf models from c1isc machine, since the front-ends have only have 6 cores
  • Build and install models

Thursday [CAN I GET THE IFO ON THIS DAY PLEASE?]

  • Complete any remaining model work
  • Connect all I/O chassis to their respective (new) front-end and see if we can start the models (Need to think of a safe way to do this. Should we disconnect coil drivers b4 starting the models?)

Friday

  • Tie-up any loose ends
Attachment 1: IMG_20220919_204013819.jpg
IMG_20220919_204013819.jpg
Attachment 2: IMG_20220919_203541114.jpg
IMG_20220919_203541114.jpg
Attachment 3: IMG_20220919_203458952.jpg
IMG_20220919_203458952.jpg
  17148   Tue Sep 20 23:06:23 2022 TegaUpdateComputersSetup the 6 new front-ends to boot off the FB1 clone

[Tega, Radhika, JC]

We wired the front-ends for power, DAQ and martian network connections. Then moved the I/O chassis from the buttom of the rack to the middle just above the KVM switch so we can leave the top og the I/O chassis open for access to the ports of OSS target adapter card for testing the extension fiber cables.

Attachment 1 (top -> bottom)

c1sus2

c1iscey

c1iscex

c1ioo

c1sus

c1lsc


When I turned on the test stand with the new front-ends, after a few minutes, the power to 1x7 was cut off due to overloading I assume. This brought down nodus, chiara and FB1. After Paco reset the tripped switch, everything came back without us actually doing anything, which is an interesting observation.


After this event, I moved the test stand power plug to the side wall rail socket. This seems fine so far. I then brought chiara (clone) and FB1 (clone) online. Here are some changes I made to get things going:

Chiara (clone)

  • Edited '/etc/dhcp/dhcpd.conf' to update the MAC address of the front-ends to match the new machines, then run
  • sudo service isc-dhcp-server restart
  • then restart front-ends
  • Edited '/etc/hosts' on chiara to include c1iscex and c1iscey as these were missing

 

FB1 (clone)

Getting the new front-ends booting off FB1 clone:

1. I found that the KVM screen was flooded with setup info about the dolphin cards on the LLO machines. This actually prevented login using the KVM switch for two of these machines.  Strangely, one of them 'c1sus' seemed to be fine, see attachment 2, so I guessed this was bcos the dolphin network was already configured earlier when we were testing the dolphin communications. So I decided to configure the remaining dolphin cards. To do so, we do the following

Dolphin Configuration:

1a. Ideally running

sudo /opt/DIS/sbin/dis_mkconf -fabrics 1 -sclw 8 -stt 1 -nodes c1lsc c1sus c1ioo c1iscex c1iscey c1sus2 -nosessions

should set up all the nodes, but this did not happen. In fact, I could no longer use the '/opt/DIS/sbin/dis_admin' GUI after running this operation and restarting the 'dis_networkmgr.service' via

sudo systemctl restart dis_networkmgr.service

so  I logged into each front-end and configured the dolphin adapter there using

sudo /opt/DIS/sbin/dis_config

After which I shut down FB1 (clone) bcos restarting it earlier didn't work, I waited a few minutes and then started it.  Everything was fine afterward, although I am not quite sure what solved the issue as I tried a few things and I was glad to see the problem go!

1b. I later found after configuring all the dolphin nodes that 2 of them failed the '/opt/DIS/sbin/dis_diag' test with an error message suggesting three possible issues of which one was 'faulty cable'. I looked at the units in question and found that swapping both cables with the remaining spares solved the problem. So it seems like these cables are faulty (need to double-check this). Attachment 3 shows the current state of the dolphin nodes on the front-ends and the dolphin switch.


2. I noticed that the NFS mount service for the mount points '/opt/rtcds' and '/opt/rtapps' in /etc/fstab exited with an error, so I ran 

sudo mount -a

3. edit '/etc/hosts' to include c1iscex and c1iscey as these were missing

 

Front-ends

To test the PCIe extension fiber cables that connect the front-ends to their respective I/O chassis, we run the following command (after booting the machine with the cable connected): 

controls@c1lsc:~$ lspci -vn | grep 10b5:3
    Subsystem: 10b5:3120
    Subsystem: 10b5:3101

If we see the output above, then both the cable and OSS card are fine (We know from previous tests that the OSS card on the I/O chassis is good). Since we only have one I/O chassis, we repeat the step above 8 times, also cycling through the six new front-end as we go so that we are also testing the installed OSS host adapter cards. I was able to test 4 cables and 4 OSS host cards (c1lsc, c1sus, c1ioo, c1sus2), but the remaining results were inconclusive (i.e. it seems to suggest that 3 out of the remaining 5 fiber cables are faulty, which in itself would be considered unfortunate but I found the reliability if the test to be in question when I went back to test the functionality to the 2 remaining OSS host cards using a cable that passed the test earlier and it didn't pass. After a few retries, I decided to call it a day b4 I lose my mind) and need to be redone again tomorrow.

 

Note: We were unable to lay the cables today bcos these tests were not complete, so we are a bit behind the plan. Would see if we can catch up tomorrow.

 

Quote:

Plan for the remainder of the week

Tuesday

  • Setup the 6 new front-ends to boot off the FB1 clone.
  • Test PCIe I/O cables by connecting them btw the front-ends and teststand I/O chassis one at a time to ensure they work
  • Then lay the fiber cables to the various I/O chassis.

 

Attachment 1: IMG_20220921_084220465.jpg
IMG_20220921_084220465.jpg
Attachment 2: dolphin_err_init_state.png
dolphin_err_init_state.png
Attachment 3: dolphin_final_state.png
dolphin_final_state.png
  17151   Wed Sep 21 17:16:14 2022 TegaUpdateComputersSetup the 6 new front-ends to boot off the FB1 clone

[Tega, JC]

We laid 4 out of 6 fiber cables today. The remaining 2 cables are for the I/O chassis on the vertex so we would test the cables the lay it tomorrow. We were also able to identify the problems with the 2 supposedly faulty cable, which are not faulty. One of them had a small bend in the connector that I was able to straighten out with a small plier and the other was a loose connection in the switch part. So there was no faulty cable, which is great! Chris wrote a matlab script that does the migration of all the model files. I am going through them, i.e. looking at the CDS parameter block to check that all is well. Next task is to build and install the updated models. Also need to update the '/opt/rtcds' and '/opt/rtapps' directory to the latest in the 40m chiara system.

 

  17153   Thu Sep 22 20:57:16 2022 TegaUpdateComputersbuild, install and start 40m models on teststand

[Tega, Chris]

We built, installed and started all the 40m models on the teststand today. The configuration we employ is to connect c1sus to the teststand I/O chassis and use dolphin to send the timing to other frontends. To get this far, we encounterd a few problems that was solved by doing the following:

0. Fixed frontend timing sync to FB1 via ntp

1. Set the rtcds enviroment variable `CDS_SRC=/opt/rtcds/userapps/trunk/cds/common/src` in the file '/etc/advligorts/env'

2. Resolved chgrp error during models installation using sticky bits on chiara, i.e. `sudo chmod g+s -R /opt/rtcds/caltech/c1/target`

3. Replaced `sqrt` with `lsqrt` in `RMSeval.c` to eliminate compilation error for c1ioo

4. Created a symlink for 'activateDQ.py' and 'generate_KisselButton.py' in '/opt/rtcds/caltech/c1/post_build'

5. Installed and configured dolphin for new frontend 'c1shimmer'

6. Replaced 'RFM0' with 'PCIE' in the ipc file, '/opt/rtcds/caltech/c1/chans/ipc/C1.ipc'

 

We still have a few issues namely:

1. The user models are not running smoothly. the cpu usage jumps to its maximum value every second or so.

2. c1omc seems to be unable to get its timing from its IOP model (Issue resolved by changing the CDS block parameter 'specific_cpu' from 6 to 4 bcos the new FEs only have 6 cores, 0-5)

3. The need to load the `dolphin-proxy-km` library and start the `rts-dolphin_daemon` service whenever we reboot the front-end

Attachment 1: dolphin_state_plus_c1shimmer.png
dolphin_state_plus_c1shimmer.png
Attachment 2: FE_status_overview.png
FE_status_overview.png
  17158   Fri Sep 23 19:07:03 2022 TegaUpdateComputersWork to improve stability of 40m models running on teststand

[Chris, Tega]

Timing glitch investigation:

  • Moved dolphin transmit node from c1sus to c1lsc bcos we suspect that the glitch might be coming from the c1sus machine (earlier c1pem on c1sus was running faster then realtime).
  • Installed and started c1oaf to remove the shared memory IPC error to/from c1lsc model
  • /opt/DIS/sbin/dis_diag gives two warnings on c1sus2
    • [WARN] IXH Adapter 0 - PCIe slot link speed is only Gen1
    • [WARN] Node 28 not reachable, but is an entry in the dishosts.conf file - c1shimmer is currently off, so this is fine.

DAQ network setup:

  • Added the DAQ ethernet MAC address  and fixed IPV4 address for the front-ends to '/etc/dhcp/dhcpd.conf'
  • Added the fixed DAQ IPV4 address and port for all the front-ends to '/etc/advligorts/subscriptions.txt' for `cps_recv` service
  • Edited '/etc/advligorts/master' by including all the iop and user models '.ini' files in '/opt/rtcds/caltech/c1/chans/daq/' containing channel info and the corresponding tespoint files in '/opt/rtcds/caltech/c1/target/gds/param/'
  • Created systemd environment file for each front-end in '/diskless/root/etc/advligorts/' containing the argument for local data concentrator and daq data transmitter (`local_dc_args` and `cps_xmit_args`). We currently have staggered the delay (-D waitValue) times of the front-ends by setting it to the last number in the daq ip address when we were facing timing glitch issues, but should probably set it back to zero to see if it has any effect.

Other:

  • Edited /etc/resolv.conf on fb1 and 'diskless/root' to enable name resolution via for example `host c1shimmer` but the file gets overwritten on chiara for some reason

Issues:

  1. Frame writing is not working at the moment. It did at some point in the past for a couple of days but stopped working earlier today and we can't quite figure out why. 
  2. We can't get data via diaggui or ndscope either. Again, we recall the working in the past too but not sure why it has stopped working now.   
  3. The cpu load on c1su2 is too high so we should split into two models
  4. We still get the occassional IPC glitch both for shared memory and dolphin, see attachments
Attachment 1: dolphin_state_all_green.png
dolphin_state_all_green.png
Attachment 2: dolphin_state_IPC_glitch.png
dolphin_state_IPC_glitch.png
  17164   Thu Sep 29 15:12:02 2022 JCUpdateComputersSetup the 6 new front-ends to boot off the FB1 clone

[Jamie, Christopher, JC]

This morning we decided to label the the fiber optic cables. While doing this, we noticed that the ends had different label, 'Host' and 'Target'. Come to find out, the fiber optic cables are directional. Four out of Six of the cables were reversed. Luckily, 1 cable for the 1Y3 IO Chassis has a spare already laid (The cable we are currently using).  Chris, Jamie, and I have begun reversing these cable to there correct position.

Quote:

[Tega, JC]

We laid 4 out of 6 fiber cables today. The remaining 2 cables are for the I/O chassis on the vertex so we would test the cables the lay it tomorrow. We were also able to identify the problems with the 2 supposedly faulty cable, which are not faulty. One of them had a small bend in the connector that I was able to straighten out with a small plier and the other was a loose connection in the switch part. So there was no faulty cable, which is great! Chris wrote a matlab script that does the migration of all the model files. I am going through them, i.e. looking at the CDS parameter block to check that all is well. Next task is to build and install the updated models. Also need to update the '/opt/rtcds' and '/opt/rtapps' directory to the latest in the 40m chiara system.

 

 

  17172   Tue Oct 4 21:00:49 2022 ChrisUpdateComputersFailed takeover attempt with the new front ends

[Jamie, JC, Chris]

Today we made a failed attempt to take over the 40m hardware with the new front ends on the test stand.

As an initial test, we connected the new c1iscey to its I/O chassis using the OneStop fiber link. This went smoothly, so we tried to proceed with the rest of the system, which uncovered several problems. Consequently, we’ve reverted control back to the old front ends tonight, and will regroup and make another attempt tomorrow.

Status summary:

  • c1iscey worked on the first try
  • c1lsc worked, after we sorted out which of the two OneStop cables run to its rack we needed to use
  • c1sus2 sort of worked (its models have been crashing sporadically)
  • c1ioo had a busted OneStop cable, and worked after that was replaced
  • c1sus refused to work with the fiber OneStop cables (we tried several, including the known working one from c1ioo), but we jury-rigged it to run over a copper cable, after nudging the teststand rack a bit closer to the chassis
  • c1iscex refused to work with the fiber OneStop cables, and substituting copper was not an option, so we were stuck

There are various pathologies that we've seen with the OneStop interface cards in the I/O chassis. We don't seem to have the documentation for these cards, but our interpretive guesses are as follows:

  • When working, it is supposed to illuminate all the green LEDs along the top of the card, and the four next to the connector. In this state, you can run lspci -vt on the host, and see the various PLX/Contec/etc devices that populate the chassis.
  • When the cable is unplugged or bad, only four green LEDs illuminate on the card, and none by the connector. No devices from the chassis can be seen from the host.
  • On c1iscex and c1sus, when a fiber link is plugged in, it turns on all the LEDs along the top of the card, but the four next to the connector remain dark. We’re not sure yet what this is trying to tell us, but lspci finds no devices from the chassis, same as if it is unplugged.
  • Also, sometimes on c1iscex, no LEDs would illuminate at all (possibly the card was not seated properly).

Tomorrow, we plan to swap out the c1iscex I/O chassis for the one in the test stand, and see if that lets us get the full system up and running.

  17173   Thu Oct 6 07:29:30 2022 ChrisUpdateComputersSuccessful takeover attempt with the new front ends

[JC, Chris]

Last night’s CDS upgrade attempt succeeded in taking over the IFO. If the IFO users are willing, let’s try to run with it today.

The new system was left in control of the IFO hardware overnight, to check its stability. All looks OK so far.

The next step will be to connect the new FEs, fb1, and chiara to the martian network, so they’re directly accessible from the control room workstations (currently the system remains quarantined on the teststand network). We’ll also need to bring over the changes to models, scripts, etc that have been made since Tega’s last sync of the filesystem on chiara.

The previous elog noted a mysterious broken state of the OneStop link between FE and IO chassis, where all green LEDs light up on the OneStop board in the IO chassis, except the four next to the fiber link connector. This was seen on c1sus and c1iscex. It was recoverable last night on c1iscex, by fully powering down both FE computer and chassis, waiting a bit, and then powering up chassis and computer again. Currently c1sus is running with a copper OneStop cable because of the fiber link troubles we had, but this procedure should be tried to see if one of the fiber links can be made to work after all.

In order to string the short copper OneStop cable for c1sus, we had to move the teststand rack closer to the IO chassis, up against the back of 1X6/1X7. This is a temporary state while we prepare to move the FEs to their new rack. It hopefully also allows sufficient clearance to the exit door to pass the upcoming fire inspection.

At first, we connected the teststand rack’s power cables to the receptacle in 1X7, but this eventually tripped 1X7’s circuit breaker in the wall panel. Now, half of the teststand rack is on the receptacle in 1X6, and the other half is on 1X7 (these are separate circuits).

After the breaker trip, daqd couldn’t start. It turned out that no data was flowing to it, because the power cycle caused the DAQ network switch to forget a setting I had applied to enable jumbo frames on the network. The configuration has now been saved so that it should apply automatically on future restarts. For future reference, the web interface of this switch is available by running firefox on fb1 and navigating to 10.0.113.254.

When the FE machines are restarted, a GPS timing offset in /sys/kernel/gpstime/offset sometimes fails to initialize. It shows up as an incorrect GPS time in /proc/gps and on the GDS_TP MEDM screens, and prevents the data from getting timestamped properly for the DAQ. This needs to be looked at and fixed soon. In the meantime, it can be worked around by setting the offset manually: look at the value on one of the FEs that got it right, and apply it using sudo sh -c "echo CORRECT_OFFSET >/sys/kernel/gpstime/offset".

In the first ~30 minutes after the system came up last night, there were transient IPC errors, caused by drifting timestamps while the GPS cards in the FEs got themselves resynced to the satellites. Since then, timing has remained stable, and no further errors occurred overnight. However, the timing status is still reported as red in the IOP state vectors. This doesn’t seem to be an operational problem and perhaps can be ignored, but we should check it out later to make sure.

Also, the DAC cards in c1ioo and c1iscey reported FIFO EMPTY errors, triggering their DACKILL watchdogs. This situation may have existed in the old system and gone undetected. To bypass the watchdog, I’ve added the optimizeIO=1 flag to the IOP models on those systems, which makes them skip the empty FIFO check. This too should be further investigated when we get a chance.

  17473   Mon Feb 20 15:27:32 2023 ranaUpdateComputerspianosa software issues

notes 20-Feb-2023

* emacs doesn't run in base conda path (cds). LD_LIBRARY_PATH errors

* works fine if I do 'conda deactivate'. Something weird with our conda env?

* did apt install emacs, xemacs, terminator,

* these should be in the default workstation install

* arrgg!
(cds) ~>dataviewer
bash: dataviewer: command not found

* looks    like apt installs, made    dataviewer executable disappear?

  309   Mon Feb 11 22:44:29 2008 robConfigurationDAQChange in channel trending on C1SUS2

I removed the MC2 optical lever related channels from trends, and the SRM POUT and YOUT (as these are redundant if we're also trending PERROR and YERROR). I did this because the c1susvme2 processor was having bursts of un-syncy lateness every ~15 seconds or so, and I suspected this might interfere with locking activities. This behaviour appears to have been happening for a month or so, and has been getting steadily worse. Rebooting did not fix the issue, but it appears that removing some trends has actually helped. Attached is a 50 day trend of the c1susvme2 sync-fe monitor.
Attachment 1: srm_sync.png
srm_sync.png
  422   Wed Apr 16 21:11:12 2008 ranaSummaryDAQAA/AI Filters for the DAQ & FE systems
I used Foton to make up some new filters which will be used all over the project in order to downsample/upsample.

There will be 2 flavors:

  • The first one will be a downsampling filter for use in the DAQ system.
    Whenever you specify a sampling rate in the .ini files below the natural rate of the ADC,
    the data will be downsampled using this filter (called ULYAW_0 in the plot). This one was
    designed for flat bandpass and a 'good' bandstop but no care given to the phase shift.

  • The second one will be used in the FE systems to downsample the ADC signal which is often
    sampled at 64 kHz down to something manageable like 2k or 16k. This one was tweaked for
    getting less phase lag in the 'control' band (usually 3x or so below Nyquist).

Here is the associated filter file:
# SAMPLING ULYAW 16384
# DESIGN   ULYAW 0 zpk([0.512+i*1024;0.512-i*1024;2.048+i*2048;2.048-i*2048], \
#                      [515.838+i*403.653;515.838-i*403.653;318.182+i*623.506;318.182-i*623.506;59.2857+i*827.88; \
#                      59.2857-i*827.88],0.988553,"n")
# DESIGN   ULYAW 1 zpk([0.512513+i*1024;0.512513-i*1024;1.53754+i*2048;1.53754-i*2048], \
#                      [200+i*346.41;200-i*346.41;45+i*718.592;45-i*718.592],1,"n")
# DESIGN   ULYAW 2 zpk([0.768769+i*1024;0.768769-i*1024;1.53754+i*2048;1.53754-i*2048], \
#                      [194.913-i*331.349;194.913+i*331.349;53.1611+i*682.119;53.1611-i*682.119],1,"n")
###                                                                          ###
ULYAW    0 21 3      0      0 DAQAA         0.00091455950698073    -1.62010355523604     0.67259370084279    -1.84740554170818     0.99961738977942
                                                                   -1.72089534598832     0.78482029284220    -1.41321371411946     0.99858678588255
                                                                   -1.85800352005967     0.95626992044093     2.00000000000000     1.00000000000000
ULYAW    1 21 2      0      0 FEAA            0.018236566955641    -1.83622978049494     0.85804776530302    -1.84740518752455     0.99961700649533
                                                                   -1.89200532023258     0.96649324616546    -1.41346289594856     0.99893883979950
ULYAW    2 21 2      0      0 ELP             0.015203943102927    -1.84117829296043     0.86136943504058    -1.84722827171918     0.99942556512240
                                                                   -1.89339022414279     0.96048849609619    -1.41346289594856     0.99893883979950
Attachment 1: DAQ_filters_080416.pdf
DAQ_filters_080416.pdf
  691   Thu Jul 17 16:39:58 2008 Max JonesUpdateDAQMagnetometer Installed
Today I installed the magnetometer near the beam splitter chamber. It is located on the BSC chamber at head height on the inner part of the interferometer (meaning I had to crawl under the arms to install it). I don't think I disturbed anything during installation but I think that's it's probably prudent to tell everyone that I was back there just in case. I plan to run 3 BNC cables (one for each axis) from the magnetometer to the DAQ input either tonight or tomorrow. Suggestions are appreciated. - Max.
  857   Tue Aug 19 19:14:17 2008 YoichiConfigurationDAQFixed C1:IOO-MC_RFAMPDDC
Yoichi, Rob

C1:IOO-MC_RFAMPDDC, which is a PD at the transmission port of the MC, was not recording sensible values.
So I tracked down the problem starting from the centering of the beam on the PD.
The beam was hitting the PD properly. The DC output BNC on the PD provided +1.25V output when the light was
falling on the PD. The PD is fine.
The flat cable from the PD runs to the IOO rack and fed into the LSC PD interface card.
The output from the interface card is connected to a VMIC3113A DAQ card, through cross connects.
The voltages on the cross connects were ok.
The VMIC3113A was controlled by an EPICS machine (c1iool0). So it provides only a slow channel.
By looking at C1IOOF.ini and tpchn_C1.par, I figured that C1:IOO-MC_RFAMPDDC is using chnnum=13639 in the RFM
network and it is named C1:IOO-ICS_CHAN_15 in the .par file. So it is reading values from the ICS DAQ board.
Actually nothing was connected to the channel 15 of the ICS board and that was why C1:IOO-MC_RFAMPDDC was reading
nothing. So I took the PD signal from the cross connect and hooked it up to the Ch15 of the ICS DAQ through
the large black break out box with 4-pin LEMOs. Now C1:IOO-MC_RFAMPDDC reads the DC output of the PD.
I also put an ND filter in front of the RFAMPD to avoid the saturation of the ADC. The attenuation should have been done
electronically, but I was too lazy. Since the ND filter changes the Stochmon values, someone should remove it and reduce the
gain of the LSC PD interface accordingly.
  1184   Sun Dec 7 16:12:53 2008 ranaUpdateDAQbooted awg
because it was red
  1292   Wed Feb 11 10:52:22 2009 YoichiConfigurationDAQC1:PEM-OSA_APTEMP and C1:PEM-OSA_SPTEMP disconnected

During the cleanup of the lab. Steve found a box with two BNCs going to the ICS DAQ interface and an unconnected D-SUB on the floor under the AP table.  It seemed like a temperature sensor.

The BNCs were connected to C1:PEM-OSA_APTEMP and C1:PEM-OSA_SPTEMP.

Steve removed the box from the floor. These channels can be now used as spare DAQ channels. I labeled those cables.

  1769   Tue Jul 21 17:01:18 2009 peteDAQDAQtemp channel PEM-PETER_FE

I added a temporary channel, to input 9 on the PEM ADCU.    Beware the 30, 31, and 32 inputs.  I tried 32 and it only gave noise.

 

 

  1973   Tue Sep 8 15:14:26 2009 rana, alexConfigurationDAQRAID update to Framebuilder: directories added + lookback increased

 Alex logged in around 10:30 this morning and, at our request, adjusted the configuration of fb40m to have 20 days of lookback.

I wasn't able to get him to elog, but he did email the procedure to us:


1) create a bunch of new "Data???" directories in /frames/full
2) change the setting in /usr/controls/daqdrc file
       set num_dirs=480;

my guess is that the next step is:

3) telnet fb0 8087

    daqd>  shutdown

I checked and we do, in fact, now have 480 directories in /frames/full and are so far using up 11% of our 13TB capacity. Lets try to remember to check up on this so that it doesn't get overfull and crash the framebuilder.

  2073   Fri Oct 9 01:31:56 2009 ranaConfigurationDAQtpchn mystery

Does anyone know if this master file is the real thing that's in use now? Are we really using a file called tpchn_C1_new.par? If anyone sees Alex, please get to the bottom of this.

allegra:daq>pwd
/cvs/cds/caltech/chans/daq
allegra:daq>more master
/cvs/cds/caltech/chans/daq/C1ADCU_PEM.ini
#/cvs/cds/caltech/chans/daq/C1ADCU_SUS.ini
/cvs/cds/caltech/chans/daq/C1LSC.ini
/cvs/cds/caltech/chans/daq/C1ASC.ini
/cvs/cds/caltech/chans/daq/C1SOS.ini
/cvs/cds/caltech/chans/daq/C1SUS_EX.ini
/cvs/cds/caltech/chans/daq/C1SUS_EY.ini
/cvs/cds/caltech/chans/daq/C1SUS1.ini
/cvs/cds/caltech/chans/daq/C1SUS2.ini
#/cvs/cds/caltech/chans/daq/C1SUS4.ini
/cvs/cds/caltech/chans/daq/C1IOOF.ini
/cvs/cds/caltech/chans/daq/C1IOO.ini
/cvs/cds/caltech/chans/daq/C0GDS.ini
/cvs/cds/caltech/chans/daq/C0EDCU.ini
/cvs/cds/caltech/chans/daq/C1OMC.ini
/cvs/cds/caltech/chans/daq/C1ASS.ini
/cvs/cds/gds/param/tpchn_C1_new.par
/cvs/cds/gds/param/tpchn_C2.par
/cvs/cds/gds/param/tpchn_C3.par

  2075   Fri Oct 9 14:23:53 2009 Alex IvanovConfigurationDAQtpchn mystery

"Yes. This master file is used."

Quote:

Does anyone know if this master file is the real thing that's in use now? Are we really using a file called tpchn_C1_new.par? If anyone sees Alex, please get to the bottom of this.

allegra:daq>pwd
/cvs/cds/caltech/chans/daq
allegra:daq>more master
/cvs/cds/caltech/chans/daq/C1ADCU_PEM.ini
#/cvs/cds/caltech/chans/daq/C1ADCU_SUS.ini
/cvs/cds/caltech/chans/daq/C1LSC.ini
/cvs/cds/caltech/chans/daq/C1ASC.ini
/cvs/cds/caltech/chans/daq/C1SOS.ini
/cvs/cds/caltech/chans/daq/C1SUS_EX.ini
/cvs/cds/caltech/chans/daq/C1SUS_EY.ini
/cvs/cds/caltech/chans/daq/C1SUS1.ini
/cvs/cds/caltech/chans/daq/C1SUS2.ini
#/cvs/cds/caltech/chans/daq/C1SUS4.ini
/cvs/cds/caltech/chans/daq/C1IOOF.ini
/cvs/cds/caltech/chans/daq/C1IOO.ini
/cvs/cds/caltech/chans/daq/C0GDS.ini
/cvs/cds/caltech/chans/daq/C0EDCU.ini
/cvs/cds/caltech/chans/daq/C1OMC.ini
/cvs/cds/caltech/chans/daq/C1ASS.ini
/cvs/cds/gds/param/tpchn_C1_new.par
/cvs/cds/gds/param/tpchn_C2.par
/cvs/cds/gds/param/tpchn_C3.par

 

  3216   Wed Jul 14 11:54:33 2010 josephbUpdateDAQDebugging Guralp and reboots

This is regards to zero signal being reported by the channels C1:PEM-SEIS_GUR1_X, C1:PEM-SEIS_GUR1_Y, and C1:PEM-SEIS_GUR1_Z.

I briefly swapped Guralp 1 EW and Guralp 2 EW to confirm to myself that it was not on the gurlap end (although the fact that its digital zero is highly indicative a digital realm problem).  I then unplugged the 17-32, and then 1-16 channel connections to the 110B.  I saw floating noise on the GUR2 channels, but still digital zero on the GUR1 channels, which means its not the BNC break out box.

There was a spare 110B, unconnected in the crate, so to do a quick test of the 110B, I turned off the crate and swapped the 110Bs, after copying the switch configuration of the first 110B to the second one.  The original 110B was labeled ADC 1, while the second 110B was labeled ADC 0.  The switches were identical except for the ones closest to the Dsub connectors on the front.  All those switches in that set were to the right, when looking down at the switches and the Dsub connectors pointing towards yourself.

Unfortunately, the c0duc1 never seemed to come up with the new 110B (ADC 0).  So we put the original 110B back.  And turned the crate back on. 

The fb then didn't seem to come back quite right.  We tried rebooting fb40m it, but its still red with status 1.  c0daqctrl is green, but c0dcu1 is red, although I'm not positive if thats due to fb40m being in a strange state.  Jenne tried a telnet in to port 8087 and shutdown, but that didn't seem to help.  At this point, we're going to contact Alex when he gets in around 12:30.

 

  3220   Wed Jul 14 16:39:06 2010 JenneUpdateDAQDebugging Guralp and reboots

[Joe, Jenne]

Joe got on the phone with Alex, and Alex's magic Alex intuition told him to ask about the RFM switch.  The C0DAQ_CTRL's overload light was orangeAlex suggested hitting the reset button on that RFM switch, which we did. That fixed everything -> c0dcu1 came back, as did the frame builder.  Rana had pointed out earlier that we could have brought back all of the other front ends, and enabled the damping of the optics even though the FB was still down.  It's okay to leave the front ends & watchdogs on, and just reboot the FB, AWG, and DAQ_CTRL computers if that is necessary.

Anyhow, once the FB was back online, we got around to bringing back all of the front ends (as usual, except for the ones which are unplugged because they're in the middle of being upgraded).  Everything is back online now.

After all of this craziness, all of the Guralp channels are working happily again. It is still unknown why they starting being digital zero, but they're back again. Maybe I should have rebooted the frame builder in addition to c0dcu1 last night?

 

Quote:

This is regards to zero signal being reported by the channels C1:PEM-SEIS_GUR1_X, C1:PEM-SEIS_GUR1_Y, and C1:PEM-SEIS_GUR1_Z.

I briefly swapped Guralp 1 EW and Guralp 2 EW to confirm to myself that it was not on the gurlap end (although the fact that its digital zero is highly indicative a digital realm problem).  I then unplugged the 17-32, and then 1-16 channel connections to the 110B.  I saw floating noise on the GUR2 channels, but still digital zero on the GUR1 channels, which means its not the BNC break out box.

There was a spare 110B, unconnected in the crate, so to do a quick test of the 110B, I turned off the crate and swapped the 110Bs, after copying the switch configuration of the first 110B to the second one.  The original 110B was labeled ADC 1, while the second 110B was labeled ADC 0.  The switches were identical except for the ones closest to the Dsub connectors on the front.  All those switches in that set were to the right, when looking down at the switches and the Dsub connectors pointing towards yourself.

Unfortunately, the c0duc1 never seemed to come up with the new 110B (ADC 0).  So we put the original 110B back.  And turned the crate back on. 

The fb then didn't seem to come back quite right.  We tried rebooting fb40m it, but its still red with status 1.  c0daqctrl is green, but c0dcu1 is red, although I'm not positive if thats due to fb40m being in a strange state.  Jenne tried a telnet in to port 8087 and shutdown, but that didn't seem to help.  At this point, we're going to contact Alex when he gets in around 12:30.

 

 

  3247   Mon Jul 19 21:47:36 2010 ranaSummaryDAQDAQ timing test

Since we now have a good measurement of the phase noise of the Rb clock Marconi locked to the Rb clock, I wanted to use that to check out the old DAQ system:

I used Megan's phase noise setup - Marconi #2 is putting out 11000013 Hz at 13 dBm into the ZP-3MH mixer. Marconi #1 is putting out 3 dBm at 11000000 Hz into the RF input.

The output goes through a 50 Ohm load and then a Mini-Circuits BNC LP filter (either 2 or 5 MHz). Then an SR560 set for low noise, G = 5, AC coupling, 1-pole LP @ 1 kHz.

This SR560 output goes into the channel C1:IOO-MC_DRUM1 (which is sampled at 16384 Hz with ICS-110B after the usual Sander Liu AA chassis containing the INA134s).

  3299   Tue Jul 27 16:03:36 2010 ranaSummaryDAQDAQ timing test

Quote:

Since we now have a good measurement of the phase noise of the Rb clock Marconi locked to the Rb clock, I wanted to use that to check out the old DAQ system:

I used Megan's phase noise setup - Marconi #2 is putting out 11000013 Hz at 13 dBm into the ZP-3MH mixer. Marconi #1 is putting out 3 dBm at 11000000 Hz into the RF input.

The output goes through a 50 Ohm load and then a Mini-Circuits BNC LP filter (either 2 or 5 MHz). Then an SR560 set for low noise, G = 5, AC coupling, 1-pole LP @ 1 kHz.

This SR560 output goes into the channel C1:IOO-MC_DRUM1 (which is sampled at 16384 Hz with ICS-110B after the usual Sander Liu AA chassis containing the INA134s).

 This is the 0.3 mHz BW spectrum of this test - as you can see the apparent linewidth (assuming the width is all caused by the DAQ jitter) is comparable to the BW and therefore not resolved.

Basically, the Hanning window function is not sharp enough to do this test and so I will do it offline in Matlab.

Attachment 1: Untitled.png
Untitled.png
  3657   Wed Oct 6 00:32:01 2010 ranaSummaryDAQNDS2

This is the link to the NDS2 webpage:

https://www.lsc-group.phys.uwm.edu/daswg/wiki/NetworkDataServer2

We should install this so that we can use this modern interface to get 40m data from outside and inside of the 40m.

ELOG V3.1.3-