40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 58 of 335  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  16154   Sun May 23 18:28:54 2021 JonUpdateCDSOpto-isolator for c1auxey

The new HAM-A coil drivers have a single DB9 connector for all the binary inputs. This requires that the dewhitening switching signals from the fast system be spliced with the coil enable signals from c1auxey. There is a common return for all the binary inputs. To avoid directly connecting the grounds of the two systems, I have looked for a suitable opto-isolator for the c1auxey signals.

I best option I found is the Ocean Controls KTD-258, a 4-channel, DIN-rail-mounted opto-isolator supporting input/output voltages of up to 30 V DC. It is an active device and can be powered using the same 15 V supply as is currently powering both the Acromags and excitation. I ordered one unit to be trialed in c1auxey. If this is found to be good solution, we will order more for the upgrades of c1auxex and c1susaux, as required for compatibility with the new suspension electronics.

  16166   Fri May 28 10:54:59 2021 JonUpdateCDSOpto-isolator for c1auxey

I have received the opto-isolator needed to complete the new c1auxey system. I left it sitting on the electronics bench next to the Acromag chassis.

Here is the manufacturer's wiring manual. It should be wired to the +15V chassis power and to the common return from the coil driver, following the instructions herein for NPN-style signals. Note that there are two sets of DIP switches (one on the input side and one on the output side) for selecting the mode of operation. These should all be set to "NPN" mode.

Attachment 1: optoisolator.jpeg
optoisolator.jpeg
  16167   Fri May 28 11:16:21 2021 JonUpdateCDSFront-End Assembly and Testing

An update on recent progress in the lab towards building and testing the new FEs.

1. Timing problems resolved / FE BIOS changes

The previously reported problem with the IOPs losing sync after a few minutes (16130) was resolved through a change in BIOS settings. However, there are many required settings and it is not trivial to get these right, so I document the procedure here for future reference.

The CDS group has a document (T1300430) listing the correct settings for each type of motherboard used in aLIGO. All of the machines received from LLO contain the oldest motherboards: the Supermicro X8DTU. Quoting from the document, the BIOS must be configured to enforce the following:

• Remove hyper-threading so the CPU doesn’t try to run stuff on the idle core, as hyperthreading simulate two cores for every physical core.
• Minimize any system interrupts from hardware, such as USB and Serial Ports, that might get through to the ‘idled’ core. This is needed on the older machines.
• Prevent the computer from reducing the clock speed on any cores to ‘save power’, etc. We need to have a constant clock speed on every ‘idled’ CPU core.

I generally followed the T1300430 instructions but found a few adjustments were necessary for diskless and deterministic operation, as noted below. The procedure for configuring the FE BIOS is as follows:

  1. At boot-up, hit the delete key to enter the BIOS setup screen.
  2. Before changing anything, I recommend photographing or otherwise documenting the current working settings on all the subscreens, in case for some reason it is necessary to revert.
  3. T1300430 assumes the process is started from a known state and lists only the non-default settings that must be changed. To put the BIOS into this known state, first navigate to Exit > Load Failsafe Defaults > Enter.
  4. Configure the non-default settings following T1300430 (Sec. 5 for the X8DTU motherboard). On the IPMI screen, set the static IP address and netmask to their specific assigned values, but do set the gateway address to all zeros as the document indicates. This is to prevent the IPMI from trying to initiate outgoing connections.
  5. For diskless booting to continue to work, it is also necessary to set Advanced > PCI/PnP Configuration > Load Onboard LAN 1 Option Rom > Enabled.
  6. I also found it was necessary to re-enable IDE direct memory access and WHEA (Windows Hardware Error Architecture) support. Since these machines have neither hard disks nor Windows, I have no idea why these are needed, but I found that without them, one of the FEs would hang during boot about 50% of the time.
    • Advanced > PCI/PnP configuration > PCI IDE BusMaster  > Enabled.
    • Advanced > ACPI Configuration > WHEA Support > Enabled.

After completing the BIOS setup, I rebooted the new FEs about six times each to make sure the configuration was stable (i.e., would never hang during boot).

2. User models created for FE testing

With the timing issue resolved, I proceeded to build basic user models for c1bhd and c1sus2 for testing purposes. Each one has a simple structure where M ADC inputs are routed through IIR filters to an output matrix, which forms linear signal combinations that are routed to N DAC outputs. This is shown in Attachment 1 for the c1bhd case, where the signals from a single ADC are conditioned and routed to a single 18-bit DAC. The c1sus2 case is similar; however the Contec BO modules still needed to be added to this model.

The FEs are now running two models each: the IOP model and one user model. The assigned parameters of each model are documented below.

Model Host CPU DCUID Path
c1x06 c1bhd 1 23 /opt/rtcds/userapps/release/cds/c1/models/c1x06.mdl
c1x07 c1sus2 1 24 /opt/rtcds/userapps/release/cds/c1/models/c1x07.mdl
c1bhd c1bhd 2 25 /opt/rtcds/userapps/release/isc/c1/models/c1bhd.mdl
c1sus2 c1sus2 2 26 /opt/rtcds/userapps/release/sus/c1/models/c1sus2.mdl

The user models were compiled and installed following the previously documented procedure (15979). As shown in Attachment 2, all the RTS processes are now working, with the exception of the DAQ server (for which we're still awaiting hardware). Note that these models currently exist only on the cloned copy of the /opt/rtcds disk running on the test stand. The plan is to copy these models to the main 40m disk later, once the new FEs are ready to be installed.

3. AA and AI chassis installed

I installed several new AA and AI chassis in the test stand to interface with the ADC and DAC cards. This includes three 16-bit AA chassis, one 16-bit AI chassis, and one 18-bit AI chassis, as pictured in Attachment 3. All of the AA/AI chassis are powered by one of the new 15V DC power strips connected to a bench supply, which is housed underneath the computers as pictured in Attachment 4.

These chassis have not yet been tested, beyond verifying that the LEDs all illuminate to indicate that power is present.

Attachment 1: c1bhd.png
c1bhd.png
Attachment 2: gds_tp.png
gds_tp.png
Attachment 3: teststand.jpeg
teststand.jpeg
Attachment 4: bench_supply.jpeg
bench_supply.jpeg
  16177   Thu Jun 3 13:06:47 2021 Ian MacMillanUpdateCDSSUS simPlant model

I was able to measure the transfer function of the plant filter module from the channel X1:SUP-C1_SUS_SINGLE_PLANT_Plant_POS_Mod_EXC to X1:SUP-C1_SUS_SINGLE_PLANT_Plant_POS_Mod_OUT. The resulting transfer function is shown below. I have also attached the raw data for making the graph.

Next, I will make a script that will make the photon filters for all the degrees of freedom and start working on the matrix version of the filter module so that there can be multiple degrees of freedom.

Attachment 1: SingleSusPlantTF.pdf
SingleSusPlantTF.pdf
Attachment 2: SUS_PLANT_TF.zip
  16178   Thu Jun 3 17:15:17 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

As Jon wrote we need to use the NPN configuration (see attachments). I tested the isolator channels in the following way:

1. I connected +15V from the power supply to the input(+) contact.

2. Signal wire from one of the digital outputs was connected to I1-4

3. When I set the digital output to HIGH, the LED on the isolator turns on.

4. I measure the resistance between O1-4 to output(-) and find it to be ~ 100ohm in the HIGH state and an open circuit in the LOW state, as expected from an open collector output.

Unlike the Acromag output, the isolator output is not pulled up in the LOW state. To do so we need to connect +15V to the output channel through a pull-up resistor. For now, I leave it with no pull-up. According to the schematics of the HAM-A Coil Driver, the digital output channels drive an electromagnetic relay (I think) so it might not need to be pulled up to switch back. I'm not sure. We will need to check the operation of these outputs at the installation.

During the testing of the isolator outputs pull-up, I accidentally ran a high current through O2, frying it dead. It is now permanently shorted to the + and - outputs rendering it unusable. In any case, we need another isolator since we have 5 channels we need to isolate.

I mounted the isolator on the DIN rail and started wiring the digital outputs into it. I connected the GND from the RTS to output(-) such that when the digital outputs are HIGH the channels in the coil driver will be sunk into the RTS GND and not the slow one avoiding GND contamination.

Attachment 1: Optical_Isolator_NPN_Input.png
Optical_Isolator_NPN_Input.png
Attachment 2: Optical_Isolator_NPN_Output.png
Optical_Isolator_NPN_Output.png
  16181   Thu Jun 3 22:08:00 2021 KojiUpdateCDSOpto-isolator for c1auxey

- Could you explain what is the blue thing in Attachment 1?

- To check the validity of the signal chain, can you make a diagram summarizing the path from the fast BO - BO I/F - Acromag - This opto-isolator - the coil driver relay? (Cut-and-paste of the existing schematics is fine)

 

  16182   Fri Jun 4 14:49:23 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

I made a diagram (Attached). I think it explains the blue thing in the previous post.

I don't know what is the grounding situation in the RTS so I put a ground in both the coil driver and the RTS. Hopefully, only one of them is connected in reality.

Quote:

- Could you explain what is the blue thing in Attachment 1?

- To check the validity of the signal chain, can you make a diagram summarizing the path from the fast BO - BO I/F - Acromag - This opto-isolator - the coil driver relay? (Cut-and-paste of the existing schematics is fine)

 

 

Attachment 1: Optical_isolator_Wiring.pdf
Optical_isolator_Wiring.pdf
  16183   Fri Jun 4 17:46:25 2021 unYehonathanUpdateCDSOpto-isolator for c1auxey

I mounted the optoisolator on the DIN rail and connected the 3 first channels

C1:SUS-ETMY_UL_ENABLE
C1:SUS-ETMY_UR_ENABLE

C1:SUS-ETMY_LL_ENABLE

to the optoisolator inputs 1,3,4 respectively. I connected the +15V input voltage into the input(+) of the optoisolator.

The outputs were connected to DB9F-2 where those channels were connected before.

I added DB9F-1 to the front panel to accept channels from the RTS. I connected the fast channels to connectors 1,2,3 from DB9F-1 to DB9F-2 according to the wiring diagram. The GND from DB9F-1 was connected to both connector 5 of DB9F-2 and the output (-).

I tested the channels: I connected a DB9 breakout board to DB9F-2. I measured the resistance between the RTS GND and the isolated channels while switching them on and off. In the beginning, when I turned on the binary channels the resistance was behaving weird - oscillating between low resistance and open circuit. I pulled up the channels through a 100Kohm resistor to observe whether the voltage behavior is reasonable or not. Indeed I observed that in the LOW state the voltage between the isolated channel and slow GND is 15V and 0.03V in the HIGH state. Then I disconnected the pull up from the channels and measured the resistance again. It showed ~ stable 170ohm in the HIGH state and an open circuit in the LOW state. I was not able to reproduce the weird initial behavior. Maybe the optoisolator needs some warmup of some sort.

 

We still need to wire the rest of the fast channels to DBF9-3 and isolate the channels in DBF9-4. For that, we need another optoisolator.

 

There is still an open issue with the BI channels not read by EPICS. They can still be read by the Windows machine though.

Attachment 1: 20210604_173420.jpg
20210604_173420.jpg
  16184   Sun Jun 6 03:02:14 2021 KojiUpdateCDSOpto-isolator for c1auxey

This RTS also use the BO interface with an opto isolator. https://dcc.ligo.org/LIGO-D1002593

Could you also include the pull up/pull down situations?

  16185   Sun Jun 6 08:42:05 2021 JonUpdateCDSFront-End Assembly and Testing

Here is an update and status report on the new BHD front-ends (FEs).

Timing

The changes to the FE BIOS settings documented in [16167] do seem to have solved the timing issues. The RTS models ran for one week with no more timing failures. The IOP model on c1sus2 did die due to an unrelated "Channel hopping detected" error. This was traced back to a bug in the Simulink model, where two identical CDS parts were both mapped to ADC_0 instead of ADC_0/1. I made this correction and recompiled the model following the procedure in [15979].

Model naming standardization

For lack of a better name, I had originally set up the user model on c1sus2 as "c1sus2.mdl" This week I standardized the name to follow the three-letter subsystem convention, as four letters lead to some inconsistency in the naming of the auto-generated MEDM screens. I renamed the model c1sus2.mdl -> c1su2.mdl. The updated table of models is below.

Model Host CPU DCUID Path
c1x06 c1bhd 1 23 /opt/rtcds/userapps/release/cds/c1/models/c1x06.mdl
c1x07 c1sus2 1 24 /opt/rtcds/userapps/release/cds/c1/models/c1x07.mdl
c1bhd c1bhd 2 25 /opt/rtcds/userapps/release/isc/c1/models/c1bhd.mdl
c1su2 c1su2 2 26 /opt/rtcds/userapps/release/sus/c1/models/c1su2.mdl

Renaming an RTS model requires several steps to fully propagate the change, so I've documented the procedure below for future reference.

On the target FE, first stop the model to be renamed:

controls@c1sus2$ rtcds stop c1sus2

Then, navigate to the build directory and run the uninstall and cleanup scripts:

controls@c1sus2$ cd /opt/rtcds/caltech/c1/rtbuild/release
controls@c1sus2$ make uninstall-c1sus2
controls@c1sus2$ make clean-c1sus2

Unfortunately, the uninstall script does not remove every vestige of the old model, so some manual cleanup is required. First, open the file /opt/rtcds/caltech/c1/target/gds/param/testpoint.par and manually delete the three-line entry corresponding to the old model:

hostname=c1sus2
system=c1sus2
[C-node26]

If this is not removed, reinstallation of the renamed model will fail because its assigned DCUID will appear to already be in use. Next, find all relics of the old model using:

controls@c1sus2$ find /opt/rtcds/caltech/c1 -iname "*sus2*"

and manually delete each file and subdirectory containing the "sus2" name. Finally, rename, recompile, reinstall, and relaunch the model:

controls@c1sus2$ cd /opt/rtcds/userapps/release/sus/c1/models
controls@c1sus2$ mv c1sus2.mdl c1su2.mdl
controls@c1sus2$ cd /opt/rtcds/caltech/c1/rtbuild/release
controls@c1sus2$ make c1su2
controls@c1sus2$ make install-c1su2
controls@c1sus2$ rtcds start c1su2

Sitemap screens

I used a tool developed by Chris, mdl2adl, to auto-generate a set of temporary sitemap/model MEDM screens. This package parses each Simulink file and generates an MEDM screen whose background is an .svg image of the Simulink model. Each object in the image is overlaid with a clickable button linked to the auto-generated RTS screens. An example of the screen for the C1BHD model is shown in Attachment 1. Having these screens will make the testing much faster and less user-error prone.

I generated these screens following the instructions in Chris' README. However, I ran this script on the c1sim machine, where all the dependencies including Matlab 2021 are already set up. I simply copied the target .mdl files to the root level of the mdl2adl repo, ran the script (./mdl2adl.sh c1x06 c1x07 c1bhd c1su2), and then copied the output to /opt/rtcds/caltech/c1/medm/medm_teststand. Then I redefined the "sitemap" environment variable on the chiara clone to point to this new location, so that they can be launched in the teststand via the usual "sitemap" command.

Current status and plans

Is it possible to convert 18-bit AO channels to 16-bit?

Currently, we are missing five 18-bit DACs needed to complete the c1sus2 system (the c1bhd system is complete). Since the first shipment, we have had no luck getting additional 18-bit DACs from the sites, and I don't know when more will become available. So, this week I took an inventory of all the 16-bit DACs available at the 40m. I located four 16-bit DACs, pictured in Attachment 2. Their operational states are unknown, but none were labeled as known not to work.

The original CDS design would call for 40 more 18-bit DAC channels. Between the four 16-bit DACs there are 64 channels, so if only 3/4 of these DACs work we would have enough AO channels. However, my search turned up zero additional 16-bit DAC adapter boards. We could check if first Rolf or Todd have any spares. If not, I think it would be relatively cheap and fast to have four new adapters fabricated.

DAQ network limitations and plan

To get deeper into the signal-integrity aspect of the testing, it is going to be critical to get the secondary DAQ network running in the teststand. Of all the CDS tools (Ndscope, Diaggui, DataViewer, StripTool), only StripTool can be used without a functioning NDS server (which, in turn, requires a functioning DAQ server). StripTool connects directly to the EPICS server run by the RTS process. As such, StripTool is useful for basic DC tests of the fast channels, but it can only access the downsampled monitor channels. Ian and Anchal are going to carry out some simple DAC-to-ADC loopback tests to the furthest extent possible using StripTool (using DC signals) and will document their findings separately.

We don't yet have a working DAQ network because we are still missing one piece of critical hardware: a 10G switch compatible with the older Myricom network cards. In the older RCG version 3.x used by the 40m, the DAQ code is hardwired to interface with a Myricom 10G PCIe card. I was able to locate a spare Myricom card, pictured in Attachment 3, in the old fb machine. Since it looks like it is going to take some time to get an old 10G switch from the sites, I went ahead and ordered one this week. I have not been able to find documentation on our particular Myricom card, so it might be compatible with the latest 10G switches but I just don't know. So instead I bought exactly the same older (discontinued) model as is used in the 40m DAQ network, the Netgear GSM7352S. This way we'll also have a spare. The unit I bought is in "like-new" condition and will unfortunately take about a week to arrive.

Attachment 1: c1bhd.png
c1bhd.png
Attachment 2: 16bit_dacs.png
16bit_dacs.png
Attachment 3: myricom.png
myricom.png
  16186   Sun Jun 6 12:15:16 2021 JonUpdateCDSOpto-isolator for c1auxey

Since this Ocean Controls optoisolator has been shown to be compatible, I've gone ahead and ordered 10 more:

  • (1) to complete c1auxey
  • (2) for the upgrade of c1auxex
  • (7) for the upgrade of c1susaux

They are expected to arrive by Wednesday.

  16187   Sun Jun 6 15:59:51 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

According to the BO interface circuit board https://dcc.ligo.org/D1001266, PCIN wires are connected to the coil driver and they are not pulled either way.

That means that they're either grounded or floating. I updated the drawing.

Quote:

This RTS also use the BO interface with an opto isolator. https://dcc.ligo.org/LIGO-D1002593

Could you also include the pull up/pull down situations?

 

Attachment 1: Optical_isolator_Wiring.pdf
Optical_isolator_Wiring.pdf
  16188   Sun Jun 6 16:33:47 2021 JonUpdateCDSBI channels on c1auxey

There is still an open issue with the BI channels not read by EPICS. They can still be read by the Windows machine though.

I looked into the issue that Yehonathan reported with the BI channels. I found the problem was with the .cmd file which sets up the Modbus interfacing of the Acromags to EPICS (/cvs/cds/caltech/target/c1auxey1/ETMYaux.cmd).

The problem is that all the channels on the XT1111 unit are being configured in Modbus as output channels. While it is possible to break up the address space of a single unit, so that some subset of channels are configured as inputs and another as outputs, I think this is likely to lead to mass confusion if the setup ever has to be modified. A simpler solution (and the convention we adopted for previous systems) is just to use separate Acromag units for BI and BO signals.

Accordingly, I updated the wiring plan to include the following changes:

  • The five EnableMon BI channels are moved to a new Acromag XT1111 unit (BIO01), whose channels are configured in Modbus as inputs.
  • One new DB37M connector is added for the 11 spare BI channels on BIO01.
  • The five channels freed up on the existing XT1111 (BIO00) are wired to the existing connector for spare BO channels.

So, one more Acromag XT1111 needs to be added to the c1auxey chassis, with the wiring changes as noted above. I have already updated the .cmd and EPICS database files in /cvs/cds/caltech/target/c1auxey1 to reflect these changes.

  16189   Mon Jun 7 13:14:20 2021 YehonathanUpdateCDSBI channels on c1auxey

I added a new XT1111 Acromag module to the c1auxey chassis. I sanitized and configured it according to the slow machines wiki instructions.

Since all the spare BIOs fit one DB37 connector I didn't add another feedthrough and combined them all on one and the same DB37 connector. This was possible because all the RTNs of the BIOs are tied to the chassis ground and therefore need only one connection. I changed the wiring spreadsheet accordingly.

I did a lot of rewirings and also cut short several long wires that were protruding from the chassis. I tested all the wires from the feedthroughs to the Acromag channels and fixed some wiring mistakes.

Tomorrow I will test the BIs using EPICs.

  16191   Mon Jun 7 17:49:19 2021 Ian MacMillanUpdateCDSSUS simPlant model

Added difference to the graph. I included the code so that others could see what it looks like and use it for easy use.

Attachment 1: SingleSusPlantTF.pdf
SingleSusPlantTF.pdf
Attachment 2: TF_Graph_Code.zip
  16193   Tue Jun 8 11:54:39 2021 YehonathanUpdateCDSBI channels on c1auxey

I tested the digital inputs the following way: I connected a DB9 breakout to DB9M-5 and DB9M-6 where digital inputs are hosted. I shorted the channel under test to GND to turn it on.

I observed the channels turn from Disabled to Enabled using caget when I shorted the channel to GND and from Enabled to Disabled when I disconnected them.

I did this for all the digital inputs and they all passed the test.

I am still waiting for the other isolator to wire the rest of the digital outputs.

Next, I believe we should take some noise spectra of the Y end before we do the installation.

Quote:

Tomorrow I will test the BIs using EPICs.

 

  16195   Wed Jun 9 13:50:48 2021 Ian MacMillanUpdateCDSSUS simPlant model

I have attached an updated transfer function graph with the residual easier to see. I thought here I would include a better explanation of what this transfer function was measuring.

This transfer function was mainly about learning how to use DTT and Foton to make and measure transfer functions. Therefore it is just measuring across a single CDS filter block. X1SUP_ISOLATED_C1_SUS_SINGLE_PLANT_Plant_POS_Mod block to be specific. This measurement shows that the block is doing what I programmed it to do with Foton. The residual is probably just because the measured TF had fewer points than the calculated one.

The next step is to take a closed-loop TF of the system and the control module.

After that, I want to add more degrees of freedom to the model. both in the plant and in the controls.

Attachment 1: SingleSusPlantTF.pdf
SingleSusPlantTF.pdf
  16199   Mon Jun 14 15:31:30 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

I checked the BI situation on the HAM-A coil driver. It seems like these are sinking BIs and indeed need to be isolated from the Acromag unit GND to avoid contamination.

The BIs will have to be isolated on a different isolator. Now, the wires coming from the field (red) are connected to the second isolator's input and the outputs are connected to the Acromag BI module and the Acromag's RTN.

I updated the wiring diagram (attached) and the wiring spreadsheet.

In the diagram, you can notice that the BI isolator (the right one) is powered by the Acromag's +15V and switched when the coil driver's GND is supplied. I am not sure if it makes sense or not. In this configuration, there is a path between the coil driver's GND and the Acromag's GND but its resistance is at least 10KOhm. The extra careful option is to power the isolator by the coil driver's +V but there is no +V on any of the connectors going out of the coil driver.

I installed an additional isolator on the DIN rail and wired the remaining BOs (C1:SUS-ETMY_SD_ENABLE, C1:SUS-ETMY_LR_ENABLE) through it to the DB9F-4 feedthrough. I also added DB9F-3 for incoming wires from the RTS and made the required connection from it to DB9F-4.

I tested the new isolated BOs using the Windows machine (after stopping Modbus). As before, I measure the resistance between pin 5 (coil driver GND) and the channel under test. When I turn on the BO I see the resistance drops from inf to 166ohm and back to inf when I turn it off. Both channels passed the test.

 

Attachment 1: Optical_isolator_Wiring.pdf
Optical_isolator_Wiring.pdf
  16201   Tue Jun 15 11:46:40 2021 Ian MacMillanUpdateCDSSUS simPlant model

I have added more degrees of freedom. The model includes x, y, z, pitch, yaw, roll and is controlled by a matrix of transfer functions (See Attachment 2). I have added 5 control filters to individually control UL, UR, LL, LR, and side. Eventually, this should become a matrix too but for the moment this is fine.

Note the Unit delay blocks in the control in Attachment 1. The model will not compile without these blocks.

Attachment 1: x1sup_isolated-6-15-v1.pdf
x1sup_isolated-6-15-v1.pdf
Attachment 2: C1_SUS_SINGLE_PLANT-6-15-v1.pdf
C1_SUS_SINGLE_PLANT-6-15-v1.pdf
  16203   Tue Jun 15 21:48:55 2021 KojiUpdateCDSOpto-isolator for c1auxey

If my understanding is correct, the (photo receiving) NPN transistor of the optocoupler is energized through the acromag. The LED side should be driven by the coil driver circuit. It is properly done for the "enable mon" through 750Ohm and +V. However, "Run/Acquire" is a relay switch and there is no one to drive the line. I propose to add the pull-up network to the run/acquire outputs. This way all 8 outputs become identical and symmetric.

We should test the configuration if this works properly. This can be done with just a manual switch, R=750Ohm, and a +V supply  (+18V I guess).

Attachment 1: Acromag_RTS_BI_config.jpg
Acromag_RTS_BI_config.jpg
  16205   Wed Jun 16 17:24:29 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

I updated the wiring diagram according to Koji's suggestion. According to the isolator manual, this configuration requires that the isolator input be configured as PNP.

Additionally, when the switch in the coil driver is open the LED in the isolator is signaling an on-state. Therefore, we might need to configure the Acromag to invert the input.

There are the Run/Aquire channels that we might need to add to the wiring diagram. If we do need to read them using slow channels, we will have to pull them up like the EnableMon channels to use them like in the wiring diagram.

Attachment 1: Optical_isolator_Wiring.pdf
Optical_isolator_Wiring.pdf
  16207   Wed Jun 16 20:32:39 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

I installed 2 additional isolators in the Acromag chassis. I set all the input channels to PNP. I ran the digital inputs (EnableMon channels) through these isolators according to the previous post.

I tested the digital inputs in the following way:

I connected an 18V voltage source to the signal wire under test through a 1Kohm resistor. I connected the GND of the voltage source to the RTN wire of the feedthrough. When the voltage source was connected, the LED on the isolator turned on and the EPICs channel under test was Enabled. When I disconnected the voltage source or shorted the signal wire to GND the LED on the isolator turned off and the EPICs channel showed a Disabled state.

  16208   Thu Jun 17 11:19:37 2021 Ian MacMillanUpdateCDSCDS Upgrade

Jon and I tested the ADC and DAC cards in both of the systems on the test stand. We had to swap out an 18-bit DAC for a 16-bit one that worked but now both machines have at least one working ADC and DAC.

[Still working on this post. I need to look at what is in the machines to say everything ]

  16217   Mon Jun 21 17:15:49 2021 Ian MacMillanUpdateCDSCDS Upgrade

Anchal and I wrote a script (Attachment 1) that will test the ADC and DAC connections with inputs on the INMON from -3000 to 3000. We could not run it because some of the channels seemed to be frozen. 

Attachment 1: DAC2ADC_Test.py
´╗┐import os
import time
import numpy as np
import subprocess
from traceback import print_exc
import argparse


def grabInputArgs():
    parser = argparse.ArgumentParser(
... 75 more lines ...
  16220   Tue Jun 22 16:53:01 2021 Ian MacMillanUpdateCDSFront-End Assembly and Testing

The channels on both the C1BHD and C1SUS2 seem to be frozen: they arent updating and are holding one value. To fix this Anchal and I tried:

  • restarting the computers 
    • restarting basically everything including the models
  • Changing the matrix values
  • adding filters
  • messing with the offset 
  • restarting the network ports (Paco suggested this apparently it worked for him at some point)
  • Checking to make sure everything was still connected inside the case (DAC, ADC, etc..)

I wonder if Jon has any ideas. 

  16224   Thu Jun 24 17:32:52 2021 Ian MacMillanUpdateCDSFront-End Assembly and Testing

Anchal and I ran tests on the two systems (C1-SUS2 and C1-BHD). Attached are the results and the code and data to recreate them.

We connected one DAC channel to one ADC channel and thus all of the results represent a DAC/ADC pair. We then set the offset to different values from -3000 to 3000 and recorded the measured signal. I then plotted the response curve of every DAC/ADC pair so each was tested at least once.

There are two types of plots included in the attachments

1) a summary plot found on the last pages of the pdf files. This is a quick and dirty way to see if all of the channels are working. It is NOT a replacement for the other plots. It shows all the data quickly but sacrifices precision.

2) In an in-depth look at an ADC/DAC pair. Here I show the measured value for a defined DC offset. The Gain of the system should be 0.5 (put in an offset of 100 and measure 50). I included a line to show where this should be. I also plotted the difference between the 0.5 gain line and the measured data. 

As seen in the provided plots the channels get saturated after about the -2000 to 2000 mark, which is why the difference graph is only concentrated on -2000 to 2000 range. 

Summary: all the channels look to be working they all report very little deviation off of the theoretical gain. 

Note: ADC channel 31 is the timing signal so it is the only channel that is wildly off. It is not a measurement channel and we just measured it by mistake.

Attachment 1: C1-SU2_Channel_Responses.pdf
C1-SU2_Channel_Responses.pdf C1-SU2_Channel_Responses.pdf C1-SU2_Channel_Responses.pdf C1-SU2_Channel_Responses.pdf C1-SU2_Channel_Responses.pdf C1-SU2_Channel_Responses.pdf C1-SU2_Channel_Responses.pdf C1-SU2_Channel_Responses.pdf
Attachment 2: C1-BHD_Channel_Responses.pdf
C1-BHD_Channel_Responses.pdf C1-BHD_Channel_Responses.pdf C1-BHD_Channel_Responses.pdf C1-BHD_Channel_Responses.pdf C1-BHD_Channel_Responses.pdf C1-BHD_Channel_Responses.pdf C1-BHD_Channel_Responses.pdf C1-BHD_Channel_Responses.pdf
Attachment 3: CDS_Channel_Test.zip
  16225   Fri Jun 25 14:06:10 2021 JonUpdateCDSFront-End Assembly and Testing

Summary

Here is the final summary (from me) of where things stand with the new front-end systems. With Anchal and Ian's recent scripted loopback testing [16224], all the testing that can be performed in isolation with the hardware on hand has been completed. We currently have no indication of any problem with the new hardware. However, the high-frequency signal integrity and noise testing remains to be done.

I detail those tests and link some DTT templates for performing them below. We have not yet received the Myricom 10G network card being sent from LHO, which is required to complete the standalone DAQ network. Thus we do not have a working NDS server in the test stand, so cannot yet run any of the usual CDS tools such as Diaggui. Another option would be to just connect the new front-ends to the 40m Martian/DAQ networks and test them there.

Final Hardware Configuration

Due to the unavailablity of the 18-bit DACs that were expected from the sites, we elected to convert all the new 18-bit AO channels to 16-bit. I was able to locate four unused 16-bit DACs around the 40m [16185], with three of the four found to be working. I was also able to obtain three spare 16-bit DAC adapter boards from Todd Etzel. With the addition of the three working DACs, we ended up with just enough hardware to complete both systems.

The final configuration of each I/O chassis is as follows. The full setup is pictured in Attachment 1.

  C1BHD C1SUS2
Component Qty Installed Qty Installed
16-bit ADC 1 2
16-bit ADC adapter 1 2
16-bit DAC 1 3
16-bit DAC adapter 1 3
16-channel BIO 1 1
32-channel BO 0 6

This hardware provides the following breakdown of channels available to user models:

  C1BHD C1SUS2
Channel Type Channel Count Channel Count
16-bit AI* 31 63
16-bit AO 16 48
BO 0 192

*The last channel of the first ADC is reserved for timing diagnostics.

The chassis have been closed up and their permanent signal cabling installed. They do not need to be reopened, unless future testing finds a problem.

RCG Model Configuration

An IOP model has been created for each system reflecting its final hardware configuration. The IOP models are permanent and system-specific. When ready to install the new systems, the IOP models should be copied to the 40m network drive and installed following the RCG-compilation procedure in [15979]. Each system also has one temporary user model which was set up for testing purposes. These user models will be replaced with the actual SUS, OMC, and BHD models when the new systems are installed.

The current RCG models and the action to take with each one are listed below:

Model Name Host CPU DCUID Path (all paths local to chiara clone machine) Action
c1x06 c1bhd 1 23 /opt/rtcds/userapps/release/cds/c1/models/c1x06.mdl Copy to same location on 40m network drive; compile and install
c1x07 c1sus2 1 24 /opt/rtcds/userapps/release/cds/c1/models/c1x07.mdl Copy to same location on 40m network drive; compile and install
c1bhd c1bhd 2 25 /opt/rtcds/userapps/release/isc/c1/models/c1bhd.mdl Do not copy; replace with permanent OMC/BHD model(s)
c1su2 c1su2 2 26 /opt/rtcds/userapps/release/sus/c1/models/c1su2.mdl Do not copy; replace with permanent SUS model(s)

Each front-end can support up to four user models.

Future Signal-Integrity Testing

Recently, the CDS group has released a well-documented procedure for testing General Standards ADC and DACs: T2000188. They've also automated the tests using a related set of shell scripts (T2000203). Unfortnately I don't believe these scripts will work at the 40m, as they require the latest v4.x RCG.

However, there is an accompanying set of DTT templates that could be very useful for accelerating the testing. They are available from the LIGO SVN (log in with username: "first.last@LIGO.ORG"). I believe these can be used almost directly, with only minor updates to channel names, etc. There are two classes of DTT-templated tests:

  1. DAC -> ADC loopback transfer functions
  2. Voltage noise floor PSD measurements of individual cards

The T2000188 document contains images of normal/passing DTT measurements, as well as known abnormalities and failure modes. More sophisticated tests could also be configured, using these templates as a guiding example.

Hardware Reordering

Due to the unexpected change from 18- to 16-bit AO, we are now short on several pieces of hardware:

  • 16-bit AI chassis. We originally ordered five of these chassis, and all are obligated as replacements within the existing system. Four of them are now (temporarily) in use in the front-end test stand. Thus four of the new 18-bit AI chassis will need to be retrofitted with 16-bit hardware.
  • 16-bit DACs. We currently have exactly enough DACs. I have requested a quote from General Standards for two additional units to have as spares.
  • 16-bit DAC adapters. I have asked Todd Etzel for two additional adapter boards to also have as spares. If no more are available, a few more should be fabricated.
Attachment 1: test_stand.JPG
test_stand.JPG
  16230   Wed Jun 30 14:09:26 2021 Ian MacMillanUpdateCDSSUS simPlant model

I have looked at my code from the previous plot of the transfer function and realized that there is a slight error that must be fixed before we can analyze the difference between the theoretical transfer function and the measured transfer function.

The theoretical transfer function, which was generated from Photon has approximately 1000 data points while the measured one has about 120. There are no points between the two datasets that have the same frequency values, so they are not directly comparable. In order to compare them I must infer the data between the points. In the previous post [16195] I expanded the measured dataset. In other words: I filled in the space between points linearly so that I could compare the two data sets. Using this code:

#make values for the comparison
tck_mag = splrep(tst_f, tst_mag) # get bspline representation given (x,y) values
gen_mag = splev(sim_f, tck_mag) # generate intermediate values
dif_mag=[]
for x in range(len(gen_mag)):
    dif_mag.append(gen_mag[x]-sim_mag[x]) # measured minus predicted

tck_ph = splrep(tst_f, tst_ph) # get bspline representation given (x,y) values
gen_ph = splev(sim_f, tck_ph) # generate intermediate values
dif_ph=[]
for x in range(len(gen_ph)):
    dif_ph.append(gen_ph[x]-sim_ph[x])

At points like a sharp peak where the measured data set was sparse compared to the peak, the difference would see the difference between the intermediate “measured” values and the theoretical ones, which would make the difference much higher than it really was.

To fix this I changed the code to generate the intermediate values for the theoretical data set. Using the code here:

tck_mag = splrep(sim_f, sim_mag) # get bspline representation given (x,y) values
gen_mag = splev(tst_f, tck_mag) # generate intermediate values
dif_mag=[]
for x in range(len(tst_mag)):
    dif_mag.append(tst_mag[x]-gen_mag[x])#measured minus predicted

tck_ph = splrep(sim_f, sim_ph) # get bspline representation given (x,y) values
gen_ph = splev(tst_f, tck_ph) # generate intermediate values
dif_ph=[]
for x in range(len(tst_ph)):
    dif_ph.append(tst_ph[x]-gen_ph[x])

Because this dataset has far more values (about 10 times more) the previous problem is not such an issue. In addition, there is never an inferred measured value used. That makes it more representative of the true accuracy of the real transfer function.

This is an update to a previous plot, so I am still using the same data just changing the way it is coded. This plot/data does not have a Q of 1000. That plot will be in a later post along with the error estimation that we talked about in this week’s meeting.

The new plot is shown below in attachment 1. Data and code are contained in attachment 2

Attachment 1: SingleSusPlantTF.pdf
SingleSusPlantTF.pdf
Attachment 2: Plant_TF_Test.zip
  16243   Fri Jul 9 18:35:32 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

Following Koji's channel list review, we made changes to the wiring spreadsheet.

Today, I made the changes real in the Acromag chassis. I went through the channel list one by one and made sure it is wired correctly. Additionally, since we now need all the channels the existing isolators have, I replaced the isolator with the defective channel with a new one.

The things to do next:

1. Create entries for the spare coil driver and satellite box channels in the EPICs DB.

2. Test the spare channels.

  16244   Mon Jul 12 18:06:25 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

I edited /cvs/cds/caltech/target/c1auxey1/ETMYaux.db (after creating a backup) and added the spare coil driver channels.

I tested those channels using caget while fixing wiring issues. The tests were all succesful. The digital output channel were tested using the Windows machine since they are locked by some EPICs mechanism I don't yet understand.

One worrying point is I found that the differential analog inputs to be unstable unless I connected a reference to some stable voltage source unlike previous tests showed. It was unstable (but less) even when I connected the ref to the ground connectors on the power supplies on the workbench. This is really puzzling.

When I say unstable I mean that most of the time the voltage reading shows the right value, but occasionly there is a transient sharp volage drop of the order of 0.5V. I will do a more quantitative analysis tomorrow.

 

  16263   Wed Jul 28 12:47:52 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

To simulate a differential output I used two power supplies connected in series. The outer connectors were used as the outputs and the common connector was connected to the ground and used as a reference. I hooked these outputs to one of the differential analog channels and measured it over time using Striptool. The setup is shown in attachment 3.

I tested two cases: With reference disconnected (attachment 1), and connected (attachment 2). Clearly, the non-referred case is way too noisy.

Attachment 1: SUS-ETMY_SparePDMon0_NoRef.png
SUS-ETMY_SparePDMon0_NoRef.png
Attachment 2: SUS-ETMY_SparePDMon0_Ref_WithGND.png
SUS-ETMY_SparePDMon0_Ref_WithGND.png
Attachment 3: DifferentialOutputTest.png
DifferentialOutputTest.png
  16276   Wed Aug 11 12:06:40 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

I redid the differential input experiment using the DS360 function generator we recently got. I generated a low frequency (0.1Hz) sine wave signal with an amplitude 0.5V and connected the + and - output to a differential input on the new c1auxcey Acromag chassis. I recorded a time series of the corresponding EPICS channel with and without the common on the DS360 connected to the Ref connector on the Acromag unit. The common connector on the DS360 is not normally grounded (there is a few tens of kohms between the ground and common connectors). The attachment shows that, indeed, the analog input readout is extremely noisy with the Ref being disconnected. The point where the Ref was connected to common is marked in the picture.

Conclusion: Ref connector on the analog input Acromag units must be connected to some stable voltage source for normal operation.

Attachment 1: SUS-ETMY_SparePDMon0_2.png
SUS-ETMY_SparePDMon0_2.png
  16280   Mon Aug 16 23:30:34 2021 PacoUpdateCDSAS WFS commissioning; restarting models

[koji, ian, tega, paco]

With the remote/local assistance of Tega/Ian last friday I made changes on the c1sus model by connecting the C1:ASC model outputs (found within a block in c1ioo) to the BS and PRM suspension inputs (pitch and yaw). Then, Koji reviewed these changes today and made me notice that no changes are actually needed since the blocks were already in place, connected in the right ports, but the model probably just wasn't rebuilt...

So, today we ran "rtcds make", "rtcds install" on the c1ioo and c1sus models (in that order) but the whole system crashed. We spent a great deal of time restarting the machines and their processes but we struggled quite a lot with setting up the right dates to match the GPS times. What seemed to work in the end was to follow the format of the date in the fb1 machine and try to match the timing to the sub-second level. This is especially tricky when performed by a human action so the whole task is tedious. We anyways completed the reboot for almost all the models except the c1oaf (which tends to make things crashy) since we won't need it right away for the tasks ahead. One potential annoying issue we found was in manually rebooting c1iscey because one of its network ports is loose (the ethernet cable won't click in place) and it appears to use this link to boot (!!) so for a while this machine just wasn't coming back up.

Finally, as we restored the suspension controls and reopened the shutters, we noticed a great deal of misalignment to the point no reflected beam was coming back to the RFPD table. So we spent some time verifying the PRM alignment and TT1 and TT2 (tip tilts) and it turned out to be mostly the latter pair that were responsible for it. We used the green beams to help optimize the XARM and YARM transmissions and were able to relock the arms. We ran ASS on them, and then aligned the PRM OpLevs which also seemed off. This was done by giving a pitch offset to the input PRM oplev beam path and then correcting for it downstream (before the qpd). We also adjusted the BS OpLev in the end.


Summary; the ASC BS and PRM outputs are now built into the SUS models. Let the AS WFS loops be closed soon!


Addenda by KA
- Upon the RTS restarting,

  • Date/Time adjustment
    sudo date --set='xxxxxx'
  • If the time on the CDS status medm screen for each IOP match with the FB local time, we ran
    rtcds start c1x01
    (or c1x02, etc)
  • Every time we restart the IOPs, fb was restarted by
    telnet fb1 8083
    > shutdown

    and restarted mx_stream from the CDS screen because these actions change the "DC" status.

- Today we once succeeded to restart the vertex machines. However, the RFM signal transmission did fail. So the end two machines were power cycled as well as c1rfm, but this made all the machines in RED again. Hell...

- We checked the PRM oplev. The spot was around the center but was clipped. This made us so confused. Our conclusion was that the oplev was like that before the RTS reboot.

  16283   Thu Aug 19 03:23:00 2021 AnchalUpdateCDSTime synchornization not running

I tried to read a bit and understand the NTP synchronization implementation in FE computers. I'm quite sure that NTP synchronization should be 'yes' if timesyncd are running correctly in the output of timedatectl in these computers. As Koji reported in 15791, this is not the case. I logged into c1lsc, c1sus and c1ioo and saw that RTC has drifted from the software clocks too which does not happen if NTP synchronization was active. This would mean that almost certainly, if the computers are rebooted, the synchronization will be lost and the models will fail to come online.

My current findings are the following (this should be documented in wiki once we setup everything):

  • nodus is running a NTP server using chronyd. One can check the configuration of this NTP serer in /etc/chornyd.conf
  • fb1 is running an NTP server using ntpd that follows nodus and an IP address 131.215.239.14. This can be seen in /etc/ntp.conf.
  • There are no comments to describe what this other server (131.215.239.14) is. Does the GC network have an NTP server too?
  • c1lsc, c1sus and c1ioo all have systemd-timesyncd.service running with configuration file in /etc/systemd/timesyncd.conf.
  • The configuration file set Servers=ntpserver but echo $ntpserver produces nothing (blank) on these computers and I've been unable to find anyplace where ntpserver is defined.
  • In chiara (our name server), the name server file /etc/hosts does not have any entry for ntpserver either.
  • I think the problem might be that these computers are unable to find the ntpserver as it is not defined anywhere.

The solution to this issue could be as simple as just defining ntpserver in the name server list. But I'm not sure if my understanding of this issue is correct. Comments/suggestions are welcome for future steps.

 

  16284   Thu Aug 19 14:14:49 2021 KojiUpdateCDSTime synchornization not running

131.215.239.14 looks like Caltech's NTP server (ntp-02.caltech.edu)
https://webmagellan.com/explore/caltech.edu/28415b58-837f-4b46-a134-54f4b81bee53

I can't say it is correct or not as I did not make the survey at your level. I think you need a few tests of reconfiguring and restarting the NTP clients to see if time synchronization starts. Because the local time is not regulated right now anyway, this operation is safe I think.

 

  16285   Fri Aug 20 00:28:55 2021 AnchalUpdateCDSTime synchornization not running

I added ntpserver as a known host name for address 192.168.113.201 (fb1's address where ntp server is running) in the martian host list in the following files in Chiara:

/var/lib/bind/martian.hosts
/var/lib/bind/rev.113.168.192.in-addr.arpa

Note: a host name called ntp was already defined at 192.168.113.11 but I don't know what computer this is.

Then, I restarted the DNS on chiara by doing:

sudo service bind9 restart

Then I logged into c1lsc and c1ioo and ran following:

controls@c1ioo:~ 0$ sudo systemctl restart systemd-timesyncd.service

controls@c1ioo:~ 0$ sudo systemctl status systemd-timesyncd.service -l
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
   Active: active (running) since Fri 2021-08-20 07:24:03 UTC; 53s ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 23965 (systemd-timesyn)
   Status: "Idle."
   CGroup: /system.slice/systemd-timesyncd.service
           └─23965 /lib/systemd/systemd-timesyncd

Aug 20 07:24:03 c1ioo systemd[1]: Starting Network Time Synchronization...
Aug 20 07:24:03 c1ioo systemd[1]: Started Network Time Synchronization.
Aug 20 07:24:03 c1ioo systemd-timesyncd[23965]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 07:24:35 c1ioo systemd-timesyncd[23965]: Using NTP server 192.168.113.201:123 (ntpserver).
controls@c1ioo:~ 0$ timedatectl
      Local time: Fri 2021-08-20 07:25:28 UTC
  Universal time: Fri 2021-08-20 07:25:28 UTC
        RTC time: Fri 2021-08-20 07:25:31
       Time zone: Etc/UTC (UTC, +0000)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

The same output is shown in c1lsc too. The NTP synchronized flag in output of timedatectl command did not change to yes and the RTC is still 3 seconds ahead of the local clock.

Then I went to c1sus to see what was the status output before rstarting the timesyncd service. I got folloing output:

controls@c1sus:~ 0$ sudo systemctl status systemd-timesyncd.service -l
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
   Active: active (running) since Tue 2021-08-17 04:38:03 UTC; 3 days ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 243 (systemd-timesyn)
   Status: "Idle."
   CGroup: /system.slice/systemd-timesyncd.service
           └─243 /lib/systemd/systemd-timesyncd

Aug 20 02:02:18 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 02:36:27 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 03:10:35 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 03:44:43 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 04:18:51 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 04:53:00 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 05:27:08 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 06:01:16 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 06:35:24 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 07:09:33 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).

This actually shows that the service was able to find ntpserver correctly at 192.168.113.201 even before I changed the name server file in chiara. So I'm retracting the changes made to name server. They are probably not required.

The configuration files for timesynd.conf are read only even with sudo. I tried changing permissions but that did not work either. Maybe these files are not correctly configured. The man page of timesyncd  says to use field 'NTP' to give the ntp servers. Our files are using field 'Servers'. But since we are not getting any error message, I don't think this is the issue here.

I'll look more into this problem.

  16286   Fri Aug 20 06:24:18 2021 AnchalUpdateCDSTime synchornization not running

I read on some stack exchange that 'NTP synchornized' indicator turns 'yes' in the output of command timedatectl only when RTC clock has been adjusted at some point. I also read that timesyncd does not do the change if the time difference is too much, roughly more than 3 seconds.

So I logged into all FE machines and ran sudo hwclock -w to synchronize them all to the system clocks and then waited if the timesyncd does any correction on RTC. It did not. A few hours later, I found the RTC clocks drifitng again from the system clocks. So even if the timesynd service is running as it should, it si not performing time correction for whatever reason.

Maybe we should try to use some other service?

Quote:
 

The NTP synchronized flag in output of timedatectl command did not change to yes and the RTC is still 3 seconds ahead of the local clock.

 

  16289   Mon Aug 23 15:25:59 2021 Ian MacMillanUpdateCDSSUS simPlant model

I am adding a State-space block to the SimPlant cds model using the example Chris gave. I made a new folder in controls called SimPlantStateSpace. wI used the code below to make a state-space LTI model with a 1D pendulum then I converted it to a discrete system using c2d matlab function. Then I used these in the rtss.m file to create the state space code I need in the SimPlantStateSpace_1D_model.h file.

%sys_model.m

Q = 1000;
phi = 1/Q;
g = 9.806;
m = 0.24; % mass of pendulum
l = 0.248; %length of pendulum
w_0 = sqrt(g/l);

f=16000 %this is the frequency of the channel that will be used

A = [0 1; -w_0^2*(1+1/Q*1i) -w_0/Q]
B = [0; 1/m];
C = [1 0];
D = [0];
sys_dc = ss(A,B,C,D)

sys=c2d(sys_dc, 1/f)

This code outputs the discrete state space that is added to the header file attached.

Attachment 1: SimPlantStateSpace.zip
  16294   Tue Aug 24 18:44:03 2021 KojiUpdateCDSFB is writing the frames with a year old date

Dan Kozak pointed out that the new frame files of the 40m has not been written in 2021 GPS time but 2020 GPS time.

Current GPS time is 1313890914 (or something like that), but the new files are written as C-R-1282268576-16.gwf

I don't know how this can happen but this may explain why we can't have the agreement between the FB gps time and the RTS gps time.

(dataviewer seems dependent on the FB GPS time and it indicates 2020 date. DTT/diaggui does not.)


This is the way to check the gpstime on fb1. It's apparently a year off.

controls@fb1:~ 0$ cat /proc/gps
1282269402.89

Attachment 1: Screen_Shot_2021-08-24_at_18.46.24.png
Screen_Shot_2021-08-24_at_18.46.24.png
  16297   Wed Aug 25 11:48:48 2021 YehonathanUpdateCDSc1auxey assembly

After confirming that, indeed, leaving the RTN connection floating can cause reliability issues we decided to make these connections in the c1auxex analog input units.

According to Johannes' wiring scheme (excluding the anti-image and OPLEV since they are decommissioned), Acromag unit 1221b accepts analog inputs from two modules. All of these channels are single-ended according to their schematics.

One option is to use the Acromag ground and connect it to the RTNs of both 1221b and 1221c. Another is to connect the minus wire of one module, which is tied to the module's ground, to the RTN. We shouldn't tie the grounds of the different modules together by connecting them to the same RTN point.

We should take some OSEM spectra of the X end arm before and after this work to confirm we didn't produce more noise by doing so. Right now, it is impossible due to issues caused by the recent power surge.

Quote:

{Yehonathan, Jon}

We poked (looked in situ with a flashlight, not disturbing any connections) around c1auxex chassis to understand better what is the wiring scheme.

To our surprise, we found that nothing was connected to the RTNs of the analog input Acromag modules. From previous experience and the Acromag manual, there can't be any meaningful voltage measurement without it.

 

  16298   Wed Aug 25 17:31:30 2021 PacoUpdateCDSFB is writing the frames with a year old date

[paco, tega, koji]

After invaluable assistance from Jamie in fixing this yearly offset in the gps time reported by cat /proc/gps, we managed to restart the real time system correctly (while still manually synchronizing the front end machine times). After this, we recovered the mode cleaner and were able to lock the arms with not much fuss.

Nevertheless, tega and I noticed some weird noise in the C1:LSC-TRX_OUT which was not present in the YARM transmission, and that is present even in the absence of light (we unlocked the arms and just saw it on the ndscope as shown in Attachment #1). It seems to affect the XARM and in general the lock acquisition...

We took some quick spectrum with diaggui (Attachment #2) but it doesn't look normal; there seems to be broadband excess noise with a remarkable 1 kHz component. We will probably look into it in more detail.

Attachment 1: TRX_noise_2021-08-25_17-40-55.png
TRX_noise_2021-08-25_17-40-55.png
Attachment 2: TRX_TRY_power_spectra.pdf
TRX_TRY_power_spectra.pdf
  16299   Wed Aug 25 18:20:21 2021 JamieUpdateCDSGPS time on fb1 fixed, dadq writing correct frames again

I have no idea what happened to the GPS timing on fb1, but it seems like the issue was coincident with the power glitch on Monday.

As was noted by Koji above, the GPS time kernel interface was off by a year, which was causing the frame builder to write out files with the wrong names.  fb1 was using DAQD components from the advligorts 3.3 release, which used the old "symmetricom" kernel module for the GPS time.  This old module was also known to have issues with time offsets.  This issue is remniscent of previous timing issues with the DAQ on fb1.

I noted that a newer version of the advligorts, version 3.4, was available on debian jessie, the system running on fb1.  advligorts 3.4 includes a newer version of the GPS time module, renamed gpstime.  I checked with Jonathan Hanks that the interfaces did not change between 3.3 and 3.4, and 3.4 was mostly a bug fix and packaging release, so I decided to upgrade the DAQ to get the new components.  I therefore did the following

  • updated the archive info in /etc/apt/sources.list.d/cdssoft.list, and added the "jessie-restricted" archive which includes the mx packages: https://git.ligo.org/cds-packaging/docs/-/wikis/home

  • removed the symmetricom module from the kernel

    sudo rmmod symmetricom

  • upgraded the advligorts-daqd components (NOTE I did not upgrade the rest of the system, although there are outstanding security upgrades needed):

    sudo apt install advligorts-daqd advligorts-daqd-dc-mx

  • loaded the new gpstime module and checked that the GPS time was correct:

    sudo modprobe gpstime

  • restarted all the daqd processes

    sudo systemctl restart daqd_*

Everything came up fine at that point, and I checked that the correct frames were being written out.

  16300   Thu Aug 26 10:10:44 2021 PacoUpdateCDSFB is writing the frames with a year old date

[paco, ]

We went over the X end to check what was going on with the TRX signal. We spotted the ground terminal coming from the QPD is loosely touching the handle of one of the computers on the rack. When we detached it completely from the rack the noise was gone (attachment 1).

We taped this terminal so it doesn't touch anything accidently. We don't know if this is the best solution since it is probably needs a stable voltage reference. In the Y end those ground terminals are connected to the same point on the rack. The other ground terminals in the X end are just cut.

We also took the PSD of these channels (attachment 2). The noise seem to be gone but TRX is still a bit noisier than TRY. Maybe we should setup a proper ground for the X arm QPD?


We saw that the X end station ALS laser was off. We turned it on and also the crystal oven and reenabled the temperature controller. Green light immidiately appeared. We are now working to restore the ALS lock. After running XARM ASS we were unable to lock the green laser so we went to the XEND and moved the piezo X ALS alignment mirrors until we maximized the transmission in the right mode. We then locked the ALS beams on both arms successfully. It very well could be that the PZT offsets were reset by the power glitch. The XARM ALS still needs some tweaking, its level is ~ 25% of what it was before the power glitch.

Attachment 1: Screenshot_from_2021-08-26_10-09-50.png
Screenshot_from_2021-08-26_10-09-50.png
Attachment 2: TRXTRY_Spectra.pdf
TRXTRY_Spectra.pdf
  16302   Thu Aug 26 10:30:14 2021 JamieConfigurationCDSfront end time synchronization fixed?

I've been looking at why the front end NTP time synchronization did not seem to be working.  I think it might not have been working because the NTP server the front ends were point to, fb1, was not actually responding to synchronization requests.

I cleaned up some things on fb1 and the front ends, which I think unstuck things.

On fb1:

  • stopped/disabled the default client (systemd-timesyncd), and properly installed the full NTP server (ntp)
  • the ntp server package for debian jessie is old-style sysVinit, not systemd.  In order to make it more integrated I copied the auto-generated service file to /etc/systemd/system/ntp.service, and added and "[install]" section that specifies that it should be available during the default "multi-user.target".
  • "enabled" the new service to auto-start at boot ("sudo systemctl enable ntp.service") 
  • made sure ntp was configured to serve the front end network ('broadcast 192.168.123.255') and then restarted the server ("sudo systemctl restart ntp.service")

For the front ends:

  • on fb1 I chroot'd into the front-end diskless root (/diskless/root) and manually specifed that systemd-timesyncd should start on boot by creating a symlink to the timesyncd service in the multi-user.target directory:
$ sudo chroot /diskless/root
$ cd /etc/systemd/system/multi-user.target.wants
$ ln -s /lib/systemd/system/systemd-timesyncd.service
  • on the front end itself (c1iscex as a test) I did a "systemctl daemon-reload" to force it to reload the systemd config, and then restarted the client ("systemctl restart systemd-timesyncd")
  • checked the NTP synchronization with timedatectl:
controls@c1iscex:~ 0$ timedatectl 
      Local time: Thu 2021-08-26 11:35:10 PDT
  Universal time: Thu 2021-08-26 18:35:10 UTC
        RTC time: Thu 2021-08-26 18:35:10
       Time zone: America/Los_Angeles (PDT, -0700)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: yes
 Last DST change: DST began at
                  Sun 2021-03-14 01:59:59 PST
                  Sun 2021-03-14 03:00:00 PDT
 Next DST change: DST ends (the clock jumps one hour backwards) at
                  Sun 2021-11-07 01:59:59 PDT
                  Sun 2021-11-07 01:00:00 PST
controls@c1iscex:~ 0$ 

Note that it is now reporting "NTP enabled: yes" (the service is enabled to start at boot) and "NTP synchronized: yes" (synchronization is happening), neither of which it was reporting previously.  I also note that the systemd-timesyncd client service is now loaded and enabled, is no longer reporting that it is in an "Idle" state and is in fact reporting that it synchronized to the proper server, and it is logging updates:

controls@c1iscex:~ 0$ sudo systemctl status systemd-timesyncd
â—Ć systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
   Active: active (running) since Thu 2021-08-26 10:20:11 PDT; 1h 22min ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 2918 (systemd-timesyn)
   Status: "Using Time Server 192.168.113.201:123 (ntpserver)."
   CGroup: /system.slice/systemd-timesyncd.service
           â””─2918 /lib/systemd/systemd-timesyncd

Aug 26 10:20:11 c1iscex systemd[1]: Started Network Time Synchronization.
Aug 26 10:20:11 c1iscex systemd-timesyncd[2918]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 26 10:20:11 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+26ppm
Aug 26 10:21:15 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 128s/-0.000s/0.000s/0.000s/+25ppm
Aug 26 10:23:23 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 256s/+0.001s/0.000s/0.000s/+26ppm
Aug 26 10:27:40 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 512s/+0.003s/0.000s/0.001s/+29ppm
Aug 26 10:36:12 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 1024s/+0.008s/0.000s/0.003s/+33ppm
Aug 26 10:53:16 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 2048s/-0.026s/0.000s/0.010s/+27ppm
Aug 26 11:27:24 c1iscex systemd-timesyncd[2918]: interval/delta/delay/jitter/drift 2048s/+0.009s/0.000s/0.011s/+29ppm
controls@c1iscex:~ 0$ 

So I think this means everything is working.

I then went ahead and reloaded and restarted the timesyncd services on the rest of the front ends.

We still need to confirm that everything comes up properly the next time we have an opportunity to reboot fb1 and the front ends (or the opportunity is forced upon us).

There was speculation that the NTP clients on the front ends (systemd-timesyncd) would not work on a read-only filesystem, but this doesn't seem to be true.  You can't trust everything you read on the internet.

  16309   Thu Sep 2 19:47:38 2021 KojiUpdateCDSThis week's FB1 GPS Timing Issue Solved

After the reboot daqd_dc was not working, but manual starting of open-mx / mx services solved the issue.

sudo systemctl start open-mx.service
sudo systemctl start mx.service
sudo systemctl start daqd_*

 

  16310   Thu Sep 2 20:44:18 2021 KojiUpdateCDSChiara DHCP restarted

We had the issue of the RT machines rebooting. Once we hooked up the display on c1iscex, it turned out that the IP was not given at it's booting-up.

I went to chiara and confirmed that the DHCP service was not running

~>sudo service isc-dhcp-server status
[sudo] password for controls:
isc-dhcp-server stop/waiting

So the DHCP service was manually restarted

~>sudo service isc-dhcp-server start
isc-dhcp-server start/running, process 24502
~>sudo service isc-dhcp-server status
isc-dhcp-server start/running, process 24502

 

 

  16311   Thu Sep 2 20:47:19 2021 KojiUpdateCDSChiara DHCP restarted

[Paco, Tega, Koji]

Once chiara's DHCP is back, things got much more straight forward.
c1iscex and c1iscey were rebooted and the IOPs were launched without any hesitation.

Paco ran rebootC1LSC.sh and for the first time in this year we had the launch of the processes without any issue.

  16321   Mon Sep 13 14:32:25 2021 YehonathanUpdateCDSc1auxey assembly

So we agreed that the RTNs points on the c1auxex Acromag chassis should just be grounded to the local Acromag ground as it just needs a stable reference. Normally, the RTNs are not connected to any ground so there is should be no danger of forming ground loops by doing that. It is probably best to use the common wire from the 15V power supplies since it also powers the VME crate. I took the spectra of the ETMX OSEMs (attachment) for reference and proceeding with the grounding work.

 

Attachment 1: ETMX_OSEMS_Noise.png
ETMX_OSEMS_Noise.png
  16325   Tue Sep 14 15:57:05 2021 jamieFrogsCDSfb1 /var full after reboot, caused all sorts of problems

/var on fb1 filled up today, which caused all sorts of CDS issues.  I found out about the problem by reading the logs of the services that were having trouble running, in which they complained about not being able to write to disk.  I looked at the filesystem status with 'df' and noticed that /var was full, which is where applications write temporary data, and will always cause problems if it's full.

I tracked the issue down to multiple multi-gigabyte log files: /var/log/messages and /var/log/messages.1.  They were full of lines like this one:

Aug 29 06:25:21 fb1 kernel: l called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl called cmd = 1gpstime iotcl ca

Seems like something related to the gpstime kernel module?

Anyway, I deleted the log files for now, which cleared up the space on /var.  Things should be back to normal now, until the logs fill up again...

  16327   Tue Sep 14 16:44:54 2021 jamieFrogsCDSfb1 /var full after reboot, caused all sorts of problems

Jonathan Hanks pointed me to this fix to the gpstime kernel module that was unfortunately put in after the 3.4 release that we're currently using:

https://git.ligo.org/cds/advligorts/-/commit/6f6d6e2eb1d3355d0cbfe9fe31ea3b59af1e7348

I hacked the source in place (/usr/src/gpstime-3.4/drv/gpstime/gpstime.c) to get the fix, and then rebuilt the kernel module with dkms :

sudo dkms uninstall gpstime/3.4
sudo dkms install gpstime/3.4

I then stopped daqd_dc, unloaded gpstime, reloaded it, restarted daqd_dc.  The messages are no longer showing up in /var/log/messages, so I think we're ok for the moment.

NOTE: the fix will be undone if we for some reason reinstall the advligorts-gpstime-dkms package.  There shouldn't be a need to do that, but we should be aware.  I'm discussing with Jonathan if we want to try to push out a new debian package to fix this issue...

ELOG V3.1.3-