We finished the installation procedure on the c1auxey1 host machine. There were some adjustments that had to be made for Debian 10. The slow machine wiki page has been updated.
A test database file was made were all the channel names were changed from C1 to C2 in order to not interfere with the existing channels.
We starting testing the channels one by one to check the wiring and the EPICs software. We found some misswirings and fixed them.
Its getting late. I'll continue with the rest of the channels on Monday.
Notice that for all the AI channels the RTN was disconnected while testing.
When the cymac is started it gives me a list of channels shown below.
$ Initialized TP interface node=8, host=98e93ecffcca
$ Creating X1:DAQ-DC0_X1IOP_STATUS
$ Creating X1:DAQ-DC0_X1IOP_CRC_CPS
$ Creating X1:DAQ-DC0_X1IOP_CRC_SUM
$ Creating X1:DAQ-DC0_X1SUP_STATUS
$ Creating X1:DAQ-DC0_X1SUP_CRC_CPS
$ Creating X1:DAQ-DC0_X1SUP_CRC_SUM
But when I enter it into the Diaggui I get an error:
The following channel could not be found:
My guess is that need to connect to the Diaggui to something that can access those channels. I also need to figure out what those channels are.
It seemed like the BIO channels were not working, both the inputs and the outputs. The inputs were working on the windows machine though. That is, when we shorted the BIO channel to the return, or put 0V on it, we could see the LED turn on on the I/O testing screen and when we ramped up the voltage above 3 the LED turned off. This is the expected behavior from a sinking digital input. However, the EPICs caget didn't show any change. All the channels were stuck on Disabled.
We checked the digital outputs by connecting the channels to a fluke. Initially, the fluke showed 13V. We tried to toggle the digital output channels with caput and that didn't work. We checked the outputs with the windows software. For that, we needed to stop the Modbus. To our surprise, the windows software was not able to flip the channels either. We realized that this BIO Acromag unit is probably defective. We replaced it with a different unit and put a warning sticker on the defective unit. Now, the digital outputs were working as expected. When we turned them on the voltage output dropped to 0V. We checked the channels with the EPICs software. We realized that these channels were locked with the closed loop definition. We turned on the channels tied to these output channels (watchdog and toggles) and it worked. The output channels can be flipped with the EPICs software. We checked all the digital output channels and fixed some wiring issues along the way.
The digital input channels were still not working. This is a software issue that we will have to deal with later.
(Yehonathan) Rana noticed that the BNC leads on the chassis front panel didn't have isolation on them so I redid them with shrinking tubes.
With all the PCIe issues now resolved, yesterday I proceeded to build an IOP model for each of new FEs. I assigned them names and DCUIDs consist with the 40m convention, listed below. These models currently exist on only the cloned copy of /opt/rtcds running on the test stand. They will be copied to the main network disk later, once the new systems are fully tested.
The models compile and install successfully. The RCG runtime diagnostics indicate that all is working except for the timing synchronization and DAQD data transmission. This is as expected because neither of these have been set up yet.
The next step is to provide the 65 kHz clock signals from the timing fanout via LC optical fiber. I overlooked the fact that an SPX optical transceiver is required to interface the fiber to the timing slave board. These were not provided with the timing slaves we received. The timing slaves require a particular type of transceiver, 100base-FX/OC-3, which we did not have on hand. (For future reference, there is a handy list of compatible transceivers in E080541, p. 14.) I placed a Digikey order for two Finisar FTLF1217P2BTL, which should arrive within two days.
After a helpful meeting with Jon, we realized that I have somehow corrupted the sitemap file. So I am going to use the code Chris wrote to regenerate it.
Also, I am going to connect the controller using the IPC parts. The error that I was having before had to do with the IPC parts not being connected properly.
I added the IPC parts back to the plant model so that should be done now. It looks like this again here.
I can't seem to find the control model which should look like this. When I open sus_single_control.mdl, it just shows the C1_SUS_SINGLE_PLANT.mdl model. Which should not be the case.
When using mdl2adl I was getting the error:
$ cd /home/controls/mdl2adl
$ ./mdl2adl x1sup.mdl
error: set $site and $ifo environment variables
to set these in the terminal use the following commands:
$ export site=tst
$ export ifo=x1
On most of the systems, there is a script that automatically runs when a terminal is opened that sets these but that hasn't been added here so you must run these commands every time you open the terminal when you are using mdl2adl.
I copied c1scx.mdl to the docker to attach to the plant using the commands:
$ ssh nodus.ligo.caltech.edu
$ cd opt/rtcds/userapps/release/isc/c1/models/simPlant
$ scp c1scx.mdl controls@c1sim:/home/controls/docker-cymac/userapps
Today I brought and installed the new optical transceivers (Finisar FTLF1217P2BTL) for the two timing slaves. The timing slaves appear to phase-lock to the clocking signal from the master fanout. A few seconds after each timing slave is powered on, its status LED begins steadily blinking at 1 Hz, just as in the existing 40m systems.
However, some other timing issue remains unresolved. When the IOP model is started (on either FE), the DACKILL watchdog appears to start in a tripped state. Then after a few minutes of running, the TIM and ADC indicators go down as well. This makes me suspect the sample clocks are not really phase-locked. However, the models do start up with no error messages. Will continue to debug...
Did you match the local PC time with the GPS time?
Working with Chris, we decided that it is probably better to use a simple filter module as a controller before we make the model more complicated. I will use the plant model that I have already made (see attachment 1 of this). then attach a single control filter module to that: as seen in attachment 1. because I only want to work with one degree of freedom (position) I will average the four outputs which should give me the position. Then by feeding the same signal to all four inputs I should isolate one degree of freedom while still using the premade plant model.
The model I made that is shown in attachment 2 is the model I made from the plan. And it complies! yay! I think there is a better way to do the average than the way I showed. And since the model is feeding back on itself I think I need to add a delay which Rana noted a while ago. I think it was a UnitDelay (see page 41 of RTS Developer’s Guide). So I will add that if we run into problems but I think there is enough going on that it might already be delayed.
Since our model (x1sup_isolated.mdl) has compiled we can open the medm screens for it. I provide a procedure below which is based on Jon's post.
$ cd docker-cymac
$ eval $(./env_cymac)
$ medm -x /opt/rtcds/tst/x1/medm/x1sup_isolated/X1SUP_ISOLATED_GDS_TP.adl
To see a list of all medm screens use:
$ cd docker-cymac
# cd /opt/rtcds/tst/x1/medm/x1sup_isolated
Some of the other useful ones are:
See attachment 4. This screen shows the POS plant filter module that will be filled by the filter representing the transfer function of a damped harmonic oscillator:
THIS TF HAS BEEN UPDATED SEE NEXT POST
The first one of these screens that are of interest to us (shown in attachment 3) is the X1SUP_ISOLATED_GDS_TP.adl screen, which is the CDS runtime diagnostics screen. This screen tells us "the success/fail state of the model and all its dependencies." I am still figuring out these screens and the best guide is T1100625.
The next step is taking some data and seeing if I can see the position damp over time. To do this I need to:
The transfer function given in the previous post was slightly incorrect the units did not make sense the new function is:
I have attached a quick derivation below in attachment 1
The plant transfer function of the pendulum in the s domain is:
Using Foton to make a plot of the TF needed and using m=40kg, w0=3Hz, and Q=50 (See attachment 1). It is easiest to enter the above filter using RPoly and saved it as Plant_V1
The new HAM-A coil drivers have a single DB9 connector for all the binary inputs. This requires that the dewhitening switching signals from the fast system be spliced with the coil enable signals from c1auxey. There is a common return for all the binary inputs. To avoid directly connecting the grounds of the two systems, I have looked for a suitable opto-isolator for the c1auxey signals.
I best option I found is the Ocean Controls KTD-258, a 4-channel, DIN-rail-mounted opto-isolator supporting input/output voltages of up to 30 V DC. It is an active device and can be powered using the same 15 V supply as is currently powering both the Acromags and excitation. I ordered one unit to be trialed in c1auxey. If this is found to be good solution, we will order more for the upgrades of c1auxex and c1susaux, as required for compatibility with the new suspension electronics.
I have received the opto-isolator needed to complete the new c1auxey system. I left it sitting on the electronics bench next to the Acromag chassis.
Here is the manufacturer's wiring manual. It should be wired to the +15V chassis power and to the common return from the coil driver, following the instructions herein for NPN-style signals. Note that there are two sets of DIP switches (one on the input side and one on the output side) for selecting the mode of operation. These should all be set to "NPN" mode.
An update on recent progress in the lab towards building and testing the new FEs.
The previously reported problem with the IOPs losing sync after a few minutes (16130) was resolved through a change in BIOS settings. However, there are many required settings and it is not trivial to get these right, so I document the procedure here for future reference.
The CDS group has a document (T1300430) listing the correct settings for each type of motherboard used in aLIGO. All of the machines received from LLO contain the oldest motherboards: the Supermicro X8DTU. Quoting from the document, the BIOS must be configured to enforce the following:
• Remove hyper-threading so the CPU doesn’t try to run stuff on the idle core, as hyperthreading simulate two cores for every physical core.
• Minimize any system interrupts from hardware, such as USB and Serial Ports, that might get through to the ‘idled’ core. This is needed on the older machines.
• Prevent the computer from reducing the clock speed on any cores to ‘save power’, etc. We need to have a constant clock speed on every ‘idled’ CPU core.
I generally followed the T1300430 instructions but found a few adjustments were necessary for diskless and deterministic operation, as noted below. The procedure for configuring the FE BIOS is as follows:
After completing the BIOS setup, I rebooted the new FEs about six times each to make sure the configuration was stable (i.e., would never hang during boot).
With the timing issue resolved, I proceeded to build basic user models for c1bhd and c1sus2 for testing purposes. Each one has a simple structure where M ADC inputs are routed through IIR filters to an M x N output matrix, which forms linear signal combinations that are routed to N DAC outputs. This is shown in Attachment 1 for the c1bhd case, where the signals from a single ADC are conditioned and routed to a single 18-bit DAC. The c1sus2 case is similar; however the Contec BO modules still needed to be added to this model.
The FEs are now running two models each: the IOP model and one user model. The assigned parameters of each model are documented below.
The user models were compiled and installed following the previously documented procedure (15979). As shown in Attachment 2, all the RTS processes are now working, with the exception of the DAQ server (for which we're still awaiting hardware). Note that these models currently exist only on the cloned copy of the /opt/rtcds disk running on the test stand. The plan is to copy these models to the main 40m disk later, once the new FEs are ready to be installed.
I installed several new AA and AI chassis in the test stand to interface with the ADC and DAC cards. This includes three 16-bit AA chassis, one 16-bit AI chassis, and one 18-bit AI chassis, as pictured in Attachment 3. All of the AA/AI chassis are powered by one of the new 15V DC power strips connected to a bench supply, which is housed underneath the computers as pictured in Attachment 4.
These chassis have not yet been tested, beyond verifying that the LEDs all illuminate to indicate that power is present.
I was able to measure the transfer function of the plant filter module from the channel X1:SUP-C1_SUS_SINGLE_PLANT_Plant_POS_Mod_EXC to X1:SUP-C1_SUS_SINGLE_PLANT_Plant_POS_Mod_OUT. The resulting transfer function is shown below. I have also attached the raw data for making the graph.
Next, I will make a script that will make the photon filters for all the degrees of freedom and start working on the matrix version of the filter module so that there can be multiple degrees of freedom.
As Jon wrote we need to use the NPN configuration (see attachments). I tested the isolator channels in the following way:
1. I connected +15V from the power supply to the input(+) contact.
2. Signal wire from one of the digital outputs was connected to I1-4
3. When I set the digital output to HIGH, the LED on the isolator turns on.
4. I measure the resistance between O1-4 to output(-) and find it to be ~ 100ohm in the HIGH state and an open circuit in the LOW state, as expected from an open collector output.
Unlike the Acromag output, the isolator output is not pulled up in the LOW state. To do so we need to connect +15V to the output channel through a pull-up resistor. For now, I leave it with no pull-up. According to the schematics of the HAM-A Coil Driver, the digital output channels drive an electromagnetic relay (I think) so it might not need to be pulled up to switch back. I'm not sure. We will need to check the operation of these outputs at the installation.
During the testing of the isolator outputs pull-up, I accidentally ran a high current through O2, frying it dead. It is now permanently shorted to the + and - outputs rendering it unusable. In any case, we need another isolator since we have 5 channels we need to isolate.
I mounted the isolator on the DIN rail and started wiring the digital outputs into it. I connected the GND from the RTS to output(-) such that when the digital outputs are HIGH the channels in the coil driver will be sunk into the RTS GND and not the slow one avoiding GND contamination.
- Could you explain what is the blue thing in Attachment 1?
- To check the validity of the signal chain, can you make a diagram summarizing the path from the fast BO - BO I/F - Acromag - This opto-isolator - the coil driver relay? (Cut-and-paste of the existing schematics is fine)
I made a diagram (Attached). I think it explains the blue thing in the previous post.
I don't know what is the grounding situation in the RTS so I put a ground in both the coil driver and the RTS. Hopefully, only one of them is connected in reality.
I mounted the optoisolator on the DIN rail and connected the 3 first channels
to the optoisolator inputs 1,3,4 respectively. I connected the +15V input voltage into the input(+) of the optoisolator.
The outputs were connected to DB9F-2 where those channels were connected before.
I added DB9F-1 to the front panel to accept channels from the RTS. I connected the fast channels to connectors 1,2,3 from DB9F-1 to DB9F-2 according to the wiring diagram. The GND from DB9F-1 was connected to both connector 5 of DB9F-2 and the output (-).
I tested the channels: I connected a DB9 breakout board to DB9F-2. I measured the resistance between the RTS GND and the isolated channels while switching them on and off. In the beginning, when I turned on the binary channels the resistance was behaving weird - oscillating between low resistance and open circuit. I pulled up the channels through a 100Kohm resistor to observe whether the voltage behavior is reasonable or not. Indeed I observed that in the LOW state the voltage between the isolated channel and slow GND is 15V and 0.03V in the HIGH state. Then I disconnected the pull up from the channels and measured the resistance again. It showed ~ stable 170ohm in the HIGH state and an open circuit in the LOW state. I was not able to reproduce the weird initial behavior. Maybe the optoisolator needs some warmup of some sort.
We still need to wire the rest of the fast channels to DBF9-3 and isolate the channels in DBF9-4. For that, we need another optoisolator.
There is still an open issue with the BI channels not read by EPICS. They can still be read by the Windows machine though.
This RTS also use the BO interface with an opto isolator. https://dcc.ligo.org/LIGO-D1002593
Could you also include the pull up/pull down situations?
Here is an update and status report on the new BHD front-ends (FEs).
The changes to the FE BIOS settings documented in  do seem to have solved the timing issues. The RTS models ran for one week with no more timing failures. The IOP model on c1sus2 did die due to an unrelated "Channel hopping detected" error. This was traced back to a bug in the Simulink model, where two identical CDS parts were both mapped to ADC_0 instead of ADC_0/1. I made this correction and recompiled the model following the procedure in .
For lack of a better name, I had originally set up the user model on c1sus2 as "c1sus2.mdl" This week I standardized the name to follow the three-letter subsystem convention, as four letters lead to some inconsistency in the naming of the auto-generated MEDM screens. I renamed the model c1sus2.mdl -> c1su2.mdl. The updated table of models is below.
Renaming an RTS model requires several steps to fully propagate the change, so I've documented the procedure below for future reference.
On the target FE, first stop the model to be renamed:
controls@c1sus2$ rtcds stop c1sus2
Then, navigate to the build directory and run the uninstall and cleanup scripts:
controls@c1sus2$ cd /opt/rtcds/caltech/c1/rtbuild/release
controls@c1sus2$ make uninstall-c1sus2
controls@c1sus2$ make clean-c1sus2
Unfortunately, the uninstall script does not remove every vestige of the old model, so some manual cleanup is required. First, open the file /opt/rtcds/caltech/c1/target/gds/param/testpoint.par and manually delete the three-line entry corresponding to the old model:
If this is not removed, reinstallation of the renamed model will fail because its assigned DCUID will appear to already be in use. Next, find all relics of the old model using:
and manually delete each file and subdirectory containing the "sus2" name. Finally, rename, recompile, reinstall, and relaunch the model:
I used a tool developed by Chris, mdl2adl, to auto-generate a set of temporary sitemap/model MEDM screens. This package parses each Simulink file and generates an MEDM screen whose background is an .svg image of the Simulink model. Each object in the image is overlaid with a clickable button linked to the auto-generated RTS screens. An example of the screen for the C1BHD model is shown in Attachment 1. Having these screens will make the testing much faster and less user-error prone.
I generated these screens following the instructions in Chris' README. However, I ran this script on the c1sim machine, where all the dependencies including Matlab 2021 are already set up. I simply copied the target .mdl files to the root level of the mdl2adl repo, ran the script (./mdl2adl.sh c1x06 c1x07 c1bhd c1su2), and then copied the output to /opt/rtcds/caltech/c1/medm/medm_teststand. Then I redefined the "sitemap" environment variable on the chiara clone to point to this new location, so that they can be launched in the teststand via the usual "sitemap" command.
Currently, we are missing five 18-bit DACs needed to complete the c1sus2 system (the c1bhd system is complete). Since the first shipment, we have had no luck getting additional 18-bit DACs from the sites, and I don't know when more will become available. So, this week I took an inventory of all the 16-bit DACs available at the 40m. I located four 16-bit DACs, pictured in Attachment 2. Their operational states are unknown, but none were labeled as known not to work.
The original CDS design would call for 40 more 18-bit DAC channels. Between the four 16-bit DACs there are 64 channels, so if only 3/4 of these DACs work we would have enough AO channels. However, my search turned up zero additional 16-bit DAC adapter boards. We could check if first Rolf or Todd have any spares. If not, I think it would be relatively cheap and fast to have four new adapters fabricated.
DAQ network limitations and plan
To get deeper into the signal-integrity aspect of the testing, it is going to be critical to get the secondary DAQ network running in the teststand. Of all the CDS tools (Ndscope, Diaggui, DataViewer, StripTool), only StripTool can be used without a functioning NDS server (which, in turn, requires a functioning DAQ server). StripTool connects directly to the EPICS server run by the RTS process. As such, StripTool is useful for basic DC tests of the fast channels, but it can only access the downsampled monitor channels. Ian and Anchal are going to carry out some simple DAC-to-ADC loopback tests to the furthest extent possible using StripTool (using DC signals) and will document their findings separately.
We don't yet have a working DAQ network because we are still missing one piece of critical hardware: a 10G switch compatible with the older Myricom network cards. In the older RCG version 3.x used by the 40m, the DAQ code is hardwired to interface with a Myricom 10G PCIe card. I was able to locate a spare Myricom card, pictured in Attachment 3, in the old fb machine. Since it looks like it is going to take some time to get an old 10G switch from the sites, I went ahead and ordered one this week. I have not been able to find documentation on our particular Myricom card, so it might be compatible with the latest 10G switches but I just don't know. So instead I bought exactly the same older (discontinued) model as is used in the 40m DAQ network, the Netgear GSM7352S. This way we'll also have a spare. The unit I bought is in "like-new" condition and will unfortunately take about a week to arrive.
Since this Ocean Controls optoisolator has been shown to be compatible, I've gone ahead and ordered 10 more:
They are expected to arrive by Wednesday.
According to the BO interface circuit board https://dcc.ligo.org/D1001266, PCIN wires are connected to the coil driver and they are not pulled either way.
That means that they're either grounded or floating. I updated the drawing.
I looked into the issue that Yehonathan reported with the BI channels. I found the problem was with the .cmd file which sets up the Modbus interfacing of the Acromags to EPICS (/cvs/cds/caltech/target/c1auxey1/ETMYaux.cmd).
The problem is that all the channels on the XT1111 unit are being configured in Modbus as output channels. While it is possible to break up the address space of a single unit, so that some subset of channels are configured as inputs and another as outputs, I think this is likely to lead to mass confusion if the setup ever has to be modified. A simpler solution (and the convention we adopted for previous systems) is just to use separate Acromag units for BI and BO signals.
Accordingly, I updated the wiring plan to include the following changes:
So, one more Acromag XT1111 needs to be added to the c1auxey chassis, with the wiring changes as noted above. I have already updated the .cmd and EPICS database files in /cvs/cds/caltech/target/c1auxey1 to reflect these changes.
I added a new XT1111 Acromag module to the c1auxey chassis. I sanitized and configured it according to the slow machines wiki instructions.
Since all the spare BIOs fit one DB37 connector I didn't add another feedthrough and combined them all on one and the same DB37 connector. This was possible because all the RTNs of the BIOs are tied to the chassis ground and therefore need only one connection. I changed the wiring spreadsheet accordingly.
I did a lot of rewirings and also cut short several long wires that were protruding from the chassis. I tested all the wires from the feedthroughs to the Acromag channels and fixed some wiring mistakes.
Tomorrow I will test the BIs using EPICs.
Added difference to the graph. I included the code so that others could see what it looks like and use it for easy use.
I tested the digital inputs the following way: I connected a DB9 breakout to DB9M-5 and DB9M-6 where digital inputs are hosted. I shorted the channel under test to GND to turn it on.
I observed the channels turn from Disabled to Enabled using caget when I shorted the channel to GND and from Enabled to Disabled when I disconnected them.
I did this for all the digital inputs and they all passed the test.
I am still waiting for the other isolator to wire the rest of the digital outputs.
Next, I believe we should take some noise spectra of the Y end before we do the installation.
I have attached an updated transfer function graph with the residual easier to see. I thought here I would include a better explanation of what this transfer function was measuring.
This transfer function was mainly about learning how to use DTT and Foton to make and measure transfer functions. Therefore it is just measuring across a single CDS filter block. X1SUP_ISOLATED_C1_SUS_SINGLE_PLANT_Plant_POS_Mod block to be specific. This measurement shows that the block is doing what I programmed it to do with Foton. The residual is probably just because the measured TF had fewer points than the calculated one.
The next step is to take a closed-loop TF of the system and the control module.
After that, I want to add more degrees of freedom to the model. both in the plant and in the controls.
I checked the BI situation on the HAM-A coil driver. It seems like these are sinking BIs and indeed need to be isolated from the Acromag unit GND to avoid contamination.
The BIs will have to be isolated on a different isolator. Now, the wires coming from the field (red) are connected to the second isolator's input and the outputs are connected to the Acromag BI module and the Acromag's RTN.
I updated the wiring diagram (attached) and the wiring spreadsheet.
In the diagram, you can notice that the BI isolator (the right one) is powered by the Acromag's +15V and switched when the coil driver's GND is supplied. I am not sure if it makes sense or not. In this configuration, there is a path between the coil driver's GND and the Acromag's GND but its resistance is at least 10KOhm. The extra careful option is to power the isolator by the coil driver's +V but there is no +V on any of the connectors going out of the coil driver.
I installed an additional isolator on the DIN rail and wired the remaining BOs (C1:SUS-ETMY_SD_ENABLE, C1:SUS-ETMY_LR_ENABLE) through it to the DB9F-4 feedthrough. I also added DB9F-3 for incoming wires from the RTS and made the required connection from it to DB9F-4.
I tested the new isolated BOs using the Windows machine (after stopping Modbus). As before, I measure the resistance between pin 5 (coil driver GND) and the channel under test. When I turn on the BO I see the resistance drops from inf to 166ohm and back to inf when I turn it off. Both channels passed the test.
I have added more degrees of freedom. The model includes x, y, z, pitch, yaw, roll and is controlled by a matrix of transfer functions (See Attachment 2). I have added 5 control filters to individually control UL, UR, LL, LR, and side. Eventually, this should become a matrix too but for the moment this is fine.
Note the Unit delay blocks in the control in Attachment 1. The model will not compile without these blocks.
If my understanding is correct, the (photo receiving) NPN transistor of the optocoupler is energized through the acromag. The LED side should be driven by the coil driver circuit. It is properly done for the "enable mon" through 750Ohm and +V. However, "Run/Acquire" is a relay switch and there is no one to drive the line. I propose to add the pull-up network to the run/acquire outputs. This way all 8 outputs become identical and symmetric.
We should test the configuration if this works properly. This can be done with just a manual switch, R=750Ohm, and a +V supply (+18V I guess).
I updated the wiring diagram according to Koji's suggestion. According to the isolator manual, this configuration requires that the isolator input be configured as PNP.
Additionally, when the switch in the coil driver is open the LED in the isolator is signaling an on-state. Therefore, we might need to configure the Acromag to invert the input.
There are the Run/Aquire channels that we might need to add to the wiring diagram. If we do need to read them using slow channels, we will have to pull them up like the EnableMon channels to use them like in the wiring diagram.
I installed 2 additional isolators in the Acromag chassis. I set all the input channels to PNP. I ran the digital inputs (EnableMon channels) through these isolators according to the previous post.
I tested the digital inputs in the following way:
I connected an 18V voltage source to the signal wire under test through a 1Kohm resistor. I connected the GND of the voltage source to the RTN wire of the feedthrough. When the voltage source was connected, the LED on the isolator turned on and the EPICs channel under test was Enabled. When I disconnected the voltage source or shorted the signal wire to GND the LED on the isolator turned off and the EPICs channel showed a Disabled state.
Jon and I tested the ADC and DAC cards in both of the systems on the test stand. We had to swap out an 18-bit DAC for a 16-bit one that worked but now both machines have at least one working ADC and DAC.
[Still working on this post. I need to look at what is in the machines to say everything ]
Anchal and I wrote a script (Attachment 1) that will test the ADC and DAC connections with inputs on the INMON from -3000 to 3000. We could not run it because some of the channels seemed to be frozen.
import numpy as np
from traceback import print_exc
parser = argparse.ArgumentParser(
The channels on both the C1BHD and C1SUS2 seem to be frozen: they arent updating and are holding one value. To fix this Anchal and I tried:
I wonder if Jon has any ideas.
Anchal and I ran tests on the two systems (C1-SUS2 and C1-BHD). Attached are the results and the code and data to recreate them.
We connected one DAC channel to one ADC channel and thus all of the results represent a DAC/ADC pair. We then set the offset to different values from -3000 to 3000 and recorded the measured signal. I then plotted the response curve of every DAC/ADC pair so each was tested at least once.
There are two types of plots included in the attachments
1) a summary plot found on the last pages of the pdf files. This is a quick and dirty way to see if all of the channels are working. It is NOT a replacement for the other plots. It shows all the data quickly but sacrifices precision.
2) In an in-depth look at an ADC/DAC pair. Here I show the measured value for a defined DC offset. The Gain of the system should be 0.5 (put in an offset of 100 and measure 50). I included a line to show where this should be. I also plotted the difference between the 0.5 gain line and the measured data.
As seen in the provided plots the channels get saturated after about the -2000 to 2000 mark, which is why the difference graph is only concentrated on -2000 to 2000 range.
Summary: all the channels look to be working they all report very little deviation off of the theoretical gain.
Note: ADC channel 31 is the timing signal so it is the only channel that is wildly off. It is not a measurement channel and we just measured it by mistake.
Here is the final summary (from me) of where things stand with the new front-end systems. With Anchal and Ian's recent scripted loopback testing , all the testing that can be performed in isolation with the hardware on hand has been completed. We currently have no indication of any problem with the new hardware. However, the high-frequency signal integrity and noise testing remains to be done.
I detail those tests and link some DTT templates for performing them below. We have not yet received the Myricom 10G network card being sent from LHO, which is required to complete the standalone DAQ network. Thus we do not have a working NDS server in the test stand, so cannot yet run any of the usual CDS tools such as Diaggui. Another option would be to just connect the new front-ends to the 40m Martian/DAQ networks and test them there.
Due to the unavailablity of the 18-bit DACs that were expected from the sites, we elected to convert all the new 18-bit AO channels to 16-bit. I was able to locate four unused 16-bit DACs around the 40m , with three of the four found to be working. I was also able to obtain three spare 16-bit DAC adapter boards from Todd Etzel. With the addition of the three working DACs, we ended up with just enough hardware to complete both systems.
The final configuration of each I/O chassis is as follows. The full setup is pictured in Attachment 1.
This hardware provides the following breakdown of channels available to user models:
*The last channel of the first ADC is reserved for timing diagnostics.
The chassis have been closed up and their permanent signal cabling installed. They do not need to be reopened, unless future testing finds a problem.
An IOP model has been created for each system reflecting its final hardware configuration. The IOP models are permanent and system-specific. When ready to install the new systems, the IOP models should be copied to the 40m network drive and installed following the RCG-compilation procedure in . Each system also has one temporary user model which was set up for testing purposes. These user models will be replaced with the actual SUS, OMC, and BHD models when the new systems are installed.
The current RCG models and the action to take with each one are listed below:
Each front-end can support up to four user models.
Recently, the CDS group has released a well-documented procedure for testing General Standards ADC and DACs: T2000188. They've also automated the tests using a related set of shell scripts (T2000203). Unfortnately I don't believe these scripts will work at the 40m, as they require the latest v4.x RCG.
However, there is an accompanying set of DTT templates that could be very useful for accelerating the testing. They are available from the LIGO SVN (log in with username: "first.last@LIGO.ORG"). I believe these can be used almost directly, with only minor updates to channel names, etc. There are two classes of DTT-templated tests:
The T2000188 document contains images of normal/passing DTT measurements, as well as known abnormalities and failure modes. More sophisticated tests could also be configured, using these templates as a guiding example.
Due to the unexpected change from 18- to 16-bit AO, we are now short on several pieces of hardware:
I have looked at my code from the previous plot of the transfer function and realized that there is a slight error that must be fixed before we can analyze the difference between the theoretical transfer function and the measured transfer function.
The theoretical transfer function, which was generated from Photon has approximately 1000 data points while the measured one has about 120. There are no points between the two datasets that have the same frequency values, so they are not directly comparable. In order to compare them I must infer the data between the points. In the previous post  I expanded the measured dataset. In other words: I filled in the space between points linearly so that I could compare the two data sets. Using this code:
#make values for the comparison
tck_mag = splrep(tst_f, tst_mag) # get bspline representation given (x,y) values
gen_mag = splev(sim_f, tck_mag) # generate intermediate values
for x in range(len(gen_mag)):
dif_mag.append(gen_mag[x]-sim_mag[x]) # measured minus predicted
tck_ph = splrep(tst_f, tst_ph) # get bspline representation given (x,y) values
gen_ph = splev(sim_f, tck_ph) # generate intermediate values
for x in range(len(gen_ph)):
At points like a sharp peak where the measured data set was sparse compared to the peak, the difference would see the difference between the intermediate “measured” values and the theoretical ones, which would make the difference much higher than it really was.
To fix this I changed the code to generate the intermediate values for the theoretical data set. Using the code here:
tck_mag = splrep(sim_f, sim_mag) # get bspline representation given (x,y) values
gen_mag = splev(tst_f, tck_mag) # generate intermediate values
for x in range(len(tst_mag)):
dif_mag.append(tst_mag[x]-gen_mag[x])#measured minus predicted
tck_ph = splrep(sim_f, sim_ph) # get bspline representation given (x,y) values
gen_ph = splev(tst_f, tck_ph) # generate intermediate values
for x in range(len(tst_ph)):
Because this dataset has far more values (about 10 times more) the previous problem is not such an issue. In addition, there is never an inferred measured value used. That makes it more representative of the true accuracy of the real transfer function.
This is an update to a previous plot, so I am still using the same data just changing the way it is coded. This plot/data does not have a Q of 1000. That plot will be in a later post along with the error estimation that we talked about in this week’s meeting.
The new plot is shown below in attachment 1. Data and code are contained in attachment 2
Following Koji's channel list review, we made changes to the wiring spreadsheet.
Today, I made the changes real in the Acromag chassis. I went through the channel list one by one and made sure it is wired correctly. Additionally, since we now need all the channels the existing isolators have, I replaced the isolator with the defective channel with a new one.
The things to do next:
1. Create entries for the spare coil driver and satellite box channels in the EPICs DB.
2. Test the spare channels.
I edited /cvs/cds/caltech/target/c1auxey1/ETMYaux.db (after creating a backup) and added the spare coil driver channels.
I tested those channels using caget while fixing wiring issues. The tests were all succesful. The digital output channel were tested using the Windows machine since they are locked by some EPICs mechanism I don't yet understand.
One worrying point is I found that the differential analog inputs to be unstable unless I connected a reference to some stable voltage source unlike previous tests showed. It was unstable (but less) even when I connected the ref to the ground connectors on the power supplies on the workbench. This is really puzzling.
When I say unstable I mean that most of the time the voltage reading shows the right value, but occasionly there is a transient sharp volage drop of the order of 0.5V. I will do a more quantitative analysis tomorrow.
To simulate a differential output I used two power supplies connected in series. The outer connectors were used as the outputs and the common connector was connected to the ground and used as a reference. I hooked these outputs to one of the differential analog channels and measured it over time using Striptool. The setup is shown in attachment 3.
I tested two cases: With reference disconnected (attachment 1), and connected (attachment 2). Clearly, the non-referred case is way too noisy.
I redid the differential input experiment using the DS360 function generator we recently got. I generated a low frequency (0.1Hz) sine wave signal with an amplitude 0.5V and connected the + and - output to a differential input on the new c1auxcey Acromag chassis. I recorded a time series of the corresponding EPICS channel with and without the common on the DS360 connected to the Ref connector on the Acromag unit. The common connector on the DS360 is not normally grounded (there is a few tens of kohms between the ground and common connectors). The attachment shows that, indeed, the analog input readout is extremely noisy with the Ref being disconnected. The point where the Ref was connected to common is marked in the picture.
Conclusion: Ref connector on the analog input Acromag units must be connected to some stable voltage source for normal operation.
[koji, ian, tega, paco]
With the remote/local assistance of Tega/Ian last friday I made changes on the c1sus model by connecting the C1:ASC model outputs (found within a block in c1ioo) to the BS and PRM suspension inputs (pitch and yaw). Then, Koji reviewed these changes today and made me notice that no changes are actually needed since the blocks were already in place, connected in the right ports, but the model probably just wasn't rebuilt...
So, today we ran "rtcds make", "rtcds install" on the c1ioo and c1sus models (in that order) but the whole system crashed. We spent a great deal of time restarting the machines and their processes but we struggled quite a lot with setting up the right dates to match the GPS times. What seemed to work in the end was to follow the format of the date in the fb1 machine and try to match the timing to the sub-second level. This is especially tricky when performed by a human action so the whole task is tedious. We anyways completed the reboot for almost all the models except the c1oaf (which tends to make things crashy) since we won't need it right away for the tasks ahead. One potential annoying issue we found was in manually rebooting c1iscey because one of its network ports is loose (the ethernet cable won't click in place) and it appears to use this link to boot (!!) so for a while this machine just wasn't coming back up.
Finally, as we restored the suspension controls and reopened the shutters, we noticed a great deal of misalignment to the point no reflected beam was coming back to the RFPD table. So we spent some time verifying the PRM alignment and TT1 and TT2 (tip tilts) and it turned out to be mostly the latter pair that were responsible for it. We used the green beams to help optimize the XARM and YARM transmissions and were able to relock the arms. We ran ASS on them, and then aligned the PRM OpLevs which also seemed off. This was done by giving a pitch offset to the input PRM oplev beam path and then correcting for it downstream (before the qpd). We also adjusted the BS OpLev in the end.
Summary; the ASC BS and PRM outputs are now built into the SUS models. Let the AS WFS loops be closed soon!
Addenda by KA
- Upon the RTS restarting,
sudo date --set='xxxxxx'
rtcds start c1x01
telnet fb1 8083
- Today we once succeeded to restart the vertex machines. However, the RFM signal transmission did fail. So the end two machines were power cycled as well as c1rfm, but this made all the machines in RED again. Hell...
- We checked the PRM oplev. The spot was around the center but was clipped. This made us so confused. Our conclusion was that the oplev was like that before the RTS reboot.
I tried to read a bit and understand the NTP synchronization implementation in FE computers. I'm quite sure that NTP synchronization should be 'yes' if timesyncd are running correctly in the output of timedatectl in these computers. As Koji reported in 15791, this is not the case. I logged into c1lsc, c1sus and c1ioo and saw that RTC has drifted from the software clocks too which does not happen if NTP synchronization was active. This would mean that almost certainly, if the computers are rebooted, the synchronization will be lost and the models will fail to come online.
My current findings are the following (this should be documented in wiki once we setup everything):
The solution to this issue could be as simple as just defining ntpserver in the name server list. But I'm not sure if my understanding of this issue is correct. Comments/suggestions are welcome for future steps.
126.96.36.199 looks like Caltech's NTP server (ntp-02.caltech.edu)
I can't say it is correct or not as I did not make the survey at your level. I think you need a few tests of reconfiguring and restarting the NTP clients to see if time synchronization starts. Because the local time is not regulated right now anyway, this operation is safe I think.
I added ntpserver as a known host name for address 192.168.113.201 (fb1's address where ntp server is running) in the martian host list in the following files in Chiara:
Note: a host name called ntp was already defined at 192.168.113.11 but I don't know what computer this is.
Then, I restarted the DNS on chiara by doing:
sudo service bind9 restart
Then I logged into c1lsc and c1ioo and ran following:
controls@c1ioo:~ 0$ sudo systemctl restart systemd-timesyncd.service
controls@c1ioo:~ 0$ sudo systemctl status systemd-timesyncd.service -l
● systemd-timesyncd.service - Network Time Synchronization
Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
Active: active (running) since Fri 2021-08-20 07:24:03 UTC; 53s ago
Main PID: 23965 (systemd-timesyn)
Aug 20 07:24:03 c1ioo systemd: Starting Network Time Synchronization...
Aug 20 07:24:03 c1ioo systemd: Started Network Time Synchronization.
Aug 20 07:24:03 c1ioo systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 07:24:35 c1ioo systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
controls@c1ioo:~ 0$ timedatectl
Local time: Fri 2021-08-20 07:25:28 UTC
Universal time: Fri 2021-08-20 07:25:28 UTC
RTC time: Fri 2021-08-20 07:25:31
Time zone: Etc/UTC (UTC, +0000)
NTP enabled: yes
NTP synchronized: no
RTC in local TZ: no
DST active: n/a
The same output is shown in c1lsc too. The NTP synchronized flag in output of timedatectl command did not change to yes and the RTC is still 3 seconds ahead of the local clock.
Then I went to c1sus to see what was the status output before rstarting the timesyncd service. I got folloing output:
controls@c1sus:~ 0$ sudo systemctl status systemd-timesyncd.service -l
● systemd-timesyncd.service - Network Time Synchronization
Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
Active: active (running) since Tue 2021-08-17 04:38:03 UTC; 3 days ago
Main PID: 243 (systemd-timesyn)
Aug 20 02:02:18 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 02:36:27 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 03:10:35 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 03:44:43 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 04:18:51 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 04:53:00 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 05:27:08 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 06:01:16 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 06:35:24 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 07:09:33 c1sus systemd-timesyncd: Using NTP server 192.168.113.201:123 (ntpserver).
This actually shows that the service was able to find ntpserver correctly at 192.168.113.201 even before I changed the name server file in chiara. So I'm retracting the changes made to name server. They are probably not required.
The configuration files for timesynd.conf are read only even with sudo. I tried changing permissions but that did not work either. Maybe these files are not correctly configured. The man page of timesyncd says to use field 'NTP' to give the ntp servers. Our files are using field 'Servers'. But since we are not getting any error message, I don't think this is the issue here.
I'll look more into this problem.
I read on some stack exchange that 'NTP synchornized' indicator turns 'yes' in the output of command timedatectl only when RTC clock has been adjusted at some point. I also read that timesyncd does not do the change if the time difference is too much, roughly more than 3 seconds.
So I logged into all FE machines and ran sudo hwclock -w to synchronize them all to the system clocks and then waited if the timesyncd does any correction on RTC. It did not. A few hours later, I found the RTC clocks drifitng again from the system clocks. So even if the timesynd service is running as it should, it si not performing time correction for whatever reason.
Maybe we should try to use some other service?
The NTP synchronized flag in output of timedatectl command did not change to yes and the RTC is still 3 seconds ahead of the local clock.