40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 56 of 337  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  14939   Fri Oct 4 01:57:09 2019 KojiUpdateCDSc1iscaux testing

The AA filter for ASDC was fixed.

== Test Status ==

[done] Whitening gain switching test
[done] AA enable/disable switching
[0th order] LO Det Mon channel check
[none] PD I/F board check
[done] QPD I/F board check
[none] CM Board
[none] ALS I/F board


The AA filter for the 4th section of the LSC analog electronics bank (D000076) was pulled out for the test. On the workbench, questionable CH8 was checked. It tuned out that the filter amplifier module for the 8th-order elliptic filter at 7.5kHz was not properly working and exhibited unusual attenuation. This filter module (Frequency Devices Inc D68L8E-7.50kHz) was desoldered and replaced with a module from a spare board. Note that Gautam and I had tried to use this spare board instead of the current one, but it didn't give us any signal for an unknown reason. Since the desoldering required a lot of force and had a risk of damaging the PCB, a socket was made from an IC socket (see Attached 1). This change made CH8 functioning equally to the other channels do.


I took this opportunity to ckech the performance of the AA filters. For each channel, the input signal was injected from J3 using a pomona clip. The output was taken from pin 1, 5, 9, ... of J2. This is the + side of the differential output. The - side just has the equivalent performance but the signal polarity. The digital signals for the AA bypass switches were not connected. Fortunately, this was just fine as it made the anti-aliasing filters engaged.

Attachment 2 shows the transfer functions of all the channels. All the channels showed an identical response (at least visually). The transfer function for CH1 was fitted by LISO. The ZPK values are listed here:

pole 5.2860544577k 503.1473053928m
pole 5.9752193716k 1.0543411596
pole 8.9271953580k 3.5384364788
pole 8.2181747850k 3.4220607928
pole 182.1403534923k 1.1187869426 # This has almost no effect
zero 13.5305051680k 423.6130434049M
zero 15.5318357741k 747.6895990654k
zero 23.1746351749k 1.5412966100M


factor 989.1003181564m
delay 24.4846075283n

Attachment 3 shows the ASD of the output voltage noise measurement. Note the input was shorted for this measurement. The nominal output voltage was found to be 0.1 uV/rtHz and the 1/f noise corner freq was about 100Hz. Only CH3 showed a deviation from the typical values. It looks like this is neither an artifact nor transient noise. Fortunately, nothing is connected to this channel right now.

  14941   Fri Oct 4 22:22:03 2019 gautamUpdateCDSFinal incarnation of latch.py

[KA, GV]

This elog is meant to be a summary of some of the many subtleties on the CM board. The latest schematic of the version used at the 40m can be found at D1500308 .

Latch logic:

  • There are several Binary Outputs and one Binary Input to the CM board.
  • The outputs control ENABLE/DISABLE switches and gains of amplifier stages, while the input reports whenever the limiter has been reached.
  • The variable gain feature is implemented by enabling/bypassing several cascaded fixed gain stages. So in order to change the gain of a single composite amplifier stage, multiple individual amplifier stages have to be switched.
  • This is implemented by the user interacting with the hardware via a "control word", consisting of a number of bits depending on the number of cascaded stages that have to be switched. 
  • This control word is sent to the device via modbus EPICS, which is an asynchronous communication protocol. Hence, it may be that the individual bits composing the control word get switched asynchronously. This would be disastrous, as there can be transient glitches in the gain of the stage being controlled. 
  • To protect against such problems, there is a latch IC in the hardware between the Binary Inputs to the board (= Binary Outputs from Acromags), and the actual switches (= MAX333) that enable/bypass the cascaded gain stages. The latch IC used is a SN74ALS573. This device acts as a bus, which transmits/blocks changes for multiple bits (= our control word) from propagating, depending on the state of a single bit (= the LATCH ENABLE bit). Thus, by controlling a single bit, we can guarantee that multiple bits get switched synchronously
  • In order to use this latch capability, we need some software logic that sets/disables the LATCH ENABLE bit. For our system, this logic is implemented in the form of a continuously running python 🐍 script, located at /cvs/cds/caltech/target/c1iscaux/latch.py. It is implemented as a systemctl service on the c1iscaux Supermicro. The logic implemented in this script is shown in Attachment #1. While the channels referred to in that attachment are for REFL1_GAIN, the same logic is implemented for REFL2_GAIN, AO_GAIN, and the SuperBoosts.
  • Some FAQ:
    1. Q: Why do we need the soft channels C1:LSC-REFL1_SET_LSB and C1:LSC-REFL1_SET_MSB?
      A: These soft channels are what is physically linked to the Acromag Binary Outputs. In order for our latch logic to be effective, we need to detect when the user asks for a change, and then disable the LATCH ENABLE bit (which is on by default, see FAQ #3) before changing the physical acromag channels. The soft channels form the protective layer between the user and the hardware, allowing latch.py to function.
    2. Q: Why is there an "_MSB" and "_LSB" soft channel? 
      A: This has to do with the mbboDirect EPICS channel type, which is used to control the multiple bits in our control word using a single input (= an MEDM gain slider). The mbboDirect data-type requires the bits it controls to have consecutive hardware addresses. However, the Acromag hardware addressing scheme is not always compatible with this requirement (see pg 33 of the manual for why this is the case). Hence, we have to artifically break up the control word into two separate control words compatible with the Acromag addressing scheme. This functionality is implemented in latch.py.
    3. Q: Why is the default state of LATCH ENABLE set to ON? 
      A: This has to do with the fact that all Binary Inputs, not just the multi-bit ones, to the CM board are propagated to the control hardware via a latch IC. For the single-bit channels, there is no requirement that the switching be synchronous. Hence, rather than setting up ~10 more single-bit soft channels and detecting changes before propagating them, we decided to leave the LATCH ENABLE ON by default, and only disable it when changing the multi-bit gain channels. This is the same way the logic was implemented in the VME state code, and we think that there are no logic reasons why it would fail. But if someone comes up with something, we can change the logic.

Acromag BIO testing:

During my bench testing of the Acromag chassis, I had not yet figured out mbboDirect and the latch logic, so I did not fully verify the channel mapping (= wiring inside the Acromag box), and whether the sitching behavior was consistent with what we expect. Koji and I verified (using the LED tester breakout board) that all the channels have the expected behavior 👏. Note that this is only a certification at the front-panel DB37 connectors of the Acromag chassis  testing of the integrated electronics chain including the CM board is in progress...

  14942   Sat Oct 5 00:03:21 2019 KojiUpdateCDSc1iscaux testing

[Gautam, Koji]

Input gain part of the CM servo board D1500308 was tested. A couple of problems were detected. One still remains.

== Test Status ==

[done] Whitening gain switching test
[done] AA enable/disable switching
[0th order] LO Det Mon channel check
[none] PD I/F board check
[done] QPD I/F board check
[in progress] CM Board
[none] ALS I/F board


We started to test the CM Servo board from the input stages. Initially, DC offsets were provided to IN1 and IN2 to check the gain on the oscilloscope or a StripTool plot. However, the results were confusing, AC measurements with SR785 was carried out in the end. It turned out that both IN1 and IN2 had some issues. IN1 showed an increment of the gain by 2dB every two gain steps, having suggested that the 1dB gain stage had a problem. IN2 showed sudden drop of the signal at the gain +8~+15dB and +24~+31dB, having suggested that a particular 8dB stage had a problem. The board was exposed with the extender and started tracing the signals.

CH1: The digital signal to switch the 1dB stage reached Pin 1A of the DIN96 connector. However, the latch logic (U47 74ALS573) does not spit out the corresponding level for this bit. Note that the next bit was properly working. We concluded that this 74ALS573 had failed and need to be replaced. We have no spare of this wide SOIC-20 chip, but Downs seems to have some spares (see Todd's spare parts list). We will try to get the chip on Monday.

CH2: The stage only used between +8dB and +15dB and between +24dB and +31dB is the +8dB stage (U9 and U2A). I found that the amped output signal did not reach the FET switch U2A (MAX333A). Therefore it was concluded that the opamp U9 (AD829) has an issue. In fact, the amp itself was working, but the output pin was not properly soldered to the pad.  Resoldering this chip made the issue gone. Note that this particular channel has some OP27s soldered instead of AD829. Gautam mentioned that there was some action on the board a few years back to deal with the offset issue. Next time when the board is polled out, I'll take the photos of the board.


Using SR785, the swept sine measurements between 100 and 100kHz were taken for all the gain settings for each channel. Between -31dB and -11dB, the input signal amplitude of 300mV was used. Between -10dB and +10dB, it was reduced to 100mV. For the rest, the amplitude was 10mV. Note that the data for +11dB for CH1 and +2dB for CH2 are missing presumably due to a data transfer issue.

The results are shown in Attachments 1~4.

Attachments 1 and 3 show the gain at each slider value. The measured gain was represented by the average between 1kHz and 10kHz. The missing 1dB every two slide values are seen for CH1. The phase delay at 100kHz is show in the lower plot. There is some delay and delay variation seen but it is in fact less than 1deg at 10kHz (see later) so it's effectfor CM servo (IMC AO path) is minimum. The gain for CH2 tracks the slider value nicely. The phase delay is larger than that of CH1, as expected because of OP27.

Attachments 2 and 4 show the transfer functions. The slider value was subtracted from the measured gain magnitude to indicate the deviation between them. The missing 1dB is obviously visible for CH1 in addition to the overall gain offset of ~0.2dB. CH2 also shows the gain offset of 0.1dB~0.2dB. The phase delay comes into the play around 20kHz particularly at higher gains where the UGF of the AO path is.


gautam: Here is the elog thread for IN2 opAmps going AD829-->OP27. Also, I guess Attachment #1 and #3 x-axes should be "Gain [dB]" rather than "Frequency [Hz]".

  14947   Tue Oct 8 03:19:14 2019 KojiUpdateCDSFinal incarnation of latch.py

Now with the CM board tested with the signal injected, it turned out that the latch logic was flipped. As the default state locked the digital levels, the buttons other than the mbbo channels were inactive.

By giving 0 to C1:LSC-CM_LATCH_ENABLE, the modification of the digital state is enabled. And with the value of 1, the digital bits on the board is locked.

In order to reflect this, latch.py was modified and now the controls are all activated.

  14948   Tue Oct 8 03:32:42 2019 KojiUpdateCDSCM servo board testing

[Koji]

The logic chips 74ALS573 were replaced. And now the gain sliders are working properly.

== Test Status ==

[done] Whitening gain switching test
[done] AA enable/disable switching
[0th order] LO Det Mon channel check
[none] PD I/F board check
[done] QPD I/F board check
[done] CM Board
[none] ALS I/F board


Last week we found that the logic chip for the REFL1 gain switching was not transmitting the input logic. I went to Downs and obtained the chips. After some inspection some other latch chips were suspicious. Therefore U46, U47, and U48 (#1, #3, and #4 from the top) were replaced. After the replacement, the gain measurements were repeated. This time the test for the AO gain was also performed. Now all three slideres show the gain as expected except for the consistent -0.2dB deficit.

Note that the transfer functions for the REFL gains were measured with the input at IN1 or IN2 and the output at TESTA1. The TFs for the AO gain was measured with the excitation at EXC B, the input at TESTB2 and the output at the SERVO output. The gain and phase variantions for the AO gain at low frequency is the effect of AC coupling existing between the excitation and the servo output.

[Update on Oct 14, 2019]

The measured transfer functions show the phase delay determined by the opamps involved. The phase delay well below the pole frequencies can be represented well by a simple time delay (a phase delay linear to the frequency). Attachment 7 shows the time delay estimated by LISO for each gain setting of each gain stage. REFL2 has particularly large phase delay because of the use of OP27s. The delay is even larger when the gain is high presunmably because of the limited GBW.

  14953   Tue Oct 8 17:59:29 2019 KojiUpdateCDSCM servo board testing (portal)

== Test Status ==

[done] Whitening gain switching test
[done] AA enable/disable switching
[0th order] LO Det Mon channel check
[none] PD I/F board check
[done] QPD I/F board check
[done] CM Board
[none] ALS I/F board


The photos of the latest board can be found as Attachments 3/4

With some input signals, the functionarities of the CM servo switches were tested.

  • Latch logic works. But latch alive signal is missing.
  • IN1 enable/disable, IN2 enable/disable are properly working
  • OUT2 toggle switch for REFL1/REFL2 mon is wokring
  • Boost / Super Boosts are working
  • EXC A enable/disable, EXC B enable/disable switches are working
  • Option 1 and Option 2 now isolate the input when either is enabled (as there is no option board)
  • 79Hz-1.6kHz pole zero pair works fine
  • OUT1 works fine
  • Disable/Enable switch for the fast path works
  • Polarity switch works
  • AO Gain property changes the gain
  • Limitter switch works (Attachments 4/5). The limitter clipps the output at 4~4.5V. The Limitter indicator also works.

After the tests the LSC cables were reconnected (Attachment 6)

  14955   Tue Oct 8 18:42:39 2019 KojiUpdateCDSCM servo board testing

The boost filters of the CM servo board were tested. Their ZPK models were made.


The transfer functions of the boost filters were measured with the SG output of a SR785 connected to IN1. The IN1 gain was set to be 0dB. The transfer function was taken between the IN1 input and the TEST1A output.
With no boost and normal boost, the input signal amplitude was fixed to 20mVpk. For the other boosts, however, I could expect large gain variation through a single sweep. Therefore automatic SG amplitude tracking was used. The target was to have the output to be 1V with maximum amplitude of 100mV.

Attachment 1 shows the measured transfer functions.

The pole and zero frequencies of the boosts were estimated using LISO. Here the TFs were normalized by the TF of 'no boost' to cancel the delay of the other stages including that of the monitor channel.

 

ZPK model of Normal Boost:

pole 44.0597566447
zero 4.3927650910k

factor 98.8275377818

 

ZPK model of Super Boost (State1):

pole 878.5368382789
zero 17.5107366335k
factor 20.0840668188

 

ZPK model of Super Boost (State2):

pole 714.8112014271
pole 1.0147609373k
zero 13.2470941080k
zero 22.2259701828k

factor 404.5411036031
 

ZPK model of Super Boost (State3):

pole 886.3650348470
pole 420.4089305781
pole 887.8490768202
zero 8.3635166134k
zero 15.7953592754k
zero 20.5144907279k

factor 8.2051379423k

 

  14956   Tue Oct 8 20:23:03 2019 gautamUpdateCDSc1iscaux testing

Looking at the old latch.st code, looks like this is just a heartbeat signal to indicate the code is alive. I'll implement this. Aesthetically, it'd be also nice to have the hex representation of the "*_SET" channels visible on the MEDM screen.

 

Quote:

Latch logic works. But latch alive signal is missing.

  14965   Mon Oct 14 16:06:28 2019 KojiUpdateCDSCM servo board testing

CM Board Slow out (digital length control) path transfer function / pole-zero filter pair (79Hz/1.6kHz) transfer function

The excitation was given from EXC A. The denominator was TESTA2, and the numerator was OUT1.

Attachment 1 shows the measured transfer function with and without PZ filter off and on. The PZ filter provides ~26dB attenuation at  high frequency. The output stage has a single order 100kHz LPF and it is visible in the transfer function.

The transfer function without the PZ filter was modelled by LISO as the following PZK representation. There looked a small step in the TF which caused the additional PZ pair (66~67Hz) but has very minor effect in the mag and phase.

pole 66.2720207366
zero 67.2660731875
pole 93.3044858160k

factor -995.5583556921m

The transfer function of the PZ filter was separately analyzed. The TF with the switch ON was normalized by the one with the switch OFF. Thus it revealed the pure effect of the switch. The PZK model of the stage was estimated to be

pole 79.7312926438
zero 1.6395485993k

factor 996.2196584165m

  14966   Mon Oct 14 16:19:30 2019 KojiUpdateCDSCM servo board testing

For the CM board modeling purpose, the transfer function from TESTA2 to TESTB2 was needed. (Attachment 1)

The ZPK model of this part is

pole 76.2369881805
zero 77.4655685092
pole 7.0761486105M

factor -993.0593433578m

 

  14967   Mon Oct 14 16:25:03 2019 KojiUpdateCDSCM servo board testing

The output stage (and AO GAIN stage) of the MC board was modelled. The transfer function was measured with the injection from EXC B. The denominator was TESTB2, and the numerator was SERVO OUT.

This stage is AC coupled by 2x 1st order HPFs. Firstly, this transfer function was measured with AO GAIN set to be 0dB. (Attachment 1)
This TF was used to characterize the cutoffs of the HPF stages, represented as the following ZPK:

zero 1m
zero 1m
pole 6.0502599855
pole 6.0624642854
factor -26.2725046079n

Then the AO GAIN was already measured as seen in [ELOG 14948]. The AO gain TF was then modeled by LISO with the above HPF as the preset. This allows us to characterize the time delay of the AO GAIN part.

  14968   Mon Oct 14 16:34:42 2019 KojiUpdateCDSCM servo board testing

Input referred offsets on the IN1/IN2 were tested with different gain settings. The two inputs were plugged by the 50 ohm terminators. The output was monitored at OUT1 (SLOW Length Output). The fast path is AC coupled and has no sensitivity to the offset.

There is the EPICS monitor point for OUT1. With the multimeter it was confirmed that the EPICS monitor (C1:LSC-CM_REFL1_GAIN) has the right value except for the opposite sign because the output stage of OUT1 is inverting. The previous stages have no sign inversion. Therefore, the numbers below does not compensate the sign inversion.

Attachment 1 shows the output offset observed at C1:LSC-CM_REFL1_GAIN. There is some gain variation, but it is around the constant offset of ~26mV. This suggested that the most of the offset is not from the gain stages but from the later stages (like the boost stages). Note that the boost stages were turned off during the measurements.

Attachment 2 shows the input refered offset naively calculated from the above output offset. In dependent from which path was used, the offset with low gain was hugely enhanced.

Since the input referred offset without subtracting the static offset seemed useless, a constant offset of -26mV was subtracted from the calculation (Attachment 2). This shows that the input refered offset can go up to ~+/-20mV when the gain is up to -16dB. Above that, the offset is mV level.

I don't think this level of offset by whichever OP27 or AD829 becomes an issue when the input error signal is the order of a volt.
This suggests that it is more important to properly set the internal offset cancellation as well as to keep the gain setting to be high.

 

  14970   Mon Oct 14 17:32:28 2019 KojiUpdateCDSPortal Elog entry for the recent CM servo board tests

Updated Circuit Diagram and photos: https://dcc.ligo.org/D1500308-v2

- (1) and (6) of the diagram: TFs with various gain slider values for REFL1/REFL2/AO GAIN [ELOG 14948] (gain values and time delay modeling)
- Switching checks, latest photo of the board, Limiter check  [ELOG 14953]
- (2): Boost transfer functions [ELOG 14955]
- (3): Slow (aka Length) CM output path [ELOG 14965]
- (4): Pole-Zero filter TF [ELOG 14965]
- (5): TF from TESTA2 to TESTB2 [ELOG 14966]
- (6): AC coupling TF of the AO GAIN stage [ELOG 14967]
- (7): AC coupling TF of the IN2 stage on IMC servo board [ELOG 15044]

Slow path = (1)*(2 if necessary)*(3)*(4 if necessary)

Fast path = (1)*(2 if necessary)*(4 if necessary)*(5)*(6)

gautam 20191122: Adding the measured AC coupling of the IN2 input of the IMC servo board for completeness.

  14990   Wed Oct 23 18:40:58 2019 gautamUpdateCDSanother round of vertex FE reboots

I wanted to restart the c1oaf model. As usual, the first time the model was restarted, it came back online with a 0x2bad error. This isn't even listed in the diagnostics manual as one of the recognized error states (unless there is a typo and they mean 0x2bad when they say 0xbad). The fix that has worked for me is to stop and start the model again, but of course, there is some chance of taking all the vertex FEs down in the process. No permutation of mxstream and daqd process restarts have cleared this error. We need some CDS/RCG support to look into this issue and fix it, it is not reasonable to go through reboots of all the vertex FEs every time we want to make a model change.

  15035   Tue Nov 19 15:08:48 2019 gautamUpdateCDSVertex models rebooted

Jon and I were surveying the CDS situation so that he can prepare a report for discussion with Rolf/Rich about our upcoming BHD upgrade. In our poking around, we must have bumped something somewhere because the c1ioo machine went offline, and consequently, took all the vertex models out. I rebooted everything with the reboot script, everything seems to have come back smoothly. I took this opportunity to install some saturation counters for the arm servos, as we have for the CARM/DARM loops, because I want to use these for a watch script that catches when the ALS loses lock and shuts stuff off before kicking optics around needlessly. See Attachment #1 for my changes.

  15061   Mon Dec 2 23:01:47 2019 gautamUpdateCDSFrequent DTT crashes on pianosa

I have been experiencing frequent crashes of DTT on pianosa in the past few weeks. This is pretty annoying to deal with when trying to characterize the interferometer loops. I attach the error log dumped to console. The error has to do with some kind of memory corruption. Recall that we aren't using a GDS version that is packaged with the SL7 lscsoft packages, we are using a pretty ancient (2.15) version that is built from source. I have been unable to build a newer version from source (though I didn't spend much time trying). pianosa is the only usable workstation at the moment, but perhaps someone can make this work on donatella / rossa for general improvement in quality of life.

  15071   Wed Dec 4 09:11:42 2019 YehonathanUpdateCDSReboot script

After the CDSs crashed we run the rebootC1LSC.sh script.

The script is a bit annoying in that it requires entering the CDSs' passwords multiple times over the time it runs which is long.

The resulting CDS screen is a bit different than what was reported before (attached). Also, not all watchdogs were restored.

We restore the remaining watchdogs and do XARM locking. Everything seems to be fine.

  15072   Wed Dec 4 12:13:10 2019 gautamUpdateCDSReboot script

It was way more annoying without a script and took longer than the 4 minutes it does now.

You can fix the requirement to enter password by changing the sshd settings on the FEs like I did for pianosa.

After running the script, you should verify that there are no red flags in the output to console. Yesterday, some of the settings the script was supposed to reset weren't correctly reset, possibly due to python/EPICS problems on donatella, and this cost me an hour of searching last night because the locking wasn't working. Anyway, best practise is to not crash the FEs.

Quote:

The script is a bit annoying in that it requires entering the CDSs' passwords multiple times over the time it runs which is long.

  15078   Thu Dec 5 15:09:50 2019 gautamUpdateCDSc1oaf crashed c1lsc

I tried starting the c1oaf model, but got a DQ error (I want the option of running feedforward during locking even if the filters aren't particularly well tuned yet). Note that this isn't "just a warning light" - some channels are initialized to +/- 1e20, so if you try turning some filters on, you will deliver a massive kick to the optics. Restarting it crashed c1lsc (this is not unexpected behavior - the only way to clear the DQ error is to restart the model, and empirically, the success rate is ~50%). The reboot script brought everything back online smoothly, and the second, time, c1oaf started without any issues.

While looking at the CDS overview screen, I noticed that the c1scy model was reporting frequent RFM errors for the C1:SCY-RFM_ETMY_LSC channel (but none of the others). On the sender model (c1rfm), no errors were being reported. The diag reset button / mxstream restart didn't really work either. See Attachment #1. Just restarting the c1scy model didn't fix the error - I had to reboot the machine and restart the models, and now no errors are being reported.

Attachment #2 shows the current nominal CDS status - the red light on c1lsc is due to some missing c1dnn channels (I'll remove these at the next c1lsc model change because I don't want to un-necessarily reboot the vertex FEs), and the c1omc model is obsolete I guess. c1daf isn't running right now but once I get the new fiber (ordered), I'm gonna restart this model as well.

P.S. The ALS temperature sliders are not SDF-ed. So when the model was restarted, I had to change the sliders back to their old values to get the beat back in the usable range.

  15122   Wed Jan 15 08:55:14 2020 gautamUpdateCDSYearly DAQD fix

Summary:

Every new year (on Dec 31 or Jan 1), all of the realtime models will report a "0x4000" error. This happens due to an offset to the GPStime driver not being updated. Here is how this can be fixed (slightly modified version of what was done at LASTI).

Steps to fix the DC errors:

  1. ssh into FB machine. 
  2. Edit the file /opt/rtcds/rtscore/release/src/include/drv/spectracomGPS.c:
    • Look for the code block with a text string that reads something like
      /* 2019 had 365 days and no leap seconds */
                   pHardware->gpsOffset += 31536000;
    • Copy and paste the above string for the appropriate number of years of offset you are adding, and edit the comment string appropriately!.
  3. Navigate to /opt/rtcds/rtscore/release/src/drv/symmetricom. Run the following commands:
    sudo make
    sudo make install
  4. Stop all the daqd processes and reload symmetricom:
    sudo systemctl daqd_* stop
    sudo modprobe -r symmetricom
    sudo modprobe symmetricom
  5. Re-start the daqd processes:
    sudo service daqd_* start

Independent of this, there is a 1 second offset between the gpstimes reported by /proc/gps and gpstime. However, this doesn't seem to drift. We had effected a static offset to correct for this in the daqd config files, and it looks like these do not need to be updated on a yearly basis. All the daqd indicators are now green, see Attachment #1.

  15223   Tue Feb 25 16:17:57 2020 Gautam, HangUpdateCDS 

Seems that the GPS is out of sync on donatella. We could not get any data from diaggui...

  15237   Mon Mar 2 16:14:47 2020 gautamUpdateCDSsome target directory cleanup

$TARGET_DIR = /cvs/cds/caltech/target

  • $TARGET_DIR/c1psl and $TARGET_DIR/c1iool0 moved to $TARGET_DIR/preAcromag_oldVME/
  • $TARGET_DIR/c1psl1 moved to $TARGET_DIR/c1psl 
  • $TARGET_DIR/c1psl/*.service and $TARGET_DIR/C1_PSL.cmd modified - i executed :%s/c1psl1/c1psl/g in vim.
  • $TARGET_DIR/preAcromag_oldVME/c1psl/autoBurt.req and $TARGET_DIR/preAcromag_oldVME/c1iool0/autoBurt.req catenated into $TARGET_DIR/c1psl/autoBurt.req. The first snapshot at 16:19 has been verified.

It remains to (Jon is taking care of these)

  • add a line to modbusIOC.service on the new c1psl machine that restores the latest burt snapshot on startup (this necessitated installation of a debian jessie libXp6 package on our debian buster machine because our shared EPICS is soooooooooooooo oooooooold)
  • change the hostname from c1psl1 to c1psl
  • update martian.hosts
  15239   Mon Mar 2 16:35:12 2020 gautamUpdateCDSc1psl test status

Channel list with test status
== Test Status ==

[done] Lock PMC and IMC
[done] IMC Servo board test
[done] IMC LO Det Mon channel check
[0th order] WFS quadrant DC mon
[none] WFS I/F monitors
[0th order] WFS attenuators
[none] IOO QPD channels
[done] FSS readbacks 
[done] PMC readbacks


Some more detailed elogs about the individual tests will follow.

Basically, I have characterized the IMC Servo board in detail. The summary finding is that the IN2 (=AO gain) slider needs to be investigated. 

All other channels need to be verified in a more thorough fashion than my basic checks which were just to guarantee the core interferometer functionality, which is important to me.

  15240   Mon Mar 2 19:32:41 2020 gautamUpdateCDSc1rfm errors

Had to reboot both end machines and the c1rfm model to get the TRX and TRY signals to the LSC models. Now both arms can be locked using POX/POY respectively.

  15248   Wed Mar 4 12:25:11 2020 gautamUpdateCDSBIO1 on c1psl is dead

There was some work done on the Acro crate this morning. Unclear if this is independent, but I found that the IMC servo board IN1 slider doesn't respond anymore, even though I had tested it and verified it to be working. Patient debugging showed that BIO1 (and only that acromag unit with the static IP 192.168.114.61) doesn't show up on the subnet in c1psl. Hopefully it's just a loose network cable, if not we will switch out the unit in the afternoon. 

Jon is going to make a python script which iteratively pings all devices on the subnet and we will put this info on an MEDM screen to catch this kind of silent failure.

  15250   Wed Mar 4 16:54:43 2020 gautamUpdateCDSc1auxex temporarily disconnected

To debug a problem with the new c1psl (later elog), we needed a Supermicro EPICS server that was using the shared EPICS/modbus/asyn binaries rather than a local install. Of those available in the lab (c1iscaux, c1vac, c1susaux being the others), this was the only one which uses the shared install. So I 

  • turned the slow bias voltages to 0
  • shutdown the watchdog
  • disconnected the Acromag crate in 1X9 from the 192.168.114.xxx subnet at the supermicro end
  • connected a test ADC to the local subnet using a different ethernet cable (leaving the original one dangling)
  • ran some software tests to see if we could open up a communication line to the test ADC using modbus without any errors being thrown
  • removed the test ADC and restored the ethernet connection.

At which point Jon reset the software end, I restored the slow bias voltage and re-enabled the local damping. The optic seems to have damped okay. The Oplev spot is back in ~center of the QPD and the green beam can be locked to a TEM00 mode (so the alignment is okay - the IR beam is unavailable while c1psl issues are being sorted but I judge that things are back to the nominal state now).

  15285   Thu Mar 26 22:31:34 2020 YehonathanUpdateCDSC1AUXEY wiring + channel list

I have made a wiring + channel list that need to be included in the new C!AUXEY Acromag.

It was mostly copied from C1AUXEX

I ignored the IPANG channels since it is going to be removed from the table.

  15287   Tue Mar 31 09:39:41 2020 gautamUpdateCDSFoton for shaped noise injections

I'd like to re-measure the transfer function from driving MC2 position to the MC_L_DQ channel (for feedforward purposes). Swept sine would be one option, but I can't get the "Envelope" feature of DTT to work, the excitation amplitude isn't getting scaled as specified in the envelope, and so I'm unable to make the measurement near 1 Hz (which is where the FF is effective). I see some scattered mentions of such an issue in past elogs but no mention of a fix (I also feel like I have gotten the envelope function to work for some other loop measurement templates). So then I thought I'd try broadband noise injection, since that seems to have been the approach followed in the past. Again, the noise injection needs to be shaped around ~1 Hz to avoid knocking the IMC out of lock, but I can't get Foton to do shaped noise injections because it doesn't inherit the sample rate when launched from inside DTT/awggui - this is not a new issue, does anyone know the fix?

Note that we are using the gds2.15 install of foton, but the pre-packaged foton that comes with the SL7 installation doesn't work either.

Update:

The envelope feature for swept-sine wasn't working because i specified the frequency grid in the wrong order apparently. Eric von Reis has been notified to include a sorting algorithm in future DTT so that this can be in arbitrary order. fixing that allows me to run a swept sine with enveloped excitation amplitude and hence get the TF I want, but still no shaped noise injections via foton 😢 

  15288   Tue Mar 31 23:35:50 2020 ranaUpdateCDSFoton for shaped noise injections

do you really mean awggui cannot make shaped noise injections via its foton text box ? That has always worked for me in the past.

If this is broken I'm suspicious there's been some package installs to the shared dirs by someone.

  15289   Tue Mar 31 23:54:57 2020 gautamUpdateCDSFoton for shaped noise injections

The problem is that foton does not inherit the model sample rate when launched from DTT/awggui. This is likely some shared/linked/dynamic library issue, the binaries we are running are precompiled presumably for some other OS. I've never gotten this to work since we changed to SL7 (but I did use it successfully in 2017 with the Ubuntu12 install).

Quote:

do you really mean awggui cannot make shaped noise injections via its foton text box ? That has always worked for me in the past.

If this is broken I'm suspicious there's been some package installs to the shared dirs by someone.

  15292   Thu Apr 2 16:31:33 2020 JonUpdateCDSC1AUXEY wiring + channel list
Quote:

I have made a wiring + channel list that need to be included in the new C!AUXEY Acromag.

I used Yehonathan's wiring assignments to lay the rest of groundwork for the final slow controls machine upgrade, c1auxey. Actions completed:

  • Created an internal wiring diagram for assembling the Acromag chassis (log in with LIGO.ORG credentials to view/edit)
  • Created a new target directory on the network drive:
/cvs/cds/caltech/target/c1auxey1

The "1" will be dropped after the new system is permanently installed.

  • Populated the target directory with files:
    • modbusIOC.service - wraps the EPICS IOC as a systemd service
    • ETMYaux.env - defines the EPICS environment variables
    • ETMYaux.cmd - command file to set up the EPICS IOC
    • ETMYaux.sh - enables DAC outputs to the suspension (executed lastly)
  • Created the EPICS channel databases:
    • ETMYaux.db - migration of the existing database
    • c1auxey_state.db - contains logic for loopback monitoring of the IOC "alive" state (visible from Sitemap > CDS > Slow Controls Status)

Hardware-wise, this system will require:

  • 2 Acromag XT-1221 units (ADC)
  • 1 Acromag XT-1541 unit (DAC)
  • 1 Acromag XT-1111 unit (sinking BIO)

I know that we do have these quantities left on hand. The next steps are to set up the Supermicro host and begin assembling the Acromag chassis. Both of these activities require an in-person presence, so I think this is as far as we can advance this project for now.

  15293   Thu Apr 2 22:19:18 2020 KojiUpdateCDSC1AUXEY wiring + channel list

We want to migrate the end shutter controls from c1aux to the end acromags. Could you include them to the list if not yet?

This will let us remove c1aux from the rack, I believe.

 

  15294   Fri Apr 3 12:09:53 2020 JonUpdateCDSC1AUXEY wiring + channel list
Quote:

We want to migrate the end shutter controls from c1aux to the end acromags. Could you include them to the list if not yet?

This will let us remove c1aux from the rack, I believe.

Yehonathan's list does include C1:AUX-GREEN_Y_Shutter and I copied its definition from /cvs/cds/caltech/target/c1aux/ShutterInterlock.db into the new ETMYaux.db file.

I noticed ShutterInterlock.db still contains about a dozen channels. Some of them appear to be ghosts (like the C1:AUX-PSL_Shutter[...] set, which has since become C1:PSL-PSL_Shutter[...] hosted on c1psl) but others like C1:AUX-GREEN_X_Shutter appear to still be in active use.

  15383   Mon Jun 8 18:14:55 2020 gautamUpdateCDSVertex FEs crashed

Summary:

Around 5pm local time, the three vertex FEs crashed. AFAIK, no one was in the lab or working on anything CDS related, so this is worrying.

Details:

  • Reboot script was used to bring all FEs back - only soft reboots were required.
  • The IMC and arms can now be locked.
  • I think combination of burt + SDF would have reverted all the settings as they should be, but if something appears off, it could be that some EPICS value didn't get reset correctly.
  15423   Mon Jun 22 17:51:50 2020 gautamUpdateCDSc1iscaux was down

The machine needed a hard reboot as it was un-ssh-able. 

The exact time that the machine went down is unknown because the blinkys were not DQ-ed. I've now added these to the EDCU to make these channels actually useful, and we may look back on the reliability (or otherwise) of the Acromag system. To my memory, this is the ~5th time one of the new Acromag servers has needed a hard reboot. While this may be less frequent (?) than the VME machines, perhaps there is some other reason for these dropouts. Maybe something to do with the martian network?

Anyway the machine is back up and running now.

  15462   Thu Jul 9 16:02:33 2020 JonHowToCDSProcedure for setting up BHD front-ends

Here is the procedure for setting up the three new BHD front-ends (c1bhd, c1sus2, c1ioo - replacement). This plan is based on technical advice from Rolf Bork and Keith Thorne.

The overall topology for each machine is shown here. As all our existing front-ends use (obsolete) Dolphin PCIe Gen1 cards for IPC, we have elected to re-use Dolphin Gen1 cards removed from the sites. Different PCIe generations of Dolphin cards cannot be mixed, so the only alternative would be to upgrade every 40m machine. However the drivers for these Gen1 Dolphin cards were last updated in 2016. Consequently, they do not support the latest Linux kernel (4.x) which forces us to install a near-obsolete OS for compatibility (Debian 8).

Hardware

Software

  • OS: Debian 8.11 (Linux kernel 3.16)
  • IPC card driver: Dolphin DX 4.4.5 [works only with Linux kernel 2.6 to 3.x]
  • I/O card driver: None required, per the manual

Install Procedure

  1. Follow Keith Thorne's procedure for setting up Debian 8 front-ends
  2. Apply the real-time kernel patches developed for Debian 9, but modified for kernel 3.16 [these are UNTESTED against Debian 8; Keith thinks they may work, but they weren't discovered until after the Debian 9 upgrade]
  3. Install the PCIe expansion cards and Dolphin DX driver (driver installation procedure)
  15515   Wed Aug 12 17:36:42 2020 gautamUpdateCDSTiming distribution slot availability

See Attachment #1. J8 was connected to a "LASTI timing slave" sitting in the rack that Chiara lives in - we don't use this for anything and I confirmed that there was no effect on the RTCDS when I pulled that fiber out. The LASTI timing slave also had a blinky that was blinking when the fiber was plugged in - which I take to believe that the slot works. 

Can we get away with just using these two available slots, J8 and J13? Do we really need three new expansion chassis?

  15518   Wed Aug 12 20:14:06 2020 KojiUpdateCDSTiming distribution slot availability

I believe we will use two new chassis at most. We'll replace c1ioo from Sun to Supermicro, but we recycle the existing timing system.

  15521   Thu Aug 13 11:30:19 2020 gautamUpdateCDSTiming distribution slot availability

That's great. I wonder if we can also get away with not adding new Dolphin infrastructure. I'd really like to avoid changing any IPC drivers.

Quote:

I believe we will use two new chassis at most. We'll replace c1ioo from Sun to Supermicro, but we recycle the existing timing system.

  15522   Thu Aug 13 13:35:13 2020 KojiUpdateCDSTiming distribution slot availability

The new dolphin eventually helps us. But the installation is an invasive change to the existing system and should be done at the installation stage of the 40m BHD.

  15524   Fri Aug 14 00:01:55 2020 gautamUpdateCDSBHD / OMC model channels now added to autoburt

I added the EPCIS channels for the c1omc model (gains, matrix elements etc) to the autoburt such that we have a record of these, since we expect these models to be running somewhat regularly now, and I also expect many CDS crashes.

  15525   Fri Aug 14 10:03:37 2020 JonUpdateCDSTiming distribution slot availability

That's great news we won't have to worry about a new timing fanout for the two new machines, c1bhd and c1sus2. And there's no plan to change Dolphin IPC drivers. The plan is only to install the same (older) version of the driver on the two new machines and plug into free slots in the existing switch.

Quote:

The new dolphin eventually helps us. But the installation is an invasive change to the existing system and should be done at the installation stage of the 40m BHD.

  15564   Tue Sep 8 11:49:04 2020 gautamUpdateCDSSome path changes

I edited /diskless/root.jessie/home/controls/.bashrc so that I don't have to keep doing this every time I do a model recompile.

Quote:

Where is this variable set and how can I add the new paths to it? 

export RCG_LIB_PATH=/opt/rtcds/userapps/release/isc/c1/models/isc/:/opt/rtcds/userapps/release/isc/c1/models/cds/:/opt/rtcds/userapps/release/isc/c1/models/sus/:$RCG_LIB_PAT
  15578   Wed Sep 16 17:44:27 2020 gautamUpdateCDSAll vertex FE models restarted

I had to make a CDS change to the c1lsc model in an effort to get a few more signals into the models. Rather than risk requiring hard reboots (typcially my experience if I try to restart a model), I opted for the more deterministic scripted reboot, at the expense of spending ~20mins to get everything back up and running.


Update 2230: this was more complicated than expected - a nuclear reboot was necessary but now everything is back online and functioning as expected. While all the CDS indicators were green when I wrote this up at ~1800, the c1sus model was having frequent CPU overflows (execution time > 60 us). Not sure why this happened, or why a hard power reboot of everything fixed it, but I'm not delving into this.

The point of all this was that I can now simultaneously digitize 4 channels - 2 DCPDs, and 2 demodulated quadratures of an RF signal.

  15609   Sat Oct 3 16:51:27 2020 gautamUpdateCDSRFM errors

Attachment #1 shows that the c1rfm model isn't able to receive any signals from the front end machines at EX and EY. Attachment #2 shows that the problem appears to have started at ~430am today morning - I certainly wasn't doing anything with the IFO at that time.

I don't know what kind of error this is - what does it mean that the receiving model shows errors but the sender shows no errors? It is not a new kind of error, and the solution in the past has been a series of model reboots, but it'd be nice if we could fix such issues because it eats up a lot of time to reboot all the vertex machines. There is no diagnostic information available in all the places I looked. I'll ask the CDS group for help, but I'm not sure if they'll have anything useful since this RFM technology has been retired at the sites (?).

In the meantime, arm cavity locking in the usual way isn't possible since we don't have the trigger signals from the arm cavity transmission. 


Update 1500 4 Oct: soft reboots of models didn't do the trick so I had to resort to hard reboots of all FEs/expansion chassis. Now the signals seem to be okay.

  15646   Wed Oct 28 09:35:00 2020 KojiUpdateCDSRFM errors

I'm starting the model restarts from remote. Then later I'll show up in the lab to do more hard resets.
==> It seems that the RFM errors are gone. Here are the steps.

  1. Shutdown all the watchdogs
  2. login to c1iscex. Shutdown all the realtime models: rtcds kill --all
  3. login to c1iscey. Shutdown all the realtime models: rtcds kill --all
  4. run scripts/cds/rebootC1LSC.sh on pianosa
  5. reboot c1iscex
  6. reboot c1isxey
  7. Wait until all the machines/models are up by the script
  8. restart c1iscex models
  9. restart c1iscey models
  10. some IPC errors are still visible on the CDS status screen. Lauch c1daf and c1oaf

 

  15659   Wed Nov 4 17:14:49 2020 gautamUpdateCDSc1bhd setup

I am working on the setup of a CDS FE, so please do not attempt any remote login to the IPMI interface of c1bhd until I'm done.

  15663   Fri Nov 6 14:27:16 2020 gautamUpdateCDSc1bhd setup - diskless boot

I was able to boot one of the 3 new Supermicro machines, which I christened c1bhd, in a diskless way (with the boot image hosted on fb, as is the case for all the other realtime FEs in the lab). This is just a first test, but it is reassuring that we can get this custom linux kernel to boot on the new hardware. Some errors about dolphin drivers are thrown at startup but this is to be expected since the server isn't connected to the dolphin network yet. We have the Dolphin adaptor card in hand, but since we have to get another PCIe card (supposedly from LLO according to the BHD spreadsheet), I defer installing this in the server chassis until we have all the necessary hardware on hand.

I also have to figure out the correct BIOS settings for this to really run effectively as a FE (we have to disable all the "un-necessary" system level services) - these machines have BIOS v3.2 as opposed to the older vintages for which there are instructions from K.T. et al.

There may yet be issues with drivers, but this is all the testing that can be done without getting an expansion chassis. After the vent and recovering the IFO, I may try experimenting with the c1ioo chassis, but I'd much prefer if we can do the testing offline on a subnet that doesn't mess with the regular IFO operation (until we need to test the IPC).

Quote:

I am working on the setup of a CDS FE, so please do not attempt any remote login to the IPMI interface of c1bhd until I'm done.

  15695   Wed Dec 2 17:54:03 2020 gautamUpdateCDSFE reboot

As discussed at the meeting, I commenced the recovery of the CDS status at 1750 local time.

  • Started by attempting to just soft-restart the c1rfm model and see if that fixes the issue. It didn't and what's more, took down the c1sus machine.
  • So hard reboots of the vertex machines was required. c1iscey also crashed. I was able to keep the EX machine up, but I soft-stopped all the RT models on it.
  • All systems were recovered by 1815. For anyone checking, the DC light on the c1oaf model is red - this is a "known" issue and requires a model restart, but i don't want to get into that now and it doesn't disrupt normal operation.

Single arm POX/POY locking was checked, but not much more. Our IMC WFS are still out of service so I hand aligned the IMC a bit, IMC REFL DC went from ~0.3 to ~0.12, which is the usual nominal level.

  15719   Wed Dec 9 15:37:48 2020 gautamUpdateCDSRFM switch IP addr reset

I suspect what happened here is that the IP didn't get updated when we went from the 131.215.113.xxx system to 192.168.113.xxx system. I fixed it now and can access the web interface. This system is now ready for remote debugging (from inside the martian network obviously). The IP is 192.168.113.90.

Managed to pull this operation off without crashing the RFM network, phew.

BTW, a windows laptop that used to be in the VEA (I last remember it being on the table near MC2 which was cleared sometime to hold the spare suspensions) is missing. Anyone know where this is ?

ELOG V3.1.3-