40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 96 of 341  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  1769   Tue Jul 21 17:01:18 2009 peteDAQDAQtemp channel PEM-PETER_FE

I added a temporary channel, to input 9 on the PEM ADCU.    Beware the 30, 31, and 32 inputs.  I tried 32 and it only gave noise.

 

 

  1973   Tue Sep 8 15:14:26 2009 rana, alexConfigurationDAQRAID update to Framebuilder: directories added + lookback increased

 Alex logged in around 10:30 this morning and, at our request, adjusted the configuration of fb40m to have 20 days of lookback.

I wasn't able to get him to elog, but he did email the procedure to us:


1) create a bunch of new "Data???" directories in /frames/full
2) change the setting in /usr/controls/daqdrc file
       set num_dirs=480;

my guess is that the next step is:

3) telnet fb0 8087

    daqd>  shutdown

I checked and we do, in fact, now have 480 directories in /frames/full and are so far using up 11% of our 13TB capacity. Lets try to remember to check up on this so that it doesn't get overfull and crash the framebuilder.

  2073   Fri Oct 9 01:31:56 2009 ranaConfigurationDAQtpchn mystery

Does anyone know if this master file is the real thing that's in use now? Are we really using a file called tpchn_C1_new.par? If anyone sees Alex, please get to the bottom of this.

allegra:daq>pwd
/cvs/cds/caltech/chans/daq
allegra:daq>more master
/cvs/cds/caltech/chans/daq/C1ADCU_PEM.ini
#/cvs/cds/caltech/chans/daq/C1ADCU_SUS.ini
/cvs/cds/caltech/chans/daq/C1LSC.ini
/cvs/cds/caltech/chans/daq/C1ASC.ini
/cvs/cds/caltech/chans/daq/C1SOS.ini
/cvs/cds/caltech/chans/daq/C1SUS_EX.ini
/cvs/cds/caltech/chans/daq/C1SUS_EY.ini
/cvs/cds/caltech/chans/daq/C1SUS1.ini
/cvs/cds/caltech/chans/daq/C1SUS2.ini
#/cvs/cds/caltech/chans/daq/C1SUS4.ini
/cvs/cds/caltech/chans/daq/C1IOOF.ini
/cvs/cds/caltech/chans/daq/C1IOO.ini
/cvs/cds/caltech/chans/daq/C0GDS.ini
/cvs/cds/caltech/chans/daq/C0EDCU.ini
/cvs/cds/caltech/chans/daq/C1OMC.ini
/cvs/cds/caltech/chans/daq/C1ASS.ini
/cvs/cds/gds/param/tpchn_C1_new.par
/cvs/cds/gds/param/tpchn_C2.par
/cvs/cds/gds/param/tpchn_C3.par

  2075   Fri Oct 9 14:23:53 2009 Alex IvanovConfigurationDAQtpchn mystery

"Yes. This master file is used."

Quote:

Does anyone know if this master file is the real thing that's in use now? Are we really using a file called tpchn_C1_new.par? If anyone sees Alex, please get to the bottom of this.

allegra:daq>pwd
/cvs/cds/caltech/chans/daq
allegra:daq>more master
/cvs/cds/caltech/chans/daq/C1ADCU_PEM.ini
#/cvs/cds/caltech/chans/daq/C1ADCU_SUS.ini
/cvs/cds/caltech/chans/daq/C1LSC.ini
/cvs/cds/caltech/chans/daq/C1ASC.ini
/cvs/cds/caltech/chans/daq/C1SOS.ini
/cvs/cds/caltech/chans/daq/C1SUS_EX.ini
/cvs/cds/caltech/chans/daq/C1SUS_EY.ini
/cvs/cds/caltech/chans/daq/C1SUS1.ini
/cvs/cds/caltech/chans/daq/C1SUS2.ini
#/cvs/cds/caltech/chans/daq/C1SUS4.ini
/cvs/cds/caltech/chans/daq/C1IOOF.ini
/cvs/cds/caltech/chans/daq/C1IOO.ini
/cvs/cds/caltech/chans/daq/C0GDS.ini
/cvs/cds/caltech/chans/daq/C0EDCU.ini
/cvs/cds/caltech/chans/daq/C1OMC.ini
/cvs/cds/caltech/chans/daq/C1ASS.ini
/cvs/cds/gds/param/tpchn_C1_new.par
/cvs/cds/gds/param/tpchn_C2.par
/cvs/cds/gds/param/tpchn_C3.par

 

  3216   Wed Jul 14 11:54:33 2010 josephbUpdateDAQDebugging Guralp and reboots

This is regards to zero signal being reported by the channels C1:PEM-SEIS_GUR1_X, C1:PEM-SEIS_GUR1_Y, and C1:PEM-SEIS_GUR1_Z.

I briefly swapped Guralp 1 EW and Guralp 2 EW to confirm to myself that it was not on the gurlap end (although the fact that its digital zero is highly indicative a digital realm problem).  I then unplugged the 17-32, and then 1-16 channel connections to the 110B.  I saw floating noise on the GUR2 channels, but still digital zero on the GUR1 channels, which means its not the BNC break out box.

There was a spare 110B, unconnected in the crate, so to do a quick test of the 110B, I turned off the crate and swapped the 110Bs, after copying the switch configuration of the first 110B to the second one.  The original 110B was labeled ADC 1, while the second 110B was labeled ADC 0.  The switches were identical except for the ones closest to the Dsub connectors on the front.  All those switches in that set were to the right, when looking down at the switches and the Dsub connectors pointing towards yourself.

Unfortunately, the c0duc1 never seemed to come up with the new 110B (ADC 0).  So we put the original 110B back.  And turned the crate back on. 

The fb then didn't seem to come back quite right.  We tried rebooting fb40m it, but its still red with status 1.  c0daqctrl is green, but c0dcu1 is red, although I'm not positive if thats due to fb40m being in a strange state.  Jenne tried a telnet in to port 8087 and shutdown, but that didn't seem to help.  At this point, we're going to contact Alex when he gets in around 12:30.

 

  3220   Wed Jul 14 16:39:06 2010 JenneUpdateDAQDebugging Guralp and reboots

[Joe, Jenne]

Joe got on the phone with Alex, and Alex's magic Alex intuition told him to ask about the RFM switch.  The C0DAQ_CTRL's overload light was orangeAlex suggested hitting the reset button on that RFM switch, which we did. That fixed everything -> c0dcu1 came back, as did the frame builder.  Rana had pointed out earlier that we could have brought back all of the other front ends, and enabled the damping of the optics even though the FB was still down.  It's okay to leave the front ends & watchdogs on, and just reboot the FB, AWG, and DAQ_CTRL computers if that is necessary.

Anyhow, once the FB was back online, we got around to bringing back all of the front ends (as usual, except for the ones which are unplugged because they're in the middle of being upgraded).  Everything is back online now.

After all of this craziness, all of the Guralp channels are working happily again. It is still unknown why they starting being digital zero, but they're back again. Maybe I should have rebooted the frame builder in addition to c0dcu1 last night?

 

Quote:

This is regards to zero signal being reported by the channels C1:PEM-SEIS_GUR1_X, C1:PEM-SEIS_GUR1_Y, and C1:PEM-SEIS_GUR1_Z.

I briefly swapped Guralp 1 EW and Guralp 2 EW to confirm to myself that it was not on the gurlap end (although the fact that its digital zero is highly indicative a digital realm problem).  I then unplugged the 17-32, and then 1-16 channel connections to the 110B.  I saw floating noise on the GUR2 channels, but still digital zero on the GUR1 channels, which means its not the BNC break out box.

There was a spare 110B, unconnected in the crate, so to do a quick test of the 110B, I turned off the crate and swapped the 110Bs, after copying the switch configuration of the first 110B to the second one.  The original 110B was labeled ADC 1, while the second 110B was labeled ADC 0.  The switches were identical except for the ones closest to the Dsub connectors on the front.  All those switches in that set were to the right, when looking down at the switches and the Dsub connectors pointing towards yourself.

Unfortunately, the c0duc1 never seemed to come up with the new 110B (ADC 0).  So we put the original 110B back.  And turned the crate back on. 

The fb then didn't seem to come back quite right.  We tried rebooting fb40m it, but its still red with status 1.  c0daqctrl is green, but c0dcu1 is red, although I'm not positive if thats due to fb40m being in a strange state.  Jenne tried a telnet in to port 8087 and shutdown, but that didn't seem to help.  At this point, we're going to contact Alex when he gets in around 12:30.

 

 

  3247   Mon Jul 19 21:47:36 2010 ranaSummaryDAQDAQ timing test

Since we now have a good measurement of the phase noise of the Rb clock Marconi locked to the Rb clock, I wanted to use that to check out the old DAQ system:

I used Megan's phase noise setup - Marconi #2 is putting out 11000013 Hz at 13 dBm into the ZP-3MH mixer. Marconi #1 is putting out 3 dBm at 11000000 Hz into the RF input.

The output goes through a 50 Ohm load and then a Mini-Circuits BNC LP filter (either 2 or 5 MHz). Then an SR560 set for low noise, G = 5, AC coupling, 1-pole LP @ 1 kHz.

This SR560 output goes into the channel C1:IOO-MC_DRUM1 (which is sampled at 16384 Hz with ICS-110B after the usual Sander Liu AA chassis containing the INA134s).

  3299   Tue Jul 27 16:03:36 2010 ranaSummaryDAQDAQ timing test

Quote:

Since we now have a good measurement of the phase noise of the Rb clock Marconi locked to the Rb clock, I wanted to use that to check out the old DAQ system:

I used Megan's phase noise setup - Marconi #2 is putting out 11000013 Hz at 13 dBm into the ZP-3MH mixer. Marconi #1 is putting out 3 dBm at 11000000 Hz into the RF input.

The output goes through a 50 Ohm load and then a Mini-Circuits BNC LP filter (either 2 or 5 MHz). Then an SR560 set for low noise, G = 5, AC coupling, 1-pole LP @ 1 kHz.

This SR560 output goes into the channel C1:IOO-MC_DRUM1 (which is sampled at 16384 Hz with ICS-110B after the usual Sander Liu AA chassis containing the INA134s).

 This is the 0.3 mHz BW spectrum of this test - as you can see the apparent linewidth (assuming the width is all caused by the DAQ jitter) is comparable to the BW and therefore not resolved.

Basically, the Hanning window function is not sharp enough to do this test and so I will do it offline in Matlab.

Attachment 1: Untitled.png
Untitled.png
  3657   Wed Oct 6 00:32:01 2010 ranaSummaryDAQNDS2

This is the link to the NDS2 webpage:

https://www.lsc-group.phys.uwm.edu/daswg/wiki/NetworkDataServer2

We should install this so that we can use this modern interface to get 40m data from outside and inside of the 40m.

  3702   Tue Oct 12 23:45:55 2010 ranaConfigurationDAQNDS2

I installed the NDS2 Client onto the workstations today using the instructions that Zach put onto the Wiki with a couple of modifications.

1) Instead of the adding path stuff in Matlab, I added the LD_LIBRARY_PATH and MATLABPATH variables into the .cshrc as instructed by JZ's NDS2 Wiki.

2) I installed the stuff into the shared /cvs/cds/caltech/apps/linux64/ partition so that it works now on all the 64-bit CentOS 5.5 workstations.

To run it you do:

> kinit albert.einstein

> matlab -nodesktop -nosplash

> help NDS2_GetData

(set the server to the NDS2 server that you like - the example in the help is fine)

> result = NDS2_GetData({'L1:LSC-DARM_ERR'}, 957313530, 10, server);

> plot(result.data)

Now you can get any of the S6 data super fast.

(** Remember to run kdestroy as soon as you are finished so that no one else in the control room can use your personal credentials. **)

Attachment 1: cerberus.jpg
cerberus.jpg
  3939   Wed Nov 17 15:49:53 2010 ranaUpdateDAQOle Channel Names

The following channels should be named as below to keep in line with their names pre-upgrade rather than use _DAQ in the name.

C1:SUS-{OPT}_{POS,PIT,YAW}

SUS{POS,PIT,YAW}_IN1
C1:SUS-{OPT}_OPLEV_{P,Y}ERROR

OL{PIT,YAW}_IN1

C1:SUS-{OPT}_SENSOR_{UL,UR,LL,LR,SIDE}
{UL,UR,LL,LR,SD}SEN_OUT
C1:SUS-{OPT}_OPLEV_{P,Y}OUT
OL{PIT,YAW}_OUT
C1:IOO-MC_TRANSPD
MC2_OLSUM_IN1

 

  4109   Wed Jan 5 00:23:30 2011 ranaSummaryDAQFrameBuilder fails in a new way

Since Leo was trying to demo his LIGO Data Listener code, he noticed that there was and NDS2 issue. The NDS2 guy (JZ) noticed that the FrameBuilder had an issue.

We investigated. At 4PM on Dec 31, the GPS timestamp of the frame file names started to be recorded wrong. In fact, it started to give it a file name matching the correct time from 1 year in the past.

So that's our version of the Y2011 bug. Here's the 'ls' of /frames/full:

drwxr-xr-x 2 controls controls 252K Dec 26 03:59 9773
drwxr-xr-x 2 controls controls 260K Dec 27 07:46 9774
drwxr-xr-x 2 controls controls 256K Dec 28 11:33 9775
drwxr-xr-x 2 controls controls 252K Dec 29 15:19 9776
drwxr-xr-x 2 controls controls 244K Dec 30 19:06 9777
drwxr-xr-x 2 controls controls 188K Dec 31 16:00 9778
drwxr-xr-x 2 controls controls 148K Jan  1 08:53 9463
drwxr-xr-x 2 controls controls 260K Jan  2 12:39 9464
drwxr-xr-x 2 controls controls 252K Jan  3 16:26 9465
drwxr-xr-x 2 controls controls 248K Jan  4 20:13 9466
drwxr-xr-x 2 controls controls  36K Jan  5 00:22 9467
controls@fb /frames/full $

The culprit is the directory who's name starts out as 9463 whereas it should be 9779.

 

  4112   Wed Jan 5 16:00:11 2011 rana, alexSummaryDAQFrameBuilder fails in a new way

Email from Alex:

Turned out to be the lack of current year information in the IRIG-B signal
received by the Symmetricom GPS card in the frame builder machine caused
this. I have added a constant in daqdrc to bring the seconds forward:

controls@fb /opt/rtcds/caltech/c1/target/
fb $ grep symm daqdrc
#set symm_gps_offset=-1;
set symm_gps_offset=31536001;

Hopefully we will be upgrading to the newer timing system at the 40M this
year, so this will not happen again next year.


 

Doing an 'ls -lrt' in /frames/full/ now shows that the names are correct:

drwxr-xr-x 2 controls controls 249856 Dec 30 19:06 9777
drwxr-xr-x 2 controls controls 192512 Dec 31 16:00 9778
drwxr-xr-x 2 controls controls 151552 Jan  1 08:53 9463
drwxr-xr-x 2 controls controls 266240 Jan  2 12:39 9464
drwxr-xr-x 2 controls controls 258048 Jan  3 16:26 9465
drwxr-xr-x 2 controls controls 253952 Jan  4 20:13 9466
drwxr-xr-x 2 controls controls 151552 Jan  5 13:54 9467
drwxr-xr-x 2 controls controls  12288 Jan  5 15:57 9783

  4115   Wed Jan 5 22:14:41 2011 ranaSummaryDAQFrameBuilder fails in a new way

Just a proof that the DAQ is working - ran DTT on nodus from 3 hours ago.

Attachment 1: Screen_shot_2011-01-05_at_10.13.21_PM.png
Screen_shot_2011-01-05_at_10.13.21_PM.png
  4185   Fri Jan 21 23:17:54 2011 ranaHowToDAQDAQ Wiki Failure

The DAQ Wiki pages say to use port 8088 for restarting the Frame Builder. I tried this to no avail.

op440m:daq>telnet fb 8088
Trying 192.168.113.202...
Connected to fb.martian.
Escape character is '^]'.
^]
telnet> quit
Connection to fb.martian closed.
op440m:daq>telnet fb 8087
Trying 192.168.113.202...
Connected to fb.martian.
Escape character is '^]'.
daqd> shutdown
OK
Connection to fb.martian closed by foreign host.

Apparently, 8087 is the right port. Various elog entries from Joe and Kiwamu say 8087 or 8088. Not sure what's going on here.

After figuring this out, I activated the C1:GCV-XARM_COARSE_OUT_DAQ and C1:GCV-XARM_FINE_OUT_DAQ and set both of them to be recorded at 2048 Hz. We are loading filters and setting gains into these filter modules such that the OUT signals will be calibrated into Hz (that's why we used the OUT instead of the IN1 as there was last night).

  4194   Mon Jan 24 10:39:16 2011 josephbHowToDAQDAQ Wiki Failure

Actually both port 8087 and 8088 work to talk to the frame builder.  Don't let the lack of a daqd prompt fool you.

 

Here's putting in the commands:

rosalba:~>telnet fb 8088 Trying 192.168.113.202...

Connected to fb.martian (192.168.113.202). Escape character is '^]'.

shutdown

0000Connection closed by foreign host.

rosalba:~>date Mon Jan 24 10:30:59 PST 2011

 

Then looking at the last 3 lines of restart.log in /opt/rtcds/caltech/c1/target/fb/

daqd_start Fri Jan 21 15:20:48 PST 2011

daqd_start Fri Jan 21 23:06:38 PST 2011

daqd_start Mon Jan 24 10:30:29 PST 2011

 

So clearly its talking to the frame builder, it just doesn't have the right formatting for the prompt.  If you try typing in "help" at the prompt, you still get all the frame builder commands listed and can try using any of them.

However, I'll edit the DAQ wiki and indicate 8087 should be used because of the better formatting for the prompt.


Quote:
Apparently, 8087 is the right port. Various elog entries from Joe and Kiwamu say 8087 or 8088. Not sure what's going on here.

After figuring this out, I activated the C1:GCV-XARM_COARSE_OUT_DAQ and C1:GCV-XARM_FINE_OUT_DAQ and set both of them to be recorded at 2048 Hz. We are loading filters and setting gains into these filter modules such that the OUT signals will be calibrated into Hz (that's why we used the OUT instead of the IN1 as there was last night).

 

  4319   Thu Feb 17 23:41:46 2011 ranaFrogsDAQFrames Directory got the wrong name: Data unreachable

DTT stopped working for recent data. An 'ls' in the frames/full/ directory reveals:

drwxr-xr-x 2 controls controls 258048 Feb  3 12:26 9807
drwxr-xr-x 2 controls controls 258048 Feb  4 16:13 9808
drwxr-xr-x 2 controls controls 262144 Feb  5 19:59 9809
drwxr-xr-x 2 controls controls 258048 Feb  6 23:46 9810
drwxr-xr-x 2 controls controls 258048 Feb  8 03:33 9811
drwxr-xr-x 2 controls controls 262144 Feb  9 07:19 9812
drwxr-xr-x 2 controls controls 253952 Feb 10 11:06 9813
drwxr-xr-x 2 controls controls 266240 Feb 11 14:53 9814
drwxr-xr-x 2 controls controls 266240 Feb 12 18:39 9815
drwxr-xr-x 2 controls controls 266240 Feb 13 22:26 9816
drwxr-xr-x 2 controls controls 262144 Feb 15 02:13 9817
drwxr-xr-x 2 controls controls 253952 Feb 16 05:59 9818
drwxr-xr-x 2 controls controls 241664 Feb 17 09:46 9819
drwxr-xr-x 2 controls controls  28672 Feb 17 12:22 9820
drwxr-xr-x 2 controls controls  32768 Feb 17 15:06 6663
drwxr-xr-x 2 controls controls  73728 Feb 17 23:39 6664
controls@fb /frames/full $ date
Thu Feb 17 23:39:27 PST 2011

  4407   Sun Mar 13 00:00:58 2011 jzweizig, ranaConfigurationDAQNDS2 code change and restart

 John has changed the NDS2 code and restarted it on Mafalda. The issue is that it goes off the rails everytime the DAQD is restarted on FB because of filename convention war between GDS and CDS.

Until this is resolved, please make sure to restart the NDS2 process on Mafalda everytime you restart DAQD by doing this:

pkill -KILL nds2

/users/jzweizig/nds2-mafalda/start_nds2

  4705   Thu May 12 22:54:20 2011 ranaUpdateDAQInput Beam Naming change (no more IP)

 We decided to rename the Input Beam channels (while keeping temporary backwards compatible aliases) as:

C1:ASC-IB_POS_X, C1:ASC-IB_POS_Y, C1:ASC-IB_ANG_SUM, etc.

  4779   Thu Jun 2 10:19:37 2011 Alex IvanovSummaryDAQinstalled new daqd (frame builder) program on fb (target/fb/daqd)

I hope that new daqd code will fix the problem with non-aligned at 16 seconds frame file GPS times.

I have compiled new daqd program under /opt/rtcds/caltech/c1/core/release/build/mx and installed it under

target/fb/daqd, then restarted daqd process on "fb" computer. It was installed with the ownership of user root

and I did chmod +s on it (set UID on execution bit). This was done in order to turn on some code to renice daqd process

to the value of -20 on the startup. Currently it runs as the lowest nice value (high priority).

 

controls@fb /opt/rtcds/caltech/c1/target/fb $ ls -alt daqd
-rwsr-sr-x 1 root controls 6592694 Jun  2 10:00 daqd

 

Backup daqd is here:

 

controls@fb /opt/rtcds/caltech/c1/target/fb $ ls -alt daqd.02jun11
-rwxr-xr-x 1 controls controls 6768158 Feb 21 11:30 daqd.02jun11

 

 

  4926   Thu Jun 30 21:55:16 2011 ranaConfigurationDAQNDS2 conf change

As I recently had trouble getting all of the SUS SENSOR channels at once from NDS2, I asked J.Z. for help. He found that the number of buffers on mafalda was set to only allow a small amount of data to be requested at one time.

He's going to have to figure out a more permanent fix, but for now he's increased the data buffer size to allow somewhat larger chunks to be gotten. I have made a work around in matlab, which gets smaller chunks and then cats them together.

Its in SUS/peakFit/.

Attachment 1: Untitled.png
Untitled.png
  4992   Tue Jul 19 21:05:55 2011 haixingUpdateDAQchoose the right relay

Rana and I are working on the AA/AI circuit for Cymac. We need relays to bypass certain paths in the circuit, and we just found a nice website
explaining how to choose the right relay:

http:/zone.ni.com/devzone/cda/tut/p/id/2774

This piece of information could be useful for others.

  6381   Wed Mar 7 21:13:30 2012 ranaUpdateDAQNDS2

 I noticed that NDS2 was not running on mafalda as it should be. Instead, there were a couple of zombie MEDMs using up 99% of the CPU. I killed the zombies and have run the 'build channel list' script. When it finished, I tried to restart the nds server, but got the following error in the log file. Email has been dispatched to JZ.

mafalda:logs>less nds2-mafalda-201203072111.log

Configuring from file: nds2.conf
Allow list: ALL
terminate called after throwing an instance of 'std::runtime_error'
  what():  Insufficient arguments
  8861   Tue Jul 16 19:16:12 2013 ranaUpdateDAQNDS2 Status

I have modified the settings on the router that connects our Martian network to the outside world so that one can access the NDS2 server running on megatron:31200.

To get at the data you point your data getting client (Matlab, ligoDV, DTT, etc.) at our router and the megatron port will be forwarded to you:

131.215.115.189:31200

is what you should point to. Now, it should be possible to run DetChar jobs (e.g. our 40m Summary pages) from the outside on some remote server. You can also grab 40m data on your laptop directly by using matlab or python NDS software.

  10507   Mon Sep 15 18:55:51 2014 ranaUpdateDAQ40m frames onto the cluster

 Dan Kozak is rsync transferring /frames from NODUS over to the LDAS grid. He's doing this without a BW limit, but even so its going to take a couple weeks. If nodus seems pokey or the net connection to the outside world is too tight, then please let me and him know so that he can throttle the pipe a little.

  10632   Wed Oct 22 21:06:33 2014 ChrisUpdateDAQ40m frames onto the cluster

Quote:

 Dan Kozak is rsync transferring /frames from NODUS over to the LDAS grid. He's doing this without a BW limit, but even so its going to take a couple weeks. If nodus seems pokey or the net connection to the outside world is too tight, then please let me and him know so that he can throttle the pipe a little.

The recently observed daqd flakiness looks related to this transfer. It appears to still be ongoing:

nodus:~>ps -ef | grep rsync
controls 29089   382  5 13:39:20 pts/1   13:55 rsync -a --inplace --delete --exclude lost+found --exclude .*.gwf /frames/trend
controls 29100   382  2 13:39:43 pts/1    9:15 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10975 131.
controls 29109   382  3 13:39:43 pts/1    9:10 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10978 131.
controls 29103   382  3 13:39:43 pts/1    9:14 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10976 131.
controls 29112   382  3 13:39:43 pts/1    9:18 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10979 131.
controls 29099   382  2 13:39:43 pts/1    9:14 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10974 131.
controls 29106   382  3 13:39:43 pts/1    9:13 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10977 131.
controls 29620 29603  0 20:40:48 pts/3    0:00 grep rsync

Diagnosing the problem:

I logged into fb and ran "top". It said that fb was waiting for disk I/O ~60% of the time (according to the "%wa" number in the header). There were 8 nfsd (network file server) processes running with several of them listed in status "D" (waiting for disk). The daqd logs were ending with errors like the following suggesting that it couldn't keep up with the flow of data:

[Wed Oct 22 18:58:35 2014] main profiler warning: 1 empty blocks in the buffer
[Wed Oct 22 18:58:36 2014] main profiler warning: 0 empty blocks in the buffer
GPS time jumped from 1098064730 to 1098064731

This all pointed to the possibility that the file transfer load was too heavy.

Reducing the load:

The following configuration changes were applied on fb.

Edited /etc/conf.d/nfs to reduce the number of nfsd processes from 8 to 1:

OPTS_RPC_NFSD="1"

(was "8")

Ran "ionice" to raise the priority of the framebuilder process (daqd):

controls@fb /opt/rtcds/rtscore/trunk/src/daqd 0$ sudo ionice -c 1 -p 10964

And to reduce the priority of the nfsd process:

controls@fb /opt/rtcds/rtscore/trunk/src/daqd 0$ sudo ionice -c 2 -p 11198

I also tried punishing nfsd with an even lower priority ("-c 3"), but that was causing the workstations to lag noticeably.

After these changes the %wa value went from ~60% to ~20%, and daqd seems to die less often, but some further throttling may still be in order.

  11265   Fri May 1 13:22:08 2015 ericqUpdateDAQPEM Slow channels added to saved frames

Rana asked me to include add slow outputs (OUT16) of the seismometer BLRMS channels to the frames. 

All of the PEM slow channels are already set up in c1/chans/daq/C1EDCU_PEM.ini, but up to this point, daqd had no knowledge of this file, since it wasn't included in c1/target/fb/master, which defines all the places to look for files describing channels to be written to disk. This file already includes lines for C1EDCU_LSC.ini and such, which from old elogs, looks like was set up by hand for subsystems we care about. 

Hence, since we now care about slow trends for the PEM subsystem, I have added a line to the daqd master file to tell it to save the PEM slow channels. This looks to have increased the size of the individual 16 second frame files from 57MB to 59MB, which isn't so bad.

  11266   Fri May 1 16:42:42 2015 ranaUpdateDAQPEM Slow channels added to saved frames

Still processing, but I think it should work fine once we have a day of data. Until then, here's the summary pages so far, including Vac channels:

http://www.ligo.caltech.edu/~misi/summary/day/20150501/pem/

  11627   Mon Sep 21 15:22:19 2015 jamieUpdateDAQworking on new fb replacement

I've been putting together a new machine that Rolf got for us as a replacement for fb.

I've installed and configured the OS, and compiled daqd and the necessary supporting software.  I want to try acquiring data with it.  This will require removing the current/old fb from the DAQ network, and adding the new machine.  It should be able to be done relatively non-invasively, such that none of the front end configuration needs to be adjusted, and the old fb can be put back in place easily.

If the test is successfully, then I'll push ahead with the rest of the replacement (such as either moving or copying the /frames RAID to the new machine).

I will do this work in the early AM tomorrow, September 22, 2015.

  11636   Tue Sep 22 17:30:55 2015 jamieUpdateDAQattempts at getting new fb working

Today I've been trying to get the new frame builder, tentatively 'fb1', to work.  It's not fully working yet, so I'm about to revert the system back to using 'fb'.  The switch-over process is annoying, since our one myrinet card has to be moved between the hosts.

A brief update on the process so far:

I'm being a little bold with this system by trying to build daqd against more system libraries, instead of the manually installed stuff usually nominally required.  Here's some of the relevant info about th fb1 system:

  • Debian 7 (wheezy)
  • lscsoft ldas-tools-framecpp-dev 2.4.1-1+deb7u0
  • lscsoft gds-dev 2.17.2-2+deb7u0
  • lscsoft libmetaio-dev 8.4.0-1+deb7u0
  • lscsoft libframe-dev 8.20-1+deb7u0
  • /opt/rtapps/epics-1.4.12.2_long
  • /opt/mx-1.2.16
  • advLigoRTS trunk

I finally managed to get daqd to build against the advLigoRTS trunk (post 2.9 branch).  I'll post detailed build log once I work out all the kinks.  It runs ok, including writing out full frames, as well as second and minute trends and raw minute trends, but there are a couple of show-stopper problems:

  • daqd segfaults if the C1EDCU.ini is specified.  If I comment out that one file from the 'master' channel ini file list then it runs without segfaulting.
  • Something is going on with the mx_streams from the front ends:
    • They appear to look ok from the daqd side, but the FEC-<ID>_FB_NET_STATUS indicators remain red.  The "DAQ" bit in the STATE_WORD is also red.  Again, this is even though data seems to be flowing.
    • The mx_stream processes on the front ends are dying (and restarting via monit) about every 2 minutes.  It's unclear what exactly is happening, but they all dia around the same time, so it possibly initiated from a daqd problem.  Around the time of the mx_stream failures, we see this in the daqd log:
[Tue Sep 22 17:24:07 2015] GPS MISS dcu 91 (TST); dcu_gps=1127003062 gps=1127003063

Aborted 1 send requests due to remote peer Aborted 1 send requests due to remote peer 00:25:90:0d:75:bb (c1sus:0) disconnected
mx_wait failed in rcvr eid=004, reqn=11; wait did not complete; status code is Remote endpoint is closed
00:30:48:d6:11:17 (c1iscey:0) disconnected
mx_wait failed in rcvr eid=002, reqn=235; wait did not complete; status code is Remote endpoint is closed
disconnected from the sender on endpoint 002
mx_wait failed in rcvr eid=005, reqn=253; wait did not complete; status code is Bad session (missing mx_connect?)
disconnected from the sender on endpoint 005
disconnected from the sender on endpoint 004
[Tue Sep 22 17:24:13 2015] GPS MISS dcu 39 (PEM); dcu_gps=1127003062 gps=1127003069
  • Occaissionally the daqd process dies when the front end mx_streams processes die.

I'll keep investigating, hopefully with some feedback from Keith and Rolf tomorrow.

  11645   Fri Sep 25 17:51:11 2015 jamieUpdateDAQfb replacement work update

Brief update about the fb replacement status.

The new hardware for fb is in the rack, temporarily sitting on top of megatron, and on the CDS network with the name 'fb1'.  I've installed an OS on it and have re-built daqd.

Earlier this week I swapped it into the network and tried to get it to acquire data from the front ends.  I was ultimately unsuccessfully.  The problem seemed to be the mx_stream communication from the front ends to the new host.

The swap is sort of a pain because we only have one Myrinet fiber network adapter card that has to be moved between machines, which of course requires shutting down both machines and opening up their chassis.  I instructed Steve to order us a new Myrinet card for the new machine, which will allow us to swap daqd machines by just moving the fiber connection.  Once that's in place (early next week) I'll go back to trying to figure out what the issue is with the mx_streams.

If all else fails I'll take the repulsive last resort of either swapping or cloning the disk from the old fb.

  11653   Wed Sep 30 13:59:49 2015 jamieUpdateDAQattempts at getting new fb working

I got Steve to get us a new Myrinet fiber network adapter card for fb1:

  • Myrinet 10G-PCIE-8B-S

I just finished installing the card in fb1, and it came up fine.  We happened to have a spare fiber, and a spare fiber jack in the DAQ switch, so I went ahead and plugged it in in parallel to the old fb:

controls@fb1:~/rtbuild/trunk 130$ /opt/mx/bin/mx_info
MX Version: 1.2.16
MX Build: controls@fb1:/opt/src/mx-1.2.16 Fri Sep 18 18:32:59 PDT 2015
1 Myrinet board installed.
The MX driver is configured to support a maximum of:
    8 endpoints per NIC, 1024 NICs on the network, 32 NICs per host
===================================================================
Instance #0:  364.4 MHz LANai, PCI-E x8, 2 MB SRAM, on NUMA node 0
    Status:         Running, P0: Link Up
    Network:        Ethernet 10G

    MAC Address:    00:60:dd:43:74:62
    Product code:   10G-PCIE-8B-S
    Part number:    09-04228
    Serial number:  485052
    Mapper:         00:60:dd:46:ea:ec, version = 0x00000000, configured
    Mapped hosts:   7

                                                        ROUTE COUNT
INDEX    MAC ADDRESS     HOST NAME                        P0
-----    -----------     ---------                        ---
   0) 00:60:dd:43:74:62 fb1:0                             1,0
   1) 00:25:90:0d:75:bb c1sus:0                           1,0
   2) 00:30:48:be:11:5d c1iscex:0                         1,0
   3) 00:30:48:d6:11:17 c1iscey:0                         1,0
   4) 00:30:48:bf:69:4f c1lsc:0                           1,0
   5) 00:14:4f:40:64:25 c1ioo:0                           1,0
   6) 00:60:dd:46:ea:ec fb:0                              1,0

We can now work on fb1 while fb continues to run and collect data from the front ends.

I'm still not getting the mx_stream connections to the new fb1 daq to work.  I'm leaving everything running as is on fb for the moment.

  11655   Thu Oct 1 19:49:52 2015 jamieUpdateDAQmore failed attempts at getting new fb working

Summary

I've not really been able to make additional progress with the new 'fb1' DAQ.  It's still flaky as hell.  Therefore we're still using old 'fb'.

Issues

mx_stream

The mx_stream processes on the front ends initially run fine, connecting to the daqd and transferring data, with both DAQ-..._STATUS and FE-..._FB_NET_STATUS indicators green.  Then after about two minutes all the mx_stream processes on all the front ends die.  Monit eventually restarts them all, at which point they come up green for a while until the crash again ~2 minutes later.  This is essentially the same situation as reported previously.

In the daqd logs when the mx_streams die:

Aborted 2 send requests due to remote peer 00:30:48:be:11:5d (c1iscex:0) disconnected
Aborted 2 send requests due to remote peer 00:14:4f:40:64:25 (c1ioo:0) disconnected
Aborted 2 send requests due to remote peer 00:30:48:d6:11:17 (c1iscey:0) disconnected
Aborted 2 send requests due to remote peer 00:25:90:0d:75:bb (c1sus:0) disconnected
Aborted 1 send requests due to remote peer 00:30:48:bf:69:4f (c1lsc:0) disconnected
mx_wait failed in rcvr eid=000, reqn=176; wait did not complete; status code is Remote endpoint is closed
disconnected from the sender on endpoint 000
mx_wait failed in rcvr eid=000, reqn=177; wait did not complete; status code is Connectivity is broken between the source and the destination
disconnected from the sender on endpoint 000
mx_wait failed in rcvr eid=000, reqn=178; wait did not complete; status code is Connectivity is broken between the source and the destination
disconnected from the sender on endpoint 000
mx_wait failed in rcvr eid=000, reqn=179; wait did not complete; status code is Connectivity is broken between the source and the destination
disconnected from the sender on endpoint 000
mx_wait failed in rcvr eid=000, reqn=180; wait did not complete; status code is Connectivity is broken between the source and the destination
disconnected from the sender on endpoint 000
[Thu Oct  1 19:00:09 2015] GPS MISS dcu 39 (PEM); dcu_gps=1127786407 gps=1127786425

[Thu Oct  1 19:00:09 2015] GPS MISS dcu 39 (PEM); dcu_gps=1127786408 gps=1127786426

[Thu Oct  1 19:00:09 2015] GPS MISS dcu 39 (PEM); dcu_gps=1127786408 gps=1127786426

In the mx_stream logs:

controls@c1iscey ~ 0$ /opt/rtcds/caltech/c1/target/fb/mx_stream -r 0 -W 0 -w 0 -s 'c1x05 c1scy c1tst' -d fb1:0
mmapped address is 0x7f0df23a6000
mmapped address is 0x7f0dee3a6000
mmapped address is 0x7f0dea3a6000
send len = 263596
Connection Made
isendxxx failed with status Remote Endpoint Unreachable
disconnected from the sender

daqd

While the mx_stream processes are running daqd seems to write out data just fine.  At least for the full frames.  I manually verified that there is indeed data in the frames that are written.

Eventually, though, daqd itself crashes with the same error that we've been seeing:

main profiler warning: 0 empty blocks in the buffer

I'm not exactly sure what the crashes are coincident with, but it looks like they are also coincident with the writing out of the minute and/or second trend files.  It's unclear how it's related to the mx_stream crashes, if at all.  The mx_stream crashes happen every couple of minutes, whereas the daqd itself crashes much less frequently.

The new daqd can't handle EDCU files.  If an EDCU file is specified (e.g. C0EDCU.ini in our case), the daqd will segfault very soon after startup.  This was an issue with the current daqd on fb, but was "fixed" by moving where the EDCU file was specified in the master file.

Conclusion

There are a number of differences between the fb1 and fb configurations:

  • newer OS (Debian 7 vs. ancient gentoo)
  • newer advLigoRTS (trunk vs. 2.9.4)
  • newer framecpp library installed from LSCSoft Debian repo (2.4.1-1+deb7u0 vs. 1.19.32-p1)

It's possible those differences could account for the problems (/opt/rtapps/epics incompatible with this Debian install, for instance).  Somehow I doubt it.  I wonder if all the weird network issues we've been seeing are somehow involved.  If the NFS mount of chiara is problematic for some reason that would affect everything that mounts it, which includes all the front ends and fb/fb1.

There are two things to try:

  • Fix the weird network problem.  Try remove EVERYTHING from the network except for chiara, fb/fb1, and the front ends and see if that helps.
  • Rebuild fb1 with Ubuntu and daqd as prescribed by Keith Thorne.
  11656   Thu Oct 1 20:24:02 2015 jamieUpdateDAQmore failed attempts at getting new fb working

I just realized that when running fb1, if a single mx_stream dies they all die.

  11657   Thu Oct 1 20:26:21 2015 jamieUpdateDAQSwapping between fb and fb1

Swapping between fb and fb1 as DAQ is very straightforward, now that they are both on the DAQ network:

  • stop daqd on fb
  • on fb sudoedit /diskless/root/etc/init.d/mx_stream and set: endpoint=fb1:0
  • start daqd on fb1.  The "new" daqd binary on fb1 is at: ~controls/rtbuild/trunk/build/mx-localtime/daqd

Once daqd starts, the front end mx_stream processes will be restarted by their monits, and be pointing to the new location.

Moving back is just reversing those steps.

  11664   Sun Oct 4 14:28:03 2015 jamieUpdateDAQmore failed attempts at getting new fb working

I tried to look at fb1 again today, but still haven't made any progress.

The one thing I did notice, though, is that every hour on the hour the fb1 daqd process dies in an identical manor to how the fb daqd dies, with these:

[Sun Oct  4 12:02:56 2015] main profiler warning: 0 empty blocks in the buffer

errors right as/after it tries to write out the minute trend frames.

This makes me think that this new hardware isn't actually going to fix the problem we've been seeing with the fb daqd, even if we do get daqd "working" on fb1 as well as it's currently working on fb.

  12714   Fri Jan 13 21:32:49 2017 ranaHowToDAQGet 40m data using NDS2 and Python

The attached file is a python notebook that you can use to get data. Minimal syntax.

Attachment 1: get40mData.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get some 40m data using NDS"
   ]
  },
  {
... 137 more lines ...
  12717   Sat Jan 14 00:53:05 2017 ranaHowToDAQGet 40m data using NDS2 and Python

Minute trend data seems not available using the NDS2 server. Its super slow using dataviewer from the control room.

Did some digging into the NDS2 config on megatron. It hasn't been updated in 2 years.

All of the stuff is run by the user 'nds2mgr'. The CronTab for this user was running all the channel name updates and server restarts at 3 AM each day; I've moved it to 5:05 AM. I don't know the password for this user, so I just did 'sudo su nds2mgr' to become him.

On megatron, in /home/nds2mgr/nds2-megatron/ there is a list of channels and configs. The file for the minute trend (C-M-ChanList.txt), hasn't been updated since Nov-2015. ???

  12718   Sat Jan 14 12:12:03 2017 ranaUpdateDAQminute trends missing

Did we turn off minute trend writing in one of the recent FrameBuilder debug sessions? Seems we only have second trends in 2016. Maybe this explains why its so slow to get minute trends? Dataviewer has to rebuild it from second trend.

controls@nodus|frames > l
total 64
drwx------   2 root     root     16384 Jun  8  2009 lost+found/
drwxr-xr-x   2 controls controls  4096 Jul 14  2015 tmp/
-rw-r--r--   1 controls controls     0 Jul 14  2015 test-file
drwxr-xr-x   5 controls controls  4096 Apr  7  2016 trend/
drwxr-xr-x   4 root     root      4096 Apr 11  2016 archive/
drwxr-xr-x 789 controls controls 36864 Jan 13 19:34 full/
controls@nodus|frames > cd trend
controls@nodus|trend > l
total 3340
drwxr-xr-x 258 controls controls 3342336 Jul  6  2015 minute_raw/
drwxr-xr-x 387 controls controls   36864 Nov  5  2015 minute/
drwxr-xr-x 969 controls controls   36864 Jan 13 19:49 second/

  12719   Sat Jan 14 12:36:57 2017 ericqUpdateDAQminute trends missing

Yes, writing minute trends causes hourly FB crashes in the current state of things. The "raw" minute trending is turned on, but I think that these are unknown to nds.

  12829   Wed Feb 15 00:26:44 2017 JohannesUpdateDAQpanels and pcbs

I finished designing the PCBs for the VME crate back sides (see attached). The project files live on the DCC now at https://dcc.ligo.org/LIGO-D1700058. I ordered a prototype quantity (9) of the PCB printed and bought the corresponding connectors, all will arrive within the next two weeks. See also attached the front panels for the Acromag DAQ chassis and Lydia's RF amplifier unit (the lone +24V slot confuses me: I don't see a ground connector?). On the Acromag panel, six (3x2) of the DB37 connectors are reserved for VME hardware, two are reserve, and I filled the remaining space with general purpose BNC connectors for whatever comes up.

Attachment 1: acromag_chassis_panel.pdf
acromag_chassis_panel.pdf
Attachment 2: vme_backplane_panel.pdf
vme_backplane_panel.pdf
Attachment 3: rfAmp.pdf
rfAmp.pdf
  12830   Wed Feb 15 09:06:13 2017 ericqUpdateDAQpanels and pcbs

The amplifier unit should use the three pin dsub connectors (3w3?) that we use on many of the other units for DC power, and preferably go through the back panel. You can leave out the negative pin, since you just need +24 and ground.

  12832   Wed Feb 15 22:21:12 2017 LydiaUpdateDAQpanels and pcbs

This is already how it's hooked up. The hole on the from that says +24 V is for an indicator light.

Quote:

The amplifier unit should use the three pin dsub connectors (3w3?) that we use on many of the other units for DC power, and preferably go through the back panel. You can leave out the negative pin, since you just need +24 and ground.

 

  12942   Thu Apr 13 19:54:07 2017 ranaUpdateDAQcheckup on minute trends

Our minute trends are still not available through NDS2 from the outside world due to the bad config of the DAQ, but I can confirm that we still have the minute-raw capability. This is 111 days of Seismic BLRMS.

However, it seems we're only able to get ~1 week of lookback on our second trendssadno and that is low-down dirty shame. We used to have over a month of second trend lookback before the last decade of 'upgrades'.

Attachment 1: BRLMS-trend.png
BRLMS-trend.png
  13478   Thu Dec 14 23:27:46 2017 johannesUpdateDAQaux chassis design

Made a front and back panel and slot panels for DSub and IDC breakouts. I want to send this out soon, are there any comments? Preferences for color schemes?

Attachment 1: auxdaq_40m_4U_front.pdf
auxdaq_40m_4U_front.pdf
Attachment 2: auxdaq_40m_4U_rear.pdf
auxdaq_40m_4U_rear.pdf
Attachment 3: auxdaq_40m_4U_DSub37x2.pdf
auxdaq_40m_4U_DSub37x2.pdf
Attachment 4: auxdaq_40m_4U_IDC50.pdf
auxdaq_40m_4U_IDC50.pdf
  13517   Tue Jan 9 00:07:03 2018 johannesUpdateDAQetmx slow daq chassis

All parts received and assembly near complete, small problem detected because the two DSub connectors are too close together for two cables to fit at the same time. Gautam and I will make some additional slot panels tomorrow using a waterjet cutter, so we can spread the breakout boards out and remedy this.

Fast binary channels need to be spliced into DSub connectors. Aaron is on this. All other, slow connections are already wired from before and have been tested for correct pins on the backplane DIN connectors.

 

The Acromag modules require only a positive supply voltage between +12 and +30 VDC and draw a maximum of 2.8W at that. This raises the question if we want this supply rail to be regulated or take the raw power from the Sorensens. Consulting with Ben Abbott: The Acromags in LIGO do not operate with regulated power. We could easily accomodate the standard regulator boards D1000217 in the chassis, which is probably a good idea if we want to place any custom electronics inside the chassis in the future, for example for whitening or active lowpass filtering.

  13529   Wed Jan 10 22:24:28 2018 johannesUpdateDAQetmx slow daq chassis

This evening I transitioned the slow controls to c1auxex2.

  1. Disconnected satellite box
  2. Turned off c1auxex
  3. Disconnected DIN cables from backplace connectors
  4. Attached purple adapter boards
  5. Labeled DSub cables for use
  6. Connected DSub cables to adapter boards and chassis
  7. Initiated modbus IOC on c1auxex2

Gautam and I then proceeded to test basic functionality

  1. Pitch bias sliders move pitch, yaw moves yawyes.
  2. Coil enable and monitoring channels work yes
  3. Watchdog seems to work. yes We set the treshold for tripping low, excited the optic, watchdog didn't disappoint and triggered.
  4. All channels Initialize with "0" upon machine/server script restart. This means the watchdog happens to be OFF, which is good yes. It would be great if we could also initialize PIT and YAW to retain their value from before to avoid kicking the optic. This is not straightforward with EPICS records but there must be a way.
  5. We got the local damping going yes.
  6. There is some problem with the routing of the fast BIO channels through the new chassis - so the ANALOG de-whitening filter seems to be always engaged, despite our toggling the software BIO bits no. Something must be wrongly wired, as we confirmed by returning only the FAST BIO wiring to the pre-acromag state (but everything else is now controlled by acromag) and didn't have the problem anymore. Or some electrical connection is not made (I had to use gender changers on these connectors due to lack of proper cabling)
  7. The switches for the QPD gain stages did not work. no I suspect a wiring problem, since the switching of the coil enables did work.

Arms are locked, have been for ~1hour with no hickups. We will leave it like this overnight to observe, and debug further tomorrow.

  13530   Thu Jan 11 09:57:17 2018 SteveUpdateDAQacromag at ETMX

Good going Johannes!

Quote:

This evening I transitioned the slow controls to c1auxex2.

  1. Disconnected satellite box
  2. Turned off c1auxex
  3. Disconnected DIN cables from backplace connectors
  4. Attached purple adapter boards
  5. Labeled DSub cables for use
  6. Connected DSub cables to adapter boards and chassis
  7. Initiated modbus IOC on c1auxex2

Gautam and I then proceeded to test basic functionality

  1. Pitch bias sliders move pitch, yaw moves yawyes.
  2. Coil enable and monitoring channels work yes
  3. Watchdog seems to work. yes We set the treshold for tripping low, excited the optic, watchdog didn't disappoint and triggered.
  4. All channels Initialize with "0" upon machine/server script restart. This means the watchdog happens to be OFF, which is good yes. It would be great if we could also initialize PIT and YAW to retain their value from before to avoid kicking the optic. This is not straightforward with EPICS records but there must be a way.
  5. We got the local damping going yes.
  6. There is some problem with the routing of the fast BIO channels through the new chassis - so the ANALOG de-whitening filter seems to be always engaged, despite our toggling the software BIO bits no. Something must be wrongly wired, as we confirmed by returning only the FAST BIO wiring to the pre-acromag state (but everything else is now controlled by acromag) and didn't have the problem anymore. Or some electrical connection is not made (I had to use gender changers on these connectors due to lack of proper cabling)
  7. The switches for the QPD gain stages did not work. no I suspect a wiring problem, since the switching of the coil enables did work.

Arms are locked, have been for ~1hour with no hickups. We will leave it like this overnight to observe, and debug further tomorrow.

 

Attachment 1: Acromg_in_action.png
Acromg_in_action.png
  13535   Thu Jan 11 20:59:41 2018 gautamUpdateDAQetmx slow daq chassis

Some suggestions of checks to run, based on the rightmost colum in the wiring diagram here - I guess some of these have been done already, just noting them here so that results can be posted.

  1. Oplev quadrant slow readouts should match their fast DAQ counterparts.
  2. Confirm that EX Transmon QPD whitening/gain switching are working as expected, and that quadrant spectra have the correct shape.
  3. Watchdog tripping under different conditions.
  4. Coil driver slow readbacks make sense - we should also confirm which of the slow readbacks we are monitoring (there are multiple on the SOS coil driver board) and update the MEDM screen accordingly.
  5. Confirm that shadow sensor PD whitening is working by looking at spectra.
  6. Confirm de-whitening switching capability - both to engage and disengage - maybe the procedure here can be repeated.
  7. Monitor DC alignment of ETMX - we've seen the optic wander around (as judged by the Oplev QPD spot position) while sitting in the control room, would be useful to rule out that this is because of the DC bias voltage stability (it probably isn't).
  8. Confirm that burt snapshot recording is working as expected - this is not just for c1auxex, but for all channels, since, as Johannes pointed out, the 2018 directory was totally missing and hence no snapshots were being made.
  9. Confirm that systemd restarts IOC processes when the machine currently called c1auxex2 gets restarted for whatever reason.

 

  13537   Fri Jan 12 10:02:05 2018 johannesUpdateDAQetmx slow daq chassis
Quote:

There is some problem with the routing of the fast BIO channels through the new chassis - so the ANALOG de-whitening filter seems to be always engaged, despite our toggling the software BIO bits no. Something must be wrongly wired, as we confirmed by returning only the FAST BIO wiring to the pre-acromag state (but everything else is now controlled by acromag) and didn't have the problem anymore. Or some electrical connection is not made (I had to use gender changers on these connectors due to lack of proper cabling)

The switches for the QPD gain stages did not work. no I suspect a wiring problem, since the switching of the coil enables did work.

Both issues were fixed. In both cases it was two separate causes that prevented them from working.

The QPD gain stage switch software channels were assigned to wrong physical pins of the Acromag, and additionally their DSub cable was swapped with a different one.

The BIO switching had its signal and ground wires swapped on ALL connections, and part of it was also suffering from the cable-mixup.

Both issues were fixed. All backplane signals are now routed through the Acromag chassis.

 

Gautam and I did notice that occasionally ETMX alignment will start drifting as evident from the OpLev. I want to set up a diagnostic channel to see if the DAC voltages coming from the Acromag are responsible for this.

ELOG V3.1.3-