40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
 40m Log, Page 288 of 341 Not logged in
ID Date Author Type Category Subject
11270   Mon May 4 10:21:09 2015 manasaSummaryGeneralDelay line frequency discriminator for FOL error signal

Attached is the schematic of the analog DFD and the plot showing the zero-crossing for a delay line length of 27cm. The bandwidth for the linear output signal obtained roughly matches what is expected from the length difference (370MHz) .

We could use a smaller cable to further increase our bandwidth. I propose we use this analog DFD to determine the range at which the frequency counter needs to be set and then use the frequency counter readout as the error signal for FOL.

11272   Mon May 4 12:42:34 2015 manasaSummaryGeneralDelay line frequency discriminator for FOL error signal

Koji suggested that I make a cosine fit for the curve instead of a linear fit.

I fit the data to $V(f) = A + B cos(2\pi f_{b}L/v)$
where L - cable length asymmetry (27 cm) , fb - beat frequency and v - velocity of light in the cable (2*10m/s)

The plot with the cosine fit is attached.

Fit coefficients (with 95% confidence bounds):
A =      0.4177  (0.3763, 0.4591)
B =       2.941  (2.89, 2.992)

11300   Mon May 18 14:46:20 2015 manasaSummaryGeneralDelay line frequency discriminator for FOL error signal

Measuring the voltage noise and frequency response of the Analog Delay-line Frequency Discriminator (DFD)

The schematic and an actual photo of the setup is shown below. The setup was checked to be physically sturdy with no loose connections or moving parts.

The voltage noise at the output of the DFD was measured using an SR785 signal analyzer while simultaneously monitoring the signal on an oscilloscope.

The noise at the output of the DFD was measured for no RF input and at several RF input frequencies including the zero crossing frequency and the optimum operating frequency of the DFD (20MHz).

The plot below show the voltage noise for different RF inputs to the DFD. It can be seen that the noise level is slightly lower at the zero crossing frequency where the amplitude noise is eliminated by the DFD.

I also did measurements to obtain the frequency response of the setup as the cable length difference has changed from the prior setup. The cable length difference is 21cm and the obtained linear signal at the output of the DFD extends over ~ 380MHz which is good enough for our purposes in FOL. A cosine fit to the data was done as before. //edit- Manasa: The gain of SR560 was set to 20 to obtain the data shown below//

Fit Coefficients (with 95% confidence bounds):
a =     -0.8763  (-1.076, -0.6763)
b =       3.771  (3.441, 4.102)

Data and matlab scripts are zipped and attached.

11368   Mon Jun 22 12:57:09 2015 ericqSummaryLSCX/Y green beat mode overlap measurement redone

I took measurements at the green beat setup on the PSL table, and found that our power / mode overlap situation is still consistent with what Koji and Manasa measured last September [ELOG 10492]. I also measured the powers at the BBPDs with the Ophir power meter.

Both mode overlaps are around 50%, which is fine.

The beatnote amplitudes at the BBPD outputs at a frequency of about 50MHz are -20.0 and -27.5 dBm for the X and Y beats, respectively. This is consistent with the measured optical power levels and a PD response of ~0.25 A/W at 532nm. The main reason for the disparity is that there is much more X green light than Y green light on the table (factor of ~20), and the greater amount of green PSL light on the Y BBPD (factor of ~3) does not quite make up for it.

One way to punch up the Y beat a little might be to adjust the pickoff optics. Of 25uW of Y arm transmitted green light incident on the polarizing beamsplitter that seperates the X and Y beams, only 13uW makes it to the Y BBPD, but this would only win us a couple dBms at most.

In any case, with the beat setup as it exists, it looks like we should design the next beatbox iteration to accept RF inputs of around -20 to -30 dBm.

In the style of the referenced ELOG, here are today's numbers.

            XARM   YARM
o BBPD DC output (mV)
 V_DARK:   +  1.0  + 2.2  V_PSL:    +  7.1  +21.3  V_ARM:    +165.0  + 8.2

o BBPD DC photocurrent (uA)
I_DC = V_DC / R_DC ... R_DC: DC transimpedance (2kOhm)  I_PSL:       3.6   10.7  I_ARM:      82.5    4.1

o Expected beat note amplitude I_beat_full = I1 + I2 + 2 sqrt(e I1 I2) cos(w t) ... e: mode overwrap (in power) I_beat_RF = 2 sqrt(e I1 I2) V_RF = 2 R sqrt(e I1 I2) ... R: RF transimpedance (2kOhm) P_RF = V_RF^2/2/50 [Watt]      = 10 log10(V_RF^2/2/50*1000) [dBm]
     = 10 log10(e I1 I2) + 82.0412 [dBm]
     = 10 log10(e) +10 log10(I1 I2) + 82.0412 [dBm]

for e=1, the expected RF power at the PDs [dBm]  P_RF:      -13.2  -21.5

o Measured beat note power (no alignment done)       P_RF:      -20.0  -27.5  [dBm] (53.0MHz and 46.5MHz)      e:       45.7   50.1  [%]                         

11370   Mon Jun 22 14:53:37 2015 ranaSummaryLSCX/Y green beat mode overlap measurement redone
• Why is there a factor of 20 power difference? Some of it is the IR laser power difference, but I thought that was just a factor of 4 in green.
• Why is the mode overlap only 50% and not more like 75%?
• IF we have enough PSL green power, we could do the Y-beat with a 80/20 instead of a 50/50 and get better SNR.
• The FFD-100 response is more like 0.33 A/W at 532 nm, not 0.25 A/W.

In any case, this signal difference is not big, so we should not need a different amplifier chain for the two signals. The 20 dB of amplification in the BeatBox was a fine way, but not great in circuit layout.

The BBPD has an input referred current noise of 10 pA/rHz and a transimpedance of 2 kOhm, so an output voltage noise of 20 nV/rHz (into 50 Ohms). This would be matched by an Amp with NF = 26 dB, which is way worse than anything we could bur from mini-circuits, so we should definitely NOT use anything like the low-noise, low output power amps used currently (e.g. ZFL-1000LN....never, ever use these for anything). We should use a single ZHL-3A-S (G = 25 dB, NF < 6 dB, Max Out = 30 dBm) for each channel (and nothing else) before driving the cables over to the LSC rack into the aLIGO demod board. I just ordered two of these now.

11384   Tue Jun 30 11:33:00 2015 JamieSummaryCDSprepping for CDS upgrade

This is going to be a big one.  We're at version 2.5 and we're going to go to 2.9.3.

RCG components that need to be updated:

• mbuf kernel module
• mx_stream driver
• iniChk.pl script
• daqd
• nds

Supporting software:

• EPICS 3.14.12.2_long
• ldas-tools (framecpp) 1.19.32-p1
• libframe 8.17.2
• gds 2.16.3.2
• fftw 3.3.2

Things to watch out for:

• RTS 2.6:
• raw minute trend frame location has changed (CRC-based subdirectory)
• new kernel patch
• RTS 2.7:
• supports "commissioning frames", which we will probably not utilize.  need to make sure that we're not writing extra frames somewhere
• RTS 2.8:
• "slow" (EPICS) data from the front-end processes is acquired via DAQ network, and not through EPICS.  This will increase traffic on the DAQ lan.  Hopefully this will not be an issue, and the existing network infrastructure can handle it, but it should be monitored.
11390   Wed Jul 1 19:16:21 2015 JamieSummaryCDSCDS upgrade in progress

## The CDS upgrade is now underway

Here's what's happened so far:

• Installed and linked in all the RTS supporting software packages in /opt/rtapps (only on front end machines and fb):
controls@c1lsc ~ 2$find /opt/rtapps/ -mindepth 1 -maxdepth 1 -type l -ls 12582916 0 lrwxrwxrwx 1 controls 1001 12 Jul 1 13:16 /opt/rtapps/gds -> gds-2.16.3.2 12603452 0 lrwxrwxrwx 1 controls 1001 10 Jul 1 13:17 /opt/rtapps/fftw -> fftw-3.3.2 12603451 0 lrwxrwxrwx 1 controls 1001 15 Jul 1 13:16 /opt/rtapps/libframe -> libframe-8.17.2 12603450 0 lrwxrwxrwx 1 controls 1001 13 Jul 1 13:16 /opt/rtapps/libmetaio -> libmetaio-8.2 12582915 0 lrwxrwxrwx 1 controls 1001 34 Jul 1 15:24 /opt/rtapps/framecpp -> ldas-tools-1.19.32-p1/linux-x86_64 12582914 0 lrwxrwxrwx 1 controls 1001 20 Jul 1 13:15 /opt/rtapps/epics -> epics-3.14.12.2_long • Checked out the RTS source for the version we'll be using: 2.9.4 /opt/rtcds/rtscore/tags/advLigoRTS-2.9.4 • built and installed all of the RTS components: • mbuf • mx_stream • daqd • nds • awgtpman • mx_stream is not working. Unknown why. It won't start on the front end machines (only tested on c1lsc so far) with the following error: controls@c1lsc ~ 1$ /opt/rtcds/caltech/c1/target/fb/mx_stream -s c1x04 c1lsc c1ass c1oaf c1cal -d fb:0
mmapped address is 0x7ff7b71a0000
send len = 263596
mx_connect failed Remote Endpoint is Closed
controls@c1lsc ~ 1$ Have contact Keith T. and Rolf B. for backup. This is a blocker, since this is what ferries the data from the front ends. • Rebuilt almost all models. This was good. Initially nothing would compile because of IPC creation errors, so I moved the old chans/ipc/C1.ipc file out of the way and generated a new one and then everything compiled (of course senders have to be compiled before receivers). I only had to fix a couple of things in the models themselves: • c1ioo - unterminated FiltCtrl inputs • C1_SUS_SINGLE_CONTROL - unterminated FiltCtrl inputs • c1oaf - bad part named "STATIC". There is some hacky namespace stuff going on in the RCG. I was able to just explode that part and it now works. • c1lsc - unterminated FiltCtrl inputs Haven't installed or tried to run anything yet, but the fact they compile is good. Some models are not compiling because they have C code in src blocks that are throwing errors: • c1lsc • c1cal It shouldn't be too hard to fix whatever is causing those compile errors. That's it for today. Will pick up again first thing tomorrow 11392 Tue Jul 7 17:22:16 2015 JessicaSummary Time Delay in ALS Cables I measured the transfer functions in the delay line cables, and then calculated the time delay from that. The first cable had a time delay of 1272 ns and the second had a time delay of 1264 ns. Below are the plots I created to calculate this. There does seem to be a pattern in the residual plots however, which was not expected. The R-Square parameter was very close to 1 for both fits, indicating that the fit was good. 11393 Tue Jul 7 18:27:54 2015 JamieSummaryCDSCDS upgrade: progress! After a couple of days of struggle, I made some progress on the CDS upgrade today: ## Front end status: • RTS upgraded to 2.9.4, and linked in as "release": /opt/rtcds/rtscore/release -> tags/advLigoRTS-2.9.4 • mbuf kernel module built installed • All front ends have been rebooted with the latest patched kernel (from 2.6 upgrade) • All models have been rebuilt, installed, restarted. Only minor model issues had to be corrected (unterminated unused inputs mostly). • awgtpman rebuilt, and installed/running on all front-ends • open-mx upgraded to 1.5.2: /opt/open-mx -> open-mx-1.5.2 • All front ends running latest version of mx_stream, built against 2.9.4 and open-mx-1.5.2. We have new GDS overview screens for the front end models: It's possible that our current lack of IRIG-B GPS distribution means that the 'TIM' status bit will always be red on the IOP models. Will consult with Rolf. There are other new features in the front ends that I can get into later. ## DAQ (fb) status: • daqd and nds rebuilt against 2.9.4, both now running on fb 40m daqd compile flags: cd src/daqd ./configure --enable-debug --disable-broadcast --without-myrinet --with-mx --enable-local-timing --with-epics=/opt/rtapps/epics/base --with-framecpp=/opt/rtapps/framecpp make make clean install daqd /opt/rtcds/caltech/c1/target/fb/ However, daqd has unfortunately been very unstable, and I've been trying to figure out why. I originally thought it was some sort of timing issue, but now I'm not so sure. I had to make the following changes to the daqdrc: set gps_leaps = 820108813 914803214 1119744016; That enumerates some list of leap seconds since some time. Not sure if that actually does anything, but I added the latest leap seconds anyway: set symm_gps_offset=315964803; This updates the silly, arbitrary GPS offset, that is required to be correct when not using external GPS reference. Finally, the last thing I did that finally got it running stably was to turn off all trend frame writing: # start trender; # start trend-frame-saver; # sync trend-frame-saver; # start minute-trend-frame-saver; # sync minute-trend-frame-saver; # start raw_minute_trend_saver; For whatever reason, it's the trend frame writing that that was causing things daqd to fall over after a short amount of time. I'll continue investigating tomorrow. We still have a lot of cleanup burt restores, testing, etc. to do, but we're getting there. 11395 Wed Jul 8 17:46:20 2015 JessicaSummaryGeneralUpdated Time Delay Plots I re-measured the transfer function for Cable B, because the residuals in my previous post for cable B indicated a bad fit. I also realized I had made a mistake in calculating the time delay, and calculated more reasonable time delays today. Cable A had a delay of 202.43 +- 0.01 ns. Cable B had a delay of 202.44 +- 0.01 ns. 11396 Wed Jul 8 20:37:02 2015 JamieSummaryCDSCDS upgrade: one step forward, two steps back After determining yesterday that all the daqd issues were coming from the frame writing, I started to dig into it more today. I also spoke to Keith Thorne, and got some good suggestions from Gerrit Kuhn at GEO. I realized that it probably wasn't the trend writing per se, but that turning on more writing to disk was causing increased load on daqd, and consequently on the system itself. With more frame writing turned on the memory consuption increased to the point of maxing out the physical RAM. The system the probably starting swaping, which certainly would have choked daqd. I noticed that fb only had 4G of RAM, which Keith suggested was just not enough. Even if the memory consumption of daqd has increased significantly, it still seems like 4G would not be enough. I opened up fb only to find that fb actually had 8G of RAM installed! Not sure what happend to the other 4G, but somehow they were not visible to the system. Koji and I eventually determined, via some frankenstein operations with megatron, that the RAM was just dead. We then pulled 4G of RAM from megatron and replaced the bad RAM in fb, so that fb now has a full 8G of RAM . Unfortunately, when we got fb fully back up and running we found that fb is not able to see any of the other hosts on the data concentrator network . mx_info, which displays the card and network status for the myricom myrinet fiber card, shows: MX Version: 1.2.16 MX Build: controls@fb:/opt/src/mx-1.2.16 Tue May 21 10:58:40 PDT 2013 1 Myrinet board installed. The MX driver is configured to support a maximum of: 8 endpoints per NIC, 1024 NICs on the network, 32 NICs per host =================================================================== Instance #0: 299.8 MHz LANai, PCI-E x8, 2 MB SRAM, on NUMA node 0 Status: Running, P0: Wrong Network Network: Myrinet 10G MAC Address: 00:60:dd:46:ea:ec Product code: 10G-PCIE-8AL-S Part number: 09-03916 Serial number: 352143 Mapper: 00:60:dd:46:ea:ec, version = 0x63e745ee, configured Mapped hosts: 1 ROUTE COUNT INDEX MAC ADDRESS HOST NAME P0 ----- ----------- --------- --- 0) 00:60:dd:46:ea:ec fb:0 D 0,0 Note that all front end machines should be listed in the table at the bottom, and they're not. Also note the "Wrong Network" note in the Status line above. It appears that the card has maybe been initialized in a bad state? Or Koji and I somehow disturbed the network when we were cleaning up things in the rack. "sudo /etc/init.d/mx restart" on fb doesn't solve the problem. We even rebooted fb and it didn't seem to help. In any event, we're back to no data flow. I'll pick up again tomorrow. 11397 Wed Jul 8 21:02:02 2015 JamieSummaryCDSCDS upgrade: another step forward, so we're back to where we started (plus a bit?) Koji did a bit of googling to determine that 'Wrong Network' status message could be explained by the fb myrinet operating in the wrong mode: (This was the useful link to track down the issue (KA))  Network: Myrinet 10G I didn't notice it before, but we should in fact be operating in "Ethernet" mode, since that's the fabric we're using for the DC network. Digging a bit deeper we found that the new version of mx (1.2.16) had indeed been configured with a different compile option than the 1.2.15 version had: controls@fb ~ 0$ grep '$./configure' /opt/src/mx-1.2.15/config.log$ ./configure --enable-ether-mode --prefix=/opt/mx
controls@fb ~ 0$grep '$ ./configure' /opt/src/mx-1.2.16/config.log
$./configure --enable-mx-wire --prefix=/opt/mx-1.2.16 controls@fb ~ 0$

So that would entirely explain the problem.  I re-linked mx to the older version (1.2.15), reloaded the mx drivers, and everything showed up correctly:

controls@fb ~ 0$/opt/mx/bin/mx_info MX Version: 1.2.12 MX Build: root@fb:/root/mx-1.2.12 Mon Nov 1 13:34:38 PDT 2010 1 Myrinet board installed. The MX driver is configured to support a maximum of: 8 endpoints per NIC, 1024 NICs on the network, 32 NICs per host =================================================================== Instance #0: 299.8 MHz LANai, PCI-E x8, 2 MB SRAM, on NUMA node 0 Status: Running, P0: Link Up Network: Ethernet 10G MAC Address: 00:60:dd:46:ea:ec Product code: 10G-PCIE-8AL-S Part number: 09-03916 Serial number: 352143 Mapper: 00:60:dd:46:ea:ec, version = 0x00000000, configured Mapped hosts: 6 ROUTE COUNT INDEX MAC ADDRESS HOST NAME P0 ----- ----------- --------- --- 0) 00:60:dd:46:ea:ec fb:0 1,0 1) 00:25:90:0d:75:bb c1sus:0 1,0 2) 00:30:48:be:11:5d c1iscex:0 1,0 3) 00:30:48:d6:11:17 c1iscey:0 1,0 4) 00:30:48:bf:69:4f c1lsc:0 1,0 5) 00:14:4f:40:64:25 c1ioo:0 1,0 controls@fb ~ 0$

The front end hosts are also showing good omx info (even though they had been previously as well):

controls@c1lsc ~ 0$/opt/open-mx/bin/omx_info Open-MX version 1.5.2 build: controls@fb:/opt/src/open-mx-1.5.2 Tue May 21 11:03:54 PDT 2013 Found 1 boards (32 max) supporting 32 endpoints each: c1lsc:0 (board #0 name eth1 addr 00:30:48:bf:69:4f) managed by driver 'igb' Peer table is ready, mapper is 00:30:48:d6:11:17 ================================================ 0) 00:30:48:bf:69:4f c1lsc:0 1) 00:60:dd:46:ea:ec fb:0 2) 00:25:90:0d:75:bb c1sus:0 3) 00:30:48:be:11:5d c1iscex:0 4) 00:30:48:d6:11:17 c1iscey:0 5) 00:14:4f:40:64:25 c1ioo:0 controls@c1lsc ~ 0$

This got all the mx_stream connections back up and running.

Unfortunately, daqd is back to being a bit flaky.  With all frame writing enabled we saw daqd crash again.  I then shut off all trend frame writing and we're back to a marginally stable state: we have data flowing from all front ends, and full frames are being written, but not trends.

I'll pick up on this again tomorrow, and maybe try to rebuild the new version of mx with the proper flags.

11398   Thu Jul 9 13:26:47 2015 JamieSummaryCDSCDS upgrade: new mx 1.2.16 installed

I rebuilt/installed mx 1.2.16 to use "ether-mode", instead of the default MX-10G:

controls@fb /opt/src/mx-1.2.16 0$./configure --enable-ether-mode --prefix=/opt/mx-1.2.16 ... controls@fb /opt/src/mx-1.2.16 0$ make
..
controls@fb /opt/src/mx-1.2.16 0$make install ... I then rebuilt/installed daqd so that it properly linked against the updated mx install: controls@fb /opt/rtcds/rtscore/release/src/daqd 0$ ./configure --enable-debug --disable-broadcast --without-myrinet --with-mx --with epics=/opt/rtapps/epics/base --with-framecpp=/opt/rtapps/framecpp --enable-local-timing ... controls@fb /opt/rtcds/rtscore/release/src/daqd 0$make ... controls@fb /opt/rtcds/rtscore/release/src/daqd 0$ install daqd /opt/rtcds/caltech/c1/target/fb/

It's now back to running and receiving data from the front ends (still not stable yet, though).

11400   Thu Jul 9 16:50:13 2015 JamieSummaryCDSCDS upgrade: if all else fails try throwing metal at the problem

I roped Rolf into coming over and adding his eyes to the problem.  After much discussion we couldn't come up with any reasonable explanation for the problems we've been seeing other than daqd just needing a lot more resources that it did before.  He said he had some old Sun SunFire X4600s from which we could pilfer memory.  I went over to Downs and ripped all the CPU/memory cards out of one of his machines and stuffed them into fb:

fb now has 8 CPU and 16G of RAM

Unfortunately, this is still not enough.  Or at least it didn't solve the problem; daqd is showing the same instabilities, falling over a couple of minutes after I turn on trend frame writing.  As always, before daqd fails it starts spitting out the following to the logs:

[Thu Jul  9 16:37:09 2015] main profiler warning: 0 empty blocks in the buffer

followed by lines like:

[Thu Jul  9 16:37:27 2015] GPS MISS dcu 44 (ASX); dcu_gps=1120520264 gps=1120519812

right before it dies.

I'm no longer convinced that this is a resource issue, though, judging by the resource usage right before the crash:

top - 16:47:32 up 48 min,  5 users,  load average: 0.91, 0.62, 0.61
Tasks:   2 total,   0 running,   2 sleeping,   0 stopped,   0 zombie
Cpu(s):  8.9%us,  0.9%sy,  0.0%ni, 89.1%id,  0.9%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  15952104k total, 13063468k used,  2888636k free,   138648k buffers
Swap:  1023996k total,        0k used,  1023996k free,  7672292k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12016 controls  20   0 8098m 4.4g 104m S  106 29.1   6:45.79 daqd
4953 controls  20   0 53580 6092 5096 S    0  0.0   0:00.04 nds

Load average less than 1 per CPU, plenty of free memory (~3G free, 0 swap), no waiting for IO (0.9%wa), etc.  daqd is utilizing lots of  threads, which should be spread across many cpus, so even the >100%CPU should be ok.   I'm at a loss...

11402   Mon Jul 13 01:11:14 2015 JamieSummaryCDSCDS upgrade: current assessment

daqd is still behaving unstably.  It's still unclear what the issue is.

The current failures look like disk IO contention.  However, it's hard to see any evidince of daqd is suffering from large IO wait while it's failing.

The frame size itself is currently smaller than it was before the upgrade:

controls@fb /frames/full 0$ls -alth 11190 | head total 369G drwxr-xr-x 321 controls controls 36K Jul 12 22:20 .. drwxr-xr-x 2 controls controls 268K Jun 23 06:06 . -rw-r--r-- 1 controls controls 67M Jun 23 06:06 C-R-1119099984-16.gwf -rw-r--r-- 1 controls controls 68M Jun 23 06:06 C-R-1119099968-16.gwf -rw-r--r-- 1 controls controls 69M Jun 23 06:05 C-R-1119099952-16.gwf -rw-r--r-- 1 controls controls 69M Jun 23 06:05 C-R-1119099936-16.gwf -rw-r--r-- 1 controls controls 67M Jun 23 06:05 C-R-1119099920-16.gwf -rw-r--r-- 1 controls controls 68M Jun 23 06:05 C-R-1119099904-16.gwf -rw-r--r-- 1 controls controls 68M Jun 23 06:04 C-R-1119099888-16.gwf controls@fb /frames/full 0$ ls -alth 11208 | head
total 17G
drwxr-xr-x   2 controls controls  20K Jul 13 01:00 .
-rw-r--r--   1 controls controls  45M Jul 13 01:00 C-R-1120809632-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 01:00 C-R-1120809408-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:56 C-R-1120809392-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:56 C-R-1120809376-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:56 C-R-1120809360-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:55 C-R-1120809344-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:55 C-R-1120809328-16.gwf
controls@fb /frames/full 0$ This would seem to indicate that it's not an increase in frame size that's to blame. Because slow data is now transported to daqd over the MX data concentrator network rather than via EPICS (RTS 2.8), there is more network on the MX network. I note also that the channel lists have increased in size: controls@fb /opt/rtcds/caltech/c1/chans/daq 0$ ls -alt archive/C1LSC* | head -20
-rw-r--r-- 1 4294967294 4294967294 262554 Jul  6 18:21 archive/C1LSC_150706_182146.ini
-rw-r--r-- 1 4294967294 4294967294 262554 Jul  6 18:16 archive/C1LSC_150706_181603.ini
-rw-r--r-- 1 4294967294 4294967294 262554 Jul  6 16:09 archive/C1LSC_150706_160946.ini
-rw-r--r-- 1 4294967294 4294967294  43366 Jul  1 16:05 archive/C1LSC_150701_160519.ini
-rw-r--r-- 1 4294967294 4294967294  43366 Jun 25 15:47 archive/C1LSC_150625_154739.ini
...

I would have thought, though, that data transmission errors would show up in the daqd status bits.

11404   Mon Jul 13 18:12:50 2015 JamieSummaryCDSCDS upgrade: left running in semi-stable configuration

I have been watching daqd all day and I don't feel particularly closer to understanding what the issues are.  However, things are

Interestingly, though, the stability appears highly variable at the moment.  This morning, daqd was very unstable and was crashing within a couple of minutes of starting.  However this afternoon, things seemed much more stable.  As of this moment, daqd has been running for for 25 minutes now, writing full frames as well as minute and second trends (no minute_raw), without any issues.  What has changed?

To reiterate, I have been closing watching disk IO to /frames.  I see no indication that there is any disk contention while daqd is failing.  It's still possible, though, that there are disk IO issues affecting daqd at a level that is not readily visible.  From dstat, the frame writes are visible, but nothing else.

I have made one change that could be positively affecting things right now: I un-exported /frames from NFS.  This eliminates anything external from reading /frames over the network.  In particular, it also shuts off the transfer of frames to LDAS.  Since I've done this, daqd has appeared to be more stable.  It's NOT totally stable, though, as the instance that I described above did eventually just die after 43 minutes, as I was writing this.

In any event, as things are currently as stable as I've seen them, I'm leaving it running in this configuration for the moment, with the following relevant daqdrc parameters:

start main 16;
start frame-saver;
sync frame-saver;
start trender 60 60;
start trend-frame-saver;
sync trend-frame-saver;
start minute-trend-frame-saver;
sync minute-trend-frame-saver;
start profiler;
start trend profiler;
11406   Tue Jul 14 09:08:37 2015 JamieSummaryCDSCDS upgrade: left running in semi-stable configuration

Overnight daqd restarted itself only about twice an hour, which is an improvement:

controls@fb /opt/rtcds/caltech/c1/target/fb 0$tail logs/restart.log daqd: Tue Jul 14 03:13:50 PDT 2015 daqd: Tue Jul 14 04:01:39 PDT 2015 daqd: Tue Jul 14 04:09:57 PDT 2015 daqd: Tue Jul 14 05:02:46 PDT 2015 daqd: Tue Jul 14 06:01:57 PDT 2015 daqd: Tue Jul 14 06:43:18 PDT 2015 daqd: Tue Jul 14 07:02:19 PDT 2015 daqd: Tue Jul 14 07:58:16 PDT 2015 daqd: Tue Jul 14 08:02:44 PDT 2015 daqd: Tue Jul 14 09:02:24 PDT 2015 Un-exporting /frames might have helped a bit. However, the problem is obviously still not fixed. 11408 Tue Jul 14 10:28:02 2015 ericqSummaryCDSCDS upgrade: left running in semi-stable configuration There remains a pattern to some of the restarts, the following times are all reported as restart times. (There are others in between, however.) daqd: Tue Jul 14 00:02:48 PDT 2015 daqd: Tue Jul 14 01:02:32 PDT 2015 daqd: Tue Jul 14 03:02:33 PDT 2015 daqd: Tue Jul 14 05:02:46 PDT 2015 daqd: Tue Jul 14 06:01:57 PDT 2015 daqd: Tue Jul 14 07:02:19 PDT 2015 daqd: Tue Jul 14 08:02:44 PDT 2015 daqd: Tue Jul 14 09:02:24 PDT 2015 daqd: Tue Jul 14 10:02:03 PDT 2015 Before the upgrade, we suffered from hourly crashes too: daqd_start Sun Jun 21 00:01:06 PDT 2015 daqd_start Sun Jun 21 01:03:47 PDT 2015 daqd_start Sun Jun 21 02:04:04 PDT 2015 daqd_start Sun Jun 21 03:04:35 PDT 2015 daqd_start Sun Jun 21 04:04:04 PDT 2015 daqd_start Sun Jun 21 05:03:45 PDT 2015 daqd_start Sun Jun 21 06:02:43 PDT 2015 daqd_start Sun Jun 21 07:04:42 PDT 2015 daqd_start Sun Jun 21 08:04:34 PDT 2015 daqd_start Sun Jun 21 09:03:30 PDT 2015 daqd_start Sun Jun 21 10:04:11 PDT 2015 So, this isn't neccesarily new behavior, just something that remains unfixed. 11409 Tue Jul 14 11:57:27 2015 jamieSummaryCDSCDS upgrade: left running in semi-stable configuration  Quote: There remains a pattern to some of the restarts, the following times are all reported as restart times. (There are others in between, however.) daqd: Tue Jul 14 00:02:48 PDT 2015 daqd: Tue Jul 14 01:02:32 PDT 2015 daqd: Tue Jul 14 03:02:33 PDT 2015 daqd: Tue Jul 14 05:02:46 PDT 2015 daqd: Tue Jul 14 06:01:57 PDT 2015 daqd: Tue Jul 14 07:02:19 PDT 2015 daqd: Tue Jul 14 08:02:44 PDT 2015 daqd: Tue Jul 14 09:02:24 PDT 2015 daqd: Tue Jul 14 10:02:03 PDT 2015 Before the upgrade, we suffered from hourly crashes too: daqd_start Sun Jun 21 00:01:06 PDT 2015 daqd_start Sun Jun 21 01:03:47 PDT 2015 daqd_start Sun Jun 21 02:04:04 PDT 2015 daqd_start Sun Jun 21 03:04:35 PDT 2015 daqd_start Sun Jun 21 04:04:04 PDT 2015 daqd_start Sun Jun 21 05:03:45 PDT 2015 daqd_start Sun Jun 21 06:02:43 PDT 2015 daqd_start Sun Jun 21 07:04:42 PDT 2015 daqd_start Sun Jun 21 08:04:34 PDT 2015 daqd_start Sun Jun 21 09:03:30 PDT 2015 daqd_start Sun Jun 21 10:04:11 PDT 2015 So, this isn't neccesarily new behavior, just something that remains unfixed. That's interesting, that we're still seeing those hourly crashes. We're not writing out the full set of channels, though, and we're getting more failures than just those at the hour, so we're still suffering. 11412 Tue Jul 14 16:51:01 2015 JamieSummaryCDSCDS upgrade: problem is not disk access I think I have now determined once and for all that the daqd problems are NOT due to disk IO contention. I have mounted a tmpfs at /frames/tmp and have told daqd to write frames there. The tmpfs exists entirely in RAM. There is essentially zero IO wait for such a filesystem, so daqd should never have trouble writing out the frames. But yet daqd continues to fail with the "0 empty blocks in the buffer" warnings. I've been down a rabbit hole. 11414 Tue Jul 14 17:14:23 2015 EveSummarySummary PagesFuture summary pages improvements Here is a list of suggested improvements to the summary pages. Let me know if there's something you'd like for me to add to this list! • A lot of plots are missing axis labels and titles, and I often don't know what to call these labels. I could use some help with this. • Check the weather and vacuum tabs to make sure that we're getting the expected output. Set the axis labels accordingly. • Investigate past periods of missing data on DataViewer to see if the problem was with the data requisition process, the summary page production process, or something else. • Based on trends in data over the past three months, set axis ranges accordingly to encapsulate the full data range. • Create a CDS tab to store statistics of our digital systems. We will use the CDS signals to determine when the digital system is running and when the minute trend is missing. This will allow us to exclude irrelevant parts of the data. • Provide duty ratio statistics for the IMC. • Set triggers for certain plots. For example, for channels C1:LSC-XARM OUT DQ and page 4 LIGO-T1500123–v1 C1:LSC-YARM OUT DQ to be plotted in the Arm LSC Control signals figures, C1:LSCTRX OUT DQ and C1:LSC-TRY OUT DQ must be higher than 0.5, thus acting as triggers. • Include some flag or other marking indicating when data is not being represented at a certain time for specific plots. • Maybe include some cool features like interactive plots. 11415 Wed Jul 15 13:19:14 2015 JamieSummaryCDSCDS upgrade: reducing mx end-points as last ditch effort I tried one last thing, suggested by Keith and Gerrit. I tried reducing the number of mx end-points on fb to zero, which should reduce the total number of fb threads, in the hope that the extra threads were causing the chokes. On Tue, Jul 14 2015, Keith Thorne <kthorne@ligo-la.caltech.edu> wrote: > Assumptions > 1) Before the upgrade (from RCG 2.6?), the DAQ had been working, reading out front-ends, writing frames trends > 2) In upgrading to RCG 2.9, the mx start-up on the frame builder was modified to use multiple end-points > (i.e. /etc/init.d/mx has a line like > # 1 10G card - X2 > MX_MODULE_PARAMS="mx_max_instance=1 mx_max_endpoints=16$MX_MODULE_PARAMS"
>  (This can be confirmed by the daqd log file with lines at the top like
> 263596
> MX has 16 maximum end-points configured
> 2 MX NICs available
> [Fri Jul 10 16:12:50 2015] ->4: set thread_stack_size=10240
> [Fri Jul 10 16:12:50 2015] new threads will be created with the stack of size 10
> 240K
>
> If this is the case, the problem may be that the additional thread on the frame-builder (one per end-point) take up so many slots on the 8-core
> frame-builder that they interrupt the frame-writing thread, thus preventing the main buffer from being emptied.
>
> One could go back to a single end-point. This only helps keep restart of front-end A from hiccuping DAQ for front-end B.
>
> You would have to remove code on front-ends (/etc/init.d/mx_stream) that chooses endpoints. i.e.
> # find line number in rtsystab. Use that to mx_stream slot on card (0-15)
> line_num=grep -v ^# /etc/rtsystab | grep --perl-regexp -n "^${hostname}\s" | se > d 's/^$$[0-9]*$$:.*/\1/g' > line_off=$(expr $line_num - 1) > epnum=$(expr $line_off % 2) > cnum=$(expr $line_off / 2) > > start-stop-daemon --start --quiet -b -m --pidfile /var/log/mx_stream0.pid --exec /opt/rtcds/tst/x2/target/x2daqdc0/mx_stream -- -e 0 -r "$epnum" -W 0 -w 0 -s "$sys" -d x2daqdc0:$cnum -l /opt/rtcds/tst/x2/target/x2daqdc0/mx_stream_logs/$hostname.log As per Keith's suggestion, I modified the mx startup script to only initialize a single endpoint, and I modified the mx_stream startup to point them all to endpoint 0. I verified that indeed daqd was a single MX end-point: MX has 1 maximum end-points configured It didn't help. After 5-10 minutes daqd crashes with the same "0 empty blocks" messages. I should also mention that I'm pretty sure the start of these messages does not seem coincident with any frame writing to disk; further evidence that it's not a disk IO issue. Keith is looking at the system now, so we if he can see anything obvious. If not, I will start reverting to 2.5. 11417 Wed Jul 15 18:19:12 2015 JamieSummaryCDSCDS upgrade: tentative stabilty? Keith Thorne provided his eyes on the situation today and had some suggestions that might have helped things Reorder ini file list in master file. Apparently the EDCU.ini file (C0EDCU.ini in our case), which describes EPICS subscriptions to be recorded by the daq, now has to be specified *after* all other front end ini files. It's unclear why, but it has something to do with RTS 2.8 which changed all slow channels to be transported over the mx network. This alone did not fix the problem, though. Increase second trend frame size. Interestingly, this might have been the key. The second trend frame size was increased to 600 seconds: start trender 600 60; The two numbers are the lengths in seconds for the second and minute trends respectively. They had been set to "60 60", but Keith suggested that longer second trend frames are better, for whatever reason. It seems he may be right, given that daqd has been running and writing full and trend frames for 1.5 hours now without issue. As I'm writing this, though, the daqd just crashed again. I note, though, that it's right after the hour, and immediately following writing out a one hour minute trend file. We've been seeing these hour, on the hour, crashes of daqd for quite a while now. So maybe this is nothing new. I've actually been wondering if the hourly daqd crashes were associated with writing out the minute trend frames, and I think we might have more evidence to point to that. If increasing the size of the second trend frames from 60 seconds (35M) to 600 seconds (70M) made a difference in stability, could there be an issue since writing out files that are smaller than some value? The full frames are 60M, and the minute trends are 35M. 11427 Sat Jul 18 15:37:19 2015 JamieSummaryCDSCDS upgrade: current status So it appears we have found a semi-stable configuration for the DAQ system post upgrade: Here are the issues: ## daqd dadq is running mostly stably for the moment, although it still crashes at the top of every hour (see below). Here are some relevant points of about the current configuration: • recording data from only a subset of front-ends, to reduce the overall load: • c1x01 • c1scx • c1x02 • c1sus • c1mcs • c1pem • c1x04 • c1lsc • c1ass • c1x05 • c1scy • 16 second main buffer: start main 16; • trend lengths: second: 600, minute: 60 start trender 600 60; • writing to frames: • full • second • minute • (NOT raw minute trends) • frame compression ON This elliminates most of the random daqd crashing. However, daqd still crashes at the top of every hour after writing out the minute trend frame. Still unclear what the issue is, but Keith is investigating. In some sense this is no worse that where we were before the upgrade, since daqd was also crashing hourly then. It's still crappy, though, so hopefully we'll figure something out. The inittab on fb automatically restarts daqd after it crashes, and monit on all of the front ends automatically restarts the mx_stream processes. ## front ends The front end modules are mostly running fine. One issue is that the execution times seem to have increased a bit, which is problematic for models that were already on the hairy edge. For instance, the rough aversage for c1sus has some from ~48us to 50us. This is most problematic for c1cal, which is now running at ~66us out of 60, which is obviously untenable. We'll need to reduce the load in c1cal somehow. All other front end models seem to be working fine, but a full test is still needed. There was an issue with the DACs on c1sus, but I rebooted and everything came up fine, optics are now damped: 11437 Wed Jul 22 22:06:42 2015 EveSummarySummary PagesFuture summary pages improvements - CDS Tab We want to monitor the status of the digital control system. 1st plot Title: EPICS DAQ Status I wonder we can plot the binary numbers as statuses of the data acquisition for the realtime codes. We want to use the status indicators. Like this: https://ldas-jobs.ligo-wa.caltech.edu/~detchar/summary/day/20150722/plots/H1-MULTI_A8CE50_SEGMENTS-1121558417-86400.png channels: C1:DAQ-DC0_C1X04_STATUS C1:DAQ-DC0_C1LSC_STATUS C1:DAQ-DC0_C1ASS_STATUS C1:DAQ-DC0_C1OAF_STATUS C1:DAQ-DC0_C1CAL_STATUS C1:DAQ-DC0_C1X02_STATUS C1:DAQ-DC0_C1SUS_STATUS C1:DAQ-DC0_C1MCS_STATUS C1:DAQ-DC0_C1RFM_STATUS C1:DAQ-DC0_C1PEM_STATUS C1:DAQ-DC0_C1X03_STATUS C1:DAQ-DC0_C1IOO_STATUS C1:DAQ-DC0_C1ALS_STATUS C1:DAQ-DC0_C1X01_STATUS C1:DAQ-DC0_C1SCX_STATUS C1:DAQ-DC0_C1ASX_STATUS C1:DAQ-DC0_C1X05_STATUS C1:DAQ-DC0_C1SCY_STATUS C1:DAQ-DC0_C1TST_STATUS 1st plot Title: IOP Fast Channel DAQ Status These have two bits each. How can we handle it? If we need to shrink it to a single bit take "AND" of them. C1:FEC-40_FB_NET_STATUS (legend: c1x04, if a legend placable) C1:FEC-20_FB_NET_STATUS (legend: c1x02) C1:FEC-33_FB_NET_STATUS (legend: c1x03) C1:FEC-19_FB_NET_STATUS (legend: c1x01) C1:FEC-46_FB_NET_STATUS (legend: c1x05) 3rd plot Title C1LSC CPU Meters channels: C1:FEC-40_CPU_METER (legend: c1x04) C1:FEC-42_CPU_METER (legend: c1lsc) C1:FEC-48_CPU_METER (legend: c1ass) C1:FEC-22_CPU_METER (legend: c1oaf) C1:FEC-50_CPU_METER (legend: c1cal) The range is from 0 to 75 except for c1oaf that could go to 500. Can we plot c1oaf with the value being devided by 8? (Then the legend should be c1oaf /8) 4th plot Title C1SUS CPU Meters channels: C1:FEC-20_CPU_METER (legend: c1x02) C1:FEC-21_CPU_METER (legend: c1sus) C1:FEC-36_CPU_METER (legend: c1mcs) C1:FEC-38_CPU_METER (legend: c1rfm) C1:FEC-39_CPU_METER (legend: c1pem) The range is be from 0 to 75 except for c1pem that could go to 500. Can we plot c1pem with the value being devided by 8? (Then the legend should be c1pem /8) 5th plot Title C1IOO CPU Meters channels: C1:FEC-33_CPU_METER (legend: c1x03) C1:FEC-34_CPU_METER (legend: c1ioo) C1:FEC-28_CPU_METER (legend: c1als) The range is be from 0 to 75. 6th plot Title C1ISCEX CPU Meters channels: C1:FEC-19_CPU_METER (legend: c1x01) C1:FEC-45_CPU_METER (legend: c1scx) C1:FEC-44_CPU_METER (legend: c1asx) The range is be from 0 to 75. 7th plot Title C1ISCEY CPU Meters channels: C1:FEC-46_CPU_METER (legend: c1x05) C1:FEC-47_CPU_METER (legend: c1scy) C1:FEC-91_CPU_METER (legend: c1tst) The range is be from 0 to 75. ===================== IOO We want a duty ratio plot for the IMC. C1:IOO-MC_TRANS_SUM >1e4 is the good period. Duty ratio plot looks like the right plot of the following link https://ldas-jobs.ligo-wa.caltech.edu/~detchar/summary/day/20150722/lock/segments/ ===================== SUS: OPLEV OL_PIT_INMON and OL_YAW_INMON are good for the slow drift monitor. But their sampling rate is too slow for the PSDs. Can you use C1:SUS-ETM_OPLEV_PERROR C1:SUS-ETM_OPLEV_YERROR etc... For the PSDs? They are 2kHz sampling DQ channels. You would be able to plot it up to ~1kHz. In fact, we want to monitor the PSD from 100mHz to 1kHz. How can you set up the resolution (=FFT length)? ===================== LSC / ASC / ALS tabs Let's make new tabs LSC, ASC, and ALS LSC: We should have a plot for C1:LSC-TRX_OUT_DQ C1:LSC-TRY_OUT_DQ C1:LSC-POPDC_OUT_DQ It's OK to use the minute trend for now. You can check the range using dataviewer. ASC: Let's use C1:SUS_MC1_ASCPIT_OUT16 (legend: IMC WFS) C1:ASS-XARM_ITM_YAW_OSC_CLKGAIN (legend: XARM ASS) C1:ASS-YARM_ITM_YAW_OSC_CLKGAIN (legend: YARM ASS) C1:ASX-XARM_M1_PIT_OSC_CLKGAIN (legend: XARM Green ASS) as the status indicators. There is no YARM Green ASS yet. ALS: Title: ALS Green transmission We want a time series of ALS-TRX_OUT16 ALS-TRY_OUT16 Title: ALS Green beatnote Another time series ALS-BEATX_FINE_Q_MON ALS-BEATY_FINE_Q_MON Title: Frequency monitor We have frequency counter outputs, but I have to talk to Eric to know the channel names 11441 Thu Jul 23 20:57:15 2015 JessicaSummaryGeneralApplying Pre-filter to data before IIR Wiener Filtering I updated my bandpass filter and have included the bode plot below in Figure 1. It is a fourth order elliptic bandpass filter with a passband ripple of 1dB and a stopband attenuation of 30 dB. It emphasizes the area between 3 and 40 Hz. Below, I applied this filter to the huddle test data. The results from this were only slightly better in the targeted region than when no pre-filter was applied. When I pre-filtered the mode cleaner data and then used an IIR wiener filter, I found that the results did not differ much from the data that was not pre-filtered. I'm not sure yet if I'm targeting the right region of this data with my bandpass filter, and will be looking more into choosing a better region. Also, I am only using certain regions of ff when calculating the transfer function, and need to optimize that region also. I uploaded the code I used to make these plots to github. 11456 Tue Jul 28 20:42:50 2015 JessicaSummaryGeneralNew Seismometer Data Coherence I was looking at the new seismometer data and plotted the coherence between the different arms of C1:PEM_GUR1 and C1:PEM_GUR2. There was not much coherence in the X arms, Y arms, or Z arms of each seismometer, but there were within the x and y arms of the seismometer. I think the area we should focus on with filtering is lower ranges, between 0.01 and 0.1, because that it where coherence is most clearly high. It is higher in high frequencies but also incredibly noisy, meaning it probably wouldn't be good to try to filter there. 11457 Wed Jul 29 10:34:42 2015 IgnacioSummaryLSCCoherence of arms and seismometers Jessica and I took 45 mins (GPS times from 1122099200 to 1122101950) worth of data from the following channels: C1:IOO-MC_L_DQ (mode cleaner) C1:LSC-XARM_IN1_DQ (X arm length) C1:LSC-YARM_IN1_DQ (Y arm length) and for the STS, GUR1, and GUR2 seismometer signals. The PSD for MCL and the arm length signals is shown below, I looked at the coherence between the arm length and each of the three seismometers, plot overload incoming below, For the coherence between STS and XARM and YARM, For GUR1, Finally for GUR2, A few remarks: 1) From the coherence plots, we can see that the arm length signals are coherent with the seismometer signals the most from 0.5 - 50 Hz. This is most evident in the coherence with STS. I think subtraction will be most useful in this range. This agrees with what we see in the PSD of the arm length signals, the magnitude of the PSD starts increasing from 1 Hz and reaches a maximum at about 30 Hz. This is indicative of which frequencies most of the noise is present. 2) Eric did not remember which of GUR1 and GUR2 corresponded to the ends of XARM and YARM. So, I went to the end of XARM, and jumped for a couple seconds to disturb whatever Gurald was in there. Using dataviewer I determined it was GUR1. Anyways, my point is, why is GUR1 less coherent with both arms and not just XARM? Since it is at the end of XARM, I was expecting GUR1 to be more coherent with XARM. Is it because, though different arms, the PSD's of both arms are roughly the same? 3) Similarly, GUR2 shows about the same levels of coherence for both arms, but it is more coherent. Is GUR2 noisier because of its location? Code: ARMS_COH.m.zip 11458 Wed Jul 29 11:15:21 2015 JessicaSummaryLSCPSDs of Arms with seismometer subtraction Ignacio and I downloaded data from the STS, GUR1, and GUR2 seismometers and from the mode cleaner and the x and y arms. The PSDs along the arms have the most noise at frequencies greater than 1 Hz, so we should focus on filtering in that area. The noise levels start dropping at around 30 Hz, but are still much higher than is seen at frequencies below 1 Hz. However, because the spectra is so low at frequencies below that, Wiener filtering alone injected a significant amount of noise into those frequencies and did not do much to reduce the noise at higher frequencies. Pre-filtering will be required, and I have started implementing a pre-filter, but with no improvements yet. So far, I have tried making a bandpass filter, but a highpass filter may be ideal in this case because so much of the noise is above 1 Hz. 11461 Wed Jul 29 21:40:39 2015 KojiSummaryCDSStatus of the frame data syncing The trend data hasn't been synced with LDAS since Jul 27 5AM local. 40m: controls@nodus|minute > pwd /frames/trend/minute controls@nodus|minute > ls -l 11222 | tail total 590432 -rw-r--r-- 1 controls controls 35758781 Jul 29 11:59 C-M-1122228000-3600.gwf -rw-r--r-- 1 controls controls 35501472 Jul 29 12:59 C-M-1122231600-3600.gwf -rw-r--r-- 1 controls controls 35296271 Jul 29 13:59 C-M-1122235200-3600.gwf -rw-r--r-- 1 controls controls 35459901 Jul 29 14:59 C-M-1122238800-3600.gwf -rw-r--r-- 1 controls controls 35550346 Jul 29 15:59 C-M-1122242400-3600.gwf -rw-r--r-- 1 controls controls 35699944 Jul 29 16:59 C-M-1122246000-3600.gwf -rw-r--r-- 1 controls controls 35549480 Jul 29 17:59 C-M-1122249600-3600.gwf -rw-r--r-- 1 controls controls 35481070 Jul 29 18:59 C-M-1122253200-3600.gwf -rw-r--r-- 1 controls controls 35518238 Jul 29 19:59 C-M-1122256800-3600.gwf -rw-r--r-- 1 controls controls 35514930 Jul 29 20:59 C-M-1122260400-3600.gwf LDAS Minute trend: [koji.arai@ldas-pcdev3 C-M-11]$ pwd /archive/frames/trend/minute-trend/40m/C-M-11 [koji.arai@ldas-pcdev3 C-M-11]$ls -l | tail -rw-r--r-- 1 1001 1001 35488497 Jul 26 19:59 C-M-1121997600-3600.gwf -rw-r--r-- 1 1001 1001 35477333 Jul 26 21:00 C-M-1122001200-3600.gwf -rw-r--r-- 1 1001 1001 35498259 Jul 26 21:59 C-M-1122004800-3600.gwf -rw-r--r-- 1 1001 1001 35509729 Jul 26 22:59 C-M-1122008400-3600.gwf -rw-r--r-- 1 1001 1001 35472432 Jul 26 23:59 C-M-1122012000-3600.gwf -rw-r--r-- 1 1001 1001 35472230 Jul 27 00:59 C-M-1122015600-3600.gwf -rw-r--r-- 1 1001 1001 35468199 Jul 27 01:59 C-M-1122019200-3600.gwf -rw-r--r-- 1 1001 1001 35461729 Jul 27 02:59 C-M-1122022800-3600.gwf -rw-r--r-- 1 1001 1001 35486755 Jul 27 03:59 C-M-1122026400-3600.gwf -rw-r--r-- 1 1001 1001 35467084 Jul 27 04:59 C-M-1122030000-3600.gwf 11465 Thu Jul 30 11:47:54 2015 KojiSummaryCDSStatus of the frame data syncing Today it was synced at 5AM but that was all. 40m: controls@nodus|minute > pwd /frames/trend/minute controls@nodus|minute > ls -l 11222|tail -rw-r--r-- 1 controls controls 35521183 Jul 29 21:59 C-M-1122264000-3600.gwf -rw-r--r-- 1 controls controls 35509281 Jul 29 22:59 C-M-1122267600-3600.gwf -rw-r--r-- 1 controls controls 35511705 Jul 29 23:59 C-M-1122271200-3600.gwf -rw-r--r-- 1 controls controls 35809690 Jul 30 00:59 C-M-1122274800-3600.gwf -rw-r--r-- 1 controls controls 35752082 Jul 30 01:59 C-M-1122278400-3600.gwf -rw-r--r-- 1 controls controls 35927246 Jul 30 02:59 C-M-1122282000-3600.gwf -rw-r--r-- 1 controls controls 35775843 Jul 30 03:59 C-M-1122285600-3600.gwf -rw-r--r-- 1 controls controls 35648583 Jul 30 04:59 C-M-1122289200-3600.gwf -rw-r--r-- 1 controls controls 35643898 Jul 30 05:59 C-M-1122292800-3600.gwf -rw-r--r-- 1 controls controls 35704049 Jul 30 06:59 C-M-1122296400-3600.gwf controls@nodus|minute > ls -l 11223|tail total 139616 -rw-r--r-- 1 controls controls 35696854 Jul 30 08:02 C-M-1122300000-3600.gwf -rw-r--r-- 1 controls controls 35675136 Jul 30 08:59 C-M-1122303600-3600.gwf -rw-r--r-- 1 controls controls 35701754 Jul 30 09:59 C-M-1122307200-3600.gwf -rw-r--r-- 1 controls controls 35718038 Jul 30 10:59 C-M-1122310800-3600.gwf LDAS Minute trend: [koji.arai@ldas-pcdev3 C-M-11]$ pwd /archive/frames/trend/minute-trend/40m/C-M-11 [koji.arai@ldas-pcdev3 C-M-11]\$ ls -l |tail -rw-r--r-- 1 1001 1001 35518238 Jul 29 19:59 C-M-1122256800-3600.gwf -rw-r--r-- 1 1001 1001 35514930 Jul 29 20:59 C-M-1122260400-3600.gwf -rw-r--r-- 1 1001 1001 35521183 Jul 29 21:59 C-M-1122264000-3600.gwf -rw-r--r-- 1 1001 1001 35509281 Jul 29 22:59 C-M-1122267600-3600.gwf -rw-r--r-- 1 1001 1001 35511705 Jul 29 23:59 C-M-1122271200-3600.gwf -rw-r--r-- 1 1001 1001 35809690 Jul 30 00:59 C-M-1122274800-3600.gwf -rw-r--r-- 1 1001 1001 35752082 Jul 30 01:59 C-M-1122278400-3600.gwf -rw-r--r-- 1 1001 1001 35927246 Jul 30 02:59 C-M-1122282000-3600.gwf -rw-r--r-- 1 1001 1001 35775843 Jul 30 03:59 C-M-1122285600-3600.gwf -rw-r--r-- 1 1001 1001 35648583 Jul 30 04:59 C-M-1122289200-3600.gwf

11502   Thu Aug 13 12:06:39 2015 Jessica SummaryIOOBetter predicted subtraction did not work as well Online

Yesterday I adjusted the preweighting of my IIR fit to the transfer function of MC2, and also managed to reduce the number of poles and zeros from 8 to 6, giving a smoother rolloff. The bode plots are pictured here:

The predicted IIR subtraction was very close to the predicted FIR subtraction, so I thought these coefficients would lead to a better online filter.

However, the actual subtraction of the MCL was not as good and noise was injected into the Y arm.

The final comparison of the subtraction factors between the online and offline data showed that the preweighting, while it improved the offline subtraction, needs more work to improve the online subtraction also.

11509   Fri Aug 14 23:49:34 2015 KojiSummaryGeneralB&K Shaker fixed

I fixed a shaker that was claimed to be broken. I had to cut the rubber membrane to open the head.

Once it was opened, the cause of the trouble was obvious. The soldering joint could not put up with the motion of the head.

It is interesting to see that the spring has the damping layer between the metal sheets.

After the repair the DC resistance was measured. It was 1.9Ohm. The side of the shaker chassis said "3.5Ohm, Max 15VA". So it can take more than 4A (wow).

I gave 2A DC from the bench top supply and turn the current on and off. I could confirm the head was moving.

I'll claim the use of this shaker for the seismometer development.

11524   Sat Aug 22 15:48:32 2015 KojiSummaryLSCArm locking recovery

As per Ignacio's request, I restored the arm locking.

- MC WFS relief

- Slow DC restored to ~0V

- Turned off DARM/CARM

- XARM/YARM turned on

11551   Tue Sep 1 02:44:44 2015 KojiSummaryCDSc1oaf, c1mcs modified for the IMC angular FF

[Koji, Ignacio]

In order to allow us to work on the IMC angular FF, we made the signal paths from PEM to MC SUSs.
In fact, there already were the paths from c1pem to c1oaf. So, the new paths were made from c1oaf to c1mcs. (Attachment 1~3)

After some debugging those two models started running. The additional cost of the processing time is insignificant.
FB was restarted to accomodate the change.

Once the modification of the models was completed, the OAF screens were modified. It seemed that the Kissel button
for the output matrix haven't been updated for the PRM ASC implementation. This was fixed as the button was updated this time.
In addition, the button for the FM matrix was also made and pasted.

11569   Thu Sep 3 19:52:24 2015 ranaSummarySUSSUS drift monitor

Since Andrey's SUS Drift mon screen back in 2007, we've had several versions which used different schemes and programming languages. Diego made an update back in January.

Today I added his stuff to the SVN since it was lost in the NFS disks somewhere. Its in SUS/DRIFT_MON/.

Since we've been updating our userapps directory recently to pull in the screens and scripts from the sites, we also got a copy of the Thomas Abbott drift mon stuff which is better (Diego actually removed the yellow/red functionality as part of the 'upgrade'), but more complicated. For now we have the old one. I updated the good values with all optics roughly aligned just a few minutes ago.

11590   Thu Sep 10 09:37:34 2015 IgnacioSummaryIOOFilters left on MCL static module

The following MCL filters were left loaded in the T240-X and T240-Y FF filter modules (filters go in pairs, both on):

FM7: SISO filters for MCL elog:11541

FM8: MISO v1 elog:11547

FM9: MISO v1.1 Small improvement over MISO v1

FM10 MISO v2 elog:11563

FM5 MISO v3.1 elog:11584 (best one)

FM6 MISO 3.1.1 elog:11584 (second best one)

11597   Tue Sep 15 01:14:10 2015 ranaSummaryLSCneed to check LSC Whitening switch logic ... again

Tonight we noticed that the REFL_DC signal has gone bipolar, even though the whitening gain is 0 dB and the whitening filter is requested to be OFF.

We should check out the switch operation of several ofthe LSC channels in the daytime - where is the procedure for this diagnostic posted?

11598   Tue Sep 15 15:01:23 2015 ranaSummaryLSCdisabling the LSC AA filters + mod to whitening

While investigating the BIO situation with the LSC machine and the iscaux2 processor last night, we wondered if maybe the Anti-Aliasing filters were mistakenly disabled. But why do we need these anyway?

Our ADCs digitize at 64 kHz and there is a digital lowpass in the IOP at 5 kHz before we downsample to 16 kHz. So mainly we're trying to prevent some aliasing at the 64 kHz IOP rate. But our analog AA filter is a 8th order ELP at 7570 Hz, so its overkill.

So, I propose that we bypas the AA via hardwiring the board and implement a 10 kHz pole in the whitening board (D990694) before the whitening by turning R127, etc. into a 0.1 uF cap. Along with the 100 Ohm series resistor, this will make a pole at ~15 kHz. Probably ought to check that the input resistor is metal film. Also, if we replace C158/C159, etc. with a 0.47 nF cap, we'll get 2 poles at 35 kHz to limit the higher frequencies from saturating.

11599   Tue Sep 15 15:10:48 2015 gautam, ericq, ranaSummaryLSCPRFPMI lock & various to-do's
I was observing Eric while he was attempting to lock the PRFPMI last night. The handoff from ALS to LSC was not very smooth, and Rana suggested looking at some control signals while parked close to the PRFPMI resonance to get an idea of what frequency bands the noise dominated in. The attached power spectrum was taken while CARM and DARM were under ALS control, and the PRMI was locked using REFL_165. The arm power was fluctuating between 15 and 50. Most of the power seems to be in the 1-5Hz band and the 10-30Hz band.

Rana made a number of suggestions, which I'm listing here. Some of these may directly help the above situation, while the others are with regards to the general state of affairs.

• Reroute both (MC and arm) FF signals to the SUS model
• For MC, bypass LSC
• Rethink the MC FF -
• Leave the arm FF on all the time?
• The positioning of the accelerometer used for MC FF has to be bettered - it should be directly below the tank
• The IOO model is over-clocking - needs to be re-examined
• Fix up the DC F2P - Rana mentioned an old (~10 yr) script called F2P ratio, we should look to integrate the Python scripts used for lock-in/demod at the sites with this
• Look to calibrate MC_F
• Implement a high BW CARM servo using ALS
• Gray code implementation for EPICS gain-stepping

11601   Tue Sep 15 18:35:21 2015 ericqSummaryLSCsome further notes

About the analog CARM control with ALS:

We're looking at using a Sigg designed remotely switchable delay line box on the currently undelayed side of the ALS DFD beat. For a beat frequency of 50MHz, one cycle is 20ns, this thing has 24ns total delay capability, so we should be able to get pretty close to a zero crossing of the analog I or Q outputs of the demod board. This can be used as IN2 for the common mode board.

Gautam is testing the functionality of the delay and switching, and should post a link to the DCC page of the schematic. Rana and Koji have been discussing the implementation of the remote switching (RCG vs. VME).

I spent some time this afternoon trying to lock the X arm in this way, but instead of at IR resonance, just wherever the I output of the DFD had a zero crossing. However, I didn't give enough thought to the loop shapes; Koji helped me think it through. Tomorrow, I'll make a little pomona box to go before the CM IN2 that will give the ALS loop shape a pole where we expect the CARM coupled cavity pole to be (~120Hz), so that the REFL11 and ALS signals have a similar shape when we're trying to transition.

The common mode board does have a filter for this kind of thing for single arm tests, but puts in a zero as well, as it expects the single arm pole, which isn't present in the ALS sensing, so maybe I'll whip up something appropriate for this, too.

11603   Tue Sep 15 20:44:13 2015 gautamSummaryLSCChecking the delay line phase shifter DS050339
I checked out the delay line phase shifter D050339, (theory of operation here) this afternoon. I first checked that the power connection was functional, which it was, though the power connector is is not the usual chassis one (see image attached, do we need to change this?).

The box has two modes of operation - you can either change the delay by flipping switches on the front panel or via a 25pin D-sub connector on the back (the pin numberings for this connector on the datasheet are a little misleading, but I determined that pins 1-9 on the D-sub connector correspond to the 9 delays on the front panel in ascending order, pin 10 is the mode selector switch, should be high for remote operation, pins 11 and 13 are NC, pin 12 is VCC of 5V, and pins 14-25 are grounded). I first checked the front-panel mode of operation, using an oscilloscope to measure the delay between the direct signal from the Fluke 6061 and the output from the D050339. This corresponds to the first set of datapoints in the plot attached (signal was 100MHz sine wave).

I then used a 25 pin D sub breakout boards to check the remote operation mode as well, which corresponds to the second set of datapoints in the plot attached. For this measurement, I used the Agilent network analyzer to measure the phase lag between the direct signal (for all delays, I measured the phase lag at 100MHz, having first calibrated the "thru" path by connecting the R and A inputs of the network analyzer using a barrel BNC) and the delayed output from the box, and then converted it to a time delay.

Both sets of data are linear, with a slope nearly equal to 1 as expected. I conclude that the box is functioning as expected. Right now, Koji is checking a board which will be used to remotely control this box. On the hardware side it remains to make a cable going from the DS050339 Dsub input to the driver board output (also 25 pin Dsub).
11604   Wed Sep 16 03:37:06 2015 KojiSummaryGreen LockingWorkable delay line setup prepared

[Koji Gautam]

The variable delay line has been setup for practical use. The hardware and basic software are ready.

The delay time is given by [512-1-mod(C1:LSC-BO_1_0_SET, 512)]*(1/16) ns

Giving 511 (LLLL LLLH HHHH HHHH) to C1:LSC-BO_1_0_SET makes the delayline shortest (+0ns).
Giving 0 (LLLL LLLL LLLL LLLL) to C1:LSC-BO_1_0_SET makes the delayline longest (~32ns).

The SR785 was removed from the rack for our access >> Eric

DO setup

- Three CONTEC DO-32L-PE cards are found in the Yarm digital cabinet. (I brought a card from WB, but will bring it back).
- The card was installed in the C1LSC chassis.

- The models for c1x04 and c1lsc were modified to include the card. Once they are restarted, the card was recognized without problem.
The frame builder also needed to be restarted (Attachment 1&2). The changes were committed to the repository.

- MEDM screen "CDS_BO_STATUS.adl" has been modified to include the bit monitors for the new DO card. (Attachment 3)

Epics values "C1:LSC-BO_1_0_SET" and "C1:LSC-BO_1_1_SET" are hooked up to the DO block.

Cables

- The DO board has DB37(F). I made a I/F cable with a DB37(M) crimp connector, DB25 breakout board, and a ribbon cable.
Pin 1 is connected to pin 14 of the DB25 (GND of the delayline circuit).
Pin 2~10 are connected to pin 1~9 of the DB25 (Switch 1~9 of the delayline circuit)
Pin 18 is connected to X01 (external = spare) (Attachment 4)

- [CONFESSION] A bench +15V power supply was prepared to power the transisters of the DO (Attachment 6). The hot side is connected to X01 (not connected to the DB25),
and the cold side is connected to pin 14 of the DB25. Once we find this is a useful setup we need to make a dedicated interface unit to convert DB37
into DB25 (and provide more connectivities).

- A DB25 M-F cable was installed on the cable tray above the LSC racks.

Delay line unit

- The delay line box was mounted on 34H of the LSC analog rack (Attachment 5).

- The side cross connect power supply was not available (to be described later). Therefore we decided to use the same +15V supply as the one for the DO card.

- Checked the functionarity of the local switches using a function generator @30MHz and the front panel switches. The maximum (~32ns) delay was confirmed.
(Just not enough to have 360 deg shift).

- Now the delay line function was tested with the front panel swicth at "ext". We confirmed that the delay time changes with the number given to C1:LSC-BO_1_0_SET.

What we need further

- Implement delay time slider control (511 = 0ns, 0 = 31.94ns). The delay time is given by
[512-1-mod(C1:LSC-BO_1_0_SET, 512)]*(1/16) ns

- Some independent RF issues I found. (Next entry)

11606   Wed Sep 16 15:04:33 2015 ericqSummaryLSCDC PD Whitening Board Fixed
 Quote: Tonight we noticed that the REFL_DC signal has gone bipolar, even though the whitening gain is 0 dB and the whitening filter is requested to be OFF.

Fixed! I noticed that whitening gain changes weren't having any effect on CM_SLOW. I then checked REFL_DC, where this also seemed to be the case. Since the gain is controlled via VME machine, and whitening filter switching is controlled via RCG, I figured there must be something wrong with the board. I checked all of the DC PD signals, which share a whitening filter board, and they all had the same symptoms.

I went and peeked at the board, and it turns out the backplane cable had fallen off.

I plugged it in, things look ok.

11609   Thu Sep 17 03:48:10 2015 ericqSummaryLSCsome further notes

Something odd is happening with the CM board. Measuring from either input to OUT1 (the "slow output") shows a nice flat response up until many 10s of kHz.

However, when I connect my idependently confirmed 120Hz LPF to either input, the pole frequency gets moved up to ~360Hz and the DC gain falls some 10dB. This happens regardless if the input is used or not, I saw this shape at a tee on the output of the LPF when the other leg of the tee was connected to a CM board input.

This has sabotaged my high bandwidth ALS efforts. I will investigate the board's input situation tomorrow.

11611   Thu Sep 17 13:06:05 2015 ericqSummaryLSCLow input impedance on CM board

As it turns out, our version of the common mode board does not have high input impedence. I think this is what is messing with the lowpass.

I added photos of the PCB to our 40m DCC page about this board: D1500308, wherein you can see that we have Revision B.

On the aLIGO wiki's CommonModeServo page, one finds that high input impedence was added in Revision E. At LIGO-D040180, one finds this was implemented via an additional dual AD829 instrumentation amplifier stage before the input amplification stage that exists on our board.

Also, I find that the boosts installed are the default 40:4k, 1k:20k, 1k:20k, 500:10k pole zero pairs. Given our 30-40kHz UGF for CARM thus far, maybe we would like to lower some of these boost corner frequencies, to actually be able to use them; so far we only use the first two.

11615   Thu Sep 17 19:58:06 2015 gautamSummaryComputer Scripts / ProgramsFrequency counting algorithm

I made some changes to the c1tst model running on c1iscey in order to test my algorithm for frequency counting. I followed the steps listed in elog 8909 to make, install and start the model.

I need to debug a few things and run some more diagnostics so I am leaving the model in its edited version (Eric had committed it to the svn before I made any changes).

11628   Mon Sep 21 18:31:06 2015 gautamSummaryComputer Scripts / ProgramsFrequency counting algorithm

I have been working on setting up a frequency counting module that can give us a readout of the beat frequency, divided by a factor of 2^14 using the Wenzel frequency dividers as described here. This is a summary of what I have thus far.

The algorithm, and simulink model

The basic idea is to pass the digitized signal through a Schmitt trigger (existing RCG module), which provides some noise immunity, and should in theory output a clean square wave with the same frequency as the input. The output of the Schmitt trigger module is either 0 (for input < lower threshold value) and 1 (for input greater than the high threshold value). By differencing this between successive samples, we can detect a "zero-crossing", and by measuring the time interval between successive zero crossings, we can take the reciprocal to get the frequency. The last bit of this operation (i.e. measuring the interval) is done using a piece of custom C code. Initially, I was trying to use the part "GPS" from CDS_PARTS to get the current GPS time and hence measure intervals between successive zero-crossings, but this didn't work out because the output of GPS is in seconds, and that doesn't give me the required precision to count frequency. I tried implementing some more precision timing using the clock_gettime() function, which is capable of giving nanosecond precision, but this didn't work for me. So I am now using a more crude way of measuring the interval, by using a counter variable that is incremented each time a zero-crossing is NOT detected, and then converting this to time using the FE_RATE macro (=16384). In any case, the ADC sampling rate limits the resolution of frequency counting using zero-crossing detection (more on this later). Attachment 1 shows the SIMULINK block diagram for this entire procedure.

Testing the model

I implemented all of this on c1tst, and followed the steps listed here to get the model up and running. I then used one of the DB37 breakout boards to send a signal to the ADC using the DS345 function generator. Attachment 2 shows some diagnostic plots - input signal was a 2.5Vpp (chosen to match the output from the Wenzel dividers) square wave at 2kHz:

• Bottom left: digitized version of the input signal - I used this to set the upper and lower thresholds on the Schmitt trigger at +1000 counts and -1000 counts respectively.
• Top left: Schmitt trigger output (red trace) and the difference between successive samples of the Schmitt trigger output (blue trace - this variable is used to detect a zero crossing)
• Top right: Counter variable used to measure intervals between successive zero crossings, and hence, the frequency. The frequency output is held until the next zero crossing is detected, at which time counter is reset
• Bottom right: frequency output in Hz.

The right column pointed me to the limitations of frequency counting using this method - even though the input frequency was constant (2kHz), the counter variable, and hence the frequency readout, was neither accurate nor precise. But this was to be expected given the limitations imposed by ADC sampling? We only get information of the state of the input signal once within each sampling interval, and hence, we cannot know if a zero crossing has occurred until the next sampling interval. Moreover, we can only count frequency in discrete steps. In attachments 3 and 4, I've plotted these discrete frequencies which can be measured - the error bars indicate the error in the frequency readout if the counter variable is 1 more or less than the "true" value - this can (and does) happen if the high and low times of the Schmitt trigger are not equal over time (see top left plot in Attachment 2, its not very obvious, but all the "low" times are not equal, and so, the interval between detected zero crossings is not equal). This becomes a problem for small values of the counter variable, i.e. at high input frequencies. I was having a look at the elogs Aidan wrote some years ago for a different digital frequency counting approach, and I guess the conclusion there was similar - for high input frequencies, the error is large.

I further did two frequency sweeps using the DS345, to see if I could recover this in the frequency readout. Attachments 5 and 6 show the results of these sweeps. For low frequencies, i.e. 100-500 Hz, the jitter in the readout is small (though this will be multiplied by a factor of 2^14), but by the time the input frequency gets up to 2kHz, the jitter in the readout is pretty bad (and gets worse for even higher frequencies.

Bottom line

Some refinements can be made to the algorithm, perhaps by introducing some averaging (i.e. not reading out frequency for every pair of zero crossings, but every 5) which may improve the jitter in the readout, but I would think that the current approach is not very useful above 2kHz (corresponding to ~30MHz of pre-divider frequency), because of the limitations shown in attachments 3 and 4.

11629   Mon Sep 21 23:18:55 2015 ericqSummaryComputer Scripts / ProgramsFrequency counting algorithm

I definitely think lowpassing the output is the way to go. Since this frequency readback will be used for slow control of the beatnote frequency via auxillary laser temperature, even lowpassing at tens of Hz is fine. The jitter doesn't mean its useless, though.

If we lowpass at 16Hz, we're effectively averaging over 1024 samples, bringing, for example, a +-2kHz jitter of a 6kHz signal as you post down to 2kHz/sqrt(1024) ~ 60Hz, which is 1% of the carrier. This seems ok to me.

11631   Tue Sep 22 02:11:17 2015 ranaSummaryComputer Scripts / ProgramsFrequency counting algorithm

I was going to suggest using a software PLL, but perhaps averaging gives the same result. The same ADC signal can be fed to multiple blocks with different averaging times and we can just use whichever ones seems the most useful.

ELOG V3.1.3-