40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 62 of 344  Not logged in ELOG logo
ID Date Author Typeup Category Subject
  11398   Thu Jul 9 13:26:47 2015 JamieSummaryCDSCDS upgrade: new mx 1.2.16 installed

I rebuilt/installed mx 1.2.16 to use "ether-mode", instead of the default MX-10G:

controls@fb /opt/src/mx-1.2.16 0$ ./configure --enable-ether-mode --prefix=/opt/mx-1.2.16
...
controls@fb /opt/src/mx-1.2.16 0$ make
..
controls@fb /opt/src/mx-1.2.16 0$ make install
...

I then rebuilt/installed daqd so that it properly linked against the updated mx install:

controls@fb /opt/rtcds/rtscore/release/src/daqd 0$ ./configure --enable-debug --disable-broadcast --without-myrinet --with-mx --with epics=/opt/rtapps/epics/base --with-framecpp=/opt/rtapps/framecpp --enable-local-timing
...
controls@fb /opt/rtcds/rtscore/release/src/daqd 0$ make
...
controls@fb /opt/rtcds/rtscore/release/src/daqd 0$ install daqd /opt/rtcds/caltech/c1/target/fb/

It's now back to running and receiving data from the front ends (still not stable yet, though).

  11400   Thu Jul 9 16:50:13 2015 JamieSummaryCDSCDS upgrade: if all else fails try throwing metal at the problem

I roped Rolf into coming over and adding his eyes to the problem.  After much discussion we couldn't come up with any reasonable explanation for the problems we've been seeing other than daqd just needing a lot more resources that it did before.  He said he had some old Sun SunFire X4600s from which we could pilfer memory.  I went over to Downs and ripped all the CPU/memory cards out of one of his machines and stuffed them into fb:

fb now has 8 CPU and 16G of RAM

Unfortunately, this is still not enough.  Or at least it didn't solve the problem; daqd is showing the same instabilities, falling over a couple of minutes after I turn on trend frame writing.  As always, before daqd fails it starts spitting out the following to the logs:

[Thu Jul  9 16:37:09 2015] main profiler warning: 0 empty blocks in the buffer

followed by lines like:

[Thu Jul  9 16:37:27 2015] GPS MISS dcu 44 (ASX); dcu_gps=1120520264 gps=1120519812

right before it dies.

I'm no longer convinced that this is a resource issue, though, judging by the resource usage right before the crash:

top - 16:47:32 up 48 min,  5 users,  load average: 0.91, 0.62, 0.61
Tasks:   2 total,   0 running,   2 sleeping,   0 stopped,   0 zombie
Cpu(s):  8.9%us,  0.9%sy,  0.0%ni, 89.1%id,  0.9%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  15952104k total, 13063468k used,  2888636k free,   138648k buffers
Swap:  1023996k total,        0k used,  1023996k free,  7672292k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12016 controls  20   0 8098m 4.4g 104m S  106 29.1   6:45.79 daqd
 4953 controls  20   0 53580 6092 5096 S    0  0.0   0:00.04 nds

Load average less than 1 per CPU, plenty of free memory (~3G free, 0 swap), no waiting for IO (0.9%wa), etc.  daqd is utilizing lots of  threads, which should be spread across many cpus, so even the >100%CPU should be ok.   I'm at a loss...

  11402   Mon Jul 13 01:11:14 2015 JamieSummaryCDSCDS upgrade: current assessment

daqd is still behaving unstably.  It's still unclear what the issue is.

The current failures look like disk IO contention.  However, it's hard to see any evidince of daqd is suffering from large IO wait while it's failing.

The frame size itself is currently smaller than it was before the upgrade:

controls@fb /frames/full 0$ ls -alth 11190 | head
total 369G
drwxr-xr-x 321 controls controls  36K Jul 12 22:20 ..
drwxr-xr-x   2 controls controls 268K Jun 23 06:06 .
-rw-r--r--   1 controls controls  67M Jun 23 06:06 C-R-1119099984-16.gwf
-rw-r--r--   1 controls controls  68M Jun 23 06:06 C-R-1119099968-16.gwf
-rw-r--r--   1 controls controls  69M Jun 23 06:05 C-R-1119099952-16.gwf
-rw-r--r--   1 controls controls  69M Jun 23 06:05 C-R-1119099936-16.gwf
-rw-r--r--   1 controls controls  67M Jun 23 06:05 C-R-1119099920-16.gwf
-rw-r--r--   1 controls controls  68M Jun 23 06:05 C-R-1119099904-16.gwf
-rw-r--r--   1 controls controls  68M Jun 23 06:04 C-R-1119099888-16.gwf
controls@fb /frames/full 0$ ls -alth 11208 | head
total 17G
drwxr-xr-x   2 controls controls  20K Jul 13 01:00 .
-rw-r--r--   1 controls controls  45M Jul 13 01:00 C-R-1120809632-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 01:00 C-R-1120809408-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:56 C-R-1120809392-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:56 C-R-1120809376-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:56 C-R-1120809360-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:55 C-R-1120809344-16.gwf
-rw-r--r--   1 controls controls  50M Jul 13 00:55 C-R-1120809328-16.gwf
controls@fb /frames/full 0$

This would seem to indicate that it's not an increase in frame size that's to blame.

Because slow data is now transported to daqd over the MX data concentrator network rather than via EPICS (RTS 2.8), there is more network on the MX network.   I note also that the channel lists have increased in size:

controls@fb /opt/rtcds/caltech/c1/chans/daq 0$ ls -alt archive/C1LSC* | head -20
-rw-r--r-- 1 4294967294 4294967294 262554 Jul  6 18:21 archive/C1LSC_150706_182146.ini
-rw-r--r-- 1 4294967294 4294967294 262554 Jul  6 18:16 archive/C1LSC_150706_181603.ini
-rw-r--r-- 1 4294967294 4294967294 262554 Jul  6 16:09 archive/C1LSC_150706_160946.ini
-rw-r--r-- 1 4294967294 4294967294  43366 Jul  1 16:05 archive/C1LSC_150701_160519.ini
-rw-r--r-- 1 4294967294 4294967294  43366 Jun 25 15:47 archive/C1LSC_150625_154739.ini
...

I would have thought, though, that data transmission errors would show up in the daqd status bits.

  11404   Mon Jul 13 18:12:50 2015 JamieSummaryCDSCDS upgrade: left running in semi-stable configuration

I have been watching daqd all day and I don't feel particularly closer to understanding what the issues are.  However, things are

Interestingly, though, the stability appears highly variable at the moment.  This morning, daqd was very unstable and was crashing within a couple of minutes of starting.  However this afternoon, things seemed much more stable.  As of this moment, daqd has been running for for 25 minutes now, writing full frames as well as minute and second trends (no minute_raw), without any issues.  What has changed?

To reiterate, I have been closing watching disk IO to /frames.  I see no indication that there is any disk contention while daqd is failing.  It's still possible, though, that there are disk IO issues affecting daqd at a level that is not readily visible.  From dstat, the frame writes are visible, but nothing else.

I have made one change that could be positively affecting things right now: I un-exported /frames from NFS.  This eliminates anything external from reading /frames over the network.  In particular, it also shuts off the transfer of frames to LDAS.  Since I've done this, daqd has appeared to be more stable.  It's NOT totally stable, though, as the instance that I described above did eventually just die after 43 minutes, as I was writing this.

In any event, as things are currently as stable as I've seen them, I'm leaving it running in this configuration for the moment, with the following relevant daqdrc parameters:

start main 16;
start frame-saver;
sync frame-saver;
start trender 60 60;
start trend-frame-saver;
sync trend-frame-saver;
start minute-trend-frame-saver;
sync minute-trend-frame-saver;
start profiler;
start trend profiler;
  11406   Tue Jul 14 09:08:37 2015 JamieSummaryCDSCDS upgrade: left running in semi-stable configuration

Overnight daqd restarted itself only about twice an hour, which is an improvement:

controls@fb /opt/rtcds/caltech/c1/target/fb 0$ tail logs/restart.log
daqd: Tue Jul 14 03:13:50 PDT 2015
daqd: Tue Jul 14 04:01:39 PDT 2015
daqd: Tue Jul 14 04:09:57 PDT 2015
daqd: Tue Jul 14 05:02:46 PDT 2015
daqd: Tue Jul 14 06:01:57 PDT 2015
daqd: Tue Jul 14 06:43:18 PDT 2015
daqd: Tue Jul 14 07:02:19 PDT 2015
daqd: Tue Jul 14 07:58:16 PDT 2015
daqd: Tue Jul 14 08:02:44 PDT 2015
daqd: Tue Jul 14 09:02:24 PDT 2015

Un-exporting /frames might have helped a bit.  However, the problem is obviously still not fixed.

  11408   Tue Jul 14 10:28:02 2015 ericqSummaryCDSCDS upgrade: left running in semi-stable configuration

There remains a pattern to some of the restarts, the following times are all reported as restart times. (There are others in between, however.)

daqd: Tue Jul 14 00:02:48 PDT 2015
daqd: Tue Jul 14 01:02:32 PDT 2015
daqd: Tue Jul 14 03:02:33 PDT 2015
daqd: Tue Jul 14 05:02:46 PDT 2015
daqd: Tue Jul 14 06:01:57 PDT 2015
daqd: Tue Jul 14 07:02:19 PDT 2015
daqd: Tue Jul 14 08:02:44 PDT 2015
daqd: Tue Jul 14 09:02:24 PDT 2015
daqd: Tue Jul 14 10:02:03 PDT 2015

Before the upgrade, we suffered from hourly crashes too:

daqd_start Sun Jun 21 00:01:06 PDT 2015
daqd_start Sun Jun 21 01:03:47 PDT 2015
daqd_start Sun Jun 21 02:04:04 PDT 2015
daqd_start Sun Jun 21 03:04:35 PDT 2015
daqd_start Sun Jun 21 04:04:04 PDT 2015
daqd_start Sun Jun 21 05:03:45 PDT 2015
daqd_start Sun Jun 21 06:02:43 PDT 2015
daqd_start Sun Jun 21 07:04:42 PDT 2015
daqd_start Sun Jun 21 08:04:34 PDT 2015
daqd_start Sun Jun 21 09:03:30 PDT 2015
daqd_start Sun Jun 21 10:04:11 PDT 2015

So, this isn't neccesarily new behavior, just something that remains unfixed. 

  11409   Tue Jul 14 11:57:27 2015 jamieSummaryCDSCDS upgrade: left running in semi-stable configuration
Quote:

There remains a pattern to some of the restarts, the following times are all reported as restart times. (There are others in between, however.)

daqd: Tue Jul 14 00:02:48 PDT 2015
daqd: Tue Jul 14 01:02:32 PDT 2015
daqd: Tue Jul 14 03:02:33 PDT 2015
daqd: Tue Jul 14 05:02:46 PDT 2015
daqd: Tue Jul 14 06:01:57 PDT 2015
daqd: Tue Jul 14 07:02:19 PDT 2015
daqd: Tue Jul 14 08:02:44 PDT 2015
daqd: Tue Jul 14 09:02:24 PDT 2015
daqd: Tue Jul 14 10:02:03 PDT 2015

Before the upgrade, we suffered from hourly crashes too:

daqd_start Sun Jun 21 00:01:06 PDT 2015
daqd_start Sun Jun 21 01:03:47 PDT 2015
daqd_start Sun Jun 21 02:04:04 PDT 2015
daqd_start Sun Jun 21 03:04:35 PDT 2015
daqd_start Sun Jun 21 04:04:04 PDT 2015
daqd_start Sun Jun 21 05:03:45 PDT 2015
daqd_start Sun Jun 21 06:02:43 PDT 2015
daqd_start Sun Jun 21 07:04:42 PDT 2015
daqd_start Sun Jun 21 08:04:34 PDT 2015
daqd_start Sun Jun 21 09:03:30 PDT 2015
daqd_start Sun Jun 21 10:04:11 PDT 2015

So, this isn't neccesarily new behavior, just something that remains unfixed. 

That's interesting, that we're still seeing those hourly crashes.

We're not writing out the full set of channels, though, and we're getting more failures than just those at the hour, so we're still suffering.

  11412   Tue Jul 14 16:51:01 2015 JamieSummaryCDSCDS upgrade: problem is not disk access

I think I have now determined once and for all that the daqd problems are NOT due to disk IO contention.

I have mounted a tmpfs at /frames/tmp and have told daqd to write frames there.  The tmpfs exists entirely in RAM.  There is essentially zero IO wait for such a filesystem, so daqd should never have trouble writing out the frames.

But yet daqd continues to fail with the "0 empty blocks in the buffer" warnings.  I've been down a rabbit hole.

  11414   Tue Jul 14 17:14:23 2015 EveSummarySummary PagesFuture summary pages improvements

Here is a list of suggested improvements to the summary pages. Let me know if there's something you'd like for me to add to this list!

  • A lot of plots are missing axis labels and titles, and I often don't know what to call these labels. I could use some help with this.
  • Check the weather and vacuum tabs to make sure that we're getting the expected output. Set the axis labels accordingly. 
  • Investigate past periods of missing data on DataViewer to see if the problem was with the data requisition process, the summary page production process, or something else.
  • Based on trends in data over the past three months, set axis ranges accordingly to encapsulate the full data range.
  • Create a CDS tab to store statistics of our digital systems. We will use the CDS signals to determine when the digital system is running and when the minute trend is missing. This will allow us to exclude irrelevant parts of the data.
  • Provide duty ratio statistics for the IMC.
  • Set triggers for certain plots. For example, for channels C1:LSC-XARM OUT DQ and page 4 LIGO-T1500123–v1 C1:LSC-YARM OUT DQ to be plotted in the Arm LSC Control signals figures, C1:LSCTRX OUT DQ and C1:LSC-TRY OUT DQ must be higher than 0.5, thus acting as triggers.
  • Include some flag or other marking indicating when data is not being represented at a certain time for specific plots.
  • Maybe include some cool features like interactive plots.
  11415   Wed Jul 15 13:19:14 2015 JamieSummaryCDSCDS upgrade: reducing mx end-points as last ditch effort

I tried one last thing, suggested by Keith and Gerrit.  I tried reducing the number of mx end-points on fb to zero, which should reduce the total number of fb threads, in the hope that the extra threads were causing the chokes.

On Tue, Jul 14 2015, Keith Thorne <kthorne@ligo-la.caltech.edu> wrote:
> Assumptions
>  1) Before the upgrade (from RCG 2.6?), the DAQ had been working, reading out front-ends, writing frames trends
>  2) In upgrading to RCG 2.9, the mx start-up on the frame builder was modified to use multiple end-points
> (i.e. /etc/init.d/mx has a line like
> # 1 10G card - X2
> MX_MODULE_PARAMS="mx_max_instance=1 mx_max_endpoints=16 $MX_MODULE_PARAMS"
>  (This can be confirmed by the daqd log file with lines at the top like
> 263596
> MX has 16 maximum end-points configured
> 2 MX NICs available
> [Fri Jul 10 16:12:50 2015] ->4: set thread_stack_size=10240
> [Fri Jul 10 16:12:50 2015] new threads will be created with the stack of size 10
> 240K
>
> If this is the case, the problem may be that the additional thread on the frame-builder (one per end-point) take up so many slots on the 8-core
> frame-builder that they interrupt the frame-writing thread, thus preventing the main buffer from being emptied.  
>
> One could go back to a single end-point. This only helps keep restart of front-end A from hiccuping DAQ for front-end B.
>
> You would have to remove code on front-ends (/etc/init.d/mx_stream) that chooses endpoints. i.e.
> # find line number in rtsystab. Use that to mx_stream slot on card (0-15)
> line_num=`grep -v ^# /etc/rtsystab | grep --perl-regexp -n "^${hostname}\s" | se
> d 's/^\([0-9]*\):.*/\1/g'`
> line_off=$(expr $line_num - 1)
> epnum=$(expr $line_off % 2)
> cnum=$(expr $line_off / 2)
>
>     start-stop-daemon --start --quiet -b -m --pidfile /var/log/mx_stream0.pid --exec /opt/rtcds/tst/x2/target/x2daqdc0/mx_stream -- -e 0 -r "$epnum" -W 0 -w 0 -s "$sys" -d x2daqdc0:$cnum -l /opt/rtcds/tst/x2/target/x2daqdc0/mx_stream_logs/$hostname.log

As per Keith's suggestion, I modified the mx startup script to only initialize a single endpoint, and I modified the mx_stream startup to point them all to endpoint 0.  I verified that indeed daqd was a single MX end-point:

MX has 1 maximum end-points configured

It didn't help.  After 5-10 minutes daqd crashes with the same "0 empty blocks" messages.

I should also mention that I'm pretty sure the start of these messages does not seem coincident with any frame writing to disk; further evidence that it's not a disk IO issue.

Keith is looking at the system now, so we if he can see anything obvious.  If not, I will start reverting to 2.5.

  11417   Wed Jul 15 18:19:12 2015 JamieSummaryCDSCDS upgrade: tentative stabilty?

Keith Thorne provided his eyes on the situation today and had some suggestions that might have helped things

Reorder ini file list in master file.  Apparently the EDCU.ini file (C0EDCU.ini in our case), which describes EPICS subscriptions to be recorded by the daq, now has to be specified *after* all other front end ini files.  It's unclear why, but it has something to do with RTS 2.8 which changed all slow channels to be transported over the mx network.  This alone did not fix the problem, though.

Increase second trend frame size.  Interestingly, this might have been the key.  The second trend frame size was increased to 600 seconds:

start trender 600 60;

The two numbers are the lengths in seconds for the second and minute trends respectively.  They had been set to "60 60", but Keith suggested that longer second trend frames are better, for whatever reason.  It seems he may be right, given that daqd has been running and writing full and trend frames for 1.5 hours now without issue. 


As I'm writing this, though, the daqd just crashed again.  I note, though, that it's right after the hour, and immediately following writing out a one hour minute trend file.  We've been seeing these hour, on the hour, crashes of daqd for quite a while now.  So maybe this is nothing new.  I've actually been wondering if the hourly daqd crashes were associated with writing out the minute trend frames, and I think we might have more evidence to point to that.

If increasing the size of the second trend frames from 60 seconds (35M) to 600 seconds (70M) made a difference in stability, could there be an issue since writing out files that are smaller than some value?  The full frames are 60M, and the minute trends are 35M.

  11427   Sat Jul 18 15:37:19 2015 JamieSummaryCDSCDS upgrade: current status

So it appears we have found a semi-stable configuration for the DAQ system post upgrade:

Here are the issues:

daqd

dadq is running mostly stably for the moment, although it still crashes at the top of every hour (see below).  Here are some relevant points of about the current configuration:

  • recording data from only a subset of front-ends, to reduce the overall load:
    • c1x01
    • c1scx
    • c1x02
    • c1sus
    • c1mcs
    • c1pem
    • c1x04
    • c1lsc
    • c1ass
    • c1x05
    • c1scy
  • 16 second main buffer:
    start main 16;
  • trend lengths: second: 600, minute: 60
    start trender 600 60;
  • writing to frames:
    • full
    • second
    • minute
    • (NOT raw minute trends)
  • frame compression ON

This elliminates most of the random daqd crashing.  However, daqd still crashes at the top of every hour after writing out the minute trend frame. Still unclear what the issue is, but Keith is investigating.  In some sense this is no worse that where we were before the upgrade, since daqd was also crashing hourly then.  It's still crappy, though, so hopefully we'll figure something out.

The inittab on fb automatically restarts daqd after it crashes, and monit on all of the front ends automatically restarts the mx_stream processes.

front ends

The front end modules are mostly running fine.

One issue is that the execution times seem to have increased a bit, which is problematic for models that were already on the hairy edge.  For instance, the rough aversage for c1sus has some from ~48us to 50us.  This is most problematic for c1cal, which is now running at ~66us out of 60, which is obviously untenable.  We'll need to reduce the load in c1cal somehow.

All other front end models seem to be working fine, but a full test is still needed.

There was an issue with the DACs on c1sus, but I rebooted and everything came up fine, optics are now damped:

  11437   Wed Jul 22 22:06:42 2015 EveSummarySummary PagesFuture summary pages improvements

- CDS Tab

We want to monitor the status of the digital control system.

1st plot
Title: EPICS DAQ Status
I wonder we can plot the binary numbers as statuses of the data acquisition for the realtime codes.
We want to use the status indicators. Like this:
https://ldas-jobs.ligo-wa.caltech.edu/~detchar/summary/day/20150722/plots/H1-MULTI_A8CE50_SEGMENTS-1121558417-86400.png

channels:
C1:DAQ-DC0_C1X04_STATUS
C1:DAQ-DC0_C1LSC_STATUS
C1:DAQ-DC0_C1ASS_STATUS
C1:DAQ-DC0_C1OAF_STATUS
C1:DAQ-DC0_C1CAL_STATUS

C1:DAQ-DC0_C1X02_STATUS
C1:DAQ-DC0_C1SUS_STATUS
C1:DAQ-DC0_C1MCS_STATUS
C1:DAQ-DC0_C1RFM_STATUS
C1:DAQ-DC0_C1PEM_STATUS

C1:DAQ-DC0_C1X03_STATUS
C1:DAQ-DC0_C1IOO_STATUS
C1:DAQ-DC0_C1ALS_STATUS

C1:DAQ-DC0_C1X01_STATUS
C1:DAQ-DC0_C1SCX_STATUS
C1:DAQ-DC0_C1ASX_STATUS

C1:DAQ-DC0_C1X05_STATUS
C1:DAQ-DC0_C1SCY_STATUS
C1:DAQ-DC0_C1TST_STATUS

1st plot
Title: IOP Fast Channel DAQ Status
These have two bits each. How can we handle it?
If we need to shrink it to a single bit take "AND" of them.
C1:FEC-40_FB_NET_STATUS (legend: c1x04, if a legend placable)
C1:FEC-20_FB_NET_STATUS (legend: c1x02)
C1:FEC-33_FB_NET_STATUS (legend: c1x03)
C1:FEC-19_FB_NET_STATUS (legend: c1x01)
C1:FEC-46_FB_NET_STATUS (legend: c1x05)

3rd plot
Title C1LSC CPU Meters
channels:
C1:FEC-40_CPU_METER (legend: c1x04)
C1:FEC-42_CPU_METER (legend: c1lsc)
C1:FEC-48_CPU_METER (legend: c1ass)
C1:FEC-22_CPU_METER (legend: c1oaf)
C1:FEC-50_CPU_METER (legend: c1cal)
The range is from 0 to 75 except for c1oaf that could go to 500.
Can we plot c1oaf with the value being devided by 8? (Then the legend should be c1oaf /8)

4th plot
Title C1SUS CPU Meters
channels:
C1:FEC-20_CPU_METER (legend: c1x02)
C1:FEC-21_CPU_METER (legend: c1sus)
C1:FEC-36_CPU_METER (legend: c1mcs)
C1:FEC-38_CPU_METER (legend: c1rfm)
C1:FEC-39_CPU_METER (legend: c1pem)
The range is be from 0 to 75 except for c1pem that could go to 500.
Can we plot c1pem with the value being devided by 8? (Then the legend should be c1pem /8)

5th plot
Title C1IOO CPU Meters
channels:
C1:FEC-33_CPU_METER (legend: c1x03)
C1:FEC-34_CPU_METER (legend: c1ioo)
C1:FEC-28_CPU_METER (legend: c1als)
The range is be from 0 to 75.

6th plot
Title C1ISCEX CPU Meters
channels:
C1:FEC-19_CPU_METER (legend: c1x01)
C1:FEC-45_CPU_METER (legend: c1scx)
C1:FEC-44_CPU_METER (legend: c1asx)
The range is be from 0 to 75.

7th plot
Title C1ISCEY CPU Meters
channels:
C1:FEC-46_CPU_METER (legend: c1x05)
C1:FEC-47_CPU_METER (legend: c1scy)
C1:FEC-91_CPU_METER (legend: c1tst)
The range is be from 0 to 75.

=====================

IOO

We want a duty ratio plot for the IMC. C1:IOO-MC_TRANS_SUM >1e4 is the good period.

Duty ratio plot looks like the right plot of the following link
https://ldas-jobs.ligo-wa.caltech.edu/~detchar/summary/day/20150722/lock/segments/

=====================

SUS: OPLEV

OL_PIT_INMON and OL_YAW_INMON are good for the slow drift monitor.
But their sampling rate is too slow for the PSDs.
Can you use
C1:SUS-ETM_OPLEV_PERROR
C1:SUS-ETM_OPLEV_YERROR
etc...
For the PSDs? They are 2kHz sampling DQ channels. You would be able to plot
it up to ~1kHz. In fact, we want to monitor the PSD from 100mHz to 1kHz.
How can you set up the resolution (=FFT length)?

=====================

LSC / ASC / ALS tabs

Let's make new tabs LSC, ASC, and ALS

LSC:

We should have a plot for
C1:LSC-TRX_OUT_DQ
C1:LSC-TRY_OUT_DQ
C1:LSC-POPDC_OUT_DQ
It's OK to use the minute trend for now.
You can check the range using dataviewer.

ASC:

Let's use
C1:SUS_MC1_ASCPIT_OUT16 (legend: IMC WFS)
C1:ASS-XARM_ITM_YAW_OSC_CLKGAIN (legend: XARM ASS)
C1:ASS-YARM_ITM_YAW_OSC_CLKGAIN (legend: YARM ASS)
C1:ASX-XARM_M1_PIT_OSC_CLKGAIN (legend: XARM Green ASS)
as the status indicators. There is no YARM Green ASS yet.

ALS:

Title: ALS Green transmission
We want a time series of
ALS-TRX_OUT16
ALS-TRY_OUT16

Title: ALS Green beatnote
Another time series
ALS-BEATX_FINE_Q_MON
ALS-BEATY_FINE_Q_MON

Title: Frequency monitor
We have frequency counter outputs, but I have to talk to Eric to know the channel names

  11441   Thu Jul 23 20:57:15 2015 JessicaSummaryGeneralApplying Pre-filter to data before IIR Wiener Filtering

I updated my bandpass filter and have included the bode plot below in Figure 1. It is a fourth order elliptic bandpass filter with a passband ripple of 1dB and a stopband attenuation of 30 dB. It emphasizes the area between 3 and 40 Hz.

Below, I applied this filter to the huddle test data. The results from this were only slightly better in the targeted region than when no pre-filter was applied. 

When I pre-filtered the mode cleaner data and then used an IIR wiener filter, I found that the results did not differ much from the data that was not pre-filtered. I'm not sure yet if I'm targeting the right region of this data with my bandpass filter, and will be looking more into choosing a better region. Also, I am only using certain regions of ff when calculating the transfer function, and need to optimize that region also. I uploaded the code I used to make these plots to github.

  11456   Tue Jul 28 20:42:50 2015 JessicaSummaryGeneralNew Seismometer Data Coherence

I was looking at the new seismometer data and plotted the coherence between the different arms of C1:PEM_GUR1 and C1:PEM_GUR2. There was not much coherence in the X arms, Y arms, or Z arms of each seismometer, but there were within the x and y arms of the seismometer.

I think the area we should focus on with filtering is lower ranges, between 0.01 and 0.1, because that it where coherence is most clearly high. It is higher in high frequencies but also incredibly noisy, meaning it probably wouldn't be good to try to filter there. 

  11457   Wed Jul 29 10:34:42 2015 IgnacioSummaryLSCCoherence of arms and seismometers

Jessica and I took 45 mins  (GPS times from 1122099200 to 1122101950) worth of data from the following channels:

C1:IOO-MC_L_DQ (mode cleaner)
C1:LSC-XARM_IN1_DQ (X arm length)
C1:LSC-YARM_IN1_DQ (Y arm length)

and for the STS, GUR1, and GUR2 seismometer signals.

The PSD for MCL and the arm length signals is shown below,

I looked at the coherence between the arm length and each of the three seismometers, plot overload incoming below,

For the coherence between STS and XARM and YARM,

  

For GUR1,

  

Finally for GUR2,

 

A few remarks:

1) From the coherence plots, we can see that the arm length signals are coherent with the seismometer signals the most from 0.5 - 50 Hz. This is most evident in the coherence with STS. I think subtraction will be most useful in this range. This agrees with what we see in the PSD of the arm length signals, the magnitude of the PSD starts increasing from 1 Hz and reaches a maximum at about 30 Hz. This is indicative of which frequencies most of the noise is present.

2) Eric did not remember which of  GUR1 and GUR2 corresponded to the ends of XARM and YARM. So, I went to the end of XARM, and jumped for a couple seconds to disturb whatever Gurald was in there. Using dataviewer I determined it was GUR1. Anyways, my point is, why is GUR1 less coherent with both arms and not just XARM?  Since it is at the end of XARM, I was expecting GUR1 to be more coherent with XARM. Is it because, though different arms, the PSD's of both arms are roughly the same? 

3) Similarly, GUR2 shows about the same levels of coherence for both arms, but it is more coherent. Is GUR2 noisier because of its location?

Code: ARMS_COH.m.zip

  11458   Wed Jul 29 11:15:21 2015 JessicaSummaryLSCPSDs of Arms with seismometer subtraction

Ignacio and I downloaded data from the STS, GUR1, and GUR2 seismometers and from the mode cleaner and the x and y arms. The PSDs along the arms have the most noise at frequencies greater than 1 Hz, so we should focus on filtering in that area. The noise levels start dropping at around 30 Hz, but are still much higher than is seen at frequencies below 1 Hz. However, because the spectra is so low at frequencies below that, Wiener filtering alone injected a significant amount of noise into those frequencies and did not do much to reduce the noise at higher frequencies. Pre-filtering will be required, and I have started implementing a pre-filter, but with no improvements yet. So far, I have tried making a bandpass filter, but a highpass filter may be ideal in this case because so much of the noise is above 1 Hz. 

  11461   Wed Jul 29 21:40:39 2015 KojiSummaryCDSStatus of the frame data syncing

The trend data hasn't been synced with LDAS since Jul 27 5AM local.

40m:

controls@nodus|minute > pwd
/frames/trend/minute
controls@nodus|minute > ls -l 11222 | tail
total 590432
-rw-r--r-- 1 controls controls 35758781 Jul 29 11:59 C-M-1122228000-3600.gwf
-rw-r--r-- 1 controls controls 35501472 Jul 29 12:59 C-M-1122231600-3600.gwf
-rw-r--r-- 1 controls controls 35296271 Jul 29 13:59 C-M-1122235200-3600.gwf
-rw-r--r-- 1 controls controls 35459901 Jul 29 14:59 C-M-1122238800-3600.gwf
-rw-r--r-- 1 controls controls 35550346 Jul 29 15:59 C-M-1122242400-3600.gwf
-rw-r--r-- 1 controls controls 35699944 Jul 29 16:59 C-M-1122246000-3600.gwf
-rw-r--r-- 1 controls controls 35549480 Jul 29 17:59 C-M-1122249600-3600.gwf
-rw-r--r-- 1 controls controls 35481070 Jul 29 18:59 C-M-1122253200-3600.gwf
-rw-r--r-- 1 controls controls 35518238 Jul 29 19:59 C-M-1122256800-3600.gwf
-rw-r--r-- 1 controls controls 35514930 Jul 29 20:59 C-M-1122260400-3600.gwf

 

LDAS Minute trend:

[koji.arai@ldas-pcdev3 C-M-11]$ pwd
/archive/frames/trend/minute-trend/40m/C-M-11
[koji.arai@ldas-pcdev3 C-M-11]$ ls -l | tail
-rw-r--r-- 1 1001 1001 35488497 Jul 26 19:59 C-M-1121997600-3600.gwf
-rw-r--r-- 1 1001 1001 35477333 Jul 26 21:00 C-M-1122001200-3600.gwf
-rw-r--r-- 1 1001 1001 35498259 Jul 26 21:59 C-M-1122004800-3600.gwf
-rw-r--r-- 1 1001 1001 35509729 Jul 26 22:59 C-M-1122008400-3600.gwf
-rw-r--r-- 1 1001 1001 35472432 Jul 26 23:59 C-M-1122012000-3600.gwf
-rw-r--r-- 1 1001 1001 35472230 Jul 27 00:59 C-M-1122015600-3600.gwf
-rw-r--r-- 1 1001 1001 35468199 Jul 27 01:59 C-M-1122019200-3600.gwf
-rw-r--r-- 1 1001 1001 35461729 Jul 27 02:59 C-M-1122022800-3600.gwf
-rw-r--r-- 1 1001 1001 35486755 Jul 27 03:59 C-M-1122026400-3600.gwf
-rw-r--r-- 1 1001 1001 35467084 Jul 27 04:59 C-M-1122030000-3600.gwf

 

  11465   Thu Jul 30 11:47:54 2015 KojiSummaryCDSStatus of the frame data syncing

Today it was synced at 5AM but that was all.

40m:

controls@nodus|minute > pwd
/frames/trend/minute
controls@nodus|minute > ls -l 11222|tail
-rw-r--r-- 1 controls controls 35521183 Jul 29 21:59 C-M-1122264000-3600.gwf
-rw-r--r-- 1 controls controls 35509281 Jul 29 22:59 C-M-1122267600-3600.gwf
-rw-r--r-- 1 controls controls 35511705 Jul 29 23:59 C-M-1122271200-3600.gwf
-rw-r--r-- 1 controls controls 35809690 Jul 30 00:59 C-M-1122274800-3600.gwf
-rw-r--r-- 1 controls controls 35752082 Jul 30 01:59 C-M-1122278400-3600.gwf
-rw-r--r-- 1 controls controls 35927246 Jul 30 02:59 C-M-1122282000-3600.gwf
-rw-r--r-- 1 controls controls 35775843 Jul 30 03:59 C-M-1122285600-3600.gwf
-rw-r--r-- 1 controls controls 35648583 Jul 30 04:59 C-M-1122289200-3600.gwf
-rw-r--r-- 1 controls controls 35643898 Jul 30 05:59 C-M-1122292800-3600.gwf
-rw-r--r-- 1 controls controls 35704049 Jul 30 06:59 C-M-1122296400-3600.gwf
controls@nodus|minute > ls -l 11223|tail
total 139616
-rw-r--r-- 1 controls controls 35696854 Jul 30 08:02 C-M-1122300000-3600.gwf
-rw-r--r-- 1 controls controls 35675136 Jul 30 08:59 C-M-1122303600-3600.gwf
-rw-r--r-- 1 controls controls 35701754 Jul 30 09:59 C-M-1122307200-3600.gwf
-rw-r--r-- 1 controls controls 35718038 Jul 30 10:59 C-M-1122310800-3600.gwf

LDAS Minute trend:

[koji.arai@ldas-pcdev3 C-M-11]$ pwd
/archive/frames/trend/minute-trend/40m/C-M-11
[koji.arai@ldas-pcdev3 C-M-11]$ ls -l |tail
-rw-r--r-- 1 1001 1001 35518238 Jul 29 19:59 C-M-1122256800-3600.gwf
-rw-r--r-- 1 1001 1001 35514930 Jul 29 20:59 C-M-1122260400-3600.gwf
-rw-r--r-- 1 1001 1001 35521183 Jul 29 21:59 C-M-1122264000-3600.gwf
-rw-r--r-- 1 1001 1001 35509281 Jul 29 22:59 C-M-1122267600-3600.gwf
-rw-r--r-- 1 1001 1001 35511705 Jul 29 23:59 C-M-1122271200-3600.gwf
-rw-r--r-- 1 1001 1001 35809690 Jul 30 00:59 C-M-1122274800-3600.gwf
-rw-r--r-- 1 1001 1001 35752082 Jul 30 01:59 C-M-1122278400-3600.gwf
-rw-r--r-- 1 1001 1001 35927246 Jul 30 02:59 C-M-1122282000-3600.gwf
-rw-r--r-- 1 1001 1001 35775843 Jul 30 03:59 C-M-1122285600-3600.gwf
-rw-r--r-- 1 1001 1001 35648583 Jul 30 04:59 C-M-1122289200-3600.gwf

  11502   Thu Aug 13 12:06:39 2015 Jessica SummaryIOOBetter predicted subtraction did not work as well Online

Yesterday I adjusted the preweighting of my IIR fit to the transfer function of MC2, and also managed to reduce the number of poles and zeros from 8 to 6, giving a smoother rolloff. The bode plots are pictured here:

The predicted IIR subtraction was very close to the predicted FIR subtraction, so I thought these coefficients would lead to a better online filter.

However, the actual subtraction of the MCL was not as good and noise was injected into the Y arm.

The final comparison of the subtraction factors between the online and offline data showed that the preweighting, while it improved the offline subtraction, needs more work to improve the online subtraction also.

  11509   Fri Aug 14 23:49:34 2015 KojiSummaryGeneralB&K Shaker fixed

I fixed a shaker that was claimed to be broken. I had to cut the rubber membrane to open the head.

Once it was opened, the cause of the trouble was obvious. The soldering joint could not put up with the motion of the head.

It is interesting to see that the spring has the damping layer between the metal sheets.

After the repair the DC resistance was measured. It was 1.9Ohm. The side of the shaker chassis said "3.5Ohm, Max 15VA". So it can take more than 4A (wow).

I gave 2A DC from the bench top supply and turn the current on and off. I could confirm the head was moving.

I'll claim the use of this shaker for the seismometer development.

  11524   Sat Aug 22 15:48:32 2015 KojiSummaryLSCArm locking recovery

As per Ignacio's request, I restored the arm locking.

- MC WFS relief

- Slow DC restored to ~0V

- Turned off DARM/CARM

- XARM/YARM turned on

- XARM/YARM ASS& Offset offloading

  11551   Tue Sep 1 02:44:44 2015 KojiSummaryCDSc1oaf, c1mcs modified for the IMC angular FF

[Koji, Ignacio]

In order to allow us to work on the IMC angular FF, we made the signal paths from PEM to MC SUSs.
In fact, there already were the paths from c1pem to c1oaf. So, the new paths were made from c1oaf to c1mcs. (Attachment 1~3)

After some debugging those two models started running. The additional cost of the processing time is insignificant.
FB was restarted to accomodate the change.

Once the modification of the models was completed, the OAF screens were modified. It seemed that the Kissel button
for the output matrix haven't been updated for the PRM ASC implementation. This was fixed as the button was updated this time.
In addition, the button for the FM matrix was also made and pasted.

 

  11569   Thu Sep 3 19:52:24 2015 ranaSummarySUSSUS drift monitor

Since Andrey's SUS Drift mon screen back in 2007, we've had several versions which used different schemes and programming languages. Diego made an update back in January.

Today I added his stuff to the SVN since it was lost in the NFS disks somewhere. Its in SUS/DRIFT_MON/.

Since we've been updating our userapps directory recently to pull in the screens and scripts from the sites, we also got a copy of the Thomas Abbott drift mon stuff which is better (Diego actually removed the yellow/red functionality as part of the 'upgrade'), but more complicated. For now we have the old one. I updated the good values with all optics roughly aligned just a few minutes ago.
 

  11590   Thu Sep 10 09:37:34 2015 IgnacioSummaryIOOFilters left on MCL static module

The following MCL filters were left loaded in the T240-X and T240-Y FF filter modules (filters go in pairs, both on):

FM7: SISO filters for MCL elog:11541

FM8: MISO v1 elog:11547

FM9: MISO v1.1 Small improvement over MISO v1

FM10 MISO v2 elog:11563

FM5 MISO v3.1 elog:11584 (best one)

FM6 MISO 3.1.1 elog:11584 (second best one)

 

  11597   Tue Sep 15 01:14:10 2015 ranaSummaryLSCneed to check LSC Whitening switch logic ... again

Tonight we noticed that the REFL_DC signal has gone bipolar, even though the whitening gain is 0 dB and the whitening filter is requested to be OFF.

We should check out the switch operation of several ofthe LSC channels in the daytime - where is the procedure for this diagnostic posted?

  11598   Tue Sep 15 15:01:23 2015 ranaSummaryLSCdisabling the LSC AA filters + mod to whitening

While investigating the BIO situation with the LSC machine and the iscaux2 processor last night, we wondered if maybe the Anti-Aliasing filters were mistakenly disabled. But why do we need these anyway?

Our ADCs digitize at 64 kHz and there is a digital lowpass in the IOP at 5 kHz before we downsample to 16 kHz. So mainly we're trying to prevent some aliasing at the 64 kHz IOP rate. But our analog AA filter is a 8th order ELP at 7570 Hz, so its overkill.

So, I propose that we bypas the AA via hardwiring the board and implement a 10 kHz pole in the whitening board (D990694) before the whitening by turning R127, etc. into a 0.1 uF cap. Along with the 100 Ohm series resistor, this will make a pole at ~15 kHz. Probably ought to check that the input resistor is metal film. Also, if we replace C158/C159, etc. with a 0.47 nF cap, we'll get 2 poles at 35 kHz to limit the higher frequencies from saturating.

  11599   Tue Sep 15 15:10:48 2015 gautam, ericq, ranaSummaryLSCPRFPMI lock & various to-do's
I was observing Eric while he was attempting to lock the PRFPMI last night. The handoff from ALS to LSC was not very smooth, and Rana suggested looking at some control signals while parked close to the PRFPMI resonance to get an idea of what frequency bands the noise dominated in. The attached power spectrum was taken while CARM and DARM were under ALS control, and the PRMI was locked using REFL_165. The arm power was fluctuating between 15 and 50. Most of the power seems to be in the 1-5Hz band and the 10-30Hz band.

Rana made a number of suggestions, which I'm listing here. Some of these may directly help the above situation, while the others are with regards to the general state of affairs.

  • Reroute both (MC and arm) FF signals to the SUS model
  • For MC, bypass LSC
  • Rethink the MC FF -
  • Leave the arm FF on all the time?
  • The positioning of the accelerometer used for MC FF has to be bettered - it should be directly below the tank
  • The IOO model is over-clocking - needs to be re-examined
  • Fix up the DC F2P - Rana mentioned an old (~10 yr) script called F2P ratio, we should look to integrate the Python scripts used for lock-in/demod at the sites with this
  • Look to calibrate MC_F
  • Implement a high BW CARM servo using ALS
  • Gray code implementation for EPICS gain-stepping

  11601   Tue Sep 15 18:35:21 2015 ericqSummaryLSCsome further notes

About the analog CARM control with ALS:

We're looking at using a Sigg designed remotely switchable delay line box on the currently undelayed side of the ALS DFD beat. For a beat frequency of 50MHz, one cycle is 20ns, this thing has 24ns total delay capability, so we should be able to get pretty close to a zero crossing of the analog I or Q outputs of the demod board. This can be used as IN2 for the common mode board. 

Gautam is testing the functionality of the delay and switching, and should post a link to the DCC page of the schematic. Rana and Koji have been discussing the implementation of the remote switching (RCG vs. VME). 

I spent some time this afternoon trying to lock the X arm in this way, but instead of at IR resonance, just wherever the I output of the DFD had a zero crossing. However, I didn't give enough thought to the loop shapes; Koji helped me think it through. Tomorrow, I'll make a little pomona box to go before the CM IN2 that will give the ALS loop shape a pole where we expect the CARM coupled cavity pole to be (~120Hz), so that the REFL11 and ALS signals have a similar shape when we're trying to transition. 

The common mode board does have a filter for this kind of thing for single arm tests, but puts in a zero as well, as it expects the single arm pole, which isn't present in the ALS sensing, so maybe I'll whip up something appropriate for this, too. 

  11603   Tue Sep 15 20:44:13 2015 gautamSummaryLSCChecking the delay line phase shifter DS050339
I checked out the delay line phase shifter D050339, (theory of operation here) this afternoon. I first checked that the power connection was functional, which it was, though the power connector is is not the usual chassis one (see image attached, do we need to change this?).

The box has two modes of operation - you can either change the delay by flipping switches on the front panel or via a 25pin D-sub connector on the back (the pin numberings for this connector on the datasheet are a little misleading, but I determined that pins 1-9 on the D-sub connector correspond to the 9 delays on the front panel in ascending order, pin 10 is the mode selector switch, should be high for remote operation, pins 11 and 13 are NC, pin 12 is VCC of 5V, and pins 14-25 are grounded). I first checked the front-panel mode of operation, using an oscilloscope to measure the delay between the direct signal from the Fluke 6061 and the output from the D050339. This corresponds to the first set of datapoints in the plot attached (signal was 100MHz sine wave).

I then used a 25 pin D sub breakout boards to check the remote operation mode as well, which corresponds to the second set of datapoints in the plot attached. For this measurement, I used the Agilent network analyzer to measure the phase lag between the direct signal (for all delays, I measured the phase lag at 100MHz, having first calibrated the "thru" path by connecting the R and A inputs of the network analyzer using a barrel BNC) and the delayed output from the box, and then converted it to a time delay.

Both sets of data are linear, with a slope nearly equal to 1 as expected. I conclude that the box is functioning as expected. Right now, Koji is checking a board which will be used to remotely control this box. On the hardware side it remains to make a cable going from the DS050339 Dsub input to the driver board output (also 25 pin Dsub).
  11604   Wed Sep 16 03:37:06 2015 KojiSummaryGreen LockingWorkable delay line setup prepared

[Koji Gautam]

The variable delay line has been setup for practical use. The hardware and basic software are ready.

The delay time is given by [512-1-mod(C1:LSC-BO_1_0_SET, 512)]*(1/16) ns

Giving 511 (LLLL LLLH HHHH HHHH) to C1:LSC-BO_1_0_SET makes the delayline shortest (+0ns).
Giving 0 (LLLL LLLL LLLL LLLL) to C1:LSC-BO_1_0_SET makes the delayline longest (~32ns).

The SR785 was removed from the rack for our access >> Eric


DO setup

- Three CONTEC DO-32L-PE cards are found in the Yarm digital cabinet. (I brought a card from WB, but will bring it back).
- The card was installed in the C1LSC chassis.

- The models for c1x04 and c1lsc were modified to include the card. Once they are restarted, the card was recognized without problem.
  The frame builder also needed to be restarted (Attachment 1&2). The changes were committed to the repository.

- MEDM screen "CDS_BO_STATUS.adl" has been modified to include the bit monitors for the new DO card. (Attachment 3)

Epics values "C1:LSC-BO_1_0_SET" and "C1:LSC-BO_1_1_SET" are hooked up to the DO block.

Cables

- The DO board has DB37(F). I made a I/F cable with a DB37(M) crimp connector, DB25 breakout board, and a ribbon cable.
  Pin 1 is connected to pin 14 of the DB25 (GND of the delayline circuit).
  Pin 2~10 are connected to pin 1~9 of the DB25 (Switch 1~9 of the delayline circuit)
  Pin 18 is connected to X01 (external = spare) (Attachment 4)
 

- [CONFESSION] A bench +15V power supply was prepared to power the transisters of the DO (Attachment 6). The hot side is connected to X01 (not connected to the DB25),
  and the cold side is connected to pin 14 of the DB25. Once we find this is a useful setup we need to make a dedicated interface unit to convert DB37
  into DB25 (and provide more connectivities).

- A DB25 M-F cable was installed on the cable tray above the LSC racks.

Delay line unit

- The delay line box was mounted on 34H of the LSC analog rack (Attachment 5).

- The side cross connect power supply was not available (to be described later). Therefore we decided to use the same +15V supply as the one for the DO card.

- Checked the functionarity of the local switches using a function generator @30MHz and the front panel switches. The maximum (~32ns) delay was confirmed.
  (Just not enough to have 360 deg shift).

- Now the delay line function was tested with the front panel swicth at "ext". We confirmed that the delay time changes with the number given to C1:LSC-BO_1_0_SET.


What we need further

- Implement delay time slider control (511 = 0ns, 0 = 31.94ns). The delay time is given by
  [512-1-mod(C1:LSC-BO_1_0_SET, 512)]*(1/16) ns

- Some independent RF issues I found. (Next entry)

  11606   Wed Sep 16 15:04:33 2015 ericqSummaryLSCDC PD Whitening Board Fixed
Quote:

Tonight we noticed that the REFL_DC signal has gone bipolar, even though the whitening gain is 0 dB and the whitening filter is requested to be OFF.

Fixed! I noticed that whitening gain changes weren't having any effect on CM_SLOW. I then checked REFL_DC, where this also seemed to be the case. Since the gain is controlled via VME machine, and whitening filter switching is controlled via RCG, I figured there must be something wrong with the board. I checked all of the DC PD signals, which share a whitening filter board, and they all had the same symptoms. 

I went and peeked at the board, and it turns out the backplane cable had fallen off. frown

I plugged it in, things look ok. 

  11609   Thu Sep 17 03:48:10 2015 ericqSummaryLSCsome further notes

Something odd is happening with the CM board. Measuring from either input to OUT1 (the "slow output") shows a nice flat response up until many 10s of kHz. 

However, when I connect my idependently confirmed 120Hz LPF to either input, the pole frequency gets moved up to ~360Hz and the DC gain falls some 10dB. This happens regardless if the input is used or not, I saw this shape at a tee on the output of the LPF when the other leg of the tee was connected to a CM board input. 

This has sabotaged my high bandwidth ALS efforts. I will investigate the board's input situation tomorrow.

  11611   Thu Sep 17 13:06:05 2015 ericqSummaryLSCLow input impedance on CM board

As it turns out, our version of the common mode board does not have high input impedence. I think this is what is messing with the lowpass. 

I added photos of the PCB to our 40m DCC page about this board: D1500308, wherein you can see that we have Revision B. 

On the aLIGO wiki's CommonModeServo page, one finds that high input impedence was added in Revision E. At LIGO-D040180, one finds this was implemented via an additional dual AD829 instrumentation amplifier stage before the input amplification stage that exists on our board.

Also, I find that the boosts installed are the default 40:4k, 1k:20k, 1k:20k, 500:10k pole zero pairs. Given our 30-40kHz UGF for CARM thus far, maybe we would like to lower some of these boost corner frequencies, to actually be able to use them; so far we only use the first two.

  11615   Thu Sep 17 19:58:06 2015 gautamSummaryComputer Scripts / ProgramsFrequency counting algorithm

I made some changes to the c1tst model running on c1iscey in order to test my algorithm for frequency counting. I followed the steps listed in elog 8909 to make, install and start the model. 

I need to debug a few things and run some more diagnostics so I am leaving the model in its edited version (Eric had committed it to the svn before I made any changes). 

  11628   Mon Sep 21 18:31:06 2015 gautamSummaryComputer Scripts / ProgramsFrequency counting algorithm

I have been working on setting up a frequency counting module that can give us a readout of the beat frequency, divided by a factor of 2^14 using the Wenzel frequency dividers as described here. This is a summary of what I have thus far.

The algorithm, and simulink model

The basic idea is to pass the digitized signal through a Schmitt trigger (existing RCG module), which provides some noise immunity, and should in theory output a clean square wave with the same frequency as the input. The output of the Schmitt trigger module is either 0 (for input < lower threshold value) and 1 (for input greater than the high threshold value). By differencing this between successive samples, we can detect a "zero-crossing", and by measuring the time interval between successive zero crossings, we can take the reciprocal to get the frequency. The last bit of this operation (i.e. measuring the interval) is done using a piece of custom C code. Initially, I was trying to use the part "GPS" from CDS_PARTS to get the current GPS time and hence measure intervals between successive zero-crossings, but this didn't work out because the output of GPS is in seconds, and that doesn't give me the required precision to count frequency. I tried implementing some more precision timing using the clock_gettime() function, which is capable of giving nanosecond precision, but this didn't work for me. So I am now using a more crude way of measuring the interval, by using a counter variable that is incremented each time a zero-crossing is NOT detected, and then converting this to time using the FE_RATE macro (=16384). In any case, the ADC sampling rate limits the resolution of frequency counting using zero-crossing detection (more on this later). Attachment 1 shows the SIMULINK block diagram for this entire procedure.

Testing the model

I implemented all of this on c1tst, and followed the steps listed here to get the model up and running. I then used one of the DB37 breakout boards to send a signal to the ADC using the DS345 function generator. Attachment 2 shows some diagnostic plots - input signal was a 2.5Vpp (chosen to match the output from the Wenzel dividers) square wave at 2kHz:

  • Bottom left: digitized version of the input signal - I used this to set the upper and lower thresholds on the Schmitt trigger at +1000 counts and -1000 counts respectively.
  • Top left: Schmitt trigger output (red trace) and the difference between successive samples of the Schmitt trigger output (blue trace - this variable is used to detect a zero crossing)
  • Top right: Counter variable used to measure intervals between successive zero crossings, and hence, the frequency. The frequency output is held until the next zero crossing is detected, at which time counter is reset
  • Bottom right: frequency output in Hz.

The right column pointed me to the limitations of frequency counting using this method - even though the input frequency was constant (2kHz), the counter variable, and hence the frequency readout, was neither accurate nor precise. But this was to be expected given the limitations imposed by ADC sampling? We only get information of the state of the input signal once within each sampling interval, and hence, we cannot know if a zero crossing has occurred until the next sampling interval. Moreover, we can only count frequency in discrete steps. In attachments 3 and 4, I've plotted these discrete frequencies which can be measured - the error bars indicate the error in the frequency readout if the counter variable is 1 more or less than the "true" value - this can (and does) happen if the high and low times of the Schmitt trigger are not equal over time (see top left plot in Attachment 2, its not very obvious, but all the "low" times are not equal, and so, the interval between detected zero crossings is not equal). This becomes a problem for small values of the counter variable, i.e. at high input frequencies. I was having a look at the elogs Aidan wrote some years ago for a different digital frequency counting approach, and I guess the conclusion there was similar - for high input frequencies, the error is large. 

I further did two frequency sweeps using the DS345, to see if I could recover this in the frequency readout. Attachments 5 and 6 show the results of these sweeps. For low frequencies, i.e. 100-500 Hz, the jitter in the readout is small (though this will be multiplied by a factor of 2^14), but by the time the input frequency gets up to 2kHz, the jitter in the readout is pretty bad (and gets worse for even higher frequencies.

Bottom line

Some refinements can be made to the algorithm, perhaps by introducing some averaging (i.e. not reading out frequency for every pair of zero crossings, but every 5) which may improve the jitter in the readout, but I would think that the current approach is not very useful above 2kHz (corresponding to ~30MHz of pre-divider frequency), because of the limitations shown in attachments 3 and 4. 

  11629   Mon Sep 21 23:18:55 2015 ericqSummaryComputer Scripts / ProgramsFrequency counting algorithm

I definitely think lowpassing the output is the way to go. Since this frequency readback will be used for slow control of the beatnote frequency via auxillary laser temperature, even lowpassing at tens of Hz is fine. The jitter doesn't mean its useless, though.

If we lowpass at 16Hz, we're effectively averaging over 1024 samples, bringing, for example, a +-2kHz jitter of a 6kHz signal as you post down to 2kHz/sqrt(1024) ~ 60Hz, which is 1% of the carrier. This seems ok to me. 

  11631   Tue Sep 22 02:11:17 2015 ranaSummaryComputer Scripts / ProgramsFrequency counting algorithm

I was going to suggest using a software PLL, but perhaps averaging gives the same result. The same ADC signal can be fed to multiple blocks with different averaging times and we can just use whichever ones seems the most useful.

  11635   Tue Sep 22 16:52:36 2015 ericqSummaryGeneralRandom Notes

Some things bouncing around my head that haven't made it to ELOG yet:

  • Last week, Rana and I were investigating excess power line noise coming from the DFD demodulation. We put transformers on the green beat signals where they arrive at the LSC rack, to avoid connecting their signal ground from the PSL table to the LSC rack ground. This didn't help; it's unclear what the culprit is; maybe the demod board power board?
  • Lately, when the interferometer loses lock, the Y arm will not lock on POY, or even flash its IR resonance. for a little while. The green beam can be locked, the X arm can be locked, and no excess angular noise is evident from glancing at the oplev XY plots. Mysterious.  
  • Sometimes, when writing new values to the C1LSC SDF table, the c1lscepics process dies (though the write is successful). This is highly annoying. This may have been adressed in some slightly newer RCG code. 
  • C1OAF is running with a big red NO SYNC message on its GDS screen. C1LSC has shown this too, but I think only when the SDF/epics crash happens. 
  • C1OAF also doesn't seem to properly load the "safe" SDF table when starting up, and errantly puts ones in every element in the static FF matrix. Be careful when restarting OAF!
  11763   Fri Nov 13 22:32:54 2015 KojiSummaryPSLPMC LO degraded, usual ERA-5 replacement, LO recovered

[Yutaro, Koji]

We found that the PMC LO level was fluctuating in a strage way (it was not stable but had many clitches like an exponential decay), we suspected the infamous PMC LO level decay. In fact, in June 2014 when Rana recalibrated the LO level,  the number on the medm screen (C1:PSL-PMC_LO_CALC) was about 11dBm. However, today it was about 6dBm. So we decided to jump in to the 1X1 rack.

The LO and PC outputs of the PMC Crystal module (D980353) were measured to be 6.2dBm and 13.3dBm. Rana reported in ELOG 10160 that it was measured to be 11.5dBm. So apparently the LO level decayed. Unfortunately, there was no record of the PC output level. In any case, we decided to pull the module for the replacement of ERA-5 chips.

Once we opened the box we found that the board was covered by some greasy material. The ERA-5 chip on the LO chain seemed unreasonably brittle. It was destryed during desoldering. We also replaced the ERA-5 chip in the PC chain, just in case. The board was cleaned by the defluxing liquids.

Taking an advatage of this chance, the SMA  cables around the PMC were checked. By removing some of the heat shrinks, suspicious broken shields of the connectors were found. We provided additional solder to repair them.

After the repair, the LO and PC output levels became icreased to 17.0dBm(!) and 13.8dBm, respectively. (Victory)
This LO level is way too much compared to Rana's value. The MEDM LO power adj has little effect and the adj range was 16dBm~17dBm. Therefore we moved the slider to 10, which yields 16dBm out, and added a 5dB attenuator. The measured LO level after the attenuator was measured to be 11.2dBm.

Locking of the PMC was tried and immediately acquired the lock. However, we noticed that the nomoinal gain of 10dB cause the oscillation of the servo. As we already adjusted the LO level to recover the nominal value, we suspeced that the modulation depth could be larger than before. We left the gain at 0dB that doesn't cause the oscillation. It should be noted that the demodulation phase and the openloop gain were optimized. This should be done in the day time as soon as possible.

When the PMC LO repair was completed, the transmission of the PMC got decreased to 0.700V. The input alignment has been adjusted and the transmission level of 0.739V has been recovered.

The IMC lock stretch is not stable as before yet. Therefore, there would still be the issue somewhere else.

  11764   Sat Nov 14 00:52:25 2015 KojiSummaryCDSInvestion on EPICS freeze

[Yutaro, Koji]

Recently "EPICS Freeze" is so frequent and the normal work on the MEDM screen became almost impossible.

As a part of the investigation, all 19 realtime processes were stopped in order to see any effect on the probem.

IN FACT, when the realtime processes were absent (still slow machines were running), frequency of EPICS Freeze
became much less. This might mean that the issue is related to the data collection of the slow channels. We need more investigation.

After the testing, all the processes were restored although it was not so straghtforward. Particularly, c1sus DAC had an error which was
not visible from the CDS Status screen. We noticed it as suspension damping was not effective on any of the vertex suspensions.
This has been solved by restarting c1x02 process.

  11765   Sun Nov 15 22:43:48 2015 KojiSummaryPSLPMC LO degraded, usual ERA-5 replacement, LO recovered

I think the IMC locking was somewhat improved. Still it is not solid as long time before.

Before the PMC fix (attachment 1)
After the PMC fix (attachment 2)

To do
- PMC loop inspection / phase check / spectral measurements
- PMC / IMC interaction
- IMC loop check
 

  11768   Mon Nov 16 19:05:59 2015 KojiSummaryPSLPMC servo circuit review, follow up measurements

PMC follow up measurements have been done. The servo circuit was reviewed.

Now the PMC, IMC, X/Y arms are locked and aligned waiting for the IFO work although I still think something is moving (ITMX?)
as the FPMI fringe is quite fast.

  11769   Mon Nov 16 21:32:49 2015 KojiSummaryPSLPMC servo circuit review, follow up measurements

The result of the precise inspection for the PMC servo board for the 40m was done.

The record, including the photo of the board, can be found at https://dcc.ligo.org/D1400221-v2

- I found some ceramic 1uF caps are used in the signal path. They have been replaced with film caps by WIMA.

- In later measurements with the openloop TF measurement, it was found that the notch frequency (14.6kHz) was off from the a sharp PZT resonance at 12.2kHz.
I replaced the combined caps of 1220pF to 1742pF. This resulted nice agreement of the notch freq with the PZT resonant freq.


Past related elogs:

SRA-3MH mixer installed in 2009: http://nodus.ligo.caltech.edu:8080/40m/1502

R20 increased for more LO Mon gain: http://nodus.ligo.caltech.edu:8080/40m/10172

  11775   Tue Nov 17 16:21:10 2015 KojiSummaryPSLPMC servo circuit review, follow up measurements

I'm still analyzing the open loop TF data. Here I report some nominal settings of the PMC servo

Nominal phase setting: 5.7
Nominal gain setting: 3dB

After the tuning of the notch frequency, I thought I could increase the gain from 5dB to 9dB.
However, after several hours of the modification, the PMC servo gradually started to have oscillation.
This seemed to be mitigated by reducing the gain down to 4dB. This may mean that the notch freq got drifted away
due to themperature rise in the module. PA85 produce significant amount of heat.

(The notch frequency did not change. Just the 22kHz peak was causing the oscillation.)

  11776   Tue Nov 17 20:40:08 2015 yutaroSummaryASCLoss maps of arm cavities

In preparation for the measurement of loss maps of arm cavities, I measured the relationship between:

the offset just after the demodulation of dithering loop (C1:ASS-YARM_ETM_PIT_L_DEMOD_I_OFFSET and C1:ASS-YARM_ETM_YAW_L_DEMOD_I_OFFSET)

and

the angle of ETMY measured with oplev (C1:SUS-ETMY_OL_PIT_INMONC1:SUS-ETMY_OL_PIT_INMON and C1:SUS-ETMY_OL_PIT_INMONC1:SUS-ETMY_OL_PIT_INMON)

while the dithering script is running. With the angle of ETMY, we can calculate the beam spot on the ETMY assuming that the beam spot on the ITMY is not changed thanks to the dithering. What we have to do is to check the calbration of oplev with another way to measure the angle, to see if the results are reliable or not.

I will report the results later.

  11780   Wed Nov 18 16:51:58 2015 KojiSummaryPSLPMC servo calibration

Summary

The PMC servo error (MIX OUT MON on the panel) and actuation (HV OUT MON) have been calibrated using the swept cavity.

Error signal slope in round-trip displacement: 2.93e9 +/- 0.05e9 [V/m]
HV OUT calibration (round-trip displacement): 5.36e-7 +/- 0.17e-7 [m/V]
PZT calibration (round-trip displacement): 10.8 +/- 0.3 [nm/V]
=> corresponds to ~2.5 fringes for 0~250V full range => not crazy

Measurement condition

The transmission level: 0.743V (on the PMC MEDM screen)
LO level: ~13dBm (after 3dB attenuation)
Phase setting: 5.7
PMC Servo gain: 7dB during the measurement (nominal 3dB)

Method

- Chose PMC actuation "BLANK" to disable servo
- Connect DS345 function generator to EXT DC input on the panel
- Monitor "MIX OUT MON" and "HV OUT MON" with an oscilloscope
- Inject a triangular wave with ~1Vpp@1 or 2Hz with appropriate offset to see the cavity resonance at about the middle of the sweep.
  The frequency of the sweep was decided considering the LPF corner freq formed by the output impedance and the capacitance of the PZT. (i.e. 11.3Hz, see next entry)

Result

- 4 sweep was taken (one 2Hz seep, three 1Hz sweep)
- The example of the sweep is shown in the attachment.
- The input triangular wave and the PDH slopes were fitted by linear lines.
- Spacing between the sideband zero crossing corresponds to twice of the modulation frequency (2x35.5MHz = 71MHz)
- The error signal slope was calibrated as V/MHz
- FSR of the PMC is given by google https://www.google.com/search?q=LIGO+pmc.m
  => Cavity round trip length is 0.4095m, FSR is 732.2MHz
- Convert frequency into round-trip displacement

- Convert HV OUT MON signal into displacement in the same way.
- The voltage applied to the PZT element is obtained considering the ratio of 49.6 between the actual HV and the HV OUT MON voltage.

  11781   Wed Nov 18 16:53:17 2015 KojiSummaryPSLPMC PZT capacitance

The PMC PZT capacitance was measured.
- Turn off the HV supplies. Disconnect HV OUT cable.
- Make sure the cable is discharged.
- Measure the capacity at the cable end with an impedance meter.
=> The PMC PZT capacitance at the cable end was measured to be 222nF
Combined with the output impedance of 63.3kOhm, the LPF pole is at 11.3Hz

  11782   Wed Nov 18 17:09:22 2015 KojiSummaryPSLPMC Servo analysis

Summary

The PMC servo was analysed. OLTF was measured and modeled by ZPK (Attachment 1). The error and actuator signals were calibrated in m/rtHz (Attachment 2)

Measurement methods

OLTF:

- The PMC servo board does not have dedicated summing/monitor points for the OLTF measurement. Moreover the PZT HV output voltage is monitored with 1/49.6 attenuation.
  Therefore we need a bit of consideration.

- The noise injection can be done at EXT DC.
- Quantity (A): Transfer function between HV OUT MON and MIX OUT MON with the injection.
  We can measure the transfer function between the HV OUT (virtual) and the MIX OUT. (HV OUT->MIX OUT). In reality, HV OUT is attenuated by factor of 49.6.
  i.e. A = (HV_OUT->MIX_OUT)*49.6
- Quantity (B): Transfer function between HV OUT MON and MIX OUT MON without the injection.
 
This is related to the transfer function between the MIX OUT and HV OUT. In reality, HV OUT is attenuated. 
  i.e. B = 1/((MIX_OUT->HV_OUT)/49.6)

- What we want to know is HV_OUT->MIX_OUT->HV_OUT. i.e. A/B = (HV_OUT->MIX_OUT*49.6)*((MIX_OUT->HV_OUT)/49.6) = HV_OUT->MIX_OUT->HV_OUT

PSD:

- The MIX OUT and HV OUT spectra have been measured. The MIX OUT was calibrated with the calibration factor in the previous entry. This is the inloop stability estimation.
  From the calibrated MIX OUT and HV OUT, the free running stability of the cavity was estimated, by mutiplying with |1-OLTF| and |1-1/(1-OLTF)|, respectively, in order to recover
  the free running motion.

OLTF Modeling

Here is the model function for the open loop TF. The first line comes from the circuit diagram. The overall factor was determined by eye-fit.
The second and third lines are to reproduce the peak/notch feature at 12kHz. The fourth line is to reproduce 28kHz feature.
The LPF right after the mixer was analyzed by a circuit simulation (Circuit Lab). It can be approximated as 150kHz LPF as the second pole
seems to come at 1.5MHz.

The sixth line comes from the LPF formed by the output resistance and the PZT capacitance.

The seventh line is to reproduce the limit by the GBW product of OP27. As the gain is 101 in one of the stages,
it yields the pole freq of ~80kHz. But it is not enough to explain the phase delay at low frequency. Therefore this
discrepancy was compensated by empirical LPF at 30kHz.

function cmpOLTFc = PMC_OLTF_model(freqOLTFc)

cmpOLTFc = -7e5*pole1(freqOLTFc,0.162).*zero1(freqOLTFc,491)... % from the circuit diagram
    .*zero2(freqOLTFc,12.5e3,100)... % eye-fit
    .*pole2(freqOLTFc,12.2e3,6)... % eye-fit
    .*pole2(freqOLTFc,27.8e3, 12)... % eye-fit
    .*pole1(freqOLTFc,150e3)... % Mixer LPF estimated from Circuit Lab Simulation
    .*pole1(freqOLTFc,11.3)... % Output Impedance + PZT LPF
    .*pole1(freqOLTFc,8e6/101)... % GBW OP27
    .*pole1(freqOLTFc,3e4); % Unknown

end

Result

Attachment 1:

The nominal OLTF (Nov 17 data) shows the nominal UGF is ~1.7kHz and the phase margin of ~60deg.

The measured OLTF was compared with the modelled OLTF. In the end they show very sufficient agreement for further calibration.
The servo is about to be instable at 28kHz due to unknown series resonance. Later in the same day, the gain of the PMC loop had to be
reduced from 7dB to 3dB to mitigate servo oscillation. It is likely that this peak caused the oscillation. The notch frequency was measured
next day and it showed no sign of frequncy drift. That's good.

We still have some phase to reduce the high freq peaks by an LPF in order to increase the over all gain.

Attachment 2:

The red curve shows the residual floor displacement of 2~10x10-15 m/rtHz. Below 4Hz there is a big peak. I suspect that I forgot to close
the PSL shutter and the IMC was locked during the measurement. Then does this mean the measured noise corresponds to the residual laser
freq noise or the PMC cavity displacement? This is interesting to see.

The estimated free running motion from the error and actuation signals agrees very well. This ensures the precision of the caibration in the precious entries.
 

 

  11792   Fri Nov 20 09:45:39 2015 steveSummarySUSoplev laser summary

 

Quote:

 

Quote:

 

      May  13, 2014             ETMX,  .............laser in place 90 d

      May  22, 2012             ETMY, 

     Oct.  7,  2013             ETMY,  LT  503 d  or  1.4 y............bad beam quality ?

     Aug. 8,  2014              ETMY,  .............laser in place   425 days  or  1.2 y

 

      Sept. 5, 2014              new 1103P, sn P893516  installed at SP table for aLIGO oplev use qualification

 

ELOG V3.1.3-