40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 2 of 325  Not logged in ELOG logo
ID Date Author Type Category Subject
  16299   Wed Aug 25 18:20:21 2021 JamieUpdateCDSGPS time on fb1 fixed, dadq writing correct frames again

I have no idea what happened to the GPS timing on fb1, but it seems like the issue was coincident with the power glitch on Monday.

As was noted by Koji above, the GPS time kernel interface was off by a year, which was causing the frame builder to write out files with the wrong names.  fb1 was using DAQD components from the advligorts 3.3 release, which used the old "symmetricom" kernel module for the GPS time.  This old module was also known to have issues with time offsets.  This issue is remniscent of previous timing issues with the DAQ on fb1.

I noted that a newer version of the advligorts, version 3.4, was available on debian jessie, the system running on fb1.  advligorts 3.4 includes a newer version of the GPS time module, renamed gpstime.  I checked with Jonathan Hanks that the interfaces did not change between 3.3 and 3.4, and 3.4 was mostly a bug fix and packaging release, so I decided to upgrade the DAQ to get the new components.  I therefore did the following

  • updated the archive info in /etc/apt/sources.list.d/cdssoft.list, and added the "jessie-restricted" archive which includes the mx packages: https://git.ligo.org/cds-packaging/docs/-/wikis/home

  • removed the symmetricom module from the kernel

    sudo rmmod symmetricom

  • upgraded the advligorts-daqd components (NOTE I did not upgrade the rest of the system, although there are outstanding security upgrades needed):

    sudo apt install advligorts-daqd advligorts-daqd-dc-mx

  • loaded the new gpstime module and checked that the GPS time was correct:

    sudo modprobe gpstime

  • restarted all the daqd processes

    sudo systemctl restart daqd_*

Everything came up fine at that point, and I checked that the correct frames were being written out.

  16298   Wed Aug 25 17:31:30 2021 PacoUpdateCDSFB is writing the frames with a year old date

[paco, tega, koji]

After invaluable assistance from Jamie in fixing this yearly offset in the gps time reported by cat /proc/gps, we managed to restart the real time system correctly (while still manually synchronizing the front end machine times). After this, we recovered the mode cleaner and were able to lock the arms with not much fuss.

Nevertheless, tega and I noticed some weird noise in the C1:LSC-TRX_OUT which was not present in the YARM transmission, and that is present even in the absence of light (we unlocked the arms and just saw it on the ndscope as shown in Attachment #1). It seems to affect the XARM and in general the lock acquisition...

We took some quick spectrum with diaggui (Attachment #2) but it doesn't look normal; there seems to be broadband excess noise with a remarkable 1 kHz component. We will probably look into it in more detail.

Attachment 1: TRX_noise_2021-08-25_17-40-55.png
TRX_noise_2021-08-25_17-40-55.png
Attachment 2: TRX_TRY_power_spectra.pdf
TRX_TRY_power_spectra.pdf
  16297   Wed Aug 25 11:48:48 2021 YehonathanUpdateCDSc1auxey assembly

After confirming that, indeed, leaving the RTN connection floating can cause reliability issues we decided to make these connections in the c1auxex analog input units.

According to Johannes' wiring scheme (excluding the anti-image and OPLEV since they are decommissioned), Acromag unit 1221b accepts analog inputs from two modules. All of these channels are single-ended according to their schematics.

One option is to use the Acromag ground and connect it to the RTNs of both 1221b and 1221c. Another is to connect the minus wire of one module, which is tied to the module's ground, to the RTN. We shouldn't tie the grounds of the different modules together by connecting them to the same RTN point.

We should take some OSEM spectra of the X end arm before and after this work to confirm we didn't produce more noise by doing so. Right now, it is impossible due to issues caused by the recent power surge.

Quote:

{Yehonathan, Jon}

We poked (looked in situ with a flashlight, not disturbing any connections) around c1auxex chassis to understand better what is the wiring scheme.

To our surprise, we found that nothing was connected to the RTNs of the analog input Acromag modules. From previous experience and the Acromag manual, there can't be any meaningful voltage measurement without it.

 

  16296   Wed Aug 25 08:53:33 2021 JordanUpdateSUS2" Adapter Ring for SOS Arrived 8/24/21

8 of the 2"->3" adapter rings (D2100377) arrived from RDL yesterday. I have not tested the threads but dimensional inspection on SN008 cleared. Parts look very good. The rest of the parts should be shipping out in the next week.

Attachment 1: 20210824_152259.jpg
20210824_152259.jpg
Attachment 2: 20210824_152259.jpg
20210824_152259.jpg
Attachment 3: 20210824_152308.jpg
20210824_152308.jpg
  16295   Tue Aug 24 22:37:40 2021 AnchalUpdateGeneralTime synchronization not really working

I attempted to install chrony and run it on one of the FE machines. It didn't work and in doing so, I lost the working NTP client service on the FE computers as well. Following are some details:

  • I added the following two mirrors in the apt source list of root.jessie at /etc/apt/sources.list
    deb http://ftp.us.debian.org/debian/ jessie main contrib non-free
    deb-src http://ftp.us.debian.org/debian/ jessie main contrib non-free
  • Then I installed chrony in the root.jessie using
    sudo apt-get install chrony
    • I was getting an error E: Can not write log (Is /dev/pts mounted?) - posix_openpt (2: No such file or directory) . To fix this, I had to run:
      sudo mount -t devpts none "$rootpath/dev/pts" -o ptmxmode=0666,newinstance
      sudo ln -fs "pts/ptmx" "$rootpath/dev/ptmx"
    • Then, I had another error to resolve.
      Failed to read /proc/cmdline. Ignoring: No such file or directory
      start-stop-daemon: nothing in /proc - not mounted?
      To fix this, I had to exit to fb1 and run:
      sudo mount --bind /proc /diskless/root.jessie/proc
    • With these steps, chrony was finally installed, but I immediately saw an error message saying:
      Starting /usr/sbin/chronyd...
      Could not open NTP sockets
  • I figured this must be due to ntp running in the FE machines.  I logged into c1iscex and stopped and disabled the ntp service:
    sudo systemctl stop ntp
    sudo systemctl disable ntp
    • I saw some error messages from the above coomand as FEs are read only file systems:
      Synchronizing state for ntp.service with sysvinit using update-rc.d...
      Executing /usr/sbin/update-rc.d ntp defaults
      insserv: fopen(.depend.stop): Read-only file system
      Executing /usr/sbin/update-rc.d ntp disable
      update-rc.d: error: Read-only file system
    • So I went back to chroot in fb1 and ran the two command sabove that failed:
      /usr/sbin/update-rc.d ntp defaults
      /usr/sbin/update-rc.d ntp disable
    • The last line gave the output:
      insserv: warning: current start runlevel(s) (empty) of script `ntp' overrides LSB defaults (2 3 4 5).
      insserv: warning: current stop runlevel(s) (2 3 4 5) of script `ntp' overrides LSB defaults (empty).
    • I igored this and moved forward.
  • I copied the chronyd.service from nodus to the chroot in fb1 and configured it to use nodus as the server. The I started the chronyd.service

    sudo systemctl status chronyd.service
    but got the saem issue of NTP sockets.

    â—Â chronyd.service - NTP client/server
       Loaded: loaded (/usr/lib/systemd/system/chronyd.service; disabled)
       Active: failed (Result: exit-code) since Tue 2021-08-24 21:52:30 PDT; 5s ago
      Process: 790 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=1/FAILURE)

    Aug 24 21:52:29 c1iscex systemd[1]: Starting NTP client/server...
    Aug 24 21:52:30 c1iscex chronyd[790]: Could not open NTP sockets
    Aug 24 21:52:30 c1iscex systemd[1]: chronyd.service: control process exited, code=exited status=1
    Aug 24 21:52:30 c1iscex systemd[1]: Failed to start NTP client/server.
    Aug 24 21:52:30 c1iscex systemd[1]: Unit chronyd.service entered failed state.

  • I tried a few things to resolve this, but couldn't get it to work. So I gave up on using chrony and decided to go back to ntp service atleast.

  • I stopped, disabled and checked status of chrony:
    sudo systemctl stop chronyd
    sudo systemctl disable chronyd
    sudo systemctl status chronyd
    This gave the output:

    â—Â chronyd.service - NTP client/server
       Loaded: loaded (/usr/lib/systemd/system/chronyd.service; disabled)
       Active: failed (Result: exit-code) since Tue 2021-08-24 22:09:07 PDT; 25s ago

    Aug 24 22:09:07 c1iscex systemd[1]: Starting NTP client/server...
    Aug 24 22:09:07 c1iscex chronyd[2490]: Could not open NTP sockets
    Aug 24 22:09:07 c1iscex systemd[1]: chronyd.service: control process exited, code=exited status=1
    Aug 24 22:09:07 c1iscex systemd[1]: Failed to start NTP client/server.
    Aug 24 22:09:07 c1iscex systemd[1]: Unit chronyd.service entered failed state.
    Aug 24 22:09:15 c1iscex systemd[1]: Stopped NTP client/server.

  • I went back to fb1 chroot and removed chrony package and deleted the configuration files and systemd service files:
    sudo apt-get remove chrony

  • But when I started ntp daemon service back in c1iscex, it gave error:
    sudo systemctl restart ntp
    Job for ntp.service failed. See 'systemctl status ntp.service' and 'journalctl -xn' for details.

  • Status shows:

    sudo systemctl status ntp
    â—Â ntp.service - LSB: Start NTP daemon
       Loaded: loaded (/etc/init.d/ntp)
       Active: failed (Result: exit-code) since Tue 2021-08-24 22:09:56 PDT; 9s ago
      Process: 2597 ExecStart=/etc/init.d/ntp start (code=exited, status=5)

    Aug 24 22:09:55 c1iscex systemd[1]: Starting LSB: Start NTP daemon...
    Aug 24 22:09:56 c1iscex systemd[1]: ntp.service: control process exited, code=exited status=5
    Aug 24 22:09:56 c1iscex systemd[1]: Failed to start LSB: Start NTP daemon.
    Aug 24 22:09:56 c1iscex systemd[1]: Unit ntp.service entered failed state.

  • I tried to enable back the ntp service by sudo systemctl enable ntp. I got similar error messages of read only filesystem as earlier.
    Synchronizing state for ntp.service with sysvinit using update-rc.d...
    Executing /usr/sbin/update-rc.d ntp defaults
    insserv: warning: current start runlevel(s) (empty) of script `ntp' overrides LSB defaults (2 3 4 5).
    insserv: warning: current stop runlevel(s) (2 3 4 5) of script `ntp' overrides LSB defaults (empty).
    insserv: fopen(.depend.stop): Read-only file system
    Executing /usr/sbin/update-rc.d ntp enable
    update-rc.d: error: Read-only file system

    • I went back to chroot in fb1 and ran:
      /usr/sbin/update-rc.d ntp defaults
      insserv: warning: current start runlevel(s) (empty) of script `ntp' overrides LSB defaults (2 3 4 5).
      insserv: warning: current stop runlevel(s) (2 3 4 5) of script `ntp' overrides LSB defaults (empty).
      and
      /usr/sbin/update-rc.d ntp enable

  • I came back to c1iscex and tried restarting the ntp service but got same error messages as above with exit code 5.

  • I checked c1sus, the ntp was running there. I tested the configuration by restarting the ntp service, and then it failed with same error message. So the remaining three FEs, c1lsc, c1ioo and c1iscey have running ntp service, but they won't be able to restart.

  • As a last try, I rebooted c1iscex to see if ntp comes back online nicely, but it doesn't.

Bottom line, I went to try chrony in the FEs, and I ended up breaking the ntp client services on the computers as well. We have no NTP synchronization in any of the FEs.

Even though Paco and I are learning about the ntp and cds stuff, I think it's time we get help from someone with real experience. The lab is not in a good state for far too long.

Quote:

tl;dr: NTP servers and clients were never synchronized, are not synchronizing even with ntp... nodus is synchronized but uses chronyd; should we use chronyd everywhere?

 

  16294   Tue Aug 24 18:44:03 2021 KojiUpdateCDSFB is writing the frames with a year old date

Dan Kozak pointed out that the new frame files of the 40m has not been written in 2021 GPS time but 2020 GPS time.

Current GPS time is 1313890914 (or something like that), but the new files are written as C-R-1282268576-16.gwf

I don't know how this can happen but this may explain why we can't have the agreement between the FB gps time and the RTS gps time.

(dataviewer seems dependent on the FB GPS time and it indicates 2020 date. DTT/diaggui does not.)


This is the way to check the gpstime on fb1. It's apparently a year off.

controls@fb1:~ 0$ cat /proc/gps
1282269402.89

Attachment 1: Screen_Shot_2021-08-24_at_18.46.24.png
Screen_Shot_2021-08-24_at_18.46.24.png
  16293   Tue Aug 24 18:11:27 2021 PacoUpdateGeneralTime synchronization not really working

tl;dr: NTP servers and clients were never synchronized, are not synchronizing even with ntp... nodus is synchronized but uses chronyd; should we use chronyd everywhere?


Spent some time investigating the ntp synchronization. In the morning, after Anchal set up all the ntp servers / FE clients I tried restarting the rts IOPs with no success. Later, with Tega we tried the usual manual matching of the date between c1iscex and fb1 machines but we iterated over different n-second offsets from -10 to +10, also without success.

This afternoon, I tried debugging the FE and fb1 timing differences. For this I inspected the ntp configuration file under /etc/ntp.conf in both the fb1 and /diskless/root.jessie/etc/ntp.conf (for the FE machines) and tried different combinations with and without nodus, with and without restrict lines, all while looking at the output of sudo journalctl -f on c1iscey. Everytime I changed the ntp config file, I restarted the service using sudo systemctl restart ntp.service . Looking through some online forums, people suggested basic pinging to see if the ntp servers were up (and broadcasting their times over the local network) but this failed to run (read-only filesystem) so I went into fb1, and ran sudo chroot /diskless/root.jessie/ /bin/bash to allow me to change file permissions. The test was first done with /bin/ping which couldn't even open a socket (root access needed) by running chmod 4755 /bin/ping then ssh-ing into c1iscey and pinging the fb1 machine successfully. After this, I ran chmod 4755 /usr/sbin/ntpd so that the ntp daemon would have no problem in reaching the server in case this was blocking the synchronization. I exited the chroot shell and the ntp daemon in c1iscey; but the ntpstat still showed unsynchronised status. I also learned that when running an ntp query with ntpq -p if a client has succeeded in synchronizing its time to the server time, an asterisk should be appended at the end. This was not the case in any FE machine... and looking at fb1, this was also not true. Although the fb1 peers are correctly listed as nodus, the caltech ntp server, and a broadcast (.BCST.) server from local time (meant to serve the FE machines), none appears to have synchronized... Going one level further, in nodus I checked the time synchronization servers by running chronyc sources the output shows

controls@nodus|~> chronyc sources
210 Number of sources = 4
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* testntp1.superonline.net      1  10   377   280  +1511us[+1403us] +/-   92ms
^+ 38.229.59.9                   2  10   377   206  +8219us[+8219us] +/-  117ms
^+ tms04.deltatelesystems.ru     2  10   377   23m    -17ms[  -17ms] +/-  183ms
^+ ntp.gnc.am                    3  10   377   914  -8294us[-8401us] +/-  168ms

I then ran chronyc clients to find if fb1 was listed (as I would have expected) but the output shows this --

Hostname                   Client    Peer CmdAuth CmdNorm  CmdBad  LstN  LstC
=========================  ======  ======  ======  ======  ======  ====  ====
501 Not authorised

So clearly chronyd succeeded in synchronizing nodus' time to whatever server it was pointed at but downstream from there, neither the fb1 or any FE machines seem to be synchronizing properly. It may be as simple as figuring out the correct ntp configuration file, or switching to chronyd for all machines (for the sake of homogeneity?)

  16292   Tue Aug 24 09:22:48 2021 AnchalUpdateGeneralTime synchronization working now

Jamie told me to use chroot to log in into the chroot jail of debian os that are exported for the FEs and install ntp there. I took following steps at the end of which, all FEs have NTP synchronized now.

  • I logged into fb1 through nodus.
  • chroot /diskless/root.jessie /bin/bash took me to the bash terminal for debian os that is exported to all FEs.
  • Here, I ran sudo apt-get install ntp which ran without any errors.
  • I then edited the file in /etc/ntp.conf , i removed the default servers and added following lines for servers (fb1 and nodus ip addresses):
    server 192.113.168.201
    server 192.113.168.201
  • I logged into each FE machine and ran following commands:
    sudo systemctl stop systemd-timesyncd.service; sudo systemctl status systemd-timesyncd.service;
    timedatectl; sleep 2;sudo systemctl daemon-reload;  sudo systemctl start ntp; sleep 2; sudo systemctl status ntp; timedatectl
    sudo hwclock -s
    • The first line ensures that systemd-timesyncd.service is not running anymore. I did not uninstall timesyncd and left its configuration file as it is.
    • The second line first shows the times of local and RTC clocks. Then reloads the daemon services to get ntp registered. Then starts ntp.service and shows it's status. Finally, the timedatectl command shows the synchronized clocks and that NTP synchronization has occured.
    • The last line sets the local clock same as RTC clock. Even though this wasn't required as I saw that the clocks were already same to seconds, I just wanted a point where all the local clocks are synchronized to the ntp server.
  • Hopefully, this would resolve our issue of restarting the models anytime some glitch happens or when we need ot update something in one of them.

Edit Tue Aug 24 10:19:11 2021:

I also disabled timesyncd on all FEs using sudo systemctl disable systemd-timesyncd.service

I've added this wiki page for summarizing the NTP synchronization knowledge.

  16291   Mon Aug 23 22:51:44 2021 AnchalUpdateGeneralTime synchronization efforts

Related elog thread: 16286


I didn't really achieve anything but I'm listing what I've tried.

  • I know now that the timesyncd isn't working because systemd-timesyncd is known to have issues when running on a read-only file system. In particular, the service does not have privileges to change the clock or drift settings at /run/systemd/clock or /etc/adjtime.
  • The workarounds to these problems are poorly rated/reviews in stack exchange and require me to change the /etc/systmd/timesyncd.conf file but I'm unable to edit this file.
  • I know that Paco was able to change these files earlier as the files are now changed and configured to follow a debian ntp pool server which won't work as the FEs do not have internet access. So the conf file needs to be restored to using ntpserver as the ntp server.
  • From system messages, the ntpserver is recognized by the service as shown in the second part of 16285. I really think the issue is in file permissions. the file /etc/adjtime has never been updated since 2017.
  • I got help from Paco on how to edit files for FE machines. The FE machines directories are exported from fb1:/diskless/root.jessie/
  • I restored the /etc/systmd/timesyncd.conf file to how it as before with just servers=ntpserver line. Restarted timesyncd service on all FEs,I tried a few su the synchronization did not happen.
  • I tried a few suggestions from stackexchange but none of them worked. The only rated solution creates a tmpfs directory outside of read-only filesystem and uses that to run timesyncd. So, in my opinion, timesyncd  would never work in our diskless read-only file system FE machines.
  • One issue in an archlinux discussion ended by the questioner resorting to use opennptd from openBSD distribution. The user claimed that opennptd is simple enough that it can run ntp synchornization on a read-only file system.
  • Somehwat painfully, I 'kind of' installed the openntpd tool in the fb1:/diskless/root.jessie directory following directions from here. I had to manually add user group and group for the FEs (which I might not have done correctly). I was not able to get the openntpd daemon to start properly after soe tries.
  • I restored everything back to how it was and restarted timesyncd in c1sus even though it would not do anything really.
Quote:

This time no matter how we try to set the time, the IOPs do not run with "DC status" green. (We kept having 0x4000)

 

  16290   Mon Aug 23 19:00:05 2021 KojiUpdateGeneralCampus Wide Power Glitch Reported: Monday, 8/23/21 at 9:30am

Restarting the RTS was unsuccessful because of the timing discrepancy error between the RT machines and the FB. This time no matter how we try to set the time, the IOPs do not run with "DC status" green. (We kept having 0x4000)

We then decided to work on the recovery without the data recorded. After some burtrestores, the IMC was locked and the spot appeared on the AS port. However, IPC seemed down and no WFS could run.

  16289   Mon Aug 23 15:25:59 2021 Ian MacMillanUpdateCDSSUS simPlant model

I am adding a State-space block to the SimPlant cds model using the example Chris gave. I made a new folder in controls called SimPlantStateSpace. wI used the code below to make a state-space LTI model with a 1D pendulum then I converted it to a discrete system using c2d matlab function. Then I used these in the rtss.m file to create the state space code I need in the SimPlantStateSpace_1D_model.h file.

%sys_model.m

Q = 1000;
phi = 1/Q;
g = 9.806;
m = 0.24; % mass of pendulum
l = 0.248; %length of pendulum
w_0 = sqrt(g/l);

f=16000 %this is the frequency of the channel that will be used

A = [0 1; -w_0^2*(1+1/Q*1i) -w_0/Q]
B = [0; 1/m];
C = [1 0];
D = [0];
sys_dc = ss(A,B,C,D)

sys=c2d(sys_dc, 1/f)

This code outputs the discrete state space that is added to the header file attached.

Attachment 1: SimPlantStateSpace.zip
  16288   Mon Aug 23 11:51:26 2021 KojiUpdateGeneralCampus Wide Power Glitch Reported: Monday, 8/23/21 at 9:30am

Campus Wide Power Glitch Reported: Monday, 8/23/21 at 9:30am (more like 9:34am according to nodus log)

nodus: rebooted. ELOG/apache/svn is running. (looks like Anchal worked on it)

chiara: survived the glitch thanks to UPS

fb1: not responding -> @1pm open to login / seemed rebooted only at 9:34am (network path recovered???)

megatron: not responding

optimus: no route to host

c1aux: ping ok, ssh not responding -> needed to use telnet (vme / vxworks)
c1auxex: ssh ok
c1auxey: ping ok, ssh not respoding -> needed to use telnet (vme / vxworks)
c1psl: ping NG, power cycled the switch on 1X2 -> ssh OK now
c1iscaux: ping NG -> rebooted the machine -> ssh recovered

c1iscaux2: does not exist any more
c1susaux: ping NG -> responds after 1X2 switch reboot

c1pem1: telnet ok (vme / vxworks)
c1iool0: does not exist any more

c1vac1: ethernet service restarted locally -> responding
ottavia: doesnot exist?
c1teststand: ping ok, ssh not respoding

3:20PM we started restarting the RTS

  16287   Mon Aug 23 10:17:21 2021 PacoSummaryComputerssystem reboot glitch

[paco]

At 09:34 PST I noted a glitch in the controls room as the machines went down except for c1ioo. Briefly, the video feeds disappeared from the screens, though the screens themselves didn't lose power. At first I though this was some kind of power glitch, but upon checking with Jordan, it most likely was related to some system crash. Coming back to the controls room, I could see the MC reflection beam swinging, but unfortunately all the FE models came down. I noticed that the DAQ status channels were blank.

I ssh into c1ioo no problem and ran "rtcds stop c1ioo c1als c1omc", then "rtcds restart c1x03" to do a soft restart. This worked, but the DAQ status was still blank. I then tried to ssh into c1sus and c1lsc without success, similarly c1iscex and c1iscey were unreachable. I went and did a hard restart on c1iscex by switching it off, then its extension chassis, then unplugging the power cords, then inverting these steps, and could ssh into it from rossa. I ran "rtcds start c1x01" and saw the same blank DAQ status. I noticed the elog was also down... so nodus was also affected?

[paco, anchal]

Anchal got on zoom to offer some assistance. We discovered that the fb1 and nodus were subject to some kind of system reboot at precisely 09:34. The "systemctl --failed" command on fb1 displayed both the daqd_dc.service and rc-local.service as loaded but failed (inactive). Is it a good idea to try and reboot the fb1 machine? ... Anchal was able to bring elog back up from nodus (ergo, this post).

[paco]

Although it probably needs the DAQ service from the fb1 machine to be up and running, I tried running the scripts/cds/rebootC1LSC.sh script. This didn't work. I tried running sudo systemctl restart daqd_dc from the fb1 machine without success. Running systemctl reset-failed "worked" for daqd_dc and rc-local services on fb1 in the sense that they were no longer output from systemctl --failed, but they remained inactive (dead) when running systemctl status on them. Following from  15303   I succeeded in restarting the daqd services. Turned out I needed to manually start the open-mx and mx services in fb1. I rerun the restartC1LSC script without success. The script fails because some machines need to be rebooted by hand.
 

  16286   Fri Aug 20 06:24:18 2021 AnchalUpdateCDSTime synchornization not running

I read on some stack exchange that 'NTP synchornized' indicator turns 'yes' in the output of command timedatectl only when RTC clock has been adjusted at some point. I also read that timesyncd does not do the change if the time difference is too much, roughly more than 3 seconds.

So I logged into all FE machines and ran sudo hwclock -w to synchronize them all to the system clocks and then waited if the timesyncd does any correction on RTC. It did not. A few hours later, I found the RTC clocks drifitng again from the system clocks. So even if the timesynd service is running as it should, it si not performing time correction for whatever reason.

Maybe we should try to use some other service?

Quote:
 

The NTP synchronized flag in output of timedatectl command did not change to yes and the RTC is still 3 seconds ahead of the local clock.

 

  16285   Fri Aug 20 00:28:55 2021 AnchalUpdateCDSTime synchornization not running

I added ntpserver as a known host name for address 192.168.113.201 (fb1's address where ntp server is running) in the martian host list in the following files in Chiara:

/var/lib/bind/martian.hosts
/var/lib/bind/rev.113.168.192.in-addr.arpa

Note: a host name called ntp was already defined at 192.168.113.11 but I don't know what computer this is.

Then, I restarted the DNS on chiara by doing:

sudo service bind9 restart

Then I logged into c1lsc and c1ioo and ran following:

controls@c1ioo:~ 0$ sudo systemctl restart systemd-timesyncd.service

controls@c1ioo:~ 0$ sudo systemctl status systemd-timesyncd.service -l
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
   Active: active (running) since Fri 2021-08-20 07:24:03 UTC; 53s ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 23965 (systemd-timesyn)
   Status: "Idle."
   CGroup: /system.slice/systemd-timesyncd.service
           └─23965 /lib/systemd/systemd-timesyncd

Aug 20 07:24:03 c1ioo systemd[1]: Starting Network Time Synchronization...
Aug 20 07:24:03 c1ioo systemd[1]: Started Network Time Synchronization.
Aug 20 07:24:03 c1ioo systemd-timesyncd[23965]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 07:24:35 c1ioo systemd-timesyncd[23965]: Using NTP server 192.168.113.201:123 (ntpserver).
controls@c1ioo:~ 0$ timedatectl
      Local time: Fri 2021-08-20 07:25:28 UTC
  Universal time: Fri 2021-08-20 07:25:28 UTC
        RTC time: Fri 2021-08-20 07:25:31
       Time zone: Etc/UTC (UTC, +0000)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

The same output is shown in c1lsc too. The NTP synchronized flag in output of timedatectl command did not change to yes and the RTC is still 3 seconds ahead of the local clock.

Then I went to c1sus to see what was the status output before rstarting the timesyncd service. I got folloing output:

controls@c1sus:~ 0$ sudo systemctl status systemd-timesyncd.service -l
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled)
   Active: active (running) since Tue 2021-08-17 04:38:03 UTC; 3 days ago
     Docs: man:systemd-timesyncd.service(8)
 Main PID: 243 (systemd-timesyn)
   Status: "Idle."
   CGroup: /system.slice/systemd-timesyncd.service
           └─243 /lib/systemd/systemd-timesyncd

Aug 20 02:02:18 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 02:36:27 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 03:10:35 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 03:44:43 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 04:18:51 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 04:53:00 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 05:27:08 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 06:01:16 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 06:35:24 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).
Aug 20 07:09:33 c1sus systemd-timesyncd[243]: Using NTP server 192.168.113.201:123 (ntpserver).

This actually shows that the service was able to find ntpserver correctly at 192.168.113.201 even before I changed the name server file in chiara. So I'm retracting the changes made to name server. They are probably not required.

The configuration files for timesynd.conf are read only even with sudo. I tried changing permissions but that did not work either. Maybe these files are not correctly configured. The man page of timesyncd  says to use field 'NTP' to give the ntp servers. Our files are using field 'Servers'. But since we are not getting any error message, I don't think this is the issue here.

I'll look more into this problem.

  16284   Thu Aug 19 14:14:49 2021 KojiUpdateCDSTime synchornization not running

131.215.239.14 looks like Caltech's NTP server (ntp-02.caltech.edu)
https://webmagellan.com/explore/caltech.edu/28415b58-837f-4b46-a134-54f4b81bee53

I can't say it is correct or not as I did not make the survey at your level. I think you need a few tests of reconfiguring and restarting the NTP clients to see if time synchronization starts. Because the local time is not regulated right now anyway, this operation is safe I think.

 

  16283   Thu Aug 19 03:23:00 2021 AnchalUpdateCDSTime synchornization not running

I tried to read a bit and understand the NTP synchronization implementation in FE computers. I'm quite sure that NTP synchronization should be 'yes' if timesyncd are running correctly in the output of timedatectl in these computers. As Koji reported in 15791, this is not the case. I logged into c1lsc, c1sus and c1ioo and saw that RTC has drifted from the software clocks too which does not happen if NTP synchronization was active. This would mean that almost certainly, if the computers are rebooted, the synchronization will be lost and the models will fail to come online.

My current findings are the following (this should be documented in wiki once we setup everything):

  • nodus is running a NTP server using chronyd. One can check the configuration of this NTP serer in /etc/chornyd.conf
  • fb1 is running an NTP server using ntpd that follows nodus and an IP address 131.215.239.14. This can be seen in /etc/ntp.conf.
  • There are no comments to describe what this other server (131.215.239.14) is. Does the GC network have an NTP server too?
  • c1lsc, c1sus and c1ioo all have systemd-timesyncd.service running with configuration file in /etc/systemd/timesyncd.conf.
  • The configuration file set Servers=ntpserver but echo $ntpserver produces nothing (blank) on these computers and I've been unable to find anyplace where ntpserver is defined.
  • In chiara (our name server), the name server file /etc/hosts does not have any entry for ntpserver either.
  • I think the problem might be that these computers are unable to find the ntpserver as it is not defined anywhere.

The solution to this issue could be as simple as just defining ntpserver in the name server list. But I'm not sure if my understanding of this issue is correct. Comments/suggestions are welcome for future steps.

 

  16282   Wed Aug 18 20:30:12 2021 AnchalUpdateASSFixed runASS scripts

Late elog: Original time of work Tue Aug 17 20:30 2021


I locked the arms yesterday remotely and tried running runASS.py scripts (generally ran by clicking Run ASS buttons on IFO OVERVIEW screen of ASC screen). We have known for few weeks that this script stopped working for some reason. It would start the dithering and would optimize the alignment but then would fail to freeze the state and save the alignment.

I found the caget('C1:LSC-TRX_OUT') or caget('C1:LSC-TRY_OUT') were not working in any of the workstations. This is weird since caget was able to acquire these fast channel values earlier and we have seen this script to work for about a month without any issue.

Anyways, to fix this, I just changed the channel name to 'C1:LSC-TRY_OUT16' when the script checks in the end if the arm has indeed been aligned. It was only this step that was failing. Now the script is working fine and I tested them on both arms. On the Y arm, I misaligned the arm by adding bias in yaw by changing C1:SUS-ITMY_YAW_OFFSET from -8 to 22. The script was able to align the arm back.

  16281   Tue Aug 17 04:30:35 2021 KojiUpdateSUSNew electronics

Received:

Aug 17, 2021 2x ISC Whitening

Delivered 2x Sat Amp board to Todd

 

Attachment 1: P_20210816_234136.jpg
P_20210816_234136.jpg
Attachment 2: P_20210816_235106.jpg
P_20210816_235106.jpg
Attachment 3: P_20210816_234220.jpg
P_20210816_234220.jpg
  16280   Mon Aug 16 23:30:34 2021 PacoUpdateCDSAS WFS commissioning; restarting models

[koji, ian, tega, paco]

With the remote/local assistance of Tega/Ian last friday I made changes on the c1sus model by connecting the C1:ASC model outputs (found within a block in c1ioo) to the BS and PRM suspension inputs (pitch and yaw). Then, Koji reviewed these changes today and made me notice that no changes are actually needed since the blocks were already in place, connected in the right ports, but the model probably just wasn't rebuilt...

So, today we ran "rtcds make", "rtcds install" on the c1ioo and c1sus models (in that order) but the whole system crashed. We spent a great deal of time restarting the machines and their processes but we struggled quite a lot with setting up the right dates to match the GPS times. What seemed to work in the end was to follow the format of the date in the fb1 machine and try to match the timing to the sub-second level. This is especially tricky when performed by a human action so the whole task is tedious. We anyways completed the reboot for almost all the models except the c1oaf (which tends to make things crashy) since we won't need it right away for the tasks ahead. One potential annoying issue we found was in manually rebooting c1iscey because one of its network ports is loose (the ethernet cable won't click in place) and it appears to use this link to boot (!!) so for a while this machine just wasn't coming back up.

Finally, as we restored the suspension controls and reopened the shutters, we noticed a great deal of misalignment to the point no reflected beam was coming back to the RFPD table. So we spent some time verifying the PRM alignment and TT1 and TT2 (tip tilts) and it turned out to be mostly the latter pair that were responsible for it. We used the green beams to help optimize the XARM and YARM transmissions and were able to relock the arms. We ran ASS on them, and then aligned the PRM OpLevs which also seemed off. This was done by giving a pitch offset to the input PRM oplev beam path and then correcting for it downstream (before the qpd). We also adjusted the BS OpLev in the end.


Summary; the ASC BS and PRM outputs are now built into the SUS models. Let the AS WFS loops be closed soon!


Addenda by KA
- Upon the RTS restarting,

  • Date/Time adjustment
    sudo date --set='xxxxxx'
  • If the time on the CDS status medm screen for each IOP match with the FB local time, we ran
    rtcds start c1x01
    (or c1x02, etc)
  • Every time we restart the IOPs, fb was restarted by
    telnet fb1 8083
    > shutdown

    and restarted mx_stream from the CDS screen because these actions change the "DC" status.

- Today we once succeeded to restart the vertex machines. However, the RFM signal transmission did fail. So the end two machines were power cycled as well as c1rfm, but this made all the machines in RED again. Hell...

- We checked the PRM oplev. The spot was around the center but was clipped. This made us so confused. Our conclusion was that the oplev was like that before the RTS reboot.

  16279   Thu Aug 12 20:52:04 2021 KojiUpdateGeneralPSL shutter was closed this morning

I did a bit more investigation on this.

- I checked P1~P4, PTP2/3, N2, TP2, TP3. But found only P1a and P2 were affected.

- Looking at the min/mean/max of P1a and P2 (Attachment 1), the signal had a large fluctuation. It is impossible to have P1a from 0.004 to 0 instantaneously.

- Looking at the raw data of P1a and P2 (Attachment 2), the value was not steadily large. Instead it looks like fluctuating noise.

So my conclusion is that because of an unknown reason, an unknown noise coupled only into P1a and P2 and tripped the PSL shutter. I still don't know the status of the mail alert.

Attachment 1: Screen_Shot_2021-08-12_at_20.51.19.png
Screen_Shot_2021-08-12_at_20.51.19.png
Attachment 2: Screen_Shot_2021-08-12_at_20.51.34.png
Screen_Shot_2021-08-12_at_20.51.34.png
  16278   Thu Aug 12 14:59:25 2021 KojiUpdateGeneralPSL shutter was closed this morning

What I was afraid of was the vacuum interlock. And indeed there was a pressure surge this morning. Is this real? Why didn't we receive the alert?

Attachment 1: Screen_Shot_2021-08-12_at_14.58.59.png
Screen_Shot_2021-08-12_at_14.58.59.png
  16277   Thu Aug 12 11:04:27 2021 PacoUpdateGeneralPSL shutter was closed this morning

Thu Aug 12 11:04:42 2021 Arrived to find the PSL shutter closed. Why? Who? When? How? No elog, no fun. I opened it, IMC is now locked, and the arms were restored and aligned.

  16276   Wed Aug 11 12:06:40 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

I redid the differential input experiment using the DS360 function generator we recently got. I generated a low frequency (0.1Hz) sine wave signal with an amplitude 0.5V and connected the + and - output to a differential input on the new c1auxcey Acromag chassis. I recorded a time series of the corresponding EPICS channel with and without the common on the DS360 connected to the Ref connector on the Acromag unit. The common connector on the DS360 is not normally grounded (there is a few tens of kohms between the ground and common connectors). The attachment shows that, indeed, the analog input readout is extremely noisy with the Ref being disconnected. The point where the Ref was connected to common is marked in the picture.

Conclusion: Ref connector on the analog input Acromag units must be connected to some stable voltage source for normal operation.

Attachment 1: SUS-ETMY_SparePDMon0_2.png
SUS-ETMY_SparePDMon0_2.png
  16275   Wed Aug 11 11:35:36 2021 PacoUpdateLSCPRMI MICH orthogonality plan

[yehonathan, paco]

Yesterday we discussed a bit about working on the PRMI sensing matrix.

In particular we will start with the "issue" of non-orthogonality in the MICH actuated by BS + PRM. Yesterday afternoon we played a little with the oscillators and ran sensing lines in MICH and PRCL (gains of 50 and 5 respectively) in the times spanning [1312671582 -> 1312672300], [1312673242 -> 1312677350] for PRMI carrier and [1312673832 -> 1312674104] for PRMI sideband. Today we realize that we could have enabled the notchSensMat filter, which is a notch filter exactly at the oscillator's frequency, in FM10 and run a lower gain to get a similar SNR. We anyways want to investigate this in more depth, so here is our tentative plan of action which implies redoing these measurements:

Task: investigate orthogonality (or lack thereof) in the MICH when actuated by BS & PRM
    1) Run sensing MICH and PRCL oscillators with PRMI Carrier locked (remember to turn NotchSensMat filter on).
    2) Analyze data and establish the reference sensing matrix.
    3) Write a script that performs steps 2 and 3 in a robust and safe way.
    4) Scan the C1:LSC-LOCKIN_OUTMTRX, MICH to BS and PRM elements around their nominal values.
    5) Scan the MICH and PRCL RFPD rotation angles around their nominal values.

We also talked about the possibility that the sensing matrix is strongly frequnecy dependant such that measuring it at 311Hz doesn't give us accurate estimation of it. Is it worthwhile to try and measure it at lower frequencies using an appropriate notch filter?


Wed Aug 11 15:28:32 2021 Updated plan after group meeting

- The problem may be in the actuators since the orthogonality seems fine when actuating on the ITMX/ITMY, so we should instead focus on measuring the actuator transfer functions using OpLevs for example (same high freq. excitation so no OSEM will work > 10 Hz).

  16274   Tue Aug 10 17:24:26 2021 pacoUpdateGeneralFive day trend

Attachment 1 shows a five and a half day minute-trend of the three temperature sensors. Logging started last Thursday ~ 2 pm when all sensors were finally deployed. While it appears that there is a 7 degree gradient along the XARM it seems like the "vertex" (more like ITMX) sensor was just placed on top of a network switch (which feels lukewarm to the touch) so this needs to be fixed. A similar situation is observed in the ETMY sensor. I shall do this later today.


Done. The temperature reading should now be more independent from nearby instruments.


Wed Aug 11 09:34:10 2021 I updated the plot with the full trend before and after rearranging the sensors.

Attachment 1: six_day_minute_trend.png
six_day_minute_trend.png
  16273   Mon Aug 9 10:38:48 2021 AnchalUpdateBHDc1teststand subnetwork now accessible remotely

I had to add following two lines in the /etc/network/interface file to make the special ip routes persistent even after reboot:

post-up ip route add 192.168.113.200 via 10.0.1.1 dev eno1
post-up ip route add 192.168.113.216 via 10.0.1.1 dev eno1

  16272   Fri Aug 6 17:10:19 2021 PacoUpdateIMCMC rollercoaster

[anchal, yehonatan, paco]

For whatever reason (i.e. we don't really know) the MC unlocked into a weird state at ~ 10:40 AM today. We first tried to find a likely cause as we saw it couldn't recover itself after ~ 40 min... so we decided to try a few things. First we verified that no suspensions were acting weird by looking at the OSEMs on MC1, MC2, and MC3. After validating that the sensors were acting normally, we moved on to the WFS. The WFS loops were disabled the moment the IMC unlocked, as they should. We then proceeded to the last resort of tweaking the MC alignment a bit, first with MC2 and then MC1 and MC3 in that order to see if we could help the MC catch its lock. This didn't help much initially and we paused at about noon.

At about 5 pm, we resumed since the IMC had remained locked to some higher order mode (TEM-01 by the looks of it). While looking at C1:IOO-MC_TRANS_SUMFILT_OUT on ndscope, we kept on shifting the MC2 Yaw alignment slider (steps = +-0.01 counts) slowly to help the right mode "hop". Once the right mode caught on, the WFS loops triggered and the IMC was restored. The transmission during this last stage is shown in Attachment #1.

Attachment 1: MC2_trans_sum_2021-08-06_17-18-54.png
MC2_trans_sum_2021-08-06_17-18-54.png
  16271   Fri Aug 6 13:13:28 2021 AnchalUpdateBHDc1teststand subnetwork now accessible remotely

c1teststand subnetwork is now accessible remotely. To log into this network, one needs to do following:

  • Log into nodus or pianosa. (This will only work from these two computers)
  • ssh -CY controls@192.168.113.245
  • Password is our usual workstation password.
  • This will log you into c1teststand network.
  • From here, you can log into fb1, chiara, c1bhd and c1sus2  which are all part of the teststand subnetwork.

Just to document the IT work I did, doing this connection was bit non-trivial than usual.

  • The martian subnetwork is created by a NAT router which connects only nodus to outside GC network and all computers within the network have ip addresses 192.168.113.xxx with subnet mask of 255.255.255.0.
  • The cloned test stand network was also running on the same IP address scheme, mostly because fb1 and chiara are clones in this network. So every computer in this network also had ip addresses 192.168.113.xxx.
  • I setup a NAT router to connect to martian network forwarding ssh requests to c1teststand computer. My NAT router creates a separate subnet with IP addresses 10.0.1.xxx and suubnet mask 255.255.255.0 gated through 10.0.1.1.
  • However, the issue is for c1teststand, there are now two networks accessible which have same IP addresses 192.168.113.xxx. So when you try to do ssh, it always search in its local c1teststand subnetwork instead of routing through the NAT router to the martian network.
  • To work around this, I had to manually provide an ip router to c1teststand for connecting to two of the computers (nodus and pianosa) in martian network. This is done by:
    ip route add 192.168.113.200 via 10.0.1.1 dev eno1
    ip route add 192.168.113.216 via 10.0.1.1 dev eno1
  • This gives c1teststand specific path for ssh requests to/from these computers in the martian network.
  16270   Thu Aug 5 14:59:31 2021 AnchalUpdateGeneralAdded temperature sensors at Yend and Vertex too

I've added the other two temperature sensor modules on Y end (on 1Y4, IP: 192.168.113.241) and in the vertex on (1X2, IP: 192.168.113.242). I've updated the martian host table accordingly. From inside martian network, one can go to the browser and go to the IP address to see the temperature sensor status . These sensors can be set to trigger alarm and send emails/sms etc if temperature goes out of a defined range.

I feel something is off though. The vertex sensor shows temperature of ~28 degrees C, Xend says 20 degrees C and Yend says 26 degrees C. I believe these sensors might need calibration.

Remaining tasks are following:

  • Modbus TCP solution:
    • If we get it right, this will be easiest solution.
    • We just need to add these sensors as streaming devices in some slow EPICS machine in there .cmd file and add the temperature sensing channels in a corresponding database file.
  • Python workaround:
    • Might be faster but dirty.
    • We run a python script on megatron which requests temperature values every second or so from the IP addresses and write them on a soft EPICs channel.
    • We still would need to create a soft EPICs channel fro this and add it to framebuilder data acquisition list.
    • Even shorted workaround for near future could be to just write temperature every 30 min to a log file in some location.

[anchal, paco]

We made a script under scripts/PEM/temp_logger.py and ran it on megatron. The script uses the requests package to query the latest sensor data from the three sensors every 10 minutes as a json file and outputs accordingly. This is not a permanent solution.

  16269   Wed Aug 4 18:19:26 2021 pacoUpdateGeneralAdded infrasensing temperature unit to martian network

[ian, anchal, paco]

We hooked up the infrasensing unit to power and changed its default IP address from 192.168.11.160 (factory default) to 192.168.113.240 in the martian network. The sensor is online with user controls and the usual password for most workstations in that IP address.

  16268   Tue Aug 3 20:20:08 2021 AnchalUpdateOptical LeversRecentered ETMX, ITMX and ETMY oplevs at good state

Late elog. Original time 08/02/2021 21:00.

I locked both arms and ran ASS to reach to optimum alignment. ETMY PIT > 10urad, ITMX P > 10urad and ETMX P < -10urad. Everything else was ok absolute value less than 10urad. I recentered these three.

Than I locked PRMI, ran ASS on PRCL and MICH and checked BS and PRM alignment. They were also less than absolute value 10urad.

  16267   Mon Aug 2 16:18:23 2021 PacoUpdateASCAS WFS MICH commissioning

[anchal, paco]

We picked up AS WFS comissioning for daytime work as suggested by gautam. In the end we want to comission this for the PRFPMI, but also for PRMI, and MICH for completeness. MICH is the simplest so we are starting here.

We started by restoromg the MICH configuration and aligning the AS DC QPD (on the AS table) by zeroing the C1:ASC-AS_DC_YAW_OUT and C1:ASC-AS_DC_PIT_OUT. Since the AS WFS gets the AS beam in transmission through a beamsplitter, we had to correct such a beamsplitters's aligment to recenter the AS beam onto the AS110 PD (for this we looked at the signal on a scope).

We then checked the rotation (R) C1:ASC-AS_RF55_SEGX_PHASE_R and delay (D) angles C1:ASC-AS_RF55_SEGX_PHASE_D (where X = 1, 2, 3, 4 for segment) to rotate all the signal into the I quadrature. We found that this optimized the PIT content on C1:ASC-AS_RF55_I_PIT_OUT and YAW content on C1:ASC-AS_RF55_I_YAW_OUTMON which is what we want anyways.

Finally, we set up some simple integrators for these WFS on the C1ASC-DHARD_PIT and C1ASC-DHARD_YAW filter banks with a pole at 0 Hz, a zero at 0.8 Hz, and a gain of -60 dB (similar to MC WFS). Nevertheless, when we closed the loop by actuating on the BS ASC PIT and ASC YAW inputs, it seemed like the ASC model outputs are not connected to the BS SUS model ASC inputs, so we might need to edit accordingly and restart the model.

  16266   Thu Jul 29 14:51:39 2021 PacoUpdateOptical LeversRecenter OpLevs

[yehonathan, anchal, paco]

Yesterday around 9:30 pm, we centered the BS, ITMY, ETMY, ITMX and ETMX oplevs (in that order) in their respective QPDs by turning the last mirror before the QPDs. We did this after running the ASS dither for the XARM/YARM configurations to use as the alignment reference. We did this in preparation for PRFPMI lock acquisition which we had to stop due to an earthquake around midnight

  16265   Wed Jul 28 20:20:09 2021 YehonathanUpdateGeneralThe temperature sensors and function generator have arrived in the lab

I put the temperature sensors box on Anchal's table (attachment 1) and the function generator on the table in front of the c1auxey Acromag chassis (attachment 2).

 

Attachment 1: 20210728_201313.jpg
20210728_201313.jpg
Attachment 2: 20210728_201607.jpg
20210728_201607.jpg
  16264   Wed Jul 28 17:10:24 2021 AnchalUpdateLSCSchnupp asymmetry

[Anchal, Paco]

I redid the measurement of Schnupp asymmetry today and found it to be 3.8 cm \pm 0.9 cm.


Method

  • One of the arms is misalgined both at ITM and ETM.
  • The other arm is locked and aligned using ASS.
  • The SRCL oscillator's output is changed to the ETM of the chosen arm.
  • The AS55_Q channel in demodulation of SRCL oscillator is configured (phase corrected) so that all signal comes in C1:CAL-SENSMAT_SRCL_AS55_Q_DEMOD_I_OUT.
  • The rotation angle of AS55 RFPD is scanned and the C1:CAL-SENSMAT_SRCL_AS55_Q_DEMOD_I_OUT is averaged over 10s after waiting for 5s to let the transients pass.
  • This data is used to find the zero crossing of AS55_Q signal when light is coming from one particular arm only.
  • The same is repeated for the other arm.
  • The difference in the zero crossing phase angles is twice the phase accumulated by a 55 MHz signal in travelling the length difference between the arm cavities i.e. the Schnupp Asymmetry.

I measured a phase difference of 5 \pm1 degrees between the two paths.

The uncertainty in this measurement is much more than gautam's 15956 measurement. I'm not sure yet why, but would look into it.

 

Quote:

I used the Valera technique to measure the Schnupp asymmetry to be \approx 3.5 \, \mathrm{cm}, see Attachment #1. The data points are points, and the zero crossing is estimated using a linear fit. I repeated the measurement 3 times for each arm to see if I get consistent results - seems like I do. Subtle effects like possible differential detuning of each arm cavity (since the measurement is done one arm at a time) are not included in the error analysis, but I think it's not controversial to say that our Schnupp asymmetry has not changed by a huge amount from past measurements. Jamie set a pretty high bar with his plot which I've tried to live up to. 

 

Attachment 1: Lsch.pdf
Lsch.pdf
  16263   Wed Jul 28 12:47:52 2021 YehonathanUpdateCDSOpto-isolator for c1auxey

To simulate a differential output I used two power supplies connected in series. The outer connectors were used as the outputs and the common connector was connected to the ground and used as a reference. I hooked these outputs to one of the differential analog channels and measured it over time using Striptool. The setup is shown in attachment 3.

I tested two cases: With reference disconnected (attachment 1), and connected (attachment 2). Clearly, the non-referred case is way too noisy.

Attachment 1: SUS-ETMY_SparePDMon0_NoRef.png
SUS-ETMY_SparePDMon0_NoRef.png
Attachment 2: SUS-ETMY_SparePDMon0_Ref_WithGND.png
SUS-ETMY_SparePDMon0_Ref_WithGND.png
Attachment 3: DifferentialOutputTest.png
DifferentialOutputTest.png
  16262   Wed Jul 28 12:00:35 2021 YehonathanUpdateBHDSOS assembly

After receiving two new tubes of EP-30 I resumed the gluing activities. I made a spreadsheet to track the assemblies that have been made, their position on the metal sheet in the cleanroom, their magnetic field, and the batch number.

I made another batch of 6 magnets yesterday (4th batch), the assembly from the 2nd batch is currently being tested for bonding strength.

One thing that we overlooked in calculating the amount of glue needed is that in addition to the minimum 8gr of EP-30 needed for every gluing session, there is also 4gr of EP-30 wasted on the mixing tube. So that means 12gr of EP-30 are used in every gluing session. We need 5 more batches so at least 60gr of EP-30 is needed. Luckily, we bought two tubes of 50gr each.

  16261   Tue Jul 27 23:04:37 2021 AnchalUpdateLSC40 meter party

[ian, anchal, paco]

After our second attempt of locking PRFPMI tonight, we tried to resotre XARM and YARM locks to IR by clicking on IFO_CONFIGURE>Restore XARM (POX) and IFO_CONFIGURE>Restore YARM (POY) but the arms did not lock. The green lasers were locked to the arms at maximum power, so the relative alignments of each cavity was ok. We were also able to lock PRMI using IFO_CONFIGURE>Restore PRMI carrier.

This was very weird to us. We were pretty sure that the aligment is correct, so we decided to cehck the POX POY signal chain. There was essentially no signal coming at POX11 and there was a -100 offset on it. We could see some PDH signal on POY11 but not enough to catch the locks.

We tried running IFO_CONFIGURE>LSC OFFSETS to cancel out any dark current DC offsets. The changes made by the script are shown in attachment 1.

We went to check the tables and found no light visible on beam finder cards on POX11 or POY11. We found that ITMX was stuck on one of the coils. We unstuck it using the shaking method. The OPLEVs on ITMX after this could not be switched on as the OPLEV servo were railing to limits. But when we ran Restore XARM (POX) again, they started working fine. Something is done by this script that we are not aware of.

We're stopping here. We still can not lock any of the single arms.


Wed Jul 28 11:19:00 2021 Update:

[gautam, paco]

Gautam found that the restoring of POX/POY failed to restore the whitening filter gains in POX11 / POY11. These are meant to be restored to 30 dB and 18 dB for POX11 and POY11 respectively but were set to 0 dB in detriment of any POX/POY triggering/locking. The reason these are lowered is to avoid saturating the speakers during lock acquisition. Yesterday, burt-restore didn't work because we restored the c1lscepics.snap but said gains are actually in c1lscaux.snap. After manually restoring the POX11 and POY11 whitening filter gains, gautam ran the LSCOffsets script. The XARM and YARM were able to quickly lock after we restored these settings.

The root of our issue may be that we didn't run the CARM & DARM watch script (which can be accessed from the ALS/Watch Scripts in medm). Gautam added a line on the Transition_IR_ALS.py script to run the watch script instead.

Attachment 1: Screenshot_2021-07-27_22-19-58.png
Screenshot_2021-07-27_22-19-58.png
  16260   Tue Jul 27 20:12:53 2021 KojiUpdateBHDSOS assembly

1 or 2. The stained ones are just fine. If you find the vented 1/4-20 screws in the clean room, you can use them.

For the 28 screws, yeah find some spares in the clean room (faster), otherwise just order.

  16259   Tue Jul 27 17:14:18 2021 YehonathanUpdateBHDSOS assembly

Jordan has made 1/4" tap holes in the lower EQ stop holders (attachment). The 1/4" stops (schematics) fit nicely in them. Also, they are about the same length as the small EQ stops, so they can be used.

However, counting all the 1/4"-3/4" vented screws we have shows that we are missing 2 screws to cover all the 7 SOSs. We can either:

1. Order new vented screws.

2. Use 2 old (stained but clean) EQ stops.

3. Screw holes into existing 1/4"-3/4" screws and clean them.

4. Use small EQ stops for one SOS.

etc.

Also, I found a mistake in the schematics of the SOS tower. The 4-40 screws used to hold the lower EQ stop holders should be SS and not silver plated as noted. I'll have to find some (28) spares in the cleanroom or order new ones.

 

Attachment 1: 20210727_154506.png
20210727_154506.png
  16257   Mon Jul 26 17:34:23 2021 PacoUpdateLoss MeasurementLoss measurement

[gautam, yehonathan, paco]

We went back to the loss data from last week and more carefully estimated the ARM loss uncertainties.

Before we simply stitched all N=16 repetitions into a single time-series and computed the loss: e.g. see Attachment 1 for such a YARM loss data. The mean and stdev for this long time series give the quoted loss from last time. We knew that the uncertainty was most certainly overestimated, as different realizations need not sample similar alignment conditions and are sensitive to different imperfections (e.g. beam angular motion, unnormalizable power fluctuations, etc...).


Today we analyzed the individual locked/misaligned cycles individually. From each cycle, it is possible to obtain a mean value of the loss as well as a std dev *across the duration of the trace*, but because we have a measurement ensemble, it is also possible to obtain an ensemble averaged mean and a statistical uncertainty estimate *across the independent cycle realizations*. While the mean values don't change much, in the latter estimate we find a much smaller statistical uncertainty. We obtain an XARM loss of 37.6 \pm 2.6 ppm and a YARM loss of 38.9 \pm 0.6 ppm. To make the distinction more clear, Attachment 2 and  Attachment 3 the YARM and XARM loss measurement ensembles respectively with single realization (time-series) standard deviations as vertical error bars, and the 1 sigma statistical uncertainty estimate filled color band. Note that the XARM loss drifts across different realizations (which happen to be ordered in time), which we think arise from inconsistent ASS dither alignment convergence. This is yet to be tested.


For budgeting the excessive uncertainties from a single locked/misaligned cycle, we could look at beam pointing, angular drift, power, and systematic differences in the paths from both reflection signals. We should be able to estimate the power fluctuations by looking at the recorded arm transmissions, the recorded MC transmission, PD technical noise, etc... and we might be able to correlate recorded oplev signals with the reflection data to identify angular drift. We have not done this yet.

Attachment 1: LossMeasurement_RawData.pdf
LossMeasurement_RawData.pdf
Attachment 2: YARM_loss_stats.pdf
YARM_loss_stats.pdf
Attachment 3: XARM_loss_stats.pdf
XARM_loss_stats.pdf
  16256   Sun Jul 25 20:41:47 2021 ranaUpdateLoss MeasurementLoss measurement

What are the quantitative root causes for why the statistical uncertainty is so large? Its larger than 1/sqrt(N)

  16255   Sun Jul 25 18:21:10 2021 KojiUpdateGeneralCanon camera / small silver tripod / macro zoom lens / LED ring light returned / Electronics borrowed

Camera and accesories returned

One HAM-A coildriver and one sat amp borrowed -> QIL

https://nodus.ligo.caltech.edu:8081/QIL/2616

 

  16254   Thu Jul 22 16:06:10 2021 PacoUpdateLoss MeasurementLoss measurement

[yehonathan, anchal, paco, gautam]

We concluded estimating the XARM and YARM losses. The hardware configuration from yesterday remains, but we repeated the measurements because we realized our REFL55_I_ERR and REFL55_Q_ERR signals representing the PD520 and MC_TRANS were scaled, offset, and rotated in a way that wasn't trivially undone by our postprocessing scripts... Another caveat that we encountered today was the need to add a "macroscopic" misalignment to the ITMs when doing the measurement to avoid any accidental resonances.

The final measurements were done with 16 repetitions, 30 second duration, and the logfiles are under scripts/lossmap_scripts/armLoss/logs/20210722_1423.txt and scripts/lossmap_scripts/armLoss/logs/20210722_1513.txt

Finally, the estimated YARM loss is 39\pm7 ppm, while the estimated XARM loss is 38\pm8 ppm. This is consistent with the inferred PRC gain from Monday and a PRM loss of ~ 2%.


Future measurements may want to look into slow drift of the locked vs misaligned traces (systematic errors?) and a better way of estimating the statistical uncertainty (e.g. by splitting the raw time traces into short segments)

  16253   Wed Jul 21 18:08:35 2021 yehonathanUpdateLoss MeasurementLoss measurement

{Gautam, Yehonathan, Anchal, Paco}

We prepared for the loss measurement using DC reflection method. We did the following changes:

1. REFL55_Q was disconnected and replaced with MC_T cable coming from the PD on the MC2 table. The cable has a red tag on it. Consequently we lost the AS beam. We realigned the optics and regained arm locks. The spot on the AS QPD had to be corrected.

2. We tried using AS55 as the PD for the DC measurement but we got ratios of ~ 0.97 which implies losses of more than 100 ppm. We decided to go with the traditional PD520 used for these measurements in the past.

3. We placed the PD520 used for loss measurements in front of the AS55 PD and optimized its position.

4. AS110 cable was disconnected from the PD and connected to PD520 to be used as the loss measurement cable.

5. In 1Y2 rack, AS110 PD cable was disconnected, REFL55_I was disconnected and AS110 cable was connected to REFL55_I channel.

So for the test, the MC transmission was measured at REFL55_Q and the AS DC was measured at REFL55_I.

We used the scripts/lossmap_scripts/armLoss/measArmLoss.py script. Note that this script assumes that you begin with the arm locked.

We are leaving the IFO in the configuration described above overnight and we plan to measure the XARM loss early AM. After which we shall restore the affected electrical and optical paths.


We ran the /scripts/lossmap_scripts/armLoss/measureArmLoss.py script in pianosa with 25 repetitions and a 30 s "duty cycle" (wait time) for the Y arm. Preliminary results give an estimated individual arm loss of ~ 30 ppm (on both X/Y arms) but we will provide a better estimate with this measurement. 

  16252   Wed Jul 21 14:50:23 2021 KojiUpdateSUSNew electronics

Received:

Jun 29, 2021 BIO I/F 6 units
Jul 19, 2021 PZT Drivers x2 / QPD Transimedance amp x2

 

Attachment 1: P_20210629_183950.jpeg
P_20210629_183950.jpeg
Attachment 2: P_20210719_135938.jpeg
P_20210719_135938.jpeg
  16251   Mon Jul 19 22:16:08 2021 pacoUpdateLSCPRFPMI locking

[gautam, paco]

Gautam managed to lock PRFPMI a little before ~ 22:00 local time. The ALS to RF handoff logic was found to be repeatable, which enabled us to lock a total of 4 times this evening. Under this nominal state, we can work on PRFPMI to narrow down less known issues and carry out systematic optimization. The second time we achieved lock, we ran sensing lines before entering the ASC stage (which we knew would destroy the lock), and offline analysis of the sensing matrix is pending (gpstime = 1310792709 + 5 min).

Things to note:

(a) there is an unexpected offset suggesting that the ALS and RF disagreed on what the lock setpoint should be, and it is still unclear where the offset is coming from.

(b) the first time the lock was reached, the ASC up stage destroyed it, suggesting these loops need some care (we were able to engage the ASC loops at low gains (0.2 instead of 1) but as soon as we enabled some integrators this consistently destroyed the lock

(c) gautam had (burt) restored to the settings from back in March when the PRFPMI was last locked, suggesting there was a small but somehow significant difference in the IFO that helped today relative to last week


Take home message--> The mere fact that we were able to lock PRFPMI rules out the considerably more serious problems with the signal chain electronics or processing. This should also be a good starting point for further debugging and optimization.


gautam: the circulating power, when the ASC was tweaked, hit 400 (normalized to single arm locked with a misaligned PRM) suggesting a recycling gain of 22.5, and an average arm loss of ~30ppm round trip (assuming 2% loss in the PRC). 

  16250   Sat Jul 17 00:52:33 2021 KojiUpdateGeneralCanon camera / small silver tripod / macro zoom lens / LED ring light borrowed -> QIL

Canon camera / small silver tripod / macro zoom lens / LED ring light borrowed -> QIL

Attachment 1: P_20210716_213850.jpg
P_20210716_213850.jpg
  16249   Fri Jul 16 16:26:50 2021 gautamUpdateComputersDocker installed on nodus

I wanted to try hosting some docker images on a "private" server, so I installed Docker on nodus following the instructions here. The install seems to have succeeded, and as far as I can tell, none of the functionality of nodus has been disturbed (I can ssh in, access shared drive, elog seems to work fine etc). But if you find a problem, maybe this action is responsible. Note that nodus is running Scientific Linux 7.3 (Nitrogen).

ELOG V3.1.3-