40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Mon Feb 7 09:00:26 2022, Paco, Update, General, Scheduled power outage recovery 
    Reply  Mon Feb 7 15:32:48 2022, Yehonathan, Update, General, Y End laser controller 
    Reply  Mon Feb 7 16:53:02 2022, Koji, Update, General, Scheduled power outage recovery PXL_20220208_001646282.jpgPXL_20220208_001657871.jpg
       Reply  Wed Feb 9 11:56:24 2022, Anchal, Update, General, Bringing back CDS Screenshot_2022-02-09_12-11-33.png
          Reply  Wed Feb 9 13:55:05 2022, Koji, Update, General, Bringing back CDS 
       Reply  Wed Feb 9 16:43:35 2022, Paco, Update, General, Scheduled power outage recovery - Locking mode cleaner(s) 
          Reply  Thu Feb 10 15:41:00 2022, Anchal, Update, General, Scheduled power outage recovery - Locking mode cleaner(s) 
             Reply  Thu Feb 10 17:57:48 2022, Anchal, Update, General, Scheduled power outage recovery - Locking mode cleaner(s) PXL_20220211_021509819.jpg
                Reply  Thu Feb 10 19:03:23 2022, Koji, Update, General, Scheduled power outage recovery - Locking mode cleaner(s) 
                   Reply  Thu Feb 10 19:46:37 2022, Koji, Update, General, Scheduled power outage recovery - Locking mode cleaner(s) 
                      Reply  Thu Feb 10 21:10:43 2022, Koji, Update, General, Video Mux setting reset Screenshot_2022-02-10_21-11-21.pngScreenshot_2022-02-10_21-11-54.png
             Reply  Fri Feb 11 16:09:11 2022, Anchal, Update, General, Scheduled power outage recovery - Input power increased 
                Reply  Mon Feb 14 18:31:50 2022, Paco, Update, General, Scheduled power outage recovery - IMC recovery progress 
                   Reply  Tue Feb 15 19:32:50 2022, Koji, Update, General, Scheduled power outage recovery - IMC recovery progress PXL_20220216_001731377.jpgScreen_Shot_2022-02-15_at_16.18.16.pngPXL_20220216_001727465.jpgScreen_Shot_2022-02-15_at_16.22.16.pngPXL_20220216_002229572.jpg
                      Reply  Tue Feb 15 19:40:02 2022, Koji, Update, General, IMC locking 
                         Reply  Wed Feb 16 15:19:41 2022, Anchal, Update, General, Reconfigured MC reflection path for low power 
                            Reply  Wed Feb 23 15:08:57 2022, Anchal, Update, General, Removed extra beamsplitter in MC WFS path 
                               Reply  Thu Feb 24 14:32:57 2022, Anchal, Update, General, MC RFPD DCMON channel got stuck to 0 
                                  Reply  Thu Feb 24 19:26:32 2022, Anchal, Update, General, IMC Locking 
                                     Reply  Sun Feb 27 00:37:00 2022, Koji, Update, General, IMC Locking Recovery PXL_20220226_093809056.jpgPXL_20220226_093854857.jpgPXL_20220226_100859871.jpgScreenshot_2022-02-26_01-56-31.pngScreenshot_2022-02-26_01-56-47.png
                                        Reply  Sun Feb 27 01:12:46 2022, Koji, Update, General, IMC manual alignment procedure PXL_20220226_100859871.jpg
    Reply  Fri Feb 11 11:17:00 2022, Anchal, Update, General, Scheduled power outage recovery 
       Reply  Mon Feb 14 21:03:25 2022, Koji, Update, General, Scheduled power outage recovery PXL_20220215_025325118.jpg
Message ID: 16652     Entry time: Wed Feb 9 11:56:24 2022     In reply to: 16651     Reply to this: 16653
Author: Anchal 
Type: Update 
Category: General 
Subject: Bringing back CDS 

[Anchal, Paco]

Bringing back CDS took a lot of work yesterday. I'm gonna try to summarize the main points here.


mx_start_stop

For some reason, fb1 was not able to mount mx devices automatically on system boot. This was an issue I earlier faced in fb1(clone) too. The fix to this problem is to run the script:

controls@fb1:/opt/mx/sbin/mx_start_stop start

To make this persistent, I've configured a daemon (/etc/systemd/system/mx_start_stop.service) in fb1 to run once on system boot and mount the mx devices as mentioned above. We did not see this issue of later reboots yesterday.


gpstime

Next was the issue of gpstime module out of date on fb1. This issue is also known in the past and requires us to do the following:

controls@fb1:~ 0$ sudo modprobe -r gpstime
controls@fb1:~ 1$ sudo modprobe gpstime

Again, to make this persistent, I've configured a daemon (/etc/systemd/system/re-add-gpstime.service) in fb1 to run the above commands once on system boot. This corrected gpstime automatically and we did not face these problems again.


time synchornization

Later we found that fb1-FE computers, ntp time synchronization was not working and the main reason was that fb1 was unable to access internet. As a rule of thumb, it is always a good idea to try pinging www.google.com on fb1 to ensure that it is connected to internet. The issue had to do with fb1 not being able to find any namespace server. We fixed this issue by reloading bind9 service on chiara a couple of times. We're not really sure why it wasn't working.

~>sudo service bind9 stop
~>sudo service bind9 start
~>sudo service bind9 status
* bind9 is running

After the above, we saw that fb1 ntp server is working fine. You see following output on fb1 when that is the case:

controls@fb1:~ 0$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
-table-moral.bnr 110.142.180.39   2 u  399  512  377  195.034  -14.618   0.122
*server1.quickdr .GPS.            1 u   67   64  377  130.483   -1.621   1.077
+ntp2.tecnico.ul 56.99.239.27     2 u  473  512  377  184.648   -0.775   2.231
+schattenbahnhof 129.69.1.153     2 u  365  512  377  144.848    3.841   1.092
 192.168.123.255 .BCST.          16 u    -   64    0    0.000    0.000   0.000

On the FE models, timedatectl should show that NTP synchronized feild is yes. That wasn't happening even after us restarting the systemd-timesyncd service. After this, I just tried restarting all FE computers and it started working.


CDS

We had removed all db9 enabling plugs on the new SOSs beforehand to keep coils off just in case CDS does not come back online properly.

Everything in CDS loaded properly except the c1oaf model which kepy showing 0x2bad status. This meant that some IPC flags are red on c1sus, c1mcs and c1lsc as well. But everything else is green. See attachment 1. I then burtrestroed everything in the /opt/rtcds/caltech/c1/burt/autoburt/snapshots/2022/Feb/4/12:19 directory. This includes the snapshot of c1vac as well that I added on autoburt that day. All burt restore statuses were green OK. I think we are in good state now to start watchdogs on the new SOSs and put back the db9 enabling plugs.


Future work:

When somebody gets time, we should make cutom service files in fb1:/etc/systemd/system/ symbolic links to a repo directory and version control these important services. We should also make sure that their dependencies and startup order is correctly configured. I might have done a half-assed job there since I recently learned how to make unit files. We should do the same on nodus and chiara too. Our hope is that on one glorious day, the lab can be restarted without spending more than 20 min on booting up the computers and network.

 

Attachment 1: Screenshot_2022-02-09_12-11-33.png  33 kB  | Hide | Hide all
Screenshot_2022-02-09_12-11-33.png
ELOG V3.1.3-