40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 176 of 350  Not logged in ELOG logo
ID Date Authorup Type Category Subject
  16691   Tue Mar 1 20:38:49 2022 TegaUpdateVACOngoing work to get the FRG gauges readouts to EPICs channels

During my investigation, I inadvertently overwrote the serial port configuration for the connected devices. So I am now working to get it all back. I have attached screenshots of the config settings that brought back communication that is not garbled. There is no physical connection to port 6, which I guess was initially used for the UPS serial communication but not anymore. Also, ports 9 and 10 are connected to Hornet and SuperBee, both of which have not been communicating for a while and are to be replaced, so there is no way to confirm communication with them. Otherwise, the remaining devices seem to be communicating as before.

I still could not establish communication with the XGS-600 controller using the serial port settings given in the manual, which also happen to work via Serial to USB adapter, so I will revisit the problem later. My immediate plan is to do a Serial Ethernet, then Ethernet to Serial, and then Serial to USB connection to see if the USB code still works. If it does then at least I know the problem is not coming from the Serial to Ethernet adapters. Then I guess I will replace the controller with my laptop and see what signal comes through when I send a message to the controller via the IOLAN serial device server. Hopefully, I can discover what's wrong by this point.

 

Note to self: Before doing anything, do a sanity check by comparing the settings on the IOLAN SDS and the config settings that worked for the Serial to USB communication and post an elog for this for reference.

Attachment 1: Working_Serial_Port_List_1.png
Working_Serial_Port_List_1.png
Attachment 2: Working_Serial_Port_List_2.png
Working_Serial_Port_List_2.png
Attachment 3: Working_Config_Ports#1-5.png
Working_Config_Ports#1-5.png
Attachment 4: Working_Config_Ports#7-8.png
Working_Config_Ports#7-8.png
  16692   Wed Mar 2 11:50:39 2022 TegaUpdateVACOngoing work to get the FRG gauges readouts to EPICs channels

Here is the IOLAN SDS TCP socket setting and the USBserial setting for comparison.

I have also included the python script and output from the USBserial test from earlier.

Attachment 1: XGS600_IOLAN_settings_1.png
XGS600_IOLAN_settings_1.png
Attachment 2: XGS600_IOLAN_settings_2.png
XGS600_IOLAN_settings_2.png
Attachment 3: XGS600_USBserial_settings.png
XGS600_USBserial_settings.png
Attachment 4: XGS600_comm_test.py
#!/usr/bin/env python

#Created 2/24/22 by Tega Edo
'''Script to read/write to the XGS-600 Gauge Controller'''

import serial
import sys,os,math,time

ser = serial.Serial('/dev/cu.usbserial-1410') # open serial port 

... 74 more lines ...
Attachment 5: XGS600_comm_test_result.txt
----- Multiple Sensor Read Commands -----

Sent to XGS-600 -> #0001\r : Read XGS contents
response : >FE4CFE4CFE4C

Sent to XGS-600 -> #0003\r : Read Setpoint States
response : >0000

Sent to XGS-600 -> #0005\r : Read software revision
response : >0206,0200,0200,0200
... 69 more lines ...
  16693   Wed Mar 2 12:40:08 2022 TegaUpdateVACOngoing work to get the FRG gauges readouts to EPICs channels

Connector Test:

A quick test to rule out any issue with the Ethernet to Serial adapter was done using the setup shown in Attachment 1. The results rule out any connector problem.

 

IOLAN COMM test (as per Koji's suggestion):

The next step is to swap the controller with a laptop set up to receive serial commands using the same settings as the XGS600 controller. Basically, run a slightly modified version of python script where we go into listening mode. Then send commands to the TCP socket on the IOLAN SDS unit using c1vac and check what data makes its way to the laptop USBserial terminal. After working on this for a bit, I realized that we do not need to do anything on the c1vac machine. We only need to start the service as it would work normally. So I wrote a small python code for a basic XGS-600 controller emulator, see Attachment 4. The outputs from the laptop and c1vac terminals are Attachments 5 and 6 respectively. 

These results show that we can communicate via the assigned IP address "192.168.114.22" and the commands that are sent from c1vac reaches the laptop in the correct format. Furthermore, the serial_XSG service, a part modbusIOC_XGS service, which usually exits with an error seems fine now after successfully communicating with the laptop. I don't know why it did not die after the tests. I also found a bug in my code as a result of the test, where the status field for the fourth gauge didn't get written to. 

 

Pressure reading issue:

I noticed that the pressure reading was not giving the atmospheric value of ~760 Torrs as expected. Looking through my previous readouts, it seems the unit showed this atm value of ~761 Torrs when the first gauge was attached. However, a closer look at the issue revealed a transient behavior, i.e. when the unit is turned on the reading dips to atm value but eventually rises up to 1000 Torrs. I don't think this is a calibration problem bcos the value of 1000 Torrs is the maximum value for the gauge range. I also found out that when the XGS-controller has been running for a while, a power cycle does not have this transient behavior. So maybe a faulty capacitor somewhere? I have attached a short video clip that shows what happens when the XGS-controller unit is turned on.

Attachment 1: IMG_20220302_123529382.jpg
IMG_20220302_123529382.jpg
Attachment 2: XGS600_Serial2Ethernet2Serial2USB_comm_test_result.txt
$ python3 XGS600_comm_test.py  

----- Multiple Sensor Read Commands -----

Sent to XGS-600 -> #0001\r : Read XGS contents
response : >FE4CFE4CFE4C

Sent to XGS-600 -> #0003\r : Read Setpoint States
response : >0000

... 73 more lines ...
Attachment 3: VID-20220302-WA0001.mp4
Attachment 4: comm_test_c1vac_to_laptop_via_iolansds.py
#!/usr/bin/env python

#Created 3/2/22 by Tega Edo
'''Script to emulate XGS-600 controller using laptop USBserial port'''

import serial
import sys,os,math,time

ser = serial.Serial('/dev/cu.usbserial-1410') # open serial port 

... 19 more lines ...
Attachment 5: laptop_terminal.txt
(base) tega.edo@Tegas-MBP serial % python3 comm_test_c1vac_to_laptop_via_iolansds.py

----- Listen for USBserial command and asynchronously send data in XGS600 format -----

Command received from c1vac [1] : 

Data sent to c1vac [1] : >1.000E+00,NOCBL    ,NOCBL    ,NOCBL    ,2.00E+00,NOCBL\r
Command received from c1vac [2] : 

Data sent to c1vac [2] : >2.000E+00,NOCBL    ,NOCBL    ,NOCBL    ,3.00E+00,NOCBL\r
... 54 more lines ...
Attachment 6: c1vac_terminal.txt
controls@c1vac:/opt/target/python/serial$ caget C1:Vac-FRG1_status && caget C1:Vac-FRG2_status && caget C1:Vac-FRG3_status && caget C1:Vac-FRG4_status && caget C1:Vac-FRG5_status
C1:Vac-FRG1_status             1.530E+02
C1:Vac-FRG2_status             OFF
C1:Vac-FRG3_status             OFF
C1:Vac-FRG4_status             NO COMM
C1:Vac-FRG5_status             1.55E+02
controls@c1vac:/opt/target/python/serial$ caget C1:Vac-FRG1_status && caget C1:Vac-FRG2_status && caget C1:Vac-FRG3_status && caget C1:Vac-FRG4_status && caget C1:Vac-FRG5_status
C1:Vac-FRG1_status             1.630E+02
C1:Vac-FRG2_status             OFF
C1:Vac-FRG3_status             OFF
... 70 more lines ...
  16704   Sun Mar 6 18:14:45 2022 TegaUpdateVACOngoing work to get the FRG gauges readouts to EPICs channels

Following repeated failure to establish communication between c1vac and the XGS600 controller via Perle IOLAN serial device server, I decided to monitor the signal voltage of the communication channels (pin#2, pin#3 and pin#5) using an oscilloscope. The result of this investigation is presented in the attached pdf document. In summary, it seems I have used a crossed wired RS232 serial cable instead of a normal RS232 serial cable, so the c1vac read request command is being relayed on the wrong comm channel (pin#2 instead of pin#3). I will swap out the cable to see if this resolves the problem.  

Attachment 1: iolan_xgs_comm_investigation.pdf
iolan_xgs_comm_investigation.pdf iolan_xgs_comm_investigation.pdf iolan_xgs_comm_investigation.pdf iolan_xgs_comm_investigation.pdf
  16706   Mon Mar 7 13:53:40 2022 TegaUpdateVACOngoing work to get the FRG gauges readouts to EPICs channels

So it appears that my deduction from the pictures of needing a cable swap was correct, however, it turns out that the installed cable was actually the normal RS232 and what we need instead is the RS232 null cable. After the swap was done, the communication between c1vac and the XGS600 controller became active. Although, the data makes it all to the to c1vac without any issues, the scope view of it shows that it is mainly utilizing the upper half of the voltage range which is just over 50% of the available range. I don't know what to make of this.

 

I guess, the only remaining issue now is the incorrect atmospheric pressure reading of 1000 Torrs. 

 

Quote:

Following repeated failure to establish communication between c1vac and the XGS600 controller via Perle IOLAN serial device server, I decided to monitor the signal voltage of the communication channels (pin#2, pin#3 and pin#5) using an oscilloscope. The result of this investigation is presented in the attached pdf document. In summary, it seems I have used a crossed wired RS232 serial cable instead of a normal RS232 serial cable, so the c1vac read request command is being relayed on the wrong comm channel (pin#2 instead of pin#3). I will swap out the cable to see if this resolves the problem.  

 

Attachment 1: iolan_xgs_comm_live.pdf
iolan_xgs_comm_live.pdf
  16713   Tue Mar 8 12:08:47 2022 TegaUpdateVACOngoing work to get the FRG gauges readouts to EPICs channels

OMG, it worked! It was indeed a calibration issue and all I had to do was press the "OK" button after selecting the "CAL" tab beside the pressure reading. Wow.

Quote:

Great trouble shoot!

> I guess, the only remaining issue now is the incorrect atmospheric pressure reading of 1000 Torrs. 

This is just a calibration issue. The controller should have the calibration function.
(The other Pirani showing 850Torr was also a calibration issue although I didn't bother to correct it. I think the pirani's typically has large distribution of the calibration values and requires individual calibration)

 

Attachment 1: XGS600_calibration.pdf
XGS600_calibration.pdf
  16747   Tue Mar 29 15:07:11 2022 TegaSummaryVACOpening of MC and ETM chambers

[Anchal, Chub, Ian, Paco, Tega]

Today, we opened the MC chamber, ETMX chamber and ETMY chamber.

  16751   Fri Apr 1 14:26:50 2022 TegaUpdateOptical LeversSimplified sketch on MC table

Here is an early sketch of the MC table. 

 

Quote:

I have made an editable draw.io diagram of the planned simplified BHD setup on the ITMY table (see attached).  10 pts = 1 inch.

This is very sketchy but easily adjustable since we are removing everything but the ITMY Oplev from that table.

 

Attachment 1: MC_Table.drawio.pdf
MC_Table.drawio.pdf
  16755   Mon Apr 4 15:49:06 2022 TegaUpdateOptical LeversSimplified sketch on BS table

I have updated the BS table using feedback from Koji and Paco and the attached pdf document is the latest iteration.

Attachment 1: BS_Table.drawio.pdf
BS_Table.drawio.pdf
  16773   Wed Apr 13 12:50:07 2022 TegaUpdateVACNew Pressure Gauge Install/Pump Spool Vent

[Jordan, JC, Tega]

We have installed all the FRGs and updated the VAC medm screens to display their sensor readings. The replacement map is CC# -> FRG#, where # in [1..4] and PRP1 -> FRG5. We now need to clean up the C1VAC python code so that it is not overloaded with non-function gauges (CC1,CC2,CC3,CC4,PRP1). Also, need to remove the connection cables for the old replaced gauges.

Attachment 1: VAC_FRG_UPGRADE.png
VAC_FRG_UPGRADE.png
  16796   Thu Apr 21 16:36:56 2022 TegaUpdateVACcleanup work for vacuum git repo

git repo - https://git.ligo.org/40m/vac

Finally incorporated the FRGs into the main modbusIOC service and everything seems to be working fine. I have also removed the old sensors (CC1,CC2,CC3,CC4,PTP1,IG1) from the serial client list and their corresponding EPICS channels. Furthermore, the interlock service python script has been updated so that all occurrence of old sensors (turns out to be only CC1) were replaced by their corresponding new FRG sensor (FRG1) and a redundnacy was also enacted for P1a where the interlock condition is replicated with P1a being replaced with FRG1 because they both sense the main volume pressure.

  16800   Thu Apr 21 18:42:47 2022 TegaUpdateBHDPart V of BHR upgrade - Align LO and AS beams to BHD BS

[Tega, Yuta, Paco]

We tried aligning the LO and AS beams on to the BHD beamsplitter. During the alignment process, we noticed that the damping loop for AS1 was not working. Paco drew our attention to the fact that the UR OSEM signal was alway close to zero, so we checked to ensure that the magnet was still within the OSEM recess and it looks OK. Next we checked the electrical connection at the interface between the copper OSEM cables to the blue in-vacuum flat cable and this too looks alright also. Since the AS1 coil driver was recently modified, it is possible we might find the problem there, so I'll ask Koji about this.


So Koji clarified that the coil driver board and SATAMP boards are different so we should connect this issue to the coil driver board.

  16803   Fri Apr 22 12:03:09 2022 TegaUpdateBHDAS1 UR OSEM problem localized in the chamber

[JC, Tega, Ian, Paco]

We found that the UR cable was clamped to the table by one of the ITMY OPLEV steering mirror mounts that was recently installed. After freeing the cable, the UR signal is now active again.

Quote:

This indicates that the AS1 UR OSEM problem is localized in the chamber. Please check if the DSUB pins are touching the table or something else.

 

  16830   Wed May 4 15:08:59 2022 TegaUpdateBHDXARM alignment to get flashing

[Yuta, Tega]

We aligned the BS, ITMY, and ETMY PIT and YAW to get the flashing on X-arm whilst also keeping the flashing of Y-arm. From attachment 1, it is clear that POXDC photodiode is not receiveing any light, so our next task is to work on POX alignment.

 

Quote:
  • Align X-arm cavity and regain flashing.
  • Fix the Oplev path for ITMX.
  • Tune POX11 phase angle to get an error signal with which we can lock the cavity.
  • Finish AS beam path setup.

 

Attachment 1: IFO_XY_ARMS_Flashing.png
IFO_XY_ARMS_Flashing.png
  16831   Wed May 4 18:52:43 2022 TegaUpdateBHDPOP beam too high at POP SM4 on the ITMX chamber

[Yuta, Tega]

We needed to sort out the POXDC signal so we could work on X-arm alignment. Given that POXDC channel value was approx 6 compared to POYDC value of approx. 180, we decided to open the ITMX chamber to see if we could improve the situation. We worked on the alignment of POX beam but could not improve the DC level which suggests that this was already optimized for.  As an aside, we also noticed some stray IR beam from the BS chamber, just above the POX beam which we cold not identify.

Next we moved on to the POP beam alignment, where we noticed that the beam level on LO1 and POP_SM4 was a bit on the high side. Basically, the beam was completely missing the 1" POP_SM4 mirror and was close to the top edge of LO1. So we changed TT2 pitch value from 0.0143 to -0.2357 in order to move the beam position on POP_SM4 mirror. This changed the input alignment, so we compensated using PR2 (0.0 -> 49.0) and PR3 (-5976.560 -> -5689.800). This did not get back the alignment as anticipated, so we moved ITMY pitch from 0.9297 to 0.9107. All of these alignment changes moved the POP beam down by approx 1/5 of an inch from outside the mirro to the edge of POP_SM4 mirror, where about half of the beam is clipped.

 

Next Steps:

We need to repeat these aligment procedures with say 1.5 time the change in TT2 pitch to center the beam on POP_SM4 mirror.

Attachment 1: snnp_pr2_pr3_itmy_aligment.png
snnp_pr2_pr3_itmy_aligment.png
  16833   Thu May 5 17:05:31 2022 TegaUpdateBHDIMC & X/Y-arm alignment

[Yuta, Tega]

In order to setup POP camera and RFPD on the ITMX table, we decided to first work on the IMC and X/Y-arm alignment.

 

IMC alignment:

We zeroed IMC WFS outputs and aligned IMC manually to get IMC transmission of 1200 and reflection of 0.35.

 

Y-arm alignment:

We used the new video game tool that moves the pairs of mirrors - PR3 & ETMY, ITMY & ETMY - in common and differential modes. This brought the Y-arm flashing to 0.8. Note that we used the _OFFSET bias values for PR3 & ETMY alignment instead of the _COMM bias values.

 

X-arm alignment:

We repeated the same procedure of moving the pairs of mirrors - BS & ETMX, ITMX & ETMX - in common and differential modes but manually this time. This brought the X-arm flashing to ~1.0.

Attachment 1: IFO_aligment_Y_locking_on.png
IFO_aligment_Y_locking_on.png
  16838   Mon May 9 18:49:05 2022 TegaUpdateBHDRelocate green TRX and TRY PDs/cams/optics from PSL table to BS table

[Paco, Tega]

Started work on the relocating the green transmission optics, cameras and PDs. Before removing the any of the optics, we checked and confirmed that the PDs and Cams are indeed connected to the GRN TRX/Y medm channels. Then added labels to the cables before moving them.

Plumbing:

  • Moved all power and signal cables for the PDs and cameras from PSL table to BS table. See attachment #1

Relocated Optics & PDs & Cameras:

  • TRX and TRY cameras
  • TRX and TRY PDs
  • 1 BS, 2 lens for PDs and a steering mirror, see Atachement #2

 

Attachment 1: IMG_20220509_184439943.jpg
IMG_20220509_184439943.jpg
Attachment 2: IMG_20220509_184154722.jpg
IMG_20220509_184154722.jpg
  16843   Tue May 10 16:31:14 2022 TegaUpdateBHDITMX optlev return beam steered to QPD

Followed the steps below to complete the ITMX optlev installation. The ITMX optlev return beam now reaches its QPD without being blocked by the input steering mirror.

Although, I centered the ITMX optlev readout, this was not done when the XARM flashing is maximized bcos the IMC chamber was being worked on, so this should be done later when the IR beam is back.

Quote:

Next steps:

  • The Oplev beam paths need to be adjusted.
    • The ongoing beam steering mirror is blocking the returning beam, so the ongoing path needs to be changed.
    • First setup two irises to save ingoing path.
    • Then make space for the returning beam by changing the steering mirror positions.
    • Then recover the ingoing path to the center of irises.
    • Steer the returning beam to the QPD.
    • Then maximize the flashing on XARM and center the oplev to save this position.
  16848   Thu May 12 19:55:01 2022 TegaUpdateBHDAS path alignment

[Yuta, Tega]

We finally managed to steer the AS beam from ITMY chamber, through BS and IMC chambers, to the in-air AP table.

We moved the AS5 mirror north to its nominal position and we also moved the ASL lens on BS chamber back to its nominal position. Attached photos are taken after today's alignment work.

Attachment 1: AS.JPG
AS.JPG
Attachment 2: OMCchamber.JPG
OMCchamber.JPG
  16867   Fri May 20 12:42:23 2022 TegaUpdateVACDoor installation on the end stations

[JC, Tega, Chub]

Today we installed the 200 lbs doors on the end station chambers.

  16870   Tue May 24 10:37:09 2022 TegaUpdateVACadded FRG channels to slow channel ini file

[Vacuum gauge sensors]

Paco informed me that the FRG sensor EPICS channels are not available on dataviewer, so I added them to slow channels ini file (/opt/rtcds/caltech/c1/chans/daq/C0EDCU.ini). I also commented out the old CC1, CC2, CC3 and CC4 gauges. A service restart is required for them to become available but this cannot be done right now because it would adversely affect the progress of the upgrade work. So this would be done at a later date.

Quote:

git repo - https://git.ligo.org/40m/vac

Finally incorporated the FRGs into the main modbusIOC service and everything seems to be working fine. I have also removed the old sensors (CC1,CC2,CC3,CC4,PTP1,IG1) from the serial client list and their corresponding EPICS channels. Furthermore, the interlock service python script has been updated so that all occurrence of old sensors (turns out to be only CC1) were replaced by their corresponding new FRG sensor (FRG1) and a redundnacy was also enacted for P1a where the interlock condition is replicated with P1a being replaced with FRG1 because they both sense the main volume pressure.

 

  16973   Wed Jul 6 15:28:18 2022 TegaUpdateSUSOutput matrix diagonalisation : F2P coil balancing

git repo : https://git.ligo.org/40m/scripts/-/tree/main/SUS/OutMatCalc

local dir:  /opt/rtcds/caltech/c1/Git/40m/scripts/SUS/OutMatCalc

 

Here is an update on our recent attempt at diagonalization of the SUS output matrices. There are two different parts to this: the first is coil balancing using existing F2P code which stopped working because of an old-style use of the print function and the second which should now focus on the mixing amongst the various degrees of freedoms (dof) without a DC/AC split I believe. The F2P codes are now working and have been consolidated in the git repo.  

 

TODO:

  • The remaining task is to make it so that we only call a single file that combines the characterization code and filter generation code, preferably with the addition of a safety feature that restores any changed values in case of an error or interruption from the user. The safety functionality is already implemented in the output matrix diagonalization stem of the code, so we just need to copy this over.  
  • Improve the error minimization algorithm for minimizing the cross-coupling between the various dof by adjusting the elements of the output matrix. 

 

Previous work 

https://nodus.ligo.caltech.edu:8081/40m/4762

https://nodus.ligo.caltech.edu:8081/40m/4719

https://nodus.ligo.caltech.edu:8081/40m/4688

https://nodus.ligo.caltech.edu:8081/40m/4682

https://nodus.ligo.caltech.edu:8081/40m/4673

https://nodus.ligo.caltech.edu:8081/40m/4327

https://nodus.ligo.caltech.edu:8081/40m/4326

https://nodus.ligo.caltech.edu:8081/40m/4762

  16976   Wed Jul 6 22:40:03 2022 TegaSummaryCDSUse osem variance to turn off SUS damping instead of coil outputs

I updated the database files for the 7 BHD optics to separate the OSEM variance trigger and the LATCH_OFF trigger operations so that an OSEM variance value exceeding the max of say 200 cnts turns off the damping loop whereas pressing the LATCH_OFF button cuts power to the coil. I restarted the modbusIOC service on c1susaux2 and checked that the new functionality is behaving as expected. So far so good.

 

TODO

Figure out the next layer of watchdogging needed for the BHD optics.  

 

Quote:

[Anchal, JC, Ian, Paco]

We have now fixed all issues with the PD mons of c1susaux2 chassis. The slow channels are now reading same values as the fast channels and there is no arbitrary offset. The binary channels are all working now except for LO2 UL which keeps showing ENABLE OFF. This was an issue earlier on LO1 UR and it magically disappeared and now is on LO2. I think the optical isolators aren't very robust. But anyways, now our watchdog system is fully functional for all BHD suspended optics.

 

  16979   Thu Jul 7 21:25:48 2022 TegaSummaryCDSUse osem variance to turn off SUS damping instead of coil outputs

[Anchal, Tega]

Implemented ramp down of coil bias voltage when the BHD optics watchdog is tripped. Also added a watchdog reset button to the SUS medm screen that turns on damping and ramps up the coil PIT/YAW bias voltages to their nominal values. I believe this concludes the watchdog work.

Quote:

TODO

Figure out the next layer of watchdogging needed for the BHD optics.  

 

  16986   Mon Jul 11 17:25:43 2022 TegaUpdateVACfixed obsolete reference bug in serial_XGS600 service

Koji noticed that the FRG sensors were not updating due to reference to an obsolete modbusIOC_XGS service, which was used temporarily to test the operation of the serial XGS sensor readout to EPICS. The information in this service was later moved into modbusIOC.service but the dependence on the modbusIOC_XGS.service was not removed from the serial_XGS600.service. This did not present any issue before the shutdown, probably bcos the obsolete service was already loaded but after the restart of c1vac, the obsolete service file modbusIOC_XGS.service was no longer available. This resulted in  serial_XGS600.service throwing a failure to load error for the missing obsolete modbusIOC_XGS service. The fix involved replacing two references to 'modbusIOC_XGS' with 'modbusIOC' in  /opt/target/services/serial_XGS600.service.

I also noticed that the date logged in the commit message was Oct 2010 and that I could not do a push from c1vac due to an error in resolving git.ligo.org. I was able to push the commit from my laptop git repo but was unable to do a pull on c1vac to keep it synced with the remote repo.

  17025   Thu Jul 21 21:50:47 2022 TegaConfigurationBHDc1sus2 IPC update

IPC issue still unresolved.

Updated shared memory tag so that 'SUS' -> 'SU2' in c1hpc, c1bac and c1su2. Removed obsolete 'HPC/BAC-SUS' references from IPC file, C1.ipc. Restarted the FE models but the c1sus2 machine froze, so I did a manual reboot. This brought down the vertex machines---which I restarted using /opt/rtcds/caltech/c1/scripts/cds/rebootC1LSC.sh---and the end machines which I restarted manually. Everything but the BHD optics now have their previous values. So need to burtrestore these.
 

# IPC file:
/opt/rtcds/caltech/c1/chans/ipc/C1.ipc

# Model file locations:
/opt/rtcds/userapps/release/isc/c1/models/isc/c1hpc.mdl
/opt/rtcds/userapps/release/sus/c1/models/c1su2.mdl
/opt/rtcds/userapps/release/isc/c1/models/isc/c1bac.mdl

# Log files:
/cvs/cds/rtcds/caltech/c1/rtbuild/3.4/c1hpc.log
/cvs/cds/rtcds/caltech/c1/rtbuild/3.4/c1su2.log
/cvs/cds/rtcds/caltech/c1/rtbuild/3.4/c1bac.log


SUS overview medm screen :

  • Reduced the entire screen width
  • Revert to old screen style watchdog layout
  17026   Fri Jul 22 15:05:26 2022 TegaConfigurationBHDc1sus2 shared memory and ADC fix

[Tega, Yuta]

We were able to fix the shared memory issue by updating the receiver model name from ''SUS' to 'SU2' and the ADC zero issue by including both ADC0 and ADC1 in the c1hpc and c1bac models as well as removing the grounding of the unused ADC channels (including chn#16 and chn#17 which are actually used in c1hpc) in c1su2. We also used shared memory to move the DCPD_A/B error signals (after signal conditioning and mixing A/B; now named A_ERR and B_ERR) from c1hpc to c1bac.
C1:HPC-DCPD_A_IN1 and C1:HPC-DCPD_B_IN1 are now availableangel (they are essentially the same as C1:LSC-DCPD_A_IN1 and C1:LSC-DCPD_B_IN1, except for they are ADC-ed with different ADC; see elog 40m/16954 and Attachment #1).
Dolphin IPC error in seding signal from c1hpc to c1lsc still remains.crying

Attachment 1: Screenshot_2022-07-22_15-04-33_DCPD.png
Screenshot_2022-07-22_15-04-33_DCPD.png
Attachment 2: Screenshot_2022-07-22_15-12-19_models.png
Screenshot_2022-07-22_15-12-19_models.png
Attachment 3: Screenshot_2022-07-22_15-15-11_ERR.png
Screenshot_2022-07-22_15-15-11_ERR.png
Attachment 4: Screenshot_2022-07-22_15-32-19_GDS.png
Screenshot_2022-07-22_15-32-19_GDS.png
  17033   Mon Jul 25 17:58:10 2022 TegaConfigurationBHDc1sus2 IPC dolphin issue update

From the 40m wiki, I was able to use the instructions here to map out what to do to get the IPC issue resolved. Here is a summary of my findings.

I updated the /etc/dis/dishost.conf file on the frame builder machine to include the c1sus2 machine which runs the sender model, c1hpc, see below. After this, the file becomes available on c1sus2 machine, see attachment 1, and the c1sus2 node shows up in the dxadmin GUI, see attachment 2. However, the c1sus2 machine was not active. I noticed that the log file for the dis_nodemgr service, see attachment 3, which is responsible for setting things up, indicated that the dis_irm service may not be up, so I checked and confirmed that this was indeed the case, see attachment 4. I tried restarting this service but was unsuccessful. I restarted the machine but this did not help either. I have reached out to Jonathan Hanks for assistance.

Attachment 1: Screen_Shot_2022-07-25_at_5.43.28_PM.png
Screen_Shot_2022-07-25_at_5.43.28_PM.png
Attachment 2: Screen_Shot_2022-07-25_at_5.21.10_PM.png
Screen_Shot_2022-07-25_at_5.21.10_PM.png
Attachment 3: Screen_Shot_2022-07-25_at_5.30.58_PM.png
Screen_Shot_2022-07-25_at_5.30.58_PM.png
Attachment 4: Screen_Shot_2022-07-25_at_5.35.19_PM.png
Screen_Shot_2022-07-25_at_5.35.19_PM.png
  17034   Mon Jul 25 18:09:41 2022 TegaConfigurationBHDBHD Homodyne Phase control MEDM screen

[Paco, Tega, Yuta]

Today, we made a custom MEDM screen for the BHD Homodyne Phase Control, which is basically an overview of the c1hpc model. See Attachments 1 & 2 for details.

Attachment 1: Screen_Shot_2022-07-25_at_6.12.08_PM.png
Screen_Shot_2022-07-25_at_6.12.08_PM.png
Attachment 2: Screen_Shot_2022-07-25_at_6.18.09_PM.png
Screen_Shot_2022-07-25_at_6.18.09_PM.png
  17044   Thu Jul 28 16:51:55 2022 TegaUpdateBHDShaking test for LO beam AS beam to BHD DCPDs

[Yuta, Tega, Yehonathan]

To investigate the BHD power imbalance and clipping issues, we did some shaking test of the mirrors in the LO path and AS paths. The results suggests the following:

  • The clipping is happening after the BHD BS in DCPD_A path, as opposed to our initial guess of BHD BS transmission clipping in elog 17040.
  • The LO beam we are seeing is probably a ghost beam from PR2

We performed both PIT and YAW shaking of all mirrors and looked at the output at DCPD_A and DCPD_B, see table below for details. Since we only see the dithering signal in DCPD_A, it suggests that the clipping is ocurring after the BHD BS and is also confined to the path between BHD BS and DCPD_A. We also swapped the camera location from DCPD_A to DCPD_B on ITMY table and confirmed that the beam was clipped for DCPD_A but not for DCPD_B.

This result discounts the possiblity of clipping being responsible for the power imbalance and therefore suggests that the power imbalance may actually be due to BHD BS not being 50:50. From the measurement in elog 17040, the transmission of BHD BS is 44\pm0.3% and the reflectivity is 56\pm0.3%. Note that DCPD_A is the transmission of BHD BS for AS beam, whereas DCPD_B is the reflection of BHD BS for AS beam, elog 16932.

We expect the shaking of PR2 to give no signal in either DCPD_A or DCPD_B when the LO beam is purely in trasmission, however, we see a signal in DCPD_A sugesting that the LO beam transit path through PR2 may not be as expected, i.e. the beam might be exiting the side of PR2 instead of the AR coated surface.

Finally, we measured the coherence between the dithering dof and DCPD_A/B & POP, see attachment 2, where we noticed that both DCPD_A/B have high coherence in the 1Hz-10Hz frequency band whereas ther was no coherence in POP as expected. This suggests that there may also be some small clipping in DCPD_B path.

 

LO Beam Shaking (LO1, LO2, PR2):

color OPTIC DOF Freq OSC Amp (cnts) comment
Black       0 Reference
Blue LO1 YAW 304.4 2000

Signal in DCPD_A & No signal in DCPD_B

Orange LO1 PIT 304.4 2000 No signal
Magenta LO2 YAW 312.2 10000 No signal
Purple LO2 PIT 312.2 10000 Signal in DCPD_A & No signal in DCPD_B
Green PR2 YAW 308.8 20000 Signal in DCPD_A & No signal in DCPD_B
Red PR2 PIT 308.8 20000 Signal in DCPD_A & No signal in DCPD_B

 

AS Beam Shaking (AS1 and AS4)

color OPTIC DOF Freq OSC Amp (cnts) comment
Black       0 Reference
Blue AS1 YAW 305.5 2000

Signal in DCPD_A & No signal in DCPD_B

Orange AS1 PIT 305.5 2000 Signal in DCPD_A & No signal in DCPD_B
Magenta AS4 YAW 313.3 2000 Signal in DCPD_A & No signal in DCPD_B
Purple AS4 PIT 313.3 2000 No signal

 

 

Attachment 1: Screenshot_2022-07-28_16-58-34_LOandASShaking.png
Screenshot_2022-07-28_16-58-34_LOandASShaking.png
Attachment 2: Screenshot_2022-07-28_18-08-55_DCPDPOPSuspensionCoherence.png
Screenshot_2022-07-28_18-08-55_DCPDPOPSuspensionCoherence.png
  17046   Fri Jul 29 18:24:53 2022 TegaUpdateBHDLO beam power improved by factor of 6 after LO and AS beam alignment

[Yuta, Tega]

From our previous work (elog 17044) of shaking PR2 and seeing a signal in DCPD_A and the fact that LO beam power is far smaller than the expected nominal value, we decided to use TT1 and TT2 to realign the LO beam. This resulted in LO beam power going up by a factor of 6 and an improvement in the LO beam shape. We are still unable to find LO and AS alignment which realize BHD fringe with no clipping everywhere.

Deformed LO beam issue: Following the TT1 and TT2 alignments, used PR2 and PR3 to recover the transmission of the X and Y arms to 1. We also used LO1 and LO2 offsets to further reduce the beam deformation by eliminating the HOM concentric fringes that surounded the LO beam and to maximize the DCPD outputs. BHD optics in ITMY table was tweaked a lot to keep the LO beam centered on the BHD DCPDs and camera. The improved LO beam is still astigmatic in the yaw direction but at least now looks like a TEM00 mode. We also repositioned the DCPD_A path camera lens to remove the circular diffused fringes due to lens clipping. After the alignment, power was measured to be the following. It also reduced the coherence between DCPD outputs and suspension motions (see attached).

                       LO         ITMX
C1:HPC-DCPD_A_OUT16    +127.50    +96.24 (ITMX single bounce consistent to 40m/17040)
C1:HPC-DCPD_B_OUT16    +120.51    +141.52
Power at viewport A    504 uW (almost as expected 40m/17040)
Power at viewport B    385 uW

AP table AS beam clipping: We also noticed clipping in the AS beam in AP table which we removed by moving SR2 and AS1 in YAW and then used AS4 to keep the BHD AS beam centered in the BHD DCPDs.

BHD fringe: After overlaping the LO and AS beams, we saw diagonal fringes indicating beam tilt of LO wrt AS, so we tried to remove the AS beam tilt using AS1 and AS4 but failed to do so because the AS4 mirror seemed to completely distort the beam, so intead we decided to use SR2 and AS1 to remove the tilt between LO and AS beams, which realized BHD fringe. But the motion of SR2 and AS1 then moved the AS beam that it is no longer seen in AP table. The alignment to realize LO and AP AS beam without clipping, and that to realize BHD fringe are attached.

Attachment 1: LO_and_oldAS_settings.png
LO_and_oldAS_settings.png
Attachment 2: BHD_fringe_settings.png
BHD_fringe_settings.png
Attachment 3: Screenshot_2022-07-29_19-37-39.png
Screenshot_2022-07-29_19-37-39.png
Attachment 4: FromTheLeft-AS-POP-LO.JPG
FromTheLeft-AS-POP-LO.JPG
  17052   Mon Aug 1 18:42:39 2022 TegaConfigurationBHDc1sus2 IPC dolphin issue update

[Yuta, Tega]

We decided to give the dolphin debugging another go. Firstly, we noticed that c1sus2 was no longer recogonising the dolphin card, which can be checked using

lspci | grep Stargen

or looking at the status light on the dolphin card of c1sus2, which was orange for both ports A and B.

We decided to do a hard reboot of c1sus2 and turned off the DAQ chassis for a few minutes, then restared c1sus2. This solved the card recognition problem as well as the 'dis_irm' driver loading issue (I think the driver does not get loaded if the system does not recognise a valid card, as I also saw the missing dis_irm driver module on c1testand). 

Next, we confirmed the status of all dolphin cards on fb1, using

controls@fb1$ /opt/DIS/sbin/dxadmin

It looks like the dolphin card on c1sus2 has now been configured and is availabe to all other nodes. We then restated the all FE machines and models to see if we are in the clear. Unfortunately, we are not so lucky since the problem persisted.

Looking at the output of 'dmesg', we could only identity two notable difference between the operational dolphin cards on c1sus/c1ioo/c1lsc and c1sus2, namely: the card number being equal to zero and the memory addresses which are also zero, see image below.

Anyways, at least we can now eliminate driver issues and would move on to debugging the models next.

Attachment 1: c1sus2_dolphin.png
c1sus2_dolphin.png
Attachment 2: fb1_dxamin_status.png
fb1_dxamin_status.png
Attachment 3: dolphin_num_mem_init2.png
dolphin_num_mem_init2.png
  17054   Tue Aug 2 17:25:18 2022 TegaConfigurationBHDc1sus2 dolphin IPC issue solved

[Yuta, Tega, Chris]

We did it!laugh

Following Chris's suggestion, we added "pciRfm=1" to the CDS parameter block in c1x07.mdl - the IOP model for c1sus2. Then restarted the FE machines and this solved the dolphin IPC problem on c1sus2. We no longer see the RT Netstat error for 'C1:HPC-LSC_DCPD_A' and 'C1:HPC-LSC_DCPD_B' on the LSC IPC status page, see attachement 1.

Attachment 2 shows the module dependencies before and after the change was made, which confirms that the IOP model was not using the dolphin driver before the change.


We encountered a burt restore problem with missing snapfiles from yesterday when we tried restoring the EPICS values after restarting the FE machines. Koji helped us debug the problem, but the summary is that restarting the FE models somehow fixed the issue.

Log files:
/opt/rtcds/caltech/c1/burt/burtcron.log
/opt/rtcds/caltech/c1/burt/autoburt/autoburtlog.log
 
Request File list:
/opt/rtcds/caltech/c1/burt/autoburt/requestfilelist
 
Snap files location:
/opt/rtcds/caltech/c1/burt/autoburt/today
/opt/rtcds/caltech/c1/burt/autoburt/snapshots
 
Autoburt crontab on megatron:
19 * * * * /opt/rtcds/caltech/c1/scripts/autoburt/autoburt.cron > /opt/rtcds/caltech/c1/burt/burtcron.log 2>&1
Attachment 1: c1lsc_IPC_status.png
c1lsc_IPC_status.png
Attachment 2: FE_lsmod_dependencies_c1sus2_b4_after_iop_unpdate.png
FE_lsmod_dependencies_c1sus2_b4_after_iop_unpdate.png
  17058   Thu Aug 4 19:01:59 2022 TegaUpdateComputersFront-end machine in supermicro boxes

Koji and JC looked around the lab today and found some supermicro boxes which I was told to look into to see if they have any useful computers.

 

Boxes next to Y-arm cabinets (3 boxes: one empty)

We were expecting to see a smaller machine in the first box - like top machine in attachement 1 - but it turns out to actually contain the front-end we need, see bottom machine in attachment 1. This is the same machine as c1bhd currently on the teststand. Attachment 2 is an image of the machine in the second box (maybe a new machine for frambuilder?). The third box is empty.

 

Boxes next to X-arm cabinets (3 boxes)

Attachement 3 shows the 3 boxes each of which contains the same FE machine we saw earlier at the bottom of attachement 1. The middle box contains the note shown in attacment 4.

 

Box opposite Y-arm cabinets (1 empty box)

 

In summary, it looks like we have 3 new front-ends, 1 new front-end with networking issue and 1 new tower machine (possibly a frame builder replacement).

Attachment 1: IMG_20220804_184444473.jpg
IMG_20220804_184444473.jpg
Attachment 2: IMG_20220804_191658206.jpg
IMG_20220804_191658206.jpg
Attachment 3: IMG_20220804_185336240.jpg
IMG_20220804_185336240.jpg
Attachment 4: IMG_20220804_185023002.jpg
IMG_20220804_185023002.jpg
  17066   Mon Aug 8 17:16:51 2022 TegaUpdateComputersFront-end machine setup

Added 3 FE machines - c1ioo, c1lsc, c1sus -  to the teststand following the instructions in elog15947. Note that we also updated /etc/hosts on chiara by adding the names and ip of the new FE since we wish to ssh from there given that chiara is where we land when we connect to c1teststand.

Two of the FE machines - c1lsc & c1ioo - have the 6-core X5680 @ 3.3GHz processor and the BIOS were already mostly configured because they came from LLO I believe. The third machine - c1sus - has the 6-core X5650 @ 2.67GHz processor and required a complete BIOS config according to the doc.

Next Step:  I think the next step is to get the latest RTS working on the new fb1 (tower machine), then boot the frontends from there.


KVM switch note:

All current front-ends have the ps/2 keyboard and mouse connectors except for fb1, which only has usb ports. So we may not be able to connect to fb1 using a ps/2 KVM switch that works for all the current front-ends. The new tower machine does have a ps/2 connector so if we decide to use that as the bootserver and framebuilder, then we should be fine.

Attachment 1: IMG_20220808_170349717.jpg
IMG_20220808_170349717.jpg
  17073   Wed Aug 10 20:30:54 2022 TegaSummarySUSCharacterisation of suspension damping

[Yuta, Tega]

We diagnosed the suspension damping of the IMC/BHD/recycling optics by kicking the various degree of freedom (dof) and then tuning the gain so that we get a residual Q of approx. 5 in the cases where this can be achieved.

MC1: Good
MC2: SIDE-YAW coupling, but OK
MC3: Too much coupling between dofs, NEEDS ATTENTION
LO1: Good
LO2: Good
AS1: POS-PIT coupling, close to oscillation, cnt2um off, NEEDS ATTENTION
AS4: PIT-YAW coupling, cannot increase YAW gain because of coupling, No cnt2um, No Cheby, NEEDS ATTENTION
PR2: No cnt2um, No Cheby
PR3: POS-PIT coupling, cannot increase POS/PIT/YAW gain because of coupling, No cnt2um, No Cheby, NEEDS ATTENTION
SR2: No cnt2um

Attachment 1: BHD_SUSPIT_KICK.png
BHD_SUSPIT_KICK.png
Attachment 2: BHD_SUSPOS_KICK.png
BHD_SUSPOS_KICK.png
Attachment 3: BHD_SUSSIDE_KICK.png
BHD_SUSSIDE_KICK.png
Attachment 4: BHD_SUSYAW_KICK.png
BHD_SUSYAW_KICK.png
Attachment 5: IMC_SUSPIT_KICK.png
IMC_SUSPIT_KICK.png
Attachment 6: IMC_SUSPOS_KICK.png
IMC_SUSPOS_KICK.png
Attachment 7: IMC_SUSSIDE_KICK.png
IMC_SUSSIDE_KICK.png
Attachment 8: IMC_SUSYAW_KICK.png
IMC_SUSYAW_KICK.png
Attachment 9: PRSR_SUSPIT_KICK.png
PRSR_SUSPIT_KICK.png
Attachment 10: PRSR_SUSPOS_KICK.png
PRSR_SUSPOS_KICK.png
Attachment 11: PRSR_SUSSIDE_KICK.png
PRSR_SUSSIDE_KICK.png
Attachment 12: PRSR_SUSYAW_KICK.png
PRSR_SUSYAW_KICK.png
  17074   Wed Aug 10 20:51:14 2022 TegaUpdateComputersCDS upgrade Front-end machine setup

Here is a summary of what needs doing following the chat with Jamie today.

 

Jamie brought over the KVM switch shown in the attachment and I tested all 16 ports and 7 cables and can confirm that they all work as expected.

 

TODO

1. Do a rack space budget to get a clear picture of how many front-ends we can fit into the new rack

2. Look into what needs doing and how much effort would be needed to clear rack 1X7 and use that instead of the new rack. The power down on Friday would present a good opportunity to do this work on Monday, so get the info ready before then. 

3. Start mounting front-ends, KVM and dolphin network switch

4. Add the BOX rack layout to the CDS upgrade page.

Attachment 1: IMG_20220810_171002928.jpg
IMG_20220810_171002928.jpg
Attachment 2: IMG_20220810_171019633.jpg
IMG_20220810_171019633.jpg
  17083   Tue Aug 16 18:22:59 2022 TegaUpdateComputersc1teststand rack mounting for CDS upgrade

[Tega, Yuta]

I keep getting confused about the purpose of the teststand. The view I am adopting going forward is its use as a platform for testing the compatibility of new hardware upgrade, instead of thinking of it as an independent system that works with old hardware.

The initial idea of clearing 1X7 cannot be done for now, because I missed the deadline for providing a detailed enough plan before Monday power up of the lab, so we are just going to go ahead and use the new rack as was initially intended and get the latest hardware and software tested here.

We mounted the DAQ, subnet and dolphin IX switches, see attachement 1. The mounting ears that came with the dolphin switch did not fit and so could not be used for mounting. We looked around the lab and decided to used one of the NavePoint mounting brackets which we found next to the teststand, see attachment 2.

We plan to move the new rack to the current location of the teststand and use the power connection from there. It is also closer to 1X7 so that moving the front-ends and switches to 1X7 should be straight forward after we complete all CDS upgrade testing.

Attachment 1: IMG_20220816_180157132.jpg
IMG_20220816_180157132.jpg
Attachment 2: IMG_20220816_175125874.jpg
IMG_20220816_175125874.jpg
  17086   Wed Aug 17 10:23:05 2022 TegaUpdateGeneralc1vac issues, pressure gauge replacement

- Disk full

I updated the configuration file '/etc/logrotate.d/rsyslog' to set a file sise limit of 50M on 'syslog' and 'daemon.log' since these are the two log files that capture caget & caput terminal outputs. I also reduce the number of backup files to 2.

controls@c1vac:~$ cat /etc/logrotate.d/rsyslog
/var/log/syslog
{
    rotate 2
    daily
    size 50M
    missingok
    notifempty
    delaycompress
    compress
    postrotate
        invoke-rc.d rsyslog rotate > /dev/null
    endscript
}

/var/log/mail.info
/var/log/mail.warn
/var/log/mail.err
/var/log/mail.log
/var/log/daemon.log
{
    rotate 2
    missingok
    notifempty
    size 50M
    compress
    delaycompress
    postrotate
        invoke-rc.d rsyslog rotate > /dev/null
    endscript
}
/var/log/kern.log
/var/log/auth.log
/var/log/user.log
/var/log/lpr.log
/var/log/cron.log
/var/log/debug
/var/log/messages
{
    rotate 4
    weekly
    missingok
    notifempty
    compress
    delaycompress
    sharedscripts
    postrotate
        invoke-rc.d rsyslog rotate > /dev/null
    endscript
}

- Vacuum gauge

The XGS-600 can handle 6 FRGs and we currently have 5 of them connected. Yes, having a spare would be good. I'll see about placing an order for these then.

Quote:

- Disk Full: Just use the usual /etc/logrotate thing

- Vacuum gauge

I rather feel not replacing P1a. We used to have Ps and CCs as they didn't cover the entire pressure range. However, this new FRG (=Full Range Gauge) does cover from 1atm to 4nTorr.

Why don't we have a couple of FRG spares, instead?

Questions to Tega: How many FRGs can our XGS-600 controller handle?

 

 

  17098   Mon Aug 22 19:02:15 2022 TegaUpdateComputersc1teststand rack mounting for CDS upgrade II

[Tega, JC]

Moved the rack to the location of the test stand just behind 1X7 and plan to remove the other two small test stand racks to create some space there.  We then mounted the c1bhd I/O chassis and 4 front-end machines on the test stand (see attachment 1).

Installed the dolphin IX cards on all 4 front-end machines: c1bhd, c1ioo, c1sus, c1lsc. I also removed the dolphin DX card that was previously installed on c1bhd.

Found a single OneStop host card with a mini PCI slot mounting plate in a storage box (see attachment 2). Since this only fits into the dual PCI riser card slot on c1bhd, I swapped out the full-length PCI slot OneStop host card on c1bhd and installed it on c1lsc, (see attachments 3 & 4).

 

Attachment 1: IMG_20220822_185437763.jpg
IMG_20220822_185437763.jpg
Attachment 2: IMG_20220822_131340214.jpg
IMG_20220822_131340214.jpg
Attachment 3: c1bhd.jpeg
c1bhd.jpeg
Attachment 4: c1lsc.jpeg
c1lsc.jpeg
  17100   Tue Aug 23 22:30:24 2022 TegaUpdateComputersc1teststand OS upgrade - I

[JC, Tega, Chris]

After moving the test stand front-ends, chiara (name server) and fb1 (boot server) to the new rack behind 1X7, we powered everything up and checked that we can reach c1teststand via pianosa and that the front-ends are still able to boot from fb1. After confirming these tests, we decided to start the software upgrade to debian 10. We installed buster on fb1 and are now in the process of setting up diskless boot. I have been looking around for cds instructions on how to do this and I found the CdsFrontEndDebian10page which contains most of the info we require. The page suggests that it may be cleaner to start the debian10 installation on a front-end that is connected to an I/O chassis with at least 1 ADC and 1 DAC card, then move the installation disk to the boot server and continue from there, so I moved the disk from fb1 to one of the front-ends but I had trouble getting it to boot. I decided to do a clean install on another disk on the c1lsc front-end which has a host adapter card that can be connected to the c1bhd I/O chassis. We can then mount this disk on fb1 and use it to setup the diskless boot OS.

  17108   Fri Aug 26 14:05:09 2022 TegaUpdateComputersrack reshuffle proposal for CDS upgrade

[Tega, Jamie]

Here is a proposal for what we would like to do in terms of reshuffling a few rack-mounted equipments for the CDS upgrade. 

  • Frequency Distribution Amp - Move the unit from 1X7 to 1X6 without disconnecting the attached cables. Then disconnect power and signal cables one at a time to enable optimum rerouting for strain relief and declutter.  

 

  • GPS Receiver Tempus LX - Move the unit from 1X7 to 1X6 without disconnecting the attached cables. Then disconnect power and signal cables one at a time to enable optimum rerouting for strain relief and declutter.

 

  • PEM & ADC Adapter - Move the unit from 1X7 to 1X6 without disconnecting the attached cables. Disconnect the single signal cable from the rear of the ADC adapter to allow for optimum rerouting for strain relief.

 

  • Martian Network Switch - Make a note of all connections, disconnect them, move the switch to 1X7 and reconnect ethernet cables. 

 

  • MARTIAN NETWORK SWITCH CONNECTIONS
    # LABEL # LABEL
    1 Tempus LX (yellow,unlabeled) 13 FB1
    2 1Y6 HUB 14 FB
    3 C0DCU1 15 NODUS
    4 C1PEM1 16  
    5 RFM-BYPASS 17 CHIARA
    6 MEGATRON/PROCYON 18  
    7 MEGATRON 19 CISC/C1SOSVME
    8 BR40M 20 C1TESTSTAND [blue/unlabelled]
    9 C1DSCL1EPICS0 21 JetStar [blue/unlabelled]
    10 OP340M 22 C1SUS [purple]
    11 C1DCUEPICS 23 unknown [88/purple/goes to top-back rail]
    12 C1ASS 24 unknown [stonewall/yellow/goes to top-front rail]

     

I believe all of this can be done in one go followed by CDS validation. Please comment so we can improve the plan. Should we move FB1 to 1X7 and remove old FB & JetStor during this work?

Attachment 1: Reshuffling proposal

Attachment 2: Front of 1X7 Rack

Attachment 3: Rear of 1X7 Rack

Attachment 4: Front of 1X6 Rack

Attachment 5: Rear of 1X6 Rack

Attachment 6: Martian switch connections

Attachment 1: rack_change_proposal.pdf
rack_change_proposal.pdf
Attachment 2: IMG_20220826_131331042.jpg
IMG_20220826_131331042.jpg
Attachment 3: IMG_20220826_131153172.jpg
IMG_20220826_131153172.jpg
Attachment 4: IMG_20220826_131428125.jpg
IMG_20220826_131428125.jpg
Attachment 5: IMG_20220826_131543818.jpg
IMG_20220826_131543818.jpg
Attachment 6: IMG_20220826_142620810.jpg
IMG_20220826_142620810.jpg
  17111   Mon Aug 29 15:15:46 2022 TegaUpdateComputers3 FEs from LLO got delivered today

[JC, Tega]

We got the 3 front-ends from LLO today. The contents of each box are:

  1. FE machine
  2. OSS adapter card for connecting to I/O chassis
  3. PCI riser cards (x2)
  4. Timing Card and cable
  5. Power cables, mounting brackets and accompanying screws
Attachment 1: IMG_20220829_145533452.jpg
IMG_20220829_145533452.jpg
Attachment 2: IMG_20220829_144801365.jpg
IMG_20220829_144801365.jpg
  17113   Tue Aug 30 15:21:27 2022 TegaUpdateComputers3 FEs from LHO got delivered today

[Tega, JC]

We received the remaining 3 front-ends from LHO today. They each have a timing card and an OSS host adapter card installed. We also receive 3 dolphin DX cards. As with the previous packages from LLO, each box contains a rack mounting kit for the supermicro machine.

Attachment 1: IMG_20220830_144925325.jpg
IMG_20220830_144925325.jpg
Attachment 2: IMG_20220830_142307495.jpg
IMG_20220830_142307495.jpg
Attachment 3: IMG_20220830_143059443.jpg
IMG_20220830_143059443.jpg
  17144   Mon Sep 19 20:21:06 2022 TegaUpdateComputers1X7 and 1X6 work

[Tega, Paco, JC]


We moved the GPS network time server and the Frequency distribution amplifier from 1X7 to 1X6 and the PEM AA, ADC adapter and Martian network switch from 1X6 to 1X7. Also mounted the dolphin IX switch at the rear of 1X7 together with the DAQ and martian switches. This cleared up enough space to mount all the front-ends, however, we found that the mounting brackets for the frontends do not fit in the 1X7 rack, so I have decided to mount them on the upper part of the test stand for now while we come up with a fix for this problem. Attachments 1 to 3 show the current state of racks 1X6, 1X7 and the teststand.

 

Attachment 1: Front of racks 1X6 and 1X7

Attachment 2: Rear of rack 1X7

Attachment 3: Front of teststand rack


Plan for the remainder of the week

Tuesday

  • Setup the 6 new front-ends to boot off the FB1 clone.
  • Test PCIe I/O cables by connecting them btw the front-ends and teststand I/O chassis one at a time to ensure they work
  • Then lay the fiber cables to the various I/O chassis.

Wednesday

  • Migrate the current models on the 6 front-ends to the new system.
  • Replace RFM IPC parts with dolphin IPC parts in c1rfm model running c1sus machine
  • Replace the RFM parts in c1iscex and c1iscey models
  • Drop c1daf and c1oaf models from c1isc machine, since the front-ends have only have 6 cores
  • Build and install models

Thursday [CAN I GET THE IFO ON THIS DAY PLEASE?]

  • Complete any remaining model work
  • Connect all I/O chassis to their respective (new) front-end and see if we can start the models (Need to think of a safe way to do this. Should we disconnect coil drivers b4 starting the models?)

Friday

  • Tie-up any loose ends
Attachment 1: IMG_20220919_204013819.jpg
IMG_20220919_204013819.jpg
Attachment 2: IMG_20220919_203541114.jpg
IMG_20220919_203541114.jpg
Attachment 3: IMG_20220919_203458952.jpg
IMG_20220919_203458952.jpg
  17148   Tue Sep 20 23:06:23 2022 TegaUpdateComputersSetup the 6 new front-ends to boot off the FB1 clone

[Tega, Radhika, JC]

We wired the front-ends for power, DAQ and martian network connections. Then moved the I/O chassis from the buttom of the rack to the middle just above the KVM switch so we can leave the top og the I/O chassis open for access to the ports of OSS target adapter card for testing the extension fiber cables.

Attachment 1 (top -> bottom)

c1sus2

c1iscey

c1iscex

c1ioo

c1sus

c1lsc


When I turned on the test stand with the new front-ends, after a few minutes, the power to 1x7 was cut off due to overloading I assume. This brought down nodus, chiara and FB1. After Paco reset the tripped switch, everything came back without us actually doing anything, which is an interesting observation.


After this event, I moved the test stand power plug to the side wall rail socket. This seems fine so far. I then brought chiara (clone) and FB1 (clone) online. Here are some changes I made to get things going:

Chiara (clone)

  • Edited '/etc/dhcp/dhcpd.conf' to update the MAC address of the front-ends to match the new machines, then run
  • sudo service isc-dhcp-server restart
  • then restart front-ends
  • Edited '/etc/hosts' on chiara to include c1iscex and c1iscey as these were missing

 

FB1 (clone)

Getting the new front-ends booting off FB1 clone:

1. I found that the KVM screen was flooded with setup info about the dolphin cards on the LLO machines. This actually prevented login using the KVM switch for two of these machines.  Strangely, one of them 'c1sus' seemed to be fine, see attachment 2, so I guessed this was bcos the dolphin network was already configured earlier when we were testing the dolphin communications. So I decided to configure the remaining dolphin cards. To do so, we do the following

Dolphin Configuration:

1a. Ideally running

sudo /opt/DIS/sbin/dis_mkconf -fabrics 1 -sclw 8 -stt 1 -nodes c1lsc c1sus c1ioo c1iscex c1iscey c1sus2 -nosessions

should set up all the nodes, but this did not happen. In fact, I could no longer use the '/opt/DIS/sbin/dis_admin' GUI after running this operation and restarting the 'dis_networkmgr.service' via

sudo systemctl restart dis_networkmgr.service

so  I logged into each front-end and configured the dolphin adapter there using

sudo /opt/DIS/sbin/dis_config

After which I shut down FB1 (clone) bcos restarting it earlier didn't work, I waited a few minutes and then started it.  Everything was fine afterward, although I am not quite sure what solved the issue as I tried a few things and I was glad to see the problem go!

1b. I later found after configuring all the dolphin nodes that 2 of them failed the '/opt/DIS/sbin/dis_diag' test with an error message suggesting three possible issues of which one was 'faulty cable'. I looked at the units in question and found that swapping both cables with the remaining spares solved the problem. So it seems like these cables are faulty (need to double-check this). Attachment 3 shows the current state of the dolphin nodes on the front-ends and the dolphin switch.


2. I noticed that the NFS mount service for the mount points '/opt/rtcds' and '/opt/rtapps' in /etc/fstab exited with an error, so I ran 

sudo mount -a

3. edit '/etc/hosts' to include c1iscex and c1iscey as these were missing

 

Front-ends

To test the PCIe extension fiber cables that connect the front-ends to their respective I/O chassis, we run the following command (after booting the machine with the cable connected): 

controls@c1lsc:~$ lspci -vn | grep 10b5:3
    Subsystem: 10b5:3120
    Subsystem: 10b5:3101

If we see the output above, then both the cable and OSS card are fine (We know from previous tests that the OSS card on the I/O chassis is good). Since we only have one I/O chassis, we repeat the step above 8 times, also cycling through the six new front-end as we go so that we are also testing the installed OSS host adapter cards. I was able to test 4 cables and 4 OSS host cards (c1lsc, c1sus, c1ioo, c1sus2), but the remaining results were inconclusive (i.e. it seems to suggest that 3 out of the remaining 5 fiber cables are faulty, which in itself would be considered unfortunate but I found the reliability if the test to be in question when I went back to test the functionality to the 2 remaining OSS host cards using a cable that passed the test earlier and it didn't pass. After a few retries, I decided to call it a day b4 I lose my mind) and need to be redone again tomorrow.

 

Note: We were unable to lay the cables today bcos these tests were not complete, so we are a bit behind the plan. Would see if we can catch up tomorrow.

 

Quote:

Plan for the remainder of the week

Tuesday

  • Setup the 6 new front-ends to boot off the FB1 clone.
  • Test PCIe I/O cables by connecting them btw the front-ends and teststand I/O chassis one at a time to ensure they work
  • Then lay the fiber cables to the various I/O chassis.

 

Attachment 1: IMG_20220921_084220465.jpg
IMG_20220921_084220465.jpg
Attachment 2: dolphin_err_init_state.png
dolphin_err_init_state.png
Attachment 3: dolphin_final_state.png
dolphin_final_state.png
  17151   Wed Sep 21 17:16:14 2022 TegaUpdateComputersSetup the 6 new front-ends to boot off the FB1 clone

[Tega, JC]

We laid 4 out of 6 fiber cables today. The remaining 2 cables are for the I/O chassis on the vertex so we would test the cables the lay it tomorrow. We were also able to identify the problems with the 2 supposedly faulty cable, which are not faulty. One of them had a small bend in the connector that I was able to straighten out with a small plier and the other was a loose connection in the switch part. So there was no faulty cable, which is great! Chris wrote a matlab script that does the migration of all the model files. I am going through them, i.e. looking at the CDS parameter block to check that all is well. Next task is to build and install the updated models. Also need to update the '/opt/rtcds' and '/opt/rtapps' directory to the latest in the 40m chiara system.

 

  17153   Thu Sep 22 20:57:16 2022 TegaUpdateComputersbuild, install and start 40m models on teststand

[Tega, Chris]

We built, installed and started all the 40m models on the teststand today. The configuration we employ is to connect c1sus to the teststand I/O chassis and use dolphin to send the timing to other frontends. To get this far, we encounterd a few problems that was solved by doing the following:

0. Fixed frontend timing sync to FB1 via ntp

1. Set the rtcds enviroment variable `CDS_SRC=/opt/rtcds/userapps/trunk/cds/common/src` in the file '/etc/advligorts/env'

2. Resolved chgrp error during models installation using sticky bits on chiara, i.e. `sudo chmod g+s -R /opt/rtcds/caltech/c1/target`

3. Replaced `sqrt` with `lsqrt` in `RMSeval.c` to eliminate compilation error for c1ioo

4. Created a symlink for 'activateDQ.py' and 'generate_KisselButton.py' in '/opt/rtcds/caltech/c1/post_build'

5. Installed and configured dolphin for new frontend 'c1shimmer'

6. Replaced 'RFM0' with 'PCIE' in the ipc file, '/opt/rtcds/caltech/c1/chans/ipc/C1.ipc'

 

We still have a few issues namely:

1. The user models are not running smoothly. the cpu usage jumps to its maximum value every second or so.

2. c1omc seems to be unable to get its timing from its IOP model (Issue resolved by changing the CDS block parameter 'specific_cpu' from 6 to 4 bcos the new FEs only have 6 cores, 0-5)

3. The need to load the `dolphin-proxy-km` library and start the `rts-dolphin_daemon` service whenever we reboot the front-end

Attachment 1: dolphin_state_plus_c1shimmer.png
dolphin_state_plus_c1shimmer.png
Attachment 2: FE_status_overview.png
FE_status_overview.png
  17158   Fri Sep 23 19:07:03 2022 TegaUpdateComputersWork to improve stability of 40m models running on teststand

[Chris, Tega]

Timing glitch investigation:

  • Moved dolphin transmit node from c1sus to c1lsc bcos we suspect that the glitch might be coming from the c1sus machine (earlier c1pem on c1sus was running faster then realtime).
  • Installed and started c1oaf to remove the shared memory IPC error to/from c1lsc model
  • /opt/DIS/sbin/dis_diag gives two warnings on c1sus2
    • [WARN] IXH Adapter 0 - PCIe slot link speed is only Gen1
    • [WARN] Node 28 not reachable, but is an entry in the dishosts.conf file - c1shimmer is currently off, so this is fine.

DAQ network setup:

  • Added the DAQ ethernet MAC address  and fixed IPV4 address for the front-ends to '/etc/dhcp/dhcpd.conf'
  • Added the fixed DAQ IPV4 address and port for all the front-ends to '/etc/advligorts/subscriptions.txt' for `cps_recv` service
  • Edited '/etc/advligorts/master' by including all the iop and user models '.ini' files in '/opt/rtcds/caltech/c1/chans/daq/' containing channel info and the corresponding tespoint files in '/opt/rtcds/caltech/c1/target/gds/param/'
  • Created systemd environment file for each front-end in '/diskless/root/etc/advligorts/' containing the argument for local data concentrator and daq data transmitter (`local_dc_args` and `cps_xmit_args`). We currently have staggered the delay (-D waitValue) times of the front-ends by setting it to the last number in the daq ip address when we were facing timing glitch issues, but should probably set it back to zero to see if it has any effect.

Other:

  • Edited /etc/resolv.conf on fb1 and 'diskless/root' to enable name resolution via for example `host c1shimmer` but the file gets overwritten on chiara for some reason

Issues:

  1. Frame writing is not working at the moment. It did at some point in the past for a couple of days but stopped working earlier today and we can't quite figure out why. 
  2. We can't get data via diaggui or ndscope either. Again, we recall the working in the past too but not sure why it has stopped working now.   
  3. The cpu load on c1su2 is too high so we should split into two models
  4. We still get the occassional IPC glitch both for shared memory and dolphin, see attachments
Attachment 1: dolphin_state_all_green.png
dolphin_state_all_green.png
Attachment 2: dolphin_state_IPC_glitch.png
dolphin_state_IPC_glitch.png
  17169   Mon Oct 3 08:35:59 2022 TegaUpdateIMCAdding IMC channels to frames for NN test

[Rana]

For the upcoming NN test on the IMC, we need to add some more channels to the frames. Can someone please add the MC2 TRANS SUM, PIT, YAW at 256 Hz? and then make sure they're in frames.

and even though its not working correctly, it would be good if someone can turn the MC WFS on for a little while. I'd just like to get some data to test some code. If its easy to roughly close the loops, that would be helpful too.


[Anchal]

Currently, none of these channels are being written on frames. From simulink model, it seems the channels:

  • C1:IOO-MC_TRANS_SUMFILT_OUT_DQ

  • C1:IOO-MC_TRANS_PIT_OUT_DQ

  • C1:IOO-MC_TRANS_YAW_OUT_DQ

are supposed to be DQed but are not present in the /opt/rtcds/caltech/c1/chans/daq/C1MCS.ini file. I tried simply adding these channels to the file and rerunning the daqd_ services but that caused 0x2000 error on c1mcs model. In my attempt, I did not know what chnnum to give for these channels so I omitted that and maybe that is the issue.

The only way I know to fix this is to make and install c1mcs model again which would bring these channels into C1MCS.ini file. But We'll have to run activateDQ.py if we do that which I am not totally sure if it is in running condition right now. @Christopher Wipf do you have any suggestions?


[Rana]

aren't they all filtered? If so, perhaps we can choose whatever is the equivalent naming at the LIGO sites rather than roll our own again.

@Tega Edo can we run activateDQ.py or will that break everything now?


[Tega]

@Rana Adhikari Looking into this now.

@Anchal Gupta The only problem I see with activateDQ.py is the use of the deprecated print function, i.e. print var instead of print(var). After fixing that, it runs OK and does not change the input INI files as they already have the required channel names. I have created a temporary folder, /opt/rtcds/caltech/c1/chans/daq/activateDQtests/, which is populated with copies of the original INI files, a modified version of activateDQ.py that does not overwrite the original input files, and a script file difftest.sh that compares the input and output files so we can test the functionality of activateDQ.py in isolation. Furthermore, looking through the code suggests that all is well. Can you look at what I have done to check that this is indeed the case? If so, your suggestion of rebuilding and installing the updated c1mcs model and running activateDQ.py afterward should do the trick.

I tested the code with:

cd /opt/rtcds/caltech/c1/chans/daq/activateDQtests/

./activateDQ.py

which creates output files with an _ prefix, for example _C1MCS.ini is the output file for C1MCS.ini, then I ran

./difftest

to compare all the input and corresponding output files.

Note that the channel names you are proposing would change after running activateDQ.py, i.e.

C1:IOO-MC_TRANS_SUMFILT_OUT_DQ -> C1:IOO-MC_TRANS_SUM_ERR

C1:IOO-MC_TRANS_PIT_OUT_DQ -> C1:IOO-MC_TRANS_PIT_ERR

C1:IOO-MC_TRANS_YAW_OUT_DQ -> C1:IOO-MC_TRANS_YAW_ERR

My question is this: why aren't we using the correct channel names in the first place so that we have less work to do later on when we finally decide to stop using this postbuild script?


[Anchal]

Yeah I found that these ERR channels are acquired and stored. I don't think we should do this either. Not sure what was the original motivation for this change. I tried commenting out this part of activateDQ.py and remaking and reinstalled c1mcs but it seems that activateDQ.py is called as postbuild script automatically on install and it uses some other copy of this file as my changes did not take affect and the DQ name change still happened.


[Tega]

Ah, we encountered the same puzzle as well. Chris found out that our models have `SCRIPT=activateDQ.py` embedded in the cds parameter block description, see attached image. We believe this is what triggers the postbuild script call to `activateDQ.py`. As for the file location, modern rtcds would have it in /opt/rtcds/caltech/c1/post_build, but I am not sure where ours is located. I did a quick search for this but could not find it in the usual place so I looked around for a bit and found this:

controls@rossa> find /opt/rtcds/userapps/ -name "activateDQ.py"

/opt/rtcds/userapps/trunk.bak/cds/c1/scripts/activateDQ.py

/opt/rtcds/userapps/tags/H2OAT_RCG2.5.1/cds/c1/scripts/activateDQ.py

/opt/rtcds/userapps/branches/StanfordGuardianDev/cds/c1/scripts/activateDQ.py

/opt/rtcds/userapps/branches/branch-2.3/cds/c1/scripts/activateDQ.py

/opt/rtcds/userapps/branches/branch-2.4/cds/c1/scripts/activateDQ.py

/opt/rtcds/userapps/trunk/cds/c1/scripts/activateDQ.py

My guess is the last one /opt/rtcds/userapps/trunk/cds/c1/scripts/activateDQ.py.

Maybe we can ask @Yuta Michimura since he wrote this script?

Anyway, we could also try removing SCRIPT=activateDQ.py from the cds parameter block description to see if that stops the postbuild script call, but keep in mind that doing so would also stop the update of the OSEM and oplev channel names. This way we know what script is being used since we will have to run it after every install (this is a bad idea).

controls@c1sus:~ 0$ env | grep script

CDS_SCRIPTS_PATH=:/opt/rtcds/userapps/release/cds/c1/scripts:/opt/rtcds/userapps/release/cds/common/scripts:/opt/rtcds/userapps/release/isc/c1/scripts:/opt/rtcds/userapps/release/isc/common/scripts:/opt/rtcds/userapps/release/sus/c1/scripts:/opt/rtcds/userapps/release/sus/common/scripts

It looks like the guess was correct. Note that in the newer version of rtcds, we can use `rtcds env` instead of `env` to see what is going on.

Attachment 1: Screen_Shot_2022-09-30_at_9.52.39_AM.png
Screen_Shot_2022-09-30_at_9.52.39_AM.png
ELOG V3.1.3-