40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Mon Dec 7 11:25:31 2020, gautam, Update, SUS, MC1 suspension glitchy again 
    Reply  Wed Dec 9 16:22:57 2020, gautam, Update, SUS, Yet another round of Sat. Box. switcharoo WFS2.pngWFS_lineNoise.pngWFSchar.pdf
       Reply  Sun Jan 3 16:26:06 2021, Koji, Update, SUS, IMC WFS check (Yet another round of Sat. Box. switcharoo) Screen_Shot_2021-01-03_at_17.14.57.png
    Reply  Thu May 13 11:55:04 2021, Anchal, Paco, Update, SUS, MC1 suspension misbehaving Screenshot_from_2021-05-13_09-50-24.pngMC1_Glitches_Invest2.pdf
       Reply  Thu May 13 19:38:54 2021, Anchal, Update, SUS, MC1 Satellite Amplifier Debugged MC1_UL_Channel_Fixed.png
          Reply  Mon May 24 19:14:15 2021, Anchal, Paco, Summary, SUS, MC1 Free Swing Test set to trigger 
             Reply  Tue May 25 10:22:16 2021, Anchal, Paco, Summary, SUS, MC1 new input matrix calculated and uploaded SUS_Input_Matrix_Diagonalization.pdf
          Reply  Thu Jun 17 11:45:42 2021, Anchal, Paco, Update, SUS, MC1 Gave trouble again SummaryScreenShot.pngMC1_LL_SENSOR_DEAD.png
             Reply  Thu Jun 17 16:37:23 2021, Anchal, Paco, Update, SUS, c1susaux computer rebooted 
                Reply  Tue Jun 22 11:56:16 2021, Anchal, Paco, Update, SUS, ADC/Slow channels issues CDS_FE_Status.png
                   Reply  Tue Jun 22 16:52:28 2021, Paco, Update, SUS, ADC/Slow channels issues shake_and_damp.png
                      Reply  Wed Jun 23 09:05:02 2021, Anchal, Update, SUS, MC lock acquired back again 
                         Reply  Thu Jun 24 16:40:37 2021, Koji, Update, SUS, MC lock acquired back again P_20210624_163641_1.jpg
Message ID: 16209     Entry time: Thu Jun 17 11:45:42 2021     In reply to: 16139     Reply to this: 16210
Author: Anchal, Paco 
Type: Update 
Category: SUS 
Subject: MC1 Gave trouble again 

TL;DR

MC1 LL Sensor showed signs of fluctuating large offsets. We tried to find the issue in the box but couldn't find any. On power cycling, the sensor got back to normal. But in putting back the box, we bumped something and c1susaux slow channels froze. We tried to reboot it, but it didn't work and the channels do not exist anymore.


Today morning we came to find that IMC struggled to lock all night (See attachment 1). We kind of had an indication yesterday evening that MC1 LL Sensor PD had a higher variance than usual and Paco had to reset WFS offsets because they had integrated the noise from this sensor. Something similar happened last night, that a false offset and its fluctuation overwhelmed WFS and MC1 got misaligned making it impossible for IMC to get lock.

In the morning, Paco again reset the WFS offsets but not we were sure that the PD variance from MC1 LL osem was very high. See attachment 2 to see how only 1 OSEM is showing higher noise in comparison to the other 4 OSEMs. This behavior is similar to what we saw earlier in 16138 but for UL sensor. Koji and I fixed it in 16139 and we tested all other channels too.

So, Paco and I, went ahead and took out the MC1 satellite amplifier box S2100029 D1002812, opened the top, and checked all the PD channel testpoints with no input current. We didn't find anything odd. Next we checked the LED dirver circuit testpoints with LED OUT and GND shorted. We got 4.997V on all LED MON testpoints which indicate normal functioning.

We just hooked back everything on the MC1 satellit box and checked the sensor channels again on medm screens. To our surprise, it started functioning normally. So maybe, just a power cycling was required but we still don't know what caused this issue.

BUT when I (Anchal) was plugging back the power cables and D25 connectors on the back side in 1X4 after moving the box back into the rack, we found that the slow channels stopped updating. They just froze!

We got worried for some time as the negative power supply indicator LEDs on the acromag chassis (which is just below the MC1 satellite box) were not ON. We checked the power cables and had to open the side panel of the 1X4 rack to check how the power cables are connected. We found that there is no third wire in the power cables and the acromag chassis only takes in single rail supply. We confirmed this by looking at another acromag chassis on Xend. We pasted a note on the acromag chassis for future reference that it uses only positive rails and negative LED monitors are not usually ON.

Back to solving the frozen acromag issue, we conjectured that maybe the ethernet connection is broken. The DB25 cables for the satellite box are bit short and pull around other cables with it when connected. We checked all the ethernet cabling, it looked fine. On c1susaux computer, we saw that the monitor LED for ethernet port 2 which is connected to acromag chassis is solid ON while the other one (which is probably connection to the switch) is blinking.

We tried doing telnet to the computer, it didn't work. The host refused connection from pianosa workstation. We tried pinging the c1susaux computer, and that worked. So we concluded that most probably, the epics modbus server hosting the slow channels on c1susaux is unable to communicate with acromag chassis and hence the solid LED light on that ethernet port instead of a blinking one. We checked computer restart procedure page for SLOW computers on wiki and found that it said if telnet is not working, we can hard reboot the computer.

We hard reboot the computer by long pressing the power button and then presssing it back on. We did this process 3 times with the same result. The ethernet port 2 LED (Acromag chassis) would blink but the ethernet port 1 LED (connected to switch) would not turn ON. We now can not even ping the machine now, let alone telnet into it. All SUS slow monitor channels are not present now ofcourse. We also tried once pressing the reset button (which the manual said would reboot the machine), but we got the same outcome.

Now, we decided to stop poking around until someone with more experience can help us on this.


Bottomline: We don't know what caused the LL sensor issue and hence it has not been fixed. It can happen again. We lost all C1SUSAUX slow channels which are the OSEM and COIL slow monitor channels for PRM, BS, ITMX, ITMY, MC1, MC2 and MC3.

Attachment 1: SummaryScreenShot.png  171 kB  Uploaded Thu Jun 17 12:53:07 2021  | Hide | Hide all
SummaryScreenShot.png
Attachment 2: MC1_LL_SENSOR_DEAD.png  377 kB  Uploaded Thu Jun 17 12:58:09 2021  | Hide | Hide all
MC1_LL_SENSOR_DEAD.png
ELOG V3.1.3-