40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 278 of 341  Not logged in ELOG logo
ID Date Author Type Category Subjectup
  15150   Thu Jan 23 23:07:04 2020 JonConfigurationPSLc1psl breakout board wiring

To facilitate wiring the c1psl chassis and scripting loopback tests, I've compiled a distilled spreadsheet with the Acromag-to-breakout board wiring, broken down by connector. This information is extractable from the master spreadsheet, but not easily. There were also a few apparent typos which are fixed here.

The wiring assignments at the time of writing are attached below. Here is the link to the latest spreadsheet.

Attachment 1: c1psl_feedthrough_wiring.pdf
c1psl_feedthrough_wiring.pdf c1psl_feedthrough_wiring.pdf c1psl_feedthrough_wiring.pdf c1psl_feedthrough_wiring.pdf
  15117   Mon Jan 13 15:47:37 2020 shrutiConfigurationComputer Scripts / Programsc1psl burt restore

[Yehonathan, Jon, Shruti]

Since the PMC would not lock, we initially burt-restored the c1psl machine to the last available shapshot (Dec 10th 2019), but it still would not lock.

Then, it was burt-restored to midnight of Dec 1st, 2019, after which it could be locked.

  13742   Mon Apr 9 23:28:49 2018 johannesConfigurationDAQc1psl channel list

I made a list of all the physical c1psl channels to get a better idea for how many acromags we need to replace it eventually. There  3123 unit is the one whose failure had prevented c1psl from booting, which is why it was unplugged (elog post 12852), and its channels have been inactive since. Are the 126MOPA channels used for the current mephisto? 126 tells me it's for an old lightwave laser, but I was checking a few and found that they have non-zero, changing values, so they may have been rewired.

It also hosts some virtual channels for the ISS with root C1:PSL-ISS_ defined in iss.db and dc.db, the PSL particle counter with root C1:PEM- defined in PCount.db  and a whole lot of PSL status channels defined in pslstatus.db. Transfering these virtual channels to a different machine is almost trivial, but the serial readout of the particle counter would have to find a new home.

Long story short - we need:

Function Type # Channels #Channels (no MOPA) # Units # Units (no MOPA)
ADC XT1221 34 21 5 3
DAC XT1541 17 14 3 2
BIO XT1111 19 10 2 1

 



3113 - ADC

C1:PSL-126MOPA_126PWR
C1:PSL-126MOPA_DTMP
C1:PSL-126MOPA_LTMP
C1:PSL-126MOPA_DMON
C1:PSL-126MOPA_LMON
C1:PSL-126MOPA_CURMON
C1:PSL-126MOPA_DTEC
C1:PSL-126MOPA_LTEC
C1:PSL-126MOPA_CURMON2
C1:PSL-126MOPA_HTEMP
C1:PSL-126MOPA_HTEMPSET
C1:PSL-FSS_RFPDDC
C1:PSL-FSS_LODET
C1:PSL-FSS_FAST
C1:PSL-FSS_PCDRIVE
C1:PSL-FSS_MODET
C1:PSL-FSS_VCODETPWR
C1:PSL-FSS_TIDALOUT
C1:PSL-PMC_RFPDDC
C1:PSL-PMC_LODET
C1:PSL-PMC_PZT
C1:PSL-PMC_MODET


3123 - ADC (failed)

C1:PSL-126MOPA_AMPMON
C1:PSL-126MOPA_126MON
C1:PSL-FSS_RCTRANSPD
C1:PSL-FSS_MINCOMEAS
C1:PSL-FSS_RMTEMP
C1:PSL-FSS_RCTEMP
C1:PSL-FSS_MIXERM
C1:PSL-FSS_SLOWM
C1:PSL-FSS_TIDALINPUT
C1:PSL-PMC_PMCTRANSPD
C1:PSL-PMC_PMCERR
C1:PSL-PPKTP_TEMP


4116 - DAC

C1:PSL-126MOPA_126CURADJ
C1:PSL-126MOPA_DCAMP
C1:PSL-126MOPA_DCAMP-
C1:PSL-FSS_INOFFSET
C1:PSL-FSS_MGAIN
C1:PSL-FSS_FASTGAIN
C1:PSL-FSS_PHCON
C1:PSL-FSS_RFADJ
C1:PSL-FSS_SLOWDC
C1:PSL-FSS_VCOMODLEVEL
C1:PSL-FSS_TIDAL
C1:PSL-FSS_TIDALSET
C1:PSL-PMC_GAIN
C1:PSL-PMC_INOFFSET
C1:PSL-PMC_PHCON
C1:PSL-PMC_RFADJ
C1:PSL-PMC_RAMP


XVME-210 - Binary Input

C1:PSL-126MOPA_FAULT
C1:PSL-126MOPA_INTERLOCK
C1:PSL-126MOPA_SHUTTER
C1:PSL-126MOPA_126LASE
C1:PSL-126MOPA_AMPON


XVME-220 - Binary Output

C1:PSL-126MOPA_126NE
C1:PSL-126MOPA_126STANDBY
C1:PSL-126MOPA_SHUTOPENEX
C1:PSL-126MOPA_STANDBY
C1:PSL-FSS_SW1
C1:PSL-FSS_SW2
C1:PSL-FSS_FASTSWEEP
C1:PSL-FSS_PHFLIP
C1:PSL-FSS_VCOTESTSW
C1:PSL-FSS_VCOWIDESW
C1:PSL-PMC_SW1
C1:PSL-PMC_SW2
C1:PSL-PMC_PHFLIP
C1:PSL-PMC_BLANK

  15253   Wed Mar 4 22:38:31 2020 JonUpdatePSLc1psl communications problem resolved

I investigated the problem reported earlier today with the BIO1 channels. By logging the systemd messages generated when the IOC starts, I was immediately able to determine that the problem was not limited to BIO1. The modbus communications were failing for several other units as well.

Because some in-situ rewiring of a handful of channels had recently been done (more on this soon), I initially suspected that one of the Acromags had been damaged in the process. However, removing BIO1 (or other non-communicating modules) did not restore communications with the rest of the modules. To test whether the chassis was the source of the problem at all, we set up a fresh ADC (new out of the package) and directly connected it to the secondary Ethernet interface of c1psl. With only the one new ADC connected, the modbus IOC failed in exactly the same way.

To confirm that the new ADC did in fact work, we connected it to c1auxex in the same configuration. The unit worked fine connected to c1auxex. This established that the source of the problem was the c1psl host. After some extensive debugging, I traced the problem to a pre-execution script (part of the modbus IOC systemd service) which resets the secondary network interface (the one connected to the Acromag chassis) prior to launching the IOC. This was to ensure the secondary interface always had the correct IP address. It appears this reset was somehow creating a race condition that allowed the modbus initializations (first communications with the Acromags) to sometimes start before the network interface had actually come back up.

I still don't understand how this was happening, or why the pre script worked just fine up until yesterday, but eliminating the network interface reset fixes the problem in 100% of the trials we ran. Unfortunately we lost the entire day to debugging this problem, so the final round of testing is still to be completed. We plan to pick it back up tomorrow afternoon.

  14817   Tue Jul 30 09:13:31 2019 gautamUpdatePSLc1psl keyed, Agilent setup cleared
  1. IMC would not lock. c1psl EPICS channels were unresponsive. I keyed the crate and went through the usual burtrestore/PMC-relocking dance.
  2. While at 1X2, I decided to take this opportunity to clean up the AG4395 setup that has been setup there unused for several weeks now.
    • Unplugged the active probe connected via BNC-T connector to the mixer IF output.
    • Noticed that the active probe (S/N 2850J01450) did not have it's power connection connected. According to the manual, this is bad. I don't know if the probe is damaged or not.
    • Moved the AG4395 cart out of the way so that there is a little more room around 1X1/1X2.
  15184   Mon Feb 3 15:22:39 2020 JonUpdatePSLc1psl progress/Acromag ADC grounding

I tested the c1psl AO channels on the electronics bench on Friday. While I found all the wiring to be correct, some of the channels exhibited excess noise with all appearances of a grounding problem.

Today Jordan, Gautam, and I investigated this further. It is indeed a grounding problem, but actually with the Acromag ADCs. The Acromag DAC outputs are single-ended (return is grounded), so (for the purpose of a loopback test) I would expect to leave the ADC inputs ungrounded. This is the configuration I tested Friday. Today we also tested driving the ADC with a floating source. The ADC noise behavior is exactly the same, whether the source end is grounded or not.

However, grounding the minus pin of the ADC channel eliminates the noise. We don't understand why this seems to be required irrespective of the driving source, so there something we're missing about the ADC design. As it turns out, this same fix was made to the AI channels of the previously-upgraded Acromag machines. I know Chub and I had to do this for the AI channels of c1vac, but at the time we thought the source grounding was causing the issue. However, today Jordan and I looked inside c1iscaux, which Chub wired, and confirmed that its AI channels are wired in the same way.

So in any case, Jordan is grounding the c1psl AI channels in the same way as c1iscaux. Once this is done, we'll continue with the bench testing tomorrow.

gautam: here are my notes about this issue when i was doing the c1iscaux testing. As I note there, "previously-upgraded Acromag machines" in the plural may be a bit of a stretch - I have no idea what the grounding situation is in c1susaux / c1auxex for example.

  15115   Fri Jan 10 14:21:19 2020 YehonathanUpdatePSLc1psl reboot

PSL controls on the sitemap went blank. Rebooted c1psl. PSL screens seem normal again.

  1976   Tue Sep 8 19:30:33 2009 ranaUpdatePSLc1psl rebooted for new RCPID database settings

The RC thermal PID is now controllable from its own MEDM screen which is reachable from the FSS screen. The slowpid.db and psl.db have been modified to add these records and all seems to be working fine.

Also, I've attached the c1psl startup output that we got on the terminal. This is just for posterity.

I'm also done tuning the PID for now. Using Kp = -1.0, Ki = -0.01, and Kd = 0, the can servo now has a time constant of ~10 minutes and good damping as can be seen in the StripTool snap below. These values are also now in the saverestore.req so hopefully its fully commissioned.

I bet that its much better now than the MINCO at holding against the 24 hour cycle and can nicely handle impulses (like when Steve scans the table). Lets revisit this in a week to see if it requires more tuning.

Attachment 1: c1psl-term-dump.txt.gz
Attachment 2: C1PSL_FSS_RCPID.png
C1PSL_FSS_RCPID.png
Attachment 3: Picture_1.png
Picture_1.png
  1983   Thu Sep 10 18:25:15 2009 ranaUpdatePSLc1psl rebooted for new RCPID database settings

I added a new database record (C1:PSL-FSS_RCPID_SETPOINT) to allow for changing of the RC setpoint while the loop is on. This will enable us to step the can's temperature and see the result in the NRPO's SLOWDC.

 

  15231   Thu Feb 27 17:50:36 2020 gautamUpdatePSLc1psl setup setup

[many people]

in prep for the install tomorrow, we did the following:

  • Install the c1psl Supermicro in the 1X2 rack (Attachment 1). To make room we removed the anti-image filter and mounted it on the OMC rack.
  • Set up a local workstation (monitor+mouse+keyboard) for the Supermicro so we can do some local testing (Attachment 2).
  • Clear up the immediate area around the 1X1/1X2 rack, setup a cart for the Acromag.
  • Make sure there are sufficient adaptor boards cables (DB37, DB15, DB9, DB25, ethernet) etc available at the cart.
  • Label cables, connect on Acromag chassis end (Attachment 3).
  • Keep some large (A3) printouts of the channel mapping handy by the cart.
  • made sure we have open fuse-able DIN rail connectors for +/-15 V DC and +/-24 V DC for the Acromag box (we are waiting on some thinner gauge cabling for the 24V supply, once that arrives, we will power the box from the Sorensens. For now, they are powered by bench supplies on the cart).
  • made sure c1psl1 (still this name for the Supermicro) is ssh-able.

Barring objections, tomorrow (Friday 28 Feb 2020) morning I will commence the switch (I still want to work on the IFO tonight).

Attachment 1: 20200227_173535.jpg
20200227_173535.jpg
Attachment 2: 20200227_173454_HDR.jpg
20200227_173454_HDR.jpg
Attachment 3: 20200227_172659.jpg
20200227_172659.jpg
  15234   Fri Feb 28 08:05:22 2020 gautamUpdatePSLc1psl setup setup

And so it begins.

Quote:

Barring objections, tomorrow (Friday 28 Feb 2020) morning I will commence the switch

  15235   Fri Feb 28 10:04:41 2020 gautamUpdatePSLc1psl setup setup

Summary:

There are several problems evident already.

  1. Several EPICS database entries were missing. WTF.
  2. After fixing the missing entries, the PMC could be locked. However, the IMC could not be locked.
  3. I think the FSS Interface card is not configured correctly.

For now, I've returned the old c1psl connections, the PMC and IMC are both locked. Need to do some debugging on the bench.

  15239   Mon Mar 2 16:35:12 2020 gautamUpdateCDSc1psl test status

Channel list with test status
== Test Status ==

[done] Lock PMC and IMC
[done] IMC Servo board test
[done] IMC LO Det Mon channel check
[0th order] WFS quadrant DC mon
[none] WFS I/F monitors
[0th order] WFS attenuators
[none] IOO QPD channels
[done] FSS readbacks 
[done] PMC readbacks


Some more detailed elogs about the individual tests will follow.

Basically, I have characterized the IMC Servo board in detail. The summary finding is that the IN2 (=AO gain) slider needs to be investigated. 

All other channels need to be verified in a more thorough fashion than my basic checks which were just to guarantee the core interferometer functionality, which is important to me.

  12849   Thu Feb 23 15:48:43 2017 johannesUpdateComputersc1psl un-bootable

Using the PDA520 detector on the AS port I tried to get some better estimates for the round-trip loss in both arms. While setting up the measurement I noticed some strange output on the scope I'm using to measure the amount of reflected light.

The interferometer was aligned using the dither scripts for both arms. Then, ITMY was majorly misaligned in pitch AND yaw such that the PD reading did not change anymore. Thus, only light reflected from the XARM was incident of the AS PD. The scope was showing strange oscillations (Channel 2 is the AS PD signal):

For the measurement we compare the DC level of the reflection with the ETM aligned (and the arm locked) vs a misaligned ETM (only ITM reflection). This ringing could be observed in both states, and was qualitatively reproducible with the other arm. It did not show up in the MC or ARM transmission. I found that changing the pitch of the 'active' ITM (=of the arm under investigation) either way by just a couple of ticks made it go away and settle roughly at the lower bound of the oscillation:

In this configuration the PD output follows the mode cleaner transmission (Channel 3 in the screen caps) quite well, but we can't take the differential measurement like this, because it is impossible to align and lock the arm but them misalign the ITM. Moving the respective other ITM for potential secondary beams did not seem to have an obvious effect, although I do suspect a ghost/secondary beam to be the culprit for this. I moved the PDA520 on the optical table but didn't see a change in the ringing amplitude. I do need to check the PD reflection though.

Obviously it will be hard to determine the arm loss this way, but for now I used the averaging function of the scope to get rid of the ringing. What this gave me was:
(16 +/- 9) ppm losses in the x-arm and (-18+/-8) ppm losses in the y-arm

The negative loss obviously makes little sense, and even the x-arm number seems a little too low to be true. I strongly suspect the ringing is responsible and wanted to investigate this further today, but a problem with c1psl came up that shut down all work on this until it is fixed:

I found the PMC unlocked this morning and c1psl (amongst other slow machines) was unresponsive, so I power-cycled them. All except c1psl came back to normal operation. The PMC transmission, as recorded by c1psl,  shows that it has been down for several days:

Repeated attempts to reset and/or power-cycle it by Gautam and myself could not bring it back. The fail indicator LED of a single daughter card (the DOUT XVME-212) turns off after reboot, all others stay lit. The sysfail LED on the crate is also on, but according to elog 10015 this is 'normal'. I'm following up that post's elog tree to monitor the startup of c1psl through its system console via a serial connection to find out what is wrong.

  12850   Thu Feb 23 18:52:53 2017 ranaUpdateComputersc1psl un-bootable

The fringes seen on the oscope are mostly likely due to the interference from multiple light beams. If there are laser beams hitting mirrors which are moving, the resultant interference signal could be modulated at several Hertz, if, for example, one of the mirrors had its local damping disabled.

  12851   Thu Feb 23 19:44:48 2017 johannesUpdateComputersc1psl un-bootable

Yes, that was one of the things that I wanted to look into. One thing Gautam and I did that I didn't mention was to reconnect the SRM satellite box and move the optic around a bit, which didn't change anything. Once the c1psl problem is fixed we'll resume with that.

Quote:

The fringes seen on the oscope are mostly likely due to the interference from multiple light beams. If there are laser beams hitting mirrors which are moving, the resultant interference signal could be modulated at several Hertz, if, for example, one of the mirrors had its local damping disabled.

 

Speaking of which:

Using one of the grey RJ45 to D-Sub cables with an RS232 to USB adapter I was able to capture the startup log of c1psl (using the usb camera windows laptop). I also logged the startup of the "healthy" c1aux, both are attached. c1psl stalls at a point were c1aux starts testing for present vme modules and doesn't continue, however is not strictly hung up, as it still registers to the logger when external login attempts via telnet occur. The telnet client simply reports that the "shell is locked" and exits. It is possible that one of the daughter cards causes this. This seems to happen after iocInit is called by the startup script at /cvs/cds/caltech/target/c1psl/startup.cmd, as it never gets to the next item "coreRelease()". Gautam and I were trying to find out what happends inside iocInit, but it's not clear to us at this point from where it is even called. iocInit.c and compiled binaries exist in several places on the shared drive. However, all belong to R3.14.x epics releases, while the logfile states that the R3.12.2 epics core is used when iocInit is called.

Next we'll interrupt the autoboot procedure and try to work with the machine directly.

Attachment 1: slow_startup_logs.tar.gz
  12854   Tue Feb 28 01:28:52 2017 johannesUpdateComputersc1psl un-bootable

It turned out the 'ringing' was caused by the respective other ETM still being aligned. For these reflection measurements both test masses of the other arm need to be misaligned. For the ETM it's sufficient to use the Misalign button in the medm screens, while the ITM has to be manually misaligned to move the reflected beam off the PD.

I did another round of armloss measurements today. I encountered some problems along the way

  • Some time today (around 6pm) most of the front end models had crashed and needed to be restarted GV: actually it was only the models on c1lsc that had crashed. I noticed this on Friday too.
  • ETMX keeps getting kicked up seemingly randomly. However, it settles fast into it's original position.

General Stuff:

  • Oscilloscope should sample both MC power (from MC2 transmitted beam) and AS signal
  • Channel data can only be loaded from the scope one channel at a time, so 'stop' scope acquisition and then grab the relevant channels individually
  • Averaging needs to be restarted everytime the mirrors are moved triggering stop and run remotely via the http interface scripts does this.

Procedure:

  1.     Run LSC Offsets
  2.     With the PSL shutter closed measure scope channel dark offsets, then open shutter
  3.     Align all four test masses with dithering to make sure the IFO alignment is in a known state
  4.     Pick an arm to measure
  5.     Turn the other arm's dither alignment off
  6.     'Misalign' that arm's ETM using medm screen button
  7.     Misalign that arm's ITM manually after disabling its OpLev servos looking at the AS port camera and make sure it doesn't hit the PD anymore.
  8.     Disable dithering for primary arm
  9.     Record MC and AS time series from (paused) scope
  10.     Misalign primary ETM
  11.     Repeat scope data recording

Each pair of readings gives the reflected power at the AS port normalized to the IMC stored power:

\widehat{P}=\frac{P_{AS}-\overline{P}_{AS}^\mathrm{dark}}{P_{MC}-\overline{P}_{MC}^\mathrm{dark}}

which is then averaged. The loss is calculated from the ratio of reflected power in the locked (L) vs misaligned (M) state from

\mathcal{L}=\frac{T_1}{4\gamma}\left[1-\frac{\overline{\widehat{P}_L}}{\overline{\widehat{P}_M}} +T_1\right ]-T_2

Acquiring data this way yielded P_L/P_M=1.00507 +/- 0.00087 for the X arm and P_L/P_M=1.00753 +/- 0.00095 for the Y arm. With \gamma_x=0.832 and \gamma_x=0.875 (from m1=0.179, m2=0.226 and 91.2% and 86.7% mode matching in X and Y arm, respectively) this yields round trip losses of:

\mathcal{L}_X=21\pm4\,\mathrm{ppm}  and  \mathcal{L}_Y=13\pm4\,\mathrm{ppm}, which is assuming a generalized 1% error in test mass transmissivities and modulation indices. As we discussed, this seems a little too good to be true, but at least the numbers are not negative.

  14455   Thu Feb 14 23:14:12 2019 gautamUpdateCDSc1rfm errors

The pressure is still 2e-4 torr according to CC1 so I thought I'd give ASS debugging a go tonight. But the arm transmission signal isn't coming through to the LSC model from the end PDs - so a resurfacing of this problem. Rebooting the sender model, c1scy, did not fix the problem. Moreover, c1susaux is dead. The last time I rebooted it, ITMY got stuck so I'm not going to attempt a revival tonight.

  15240   Mon Mar 2 19:32:41 2020 gautamUpdateCDSc1rfm errors

Had to reboot both end machines and the c1rfm model to get the TRX and TRY signals to the LSC models. Now both arms can be locked using POX/POY respectively.

Attachment 1: RFMerrors.png
RFMerrors.png
  14457   Fri Feb 15 15:22:08 2019 gautamUpdateCDSc1rfm errors persist

I restarted c1scyc1rfm (so both sender and receiver models were cycled) and power-cycled the c1iscey and c1sus machines. The TRY PD is certainly seeing light - it is just not getting piped over to c1rfm. dmesg doesn't give any clues. I'm out of ideas.

P.S. The new reality seems to be that getting ITMY stuck in the event of a c1susaux reboot is inevitable. As is the practise for ITMX, I tried slowly ramping the PIT and YAW biases to 0 slowly - but in the process of ramping YAW to 0, the optic got stuck. I am ramping in steps of 0.1 (in units of the PIT/YAW sliders, waiting ~3 seconds between steps), I guess I can try ramping even more slowly.

Update: I power cycled the physical RFM switch. This necessitated reboot of all vertex FEs. But seems like things are back to normal now...

Note: to unstick ITMY, seems like the best approach is:

  1. Jiggle bias until SIDE shadow sensor is on average above it's half-light level. This is the critical step. A bias of +20000 cts on the fast SIDE output seems to help.
  2. Set YAW bias to -10, ramp down the BIAS in steps of 0.1, watching shadow sensor levels to ensure optic doesn't get stuck again.
  3. Hope for the best. Iterate if necessary.
Quote:

The pressure is still 2e-4 torr according to CC1 so I thought I'd give ASS debugging a go tonight. But the arm transmission signal isn't coming through to the LSC model from the end PDs - so a resurfacing of this problem. Rebooting the sender model, c1scy, did not fix the problem. Moreover, c1susaux is dead. The last time I rebooted it, ITMY got stuck so I'm not going to attempt a revival tonight.

Attachment 1: Screenshot_from_2019-02-15_15-21-47.png
Screenshot_from_2019-02-15_15-21-47.png
  15920   Mon Mar 15 20:22:01 2021 gautamUpdateASCc1rfm model restarted

On Friday, I felt that the ASC performance when the PRFPMI was locked was not as good as it used to be, so I looked into the situation a bit more. As part of my ASC model revamp in December, I made a bunch of changes to the signal routing, and my suspicion was that the control signals weren't even reaching the ETMs. My log says that I recompiled and reinstalled the c1rfm model (used to pipe the ASC control signals to the ETMs), and indeed, the file was modified on Dec 21. But for whatever reason, the C1RFM.ini (=Dolphin receiver since the ASC control signals are sent to this model over the Dolphin network from the c1ioo machine which hosts the C1:ASC- namespace, and RFM sender to the ETMs, but this path already existed) file never picked up the new channels. Today, I recompiled, re-installed, and restarted the models, and confirmed that the control signals actually make it to the ETMs. So now we can have the QPD-based ASC loops engaged once again for the PRFPMI lock. The CDS system did not crash 🎉 . See Attachments #1-3.

I checked the loop performance in the POX/POY locked config by first deliberately misaligning the ETMs, and then engagin the loops - seems to work (Attachment #4). The loop shapes have to be tweaked a bit and I didn't engage the integrators, hence the DC pointing wasn't recovered. Also, added a line to the script that turns the ASC loops on to set limits for all the loops - in the testing process, one of the loops ran away and I tripped the ETMY watchdog. It has since been recovered. I SDFed a limit of 100cts just to be on the conservative side for model reboot situations - the value in the script can be raised/lowered as deemed necessary (sorry, I don't know the cts-->urad number off the top of my head).

But the hope is this improves the power buildup, and provides stability so that I can begin to commission the AS WFS system a bit.

Attachment 1: RFM.png
RFM.png
Attachment 2: CDSoverview.png
CDSoverview.png
Attachment 3: RFMchans.png
RFMchans.png
Attachment 4: ASCloops.png
ASCloops.png
  11883   Tue Dec 15 11:22:53 2015 gautamUpdateCDSc1scx and c1asx crashed

I noticed what I thought was excessive movement of the beam spot on ITMX and ETMX on the control room monitors, and when I checked the CDS FE status overview MEDM screen, I saw that c1scx and c1asx had crashed. I ssh-ed into c1iscex and restarted both models, and then restarted fb as well. However, the DAQ-DCO_C1SCX_STATUS indicator remains red even after restarting fb (see attached screenshot). I am not sure how to fix this so I am leaving it as is for now, and the X arm looks to have settled down.

Attachment 1: CDS_FE_STATUS_OVERVIEW_15DEC2015.png
CDS_FE_STATUS_OVERVIEW_15DEC2015.png
  7008   Mon Jul 23 18:57:52 2012 JamieUpdateCDSc1scx and c1scy models recompiled and restarted

After the changes listed in 7005 and 7007, I have rebuilt, installed, and restarted the c1scx and c1scy models.  Everything seems to have come back up ok.

Running into some daqd troubles because of a change to c1ioo, but will report on the new ALS channels when I can.

  6436   Thu Mar 22 16:45:06 2012 kiwamuUpdateCDSc1scx and c1scy not properly running

It seems that neither c1scx nor c1scy is working properly as their ADC counts are showing digital-zeros.

However the IOPs, c1gcx and c1gcy look running fine, and also the IOPs seem successfully recognizing the ADCs according to dmesg.

Also there is one more confusing fact : c1scx and c1scy are synchronizing to the timing signal somehow.

I restarted the c1scx front end model to see if this helps, but unfortunately it didn't work.

As this is not the top priority concern for now, I am leaving them as they are now with the watchgods off.

(I may try hardware rebooting them in this evening)

Quote from #6434

The power was turned back on at 4pm It took some time for Suresh to restart the computers. We have damping but things are not perfect yet. Auto BURTH did not work well.

 

  6438   Thu Mar 22 17:41:15 2012 sureshUpdateCDSc1scx and c1scy not properly running

Quote:

It seems that neither c1scx nor c1scy is working properly as their ADC counts are showing digital-zeros.

Quote from #6434

The power was turned back on at 4pm It took some time for Suresh to restart the computers. We have damping but things are not perfect yet. Auto BURTH did not work well.

 When Steve and I restarted the c1iscex and c1iscey computers after the power shutdown, the models within them did not start-up automatically.  I had to start them manually from a terminal in the control room. 

I also tried rebooting the FB a couple of times.  Did not make any difference.

Manually starting the c1x05, c1scy and c1x01, c1scx models (with the Burt Restore button ON) did not resolve the issue of zeros in the epics screens.  though it did re-establish timing. 

  6439   Thu Mar 22 23:43:56 2012 KojiUpdateCDSc1scx and c1scy not properly running

Did you guys checked if the simplant switch is set to "REAL WORLD" mode?

Edit by KI:

Bingo ! The input signals were bypassed to the simplant. I switched the simplant settings to REAL WORLD and now both end suspensions are working fine.

  5535   Sat Sep 24 01:38:14 2011 kiwamuUpdateCDSc1scx and c1x01 restarted

[Koji / Kiwamu]

 The c1scx and c1x01 realtime processes became frozen. We restarted them around 1:30 by sshing and running the kill/start scripts.

  6175   Fri Jan 6 01:00:56 2012 kiwamuUpdateCDSc1scx out of sync

Both the c1scx and its IOP realtime processes became out of sync.

Initially I found that the c1scx didn't show any ADC signals, though the sync sign was green.

Then I software-rebooted the c1iscex machine and then it became out of sync.

For tonight this is fine because I am concentrating on the central part anyway.

  4173   Thu Jan 20 04:03:02 2011 kiwamuUpdateCDSc1scy error

 I found that c1scy was not running due to a daq initialization error.

 I couldn't figure out how to fix it, so I am leaving it to Joe.


 Here is the error messages in the dmesg on c1iscey
[   39.429002] c1scy: Invalid num daq chans = 0
[   39.429002] c1scy: DAQ init failed -- exiting
 
 
Before I found this fact, I rebooted c1iscey in order to recover the synchronization with fb.
The synchronization had been lost probably because I shutdowned the daqd on fb.
  4175   Thu Jan 20 10:15:50 2011 josephbUpdateCDSc1scy error

This is caused by an insufficient number of active DAQ channels in the C1SCY.ini file located in /opt/rtcds/caltech/c1/chans/daq/.  A quick look (grep -v # C1SCY.ini) indicates there are no active channels.  Experience tells me you need at least 2 active channels.

Taking a look at the activateDAQ.py script in the daq directory, it looks like the C1SCY.ini file is included, by the loop over optics is missing ETMY.  This caused the file to improperly updated when the activateDAQ.py script was run.  I have fixed the C1SCY.ini file (ran a modified version of the activate script on just C1SCY.ini).

I have restarted the c1scy front end using the startc1scy script and is currently working.

Quote:
 Here is the error messages in the dmesg on c1iscey
[   39.429002] c1scy: Invalid num daq chans = 0
[   39.429002] c1scy: DAQ init failed -- exiting
 

 

  8626   Thu May 23 10:24:23 2013 JamieSummaryCDSc1scy model continues to run at the hairy edge

c1scy, the controller model at the Y END, is still running very long, typically at 55/60 microseconds, or ~92% of it's cycle.  It's currently showing a recorded max cycle time (since last restart or reset) of 60, which means that it has actually hit it's limit sometime in the very recent past.  This is obviously not good, since it's going to inject big glitches into ETMY.

c1scy is actually running a lot less code than c1scx, but c1scx caps out it's load at about 46 us.  This indicates to me that it must be some hardware configuration setting in the c1iscey computer.

I'll try to look into this more as soon as I can.

  9441   Wed Dec 4 21:33:24 2013 KojiUpdateCDSc1scy time-over issue mitigated

c1scy had frequent time-over. This caused the glitches of the OSEM damping servos.

Today Eric Q was annoyed by the glitches while he worked on the green PDH inspection at the Y-end.

In order to mitigate this issue, low priority RFM channels are moved from c1scy to c1tst.
The moved channels (see Attachment 1) are supposed to be less susceptible to the additional delay.

This modification required the following models to be modified, recompiled, reinstalled, and restarted
in the listed order:
c1als, c1sus, c1rfn, c1tst, c1scy

Now the models are are running. CDS status is all green.
The time consumption of c1scy is now ~30us (porevious ~60us)
(see Attachment 2)

I am looking at the cavity lock of TEM00 and I have witnessed no glitch any more.
In fact, the OSEM signals have no glitch. (see Attachment 3)

We still have c1mcs having regularly time-over. Can I remove the WFS->OAF connections temporarily?

Attachment 1: TST.png
TST.png
Attachment 2: CDS.png
CDS.png
Attachment 3: no_glitch.png
no_glitch.png
  5786   Wed Nov 2 17:29:10 2011 KatrinUpdateCDSc1scy.mdl compiled

Slight modification on that model:

  • terminated Q_out of Lockins to be able to compile the old model
  • assigned other ADC channels to GCY (green YARM)
  16728   Tue Mar 15 14:10:41 2022 AnchalSummaryCDSc1su2 model remade, reinstalled, restarted after the update

I have restarted c1su2 model with the connections of Run Acquire switch to analog filters on coil drivers. Following steps were taken:

First ssh to c1sus2 and then:

controls@c1sus2:~ 0$ rtcds make c1su2
buildd: /opt/rtcds/caltech/c1/rtbuild/release
### building c1su2...
Cleaning c1su2...
Done
Parsing the model c1su2...
Done
Building EPICS sequencers...
Done
Building front-end Linux kernel module c1su2...
Done
RCG source code directory:
/opt/rtcds/rtscore/branches/branch-3.4
The following files were used for this build:
/opt/rtcds/userapps/release/cds/common/models/lockin.mdl
/opt/rtcds/userapps/release/cds/common/models/rtbitget.mdl
/opt/rtcds/userapps/release/cds/common/models/rtdemod.mdl
/opt/rtcds/userapps/release/isc/common/models/QPD.mdl
/opt/rtcds/userapps/release/sus/c1/models/c1su2.mdl
/opt/rtcds/userapps/release/sus/c1/models/lib/sus_single_control.mdl

Successfully compiled c1su2
***********************************************
Compile Warnings, found in c1su2_warnings.log:
***********************************************
WARNING  *********** No connection to subsystem output named  SUS_DAC1_12  
WARNING  *********** No connection to subsystem output named  SUS_DAC1_13  
WARNING  *********** No connection to subsystem output named  SUS_DAC1_14  
WARNING  *********** No connection to subsystem output named  SUS_DAC1_15  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_7  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_8  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_9  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_10  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_11  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_12  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_13  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_14  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_15  
***********************************************
controls@c1sus2:~ 0$ rtcds install c1su2
buildd: /opt/rtcds/caltech/c1/rtbuild/release
### installing c1su2...
Installing system=c1su2 site=caltech ifo=C1,c1
Installing /opt/rtcds/caltech/c1/chans/C1SU2.txt
Installing /opt/rtcds/caltech/c1/target/c1su2/c1su2epics
Installing /opt/rtcds/caltech/c1/target/c1su2
Installing start and stop scripts
/opt/rtcds/caltech/c1/scripts/killc1su2
/opt/rtcds/caltech/c1/scripts/startc1su2
Performing install-daq
Updating testpoint.par config file
/opt/rtcds/caltech/c1/target/gds/param/testpoint.par
/opt/rtcds/rtscore/branches/branch-3.4/src/epics/util/updateTestpointPar.pl -par_file=/opt/rtcds/caltech/c1/target/gds/param/archive/testpoint_220315_135808.par -gds_node=26 -site_letter=C -system=c1su2 -host=c1sus2
Installing GDS node 26 configuration file
/opt/rtcds/caltech/c1/target/gds/param/tpchn_c1su2.par
Installing auto-generated DAQ configuration file
/opt/rtcds/caltech/c1/chans/daq/C1SU2.ini
Installing Epics MEDM screens
Running post-build script

/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-AS1_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS1_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-AS1_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS1_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-AS1_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS1_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-AS4_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS4_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-AS4_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS4_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-AS4_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS4_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-LO1_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO1_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-LO1_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO1_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-LO1_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO1_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-LO2_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO2_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-LO2_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO2_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-LO2_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO2_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-PR2_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR2_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-PR2_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR2_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-PR2_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR2_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-PR3_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR3_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-PR3_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR3_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-PR3_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR3_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-SR2_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_SR2_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-SR2_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_SR2_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-SR2_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_SR2_TO_COIL_KB.adl
safe.snap exists
controls@c1sus2:~ 0$

Then on rossa, run activateSUS2DQ.py which creates a file C1SU2.ini.NEW. Remove old backup file C1SU2.ini.bak, rename C1SU2.ini to C1SU2.ini.bak and rename C1SU2.ini.NEW to C1SU2.ini:

~> cd /opt/rtcds/caltech/c1/chans/daq/
daq>python2 activateSUS2DQ.py 
/opt/rtcds/caltech/c1/chans/daq/C1SU2.ini
daq>rm C1SU2.ini.bak
daq>mv C1SU2.ini C1SU2.ini.bak
daq>mv C1SU2.ini.NEW C1SU2.ini

Then ssh back to c1sus2 and restart the rtcds model:

controls@c1sus2:~ 0$ rtcds restart c1su2
### stopping c1su2...
### starting c1su2...
c1su2epics: no process found
Number of ADC cards on bus = 2
Number of DAC16 cards on bus = 3
Number of DAC18 cards on bus = 0
Number of DAC20 cards on bus = 0
Specified filename iocC1.log does not exist.
c1su2epics C1 IOC Server started
c1su2 RT ready in 4
awg_server Version $Id$
channel_client Version $Id$
testpoint_server Version $Id$
/opt/rtcds/caltech/c1/target/gds/bin/awgtpman -s c1su2 -l /opt/rtcds/caltech/c1/target/gds/awgtpman_logs/c1su2.log started on host c1sus2 hostid ffffffffa8c05771 
awgtpman Version $Id$
controls@c1sus2:~ 0$

Then restart daqd services from rossa and burtrestore to latest snap of c1su2epics.snap:

daq>telnet fb 8083
Trying 192.168.113.201...
Connected to fb.martian.
Escape character is '^]'.
daqd> shutdown
OK
Connection closed by foreign host.
daq>burtgooey
>burtwb -f /opt/rtcds/caltech/c1/burt/autoburt/latest/c1su2epics.snap -l /tmp/controls_1220315_140755_0.write.log -o /tmp/controls_1220315_140755_0.nowrite.snap -v <
daq>

All suspensions are back online and everything is same as before now. Will test later the Run/Acquire switch functionality.

  16726   Tue Mar 15 11:52:34 2022 AnchalSummaryCDSc1su2 model updated for sending Run/Acquire Binary Output to Binary Interface card

I routed the XXX_COIL_DW signals from the 7 SOS blocks in c1su2.mdl (located at /cvs/cds/rtcds/userapps/trunk/sus/c1/models/c1su2.mdl) to the binary outputs from the FE model. The routing is done such that when these binary outputs are routed through the binary interface card mounted on 1Y0, they go to the acromag chassis just installed and from there they go to the binary inputs of the coil drivers together with the acromag controlled coil outputs.

I have not restarted the rtcds models yet. This needs more care and need to follow instructions from 40m/16533. Will do that sometime later or Koji can follow up this work.

Attachment 1: c1su2.pdf
c1su2.pdf
  16533   Wed Dec 22 17:40:22 2021 AnchalSummaryCDSc1su2 model updated with SUS damping blocks for 7 SOSs

[Anchal, Koji]

I've updated the c1su2 model today with model suspension blocks for the 7 new SOSs (LO1, LO2, AS1, AS4, SR2, PR2 and PR3). The model is running properly now but we had some difficulty in getting it to run.

Initially, we were getting 0x2000 error on the c1su2 model CDS screen. The issue probably was high data transmission required for all the 7 SOSs in this model. Koji dug up a script /opt/rtcds/caltech/c1/userapps/trunk/cds/c1/scripts/activateDQ.py that has been used historically for updating the data rate on some of theDQ channels in the suspension block. However, this script was not working properly for Koji, so he create a new script at /opt/rtcds/caltech/c1/chans/daq/activateSUS2DQ.py.

[Ed by KA: I could not make this modified script run so that I replaces the input file (i.e. C1SU2.ini). So the output file is named C1SU2.ini.NEW and need to manually replace the original file.]

With this, Koji was able to reduce acquisition rate of SUSPOS_IN1_DQ, SUSPIT_IN1_DQ, SUSYAW_IN1_DQ, SUSSIDE_IN1_DQ, SENSOR_UL, SENSOR_UR, SENSOR_LL,SENSOR_LR, SENSOR_SIDE, OPLEV_PERROR, OPLEV_YERROR, and OPLEV_SUM to 2048 Sa/s. The script modifies the /opt/rtcds/caltech/c1/chans/daq/C1SU2.ini file which would get re-written if c1su2 model is remade and reinstalled. After this modification, the 0x2000 error stopped appearing and the model is running fine.


Should we change the library model part for sus_single_control.mdl

We notice that all our suspension models need to go through this weird python script modifying auto-generated .ini files to reduce the data rate. Ideally, there is a simpler solution to this by simply adding the datarate 2048 in the '#DAQ Channels' block in the model library part /cvs/cds/rtcds/userapps/trunk/sus/c1/models/lib/sus_single_control.mdl which is the root model in all the suspensions. With this change, the .ini files will automatically be written with correct datarate and there will be no need for using the activateDQ script. But we couldn't find why this simple solution was not implemented in the past, so we want to know if there is more stuff going on here then we know. Changing the library model would obviously change every suspension model and we don't want a broken CDS system on our head at the begining of holidays, so we'll leave this delicate task for the near future.

  16537   Wed Dec 29 20:09:40 2021 ranaSummaryCDSc1su2 model updated with SUS damping blocks for 7 SOSs

We want to maintain the 16 kHz sample rate for the COIL DAQ channels, but nothing wrong with reducing the others.

I would suggest setting the DQ sample rates to 256 Hz for the SUS DAMP channels and 1024 Hz for the OPLEV channels (for noise diagnostics).

Maybe you can put these numbers into a new library part and we can have the best of all worlds?

Quote:
 

Should we change the library model part for sus_single_control.mdl

We notice that all our suspension models need to go through this weird python script modifying auto-generated .ini files to reduce the data rate. Ideally, there is a simpler solution to this by simply adding the datarate 2048 in the '#DAQ Channels' block in the model library part /cvs/cds/rtcds/userapps/trunk/sus/c1/models/lib/sus_single_control.mdl which is the root model in all the suspensions. With this change, the .ini files will automatically be written with correct datarate and there will be no need for using the activateDQ script. But we couldn't find why this simple solution was not implemented in the past, so we want to know if there is more stuff going on here then we know. Changing the library model would obviously change every suspension model and we don't want a broken CDS system on our head at the begining of holidays, so we'll leave this delicate task for the near future.

 

  7165   Mon Aug 13 20:12:29 2012 jamieUpdateCDSc1sup model moved to c1lsc machine

I moved the c1sup simplant model to the c1lsc machine, where there was one remaining available processor.  This requires changing a bunch of IPC routing in the c1sus and c1lsp models.  I have rebuilt and installed the models, and have restarted c1sup, but have not restarted c1sus and c1lsp since they're currently in use.  I'll restart them first thing tomorrow.

  6619   Mon May 7 22:39:37 2012 DenUpdateCDSc1sus

[Jenne, Den]

We decided to reboot C1SUS machine in hope that this will fix the problem with seismic channels. After reboot the machine could not connect to framebuilder. We restarted mx_stream but this did not relp. Then we manually executed

/opt/rtcds/caltech/c1/target/fb/mx_stream -s c1x02 c1sus c1mcs c1rfm c1pem -d fb:0 -l /opt/rtcds/caltech/c1/target/fb/mx_stream_logs/c1sus.log

but c1sus still could not connect to fb. This script returned the following error:

controls@c1sus ~ 128$ cat /opt/rtcds/caltech/c1/target/fb/mx_stream_logs/c1sus.log


c1x02
c1sus
c1mcs
c1rfm
c1pem
mmapped address is 0x7fb5ef8cc000
mapped at 0x7fb5ef8cc000
mmapped address is 0x7fb5eb8cc000
mapped at 0x7fb5eb8cc000
mmapped address is 0x7fb5e78cc000
mapped at 0x7fb5e78cc000
mmapped address is 0x7fb5e38cc000
mapped at 0x7fb5e38cc000
mmapped address is 0x7fb5df8cc000
mapped at 0x7fb5df8cc000
send len = 263596
OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)
mx_connect failed

Looks like CDS error. We are leaving the WATCHDOGS OFF for the night.

  10135   Mon Jul 7 13:44:21 2014 JenneUpdateCDSc1sus - bad fb connection

Quote:

 

I managed to recover c1sus.  It required stopping all the models, and the restarting them one-by-one:

$ rtcds stop all     # <-- this does the right to stop all the models with the IOP stopped last, so they will all unload properly.

$ rtcds start iop

$ rtcds start c1sus c1mcs c1rfm

I have no idea why the c1sus models got wedged, or why restarting them in this way fixed the issue.

 In addition to needing obnoxiously regular mxstream restarts, this afternoon the sus machine was doing something slightly differently.  Only 1 fb block per core was red (the mxstream symptom is 3 fb-related blocks are red per core), and restarting the mxstream didn't help.  Anyhow, I was searching through the elog, and this entry to which I'm replying had similar symptoms.  However, by the time I went back to the CDS FE screen, c1sus had regular mxstream symptoms, and an mxstream restart fixed things right up. 

So, I don't know what the issue is or was, nor do I know why it is fixed, but it's fine for now, but I wanted to make a note for the future.

  3945   Thu Nov 18 11:06:20 2010 josephbUpdateCDSc1sus and ADCs

Problem:

ADCs are timing out on c1sus when we have more than 3.

Talked with Rolf:

Alex will be back tomorrow (he took yesterday and today off), so I talked with Rolf.

He said ordering shouldn't make a difference and he's not sure why would be having a problem. However, when he loads the chassis, he tends to put all the ADCs on the same PCI bus (the back plane apparently contains multiples).  Slot 1 is its own bus, Slots 2-9 should be the same bus, and 10-17should be the same bus.

He also mentioned that when you use dmesg and see a line like "ADC TIMEOUT # ##### ######", the first number should be the ADC number, which is useful for determining which one is reporting back slow.

Plan:

Disconnect c1sus IO chassis completely, pull it out, pull out all cards, check connectors, and repopulate with Rolf's suggestions and keeping this elog in mind.

In regards to the RFM, it looks like one of the fibers had been disconnected from  the c1sus chassis RFM card (its plugged in in the middle of the chassis so its hard to see) during all the plugging in and out of the cables and cards last night.

  4733   Tue May 17 18:09:13 2011 Jamie, KiwamuConfigurationCDSc1sus and c1auxey crashed, rebooted

c1sus and c1auxey crashed, required hard reboot

For some reason, we found that c1sus and c1auxey were completely unresponsive.  We went out and gave them a hard reset, which brought them back up with no problems.

This appears to be related to a very similar problem report by Kiwamu just a couple of days ago, where c1lsc crashed after editing the C1LSC.ini and restarting the daqd process, which is exactly what I just did (see my previous log).  What could be causing this?

  6737   Fri Jun 1 02:33:40 2012 JenneUpdateComputersc1sus and c1iscex - bad fb connections

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

  6740   Fri Jun 1 09:50:50 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

 This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

  6742   Fri Jun 1 14:40:24 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

I managed to recover c1sus.  It required stopping all the models, and the restarting them one-by-one:

$ rtcds stop all     # <-- this does the right to stop all the models with the IOP stopped last, so they will all unload properly.

$ rtcds start iop

$ rtcds start c1sus c1mcs c1rfm

I have no idea why the c1sus models got wedged, or why restarting them in this way fixed the issue.

  6738   Fri Jun 1 08:01:46 2012 steveUpdateComputersc1sus and c1iscex are down

Quote:

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

 

 

Attachment 1: compdown.png
compdown.png
  4183   Fri Jan 21 15:26:15 2011 josephbUpdateCDSc1sus broken yesterday and now fixed

[Joe, Koji]
Yesterday's CDS swap of c1sus and c1iscex left the interfometer in a bad state due to several issues.

The first being a need to actually power down the IO chassis completely (I eventually waited for a green LED to stop glowing and then plugged the power back in) when switching computers.  I also plugged and plugged the interface cable from the IO chassis and computer while powered down.  This let the computer actually see the IO chassis (previously the host interface card was glowing just red, no green lights).

Second, the former c1iscex computer and now new c1sus computer only has 6 CPUs, not 8 like most of the other front ends.  Because it was running 6 models (c1sus, c1mcs, c1rms, c1rfm, c1pem, c1x02) and 1 CPU needed to be reserved for the operating system, 2 models were not actually running (recycling mirrors and PEM).  This meant the recycling mirrors were left swinging uncontrolled.

To fix this I merged the c1rms model with the c1sus model.  The c1sus model now controls BS, ITMX, ITMY, PRM, SRM.  I merged the filter files in the /chans/ directory, and reactivated all the DAQ channels.  The master file for the fb in the /target/fb directory had all references to c1rms removed, and then the fb was restarted via "telnet fb 8088" and then "shutdown".

My final mistake was starting the work late in the day.

So the lesson for Joe is, don't start changes in the afternoon.

Koji has been helping me test the damping and confirm things are really running.  We were having some issues with some of the matrix values.  Unfortunately I had to add them by hand since the previous snapshots no longer work with the models.

  3653   Tue Oct 5 16:58:41 2010 josephb, yutaUpdateCDSc1sus front end status

We moved the filters for the mode cleaner optics over from the C1SUS.txt file in /opt/rtcds/caltech/c1/chans/ to the C1MCS.txt file, and placed SUS_ on the front of all the filter names.  This has let us load he filters for the mode cleaner optics.

At the moment, we cannot seem to get testpoints for the optics (i.e. dtt is not working, even the specially installed ones on rosalba). I've asked Yuta to enter in the correct matrix elements and turn the correct filters on, then save with a burt backup.

  6787   Thu Jun 7 17:49:09 2012 JamieUpdateCDSc1sus in weird state, running models but unresponsive otherwise

Somehow c1sus was in a very strange state.  It was running models, but EPICS was slow to respond.  We could not log into it via ssh, and we could not bring up test points.  Since we didn't know what else to do we just gave it a hard reset.

Once it came it, none of the models were running.  I think this is a separate problem with the model startup scripts that I need to debug.  I logged on to c1sus and ran:

rtcds restart all

(which handles proper order of restarts) and everything came up fine.

Have no idea what happened there to make c1sus freeze like that.  Will keep an eye out.

  3946   Thu Nov 18 14:05:06 2010 josephb, yutaUpdateCDSc1sus is alive!

Problem:

We broke c1sus by moving ADC cards around.

Solution:

We pulled all the cards out, examined all contacts (which looked fine), found 1 poorly connected cable internally, going between an ADC and ADC timing interface card  (that probably happened last night), and one of the two RFM fiber cables pulled out of its RFM card.

We then placed all of the cards back in with a new ordering, tightened down everything, and triple checked all connections were on and well fit.

 

Gotcha!

Joe forgot that slot 1 and slot 2 of the timing interface boards have their last channels reserved for duotone signals.  Thus, they shouldn't be used for any ADCs or DACs that need their last channel (such as MC3_LR sensor input).  We saw a perfect timing signal come in through the MC3_LR sensor input, which prevented damping. 

We moved the ADC timing interface card out of the 1st slot  of the timing interface board and into slot 6 of the timing interface board, which resolved the problem.

Final Configuration:

 

 Timing Interface Board

Timing Interface Slot 1 (Duotone) 2 (Duotone) 3 4 5 6 7 8 9 10 11 12 13
Card None DAC interface (can't use last channel) ADC Interface ADC interface ADC interface

ADC

interface

None None None DAC interface DAC interface None None

 PCIe Chassis

Slot 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
PCIe Number Do Not Use 1 6 5 4 9 8 7 3 2 14 13 12 17 16 15 11 10
Card None ADC DAC ADC ADC ADC BO BO BO BO DAC DAC BIO RFM None None None None

Still having Issues with:

ITM West damps.  ITM South damps, but the coil gains are opposite to the other optics in order to damp properly.

We also need to look into switching the channel names for the watchdogs on ITMX/Y in addition to the front end code changes.

ELOG V3.1.3-