40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 63 of 341  Not logged in ELOG logo
ID Date Author Type Category Subjectdown
  17052   Mon Aug 1 18:42:39 2022 TegaConfigurationBHDc1sus2 IPC dolphin issue update

[Yuta, Tega]

We decided to give the dolphin debugging another go. Firstly, we noticed that c1sus2 was no longer recogonising the dolphin card, which can be checked using

lspci | grep Stargen

or looking at the status light on the dolphin card of c1sus2, which was orange for both ports A and B.

We decided to do a hard reboot of c1sus2 and turned off the DAQ chassis for a few minutes, then restared c1sus2. This solved the card recognition problem as well as the 'dis_irm' driver loading issue (I think the driver does not get loaded if the system does not recognise a valid card, as I also saw the missing dis_irm driver module on c1testand). 

Next, we confirmed the status of all dolphin cards on fb1, using

controls@fb1$ /opt/DIS/sbin/dxadmin

It looks like the dolphin card on c1sus2 has now been configured and is availabe to all other nodes. We then restated the all FE machines and models to see if we are in the clear. Unfortunately, we are not so lucky since the problem persisted.

Looking at the output of 'dmesg', we could only identity two notable difference between the operational dolphin cards on c1sus/c1ioo/c1lsc and c1sus2, namely: the card number being equal to zero and the memory addresses which are also zero, see image below.

Anyways, at least we can now eliminate driver issues and would move on to debugging the models next.

Attachment 1: c1sus2_dolphin.png
c1sus2_dolphin.png
Attachment 2: fb1_dxamin_status.png
fb1_dxamin_status.png
Attachment 3: dolphin_num_mem_init2.png
dolphin_num_mem_init2.png
  16414   Tue Oct 19 18:20:33 2021 Ian MacMillanSummaryCDSc1sus2 DAC to ADC test

I ran a DAC to ADC test on c1sus2 channels where I hooked up the outputs on the DAC to the input channels on the ADC. We used different combinations of ADCs and DACs to make sure that there were no errors that cancel each other out in the end. I took a transfer function across these channel combinations to reproduce figure 1 in T2000188.

As seen in the two attached PDFs the channels seem to be working properly they have a flat response with a gain of 0.5 (-6 dB). This is the response that is expected and is the result of the DAC signal being sent as a single ended signal and the ADC receiving as a differential input signal. This should result in a recorded signal of 0.5 the amplitude of the actual output signal.

The drop off on the high frequency end is the result of the anti-aliasing filter and the anti-imaging filter. Both of these are 8-pole elliptical filters so when combined we should get a drop off of 320dB per decade. I measured the slope on the last few points of each filter and the averaged value was around 347dB per decade. This is slightly steeper than expected but since it is to cut off higher frequencies it shouldn't have an effect on the operation of the system. Also it is very close to the expected value.

The ripples seen before the drop off are also an effect of the elliptical filters and are seen in T2000188.

Note: the transfer function that doesn't seem to match the others is the heartbeat timing signal.

Attachment 1: data3_Plots.pdf
data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf
Attachment 2: data2_Plots.pdf
data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf
  16415   Tue Oct 19 23:43:09 2021 KojiSummaryCDSc1sus2 DAC to ADC test

(Because of a totally unrelated reason) I was checking the electronics units for the upgrade. And I realized that the electronics units at the test stand have not been properly powered.

I found that the AA/AI stack at the test stand (Attachment 1) has an unusual powering configuration (Attachment 2).
- Only the positive power supply was used / - The supply voltage is only +15V / - The GND reference is not connected to anywhere.

For confirmation, I checked the voltage across the DC power strip (Attachments 3/4). The positive was +5.3V and the negative was -9.4V. This is subject to change depending on the earth potential.

This is not a good condition at all. The asymmetric powering of the circuit may cause damages to the opamps. So I turned off the switches of the units.

The power configuration should be immediately corrected.

  1. Use both positive and negative supply (2 power supply channels) to produce the positive and the negative voltage potentials. Connect the reference potential to the earth post of the power supply.
    https://www.youtube.com/watch?v=9_6ecyf6K40   [Dual Power Supply Connection / Serial plus minus electronics laboratory PS with center tap]
  2. These units have DC power regulator which produces +/-15V out of +/-18V. So the DC power supplies are supposed to be set at +18V.

 

Attachment 1: P_20211019_224433.jpg
P_20211019_224433.jpg
Attachment 2: P_20211019_224122.jpg
P_20211019_224122.jpg
Attachment 3: P_20211019_224400.jpg
P_20211019_224400.jpg
Attachment 4: P_20211019_224411.jpg
P_20211019_224411.jpg
  16430   Tue Oct 26 18:24:00 2021 Ian MacMillanSummaryCDSc1sus2 DAC to ADC test

[Ian, Anchal, Paco]

After the Koji found that there was a problem with the power source Anchal and I fixed the power then reran the measurment. The only change this time around is that I increased the excitation amplitude to 100. In the first run the excitation amplitude was 1 which seemed to come out noise free but is too low to give a reliable value.

link to previous results

The new plots are attached.

Attachment 1: data2_Plots.pdf
data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf data2_Plots.pdf
Attachment 2: data3_Plots.pdf
data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf data3_Plots.pdf
  3638   Fri Oct 1 18:19:24 2010 josephb, kiwamuUpdateCDSc1sus work

The c1sus model was split into 2, so that c1sus controls BS, PRM, SRM, ITMX, ITMY, while c1mcs controls MC1, MC2, MC3.  The c1mcs uses shared memory to tell c1sus what signals to the binary outputs (which control analog whitening/dewhitening filters), since two models can't control a binary output.

This split was done because the CPU time was running above 60 microseconds (the limit allowable since we're trying to run at 16kHz). Apparently the work Alex had done getting testpoints working had put a greater load on the cpu and pushed it over an acceptable maximum.    After removing the MC optics controls, the CPU time dropped to about 47 microseconds from about 67 microseconds.  The c1mcs is taking about 20 microseconds per cycle.

The new model is using the top_names functionality to still call the channels C1SUS-XXX_YYY.  However, the directory to find the actual medm filter modules is /opt/rtcds/caltech/c1/medm/c1mcs, and the gds testpoint screen for that model is called C1MCS-GDS_TP.adl.  I'm currently in the process of updating the medm screens to point to the correct location.

Also, while plugging in the cables from the coil dewhitening boards, we realized I (Joe) had made a mistake in the assignment of channels to the binary output boards.  I need to re-examine Jay's old drawings and fix the simulink model binary outputs.

  3665   Thu Oct 7 10:37:42 2010 josephbUpdateCDSc1sus with flaky ssh

Currently trying to understand why the ssh connections to c1sus  are flaky.  This morning, every time I tried to make the c1sus model on the c1sus machine, the ssh session would be terminated at a random spot midway through the build process.  Eventually restarting c1sus fixed the problem for the moment.

However, previously in the last 48 hours, the c1sus machine had stopped responding to ssh logins while still appearing to be running the front end code.  The next time this occurs, we should attach a monitor and keyboard and see what kind of state the computer is in.  Its interesting to note we didn't have these problems before we switched over to the Gentoo kernel from the real-time linux Centos 5.5 kernel.

  3160   Tue Jul 6 17:07:56 2010 josephbUpdateCDSc1sus status

I talked to Alex, and he explained the steps necessary to get the real time linux kernel installed.  It basically went like copy the files from c1iscex (the one he installed last month) in the directory /opt/rtlidk-2.2 to the c1sus locally.  Then go into rtlinux_kernel_2_6, and run make and make install (or something like that - need to look at the make file).  Then edit the grub loader file to look like the one on c1iscex (located at /boot/grub/menu.lst).

This will then hopefully let us try out the RCG code on c1sus and see if it works.

  3662   Wed Oct 6 16:16:48 2010 josephb, yutaUpdateCDSc1sus status

At the moment, c1sus and c1mcs on the c1sus machine seem to be dead in the water.  At this point, it is unclear to me why.

Apparently during the 40m meeting, Alex was able to get test points working for the c1mcs model.  He said he "had to slow down mx_stream startup on c1sus".   When we returned at 2pm, things were running fine. 

We began updating all the matrix values on the medm screens.  Somewhere towards the end of this the c1sus model seemed to have crashed, leaving only c1x02 and c1mcs running.  There were no obvious error messages I saw in dmesg and the target/c1sus/logs/log.txt file (although that seems to empty these days).  We quickly saved to burt snap shots, one of c1sus and one of c1mcs and saved them to /opt/rtcds/catlech/c1/target/snapshots directory temporarily.  We then ran the killc1sus script on c1sus, and then after confirming the code was removed, ran the startup script, startc1sus.  The code seemed to come back partly.  It was syncing up and finding the ADC/DAC boards, but not doing any real computations.  The cycle time was reporting reasonably, but the usr time (representing computation done for the model) was 0.  There were no updating monitor channels on the medm screens and filters would not turn on.

At this point I tried bringing down all 3 models, and restarting c1x02, then c1sus and c1mcs.  At this point, both c1sus and c1mcs came back partly, doing no real calculations.  c1x02 appears to be working normally (or at least the two filter banks in that model are showing changing channels from ADCs properly).  I then tried rebooting the c1sus machine.  It came back in the same state, working c1x02, non-calculating c1sus and c1mcs.

  3666   Thu Oct 7 10:48:41 2010 josephb, yutaUpdateCDSc1sus status

This problem has been resolved.

Apparently during one of Alex's debugging sessions, he had commented out the feCode function call on line 1532 of the controller.c file (located in /opt/rtcds/caltech/c1/core/advLigoRTS/src/fe/ directory).

This function is the one that actually calls all the front end specific code and without it, the code just doesn't do any computations.  We had to then rebuild the front end codes with this corrected file.

Quote:

At the moment, c1sus and c1mcs on the c1sus machine seem to be dead in the water.  At this point, it is unclear to me why.

Apparently during the 40m meeting, Alex was able to get test points working for the c1mcs model.  He said he "had to slow down mx_stream startup on c1sus".   When we returned at 2pm, things were running fine. 

We began updating all the matrix values on the medm screens.  Somewhere towards the end of this the c1sus model seemed to have crashed, leaving only c1x02 and c1mcs running.  There were no obvious error messages I saw in dmesg and the target/c1sus/logs/log.txt file (although that seems to empty these days).  We quickly saved to burt snap shots, one of c1sus and one of c1mcs and saved them to /opt/rtcds/catlech/c1/target/snapshots directory temporarily.  We then ran the killc1sus script on c1sus, and then after confirming the code was removed, ran the startup script, startc1sus.  The code seemed to come back partly.  It was syncing up and finding the ADC/DAC boards, but not doing any real computations.  The cycle time was reporting reasonably, but the usr time (representing computation done for the model) was 0.  There were no updating monitor channels on the medm screens and filters would not turn on.

At this point I tried bringing down all 3 models, and restarting c1x02, then c1sus and c1mcs.  At this point, both c1sus and c1mcs came back partly, doing no real calculations.  c1x02 appears to be working normally (or at least the two filter banks in that model are showing changing channels from ADCs properly).  I then tried rebooting the c1sus machine.  It came back in the same state, working c1x02, non-calculating c1sus and c1mcs.

 

  3668   Thu Oct 7 14:57:52 2010 josephb, yutaUpdateCDSc1sus status

Around noon, Yuta and I were trying to figure out why we were getting no signal out to the mode cleaner coils.  It turns out the mode cleaner optic control model was not talking to the IOP model. 

Alex and I were working under the incorrect assumption that you could use the same DAC piece in multiple models, and simply use a subset of the channels.  He finally went and asked Rolf, who said that the same DAC simulink piece in different models doesn't work.  You need to use shared memory locations to move the data to the model with the DAC card.  Rolf says there was a discussion (probably a long while back) where it was asked if we needed to support DAC cards in multiple models and the decision was that it was not needed.

Rolf and Alex have said they'd come over and discuss the issue.

In the meantime, I'm moving forward by adding shared memory locations for all the mode cleaner optics to talk to the DAC in the c1sus model.

 

Note by KA: Important fact that is worth remembering

  3673   Thu Oct 7 17:19:55 2010 josephb, alex, rolfUpdateCDSc1sus status

As noted by Koji, Alex and Rolf stopped by.

We discussed the feasibility of getting multiple models using the same DAC.  We decided that we infact did need it. (I.e. 8 optics through 3 DACs does not divide nicely), and went about changing the controller.c file so as to gracefully handle that case.  Basically it now writes a 0 to the channel rather than repeating the last output if a particular model goes down that is sharing a DAC.

In a separate issue, we found that when skipping DACs  in a model (say using DACs 1 and 2 only) there was a miscommunication to the IOP, resulting in the wrong DACs getting the data.  the temporary solution is to have all DACs in each model, even if they are not used.  This will eventually be fixed in code.

At this point, we *seem* to be able to control and damp optics.  Look for a elog from Yuta confirming or denying this later tonight (or maybe tomorrow).

 

  3687   Mon Oct 11 10:49:03 2010 josephbUpdateCDSc1sus stability

Taking a look at the c1sus machine, it looks as if all of the front end codes its running (c1sus - running BS, ITMX, ITMY, c1mcs - running MC1, MC2, MC3, and c1rms - running PRM and SRM) worked over the weekend.  As I see no

Running dmesg on c1sus reports on a single long cycle on c1x02, where it took 17 microseconds (~15 microseconds i maximum because the c1x02 IOP process is running at 64kHz).

Both the c1sus and c1mcs models are running at around 39-42 microseconds USR time and 44-50 microseconds CPU time.  It would run into problems at 60-62 microseconds.

Looking at the filters that are turned on, it looks as it these models were running with only a single optic's worth of filters turned on via the medm screens.  I.e. the MC2 and ITMY filters were properly set, but not the others.

The c1rms model is running at around 10 microseconds USR time and 14-18 microseconds CPU time.  However it apparently had no filters on.

It looks as if no test points were used this weekend.  We'll turn on the rest of the filters and see if we start seeing crashes of the front end again.

Edit:

The filters for all the suspensions have been turned on, and all matrix elements entered.  The USR and CPU times have not appreciably changed.  No long cycles have been reported through dmesg on c1sus at this time.  I'm going to let it run and see if it runs into problems.

  6020   Mon Nov 28 06:53:30 2011 kiwamuUpdateCDSc1sus shutdown

I have restarted the c1sus machine around 9:00 PM yesterday and then shut it down around 4:00 AM this morning after a little bit of taking care of the interferomter.

Quote from #6016

c1sus has been shutdown so that the optics dont bang around.  This is because the watch dogs are not working.

  6033   Tue Nov 29 04:47:49 2011 kiwamuUpdateCDSc1sus shut down again

I have shut down the c1sus machine at 3:30 AM.

  3636   Fri Oct 1 16:34:06 2010 josephbUpdateCDSc1sus not booting due to fb dhcp server not running

For some reason, the dhcp server running on the fb machine which assigns the IP address to c1sus (since its running a diskless boot) was down.  This was preventing c1sus from coming up properly.  The symptom was an error indicated no DHCP offers were made(when I plugged a keyboard and monitor in).

To check if the dhcp server is running, run ps -ef | grep dhcpd.  If its not, it can be started with "sudo /etc/init.d/dhcpd start"

  3157   Fri Jul 2 11:33:15 2010 josephbUpdateCDSc1sus needs real time linux to be setup on it

I connected a monitor and keyboard to the new c1sus machine and discovered its not running RTL linux.  I changed the root password to the usual, however, without help from Alex I don't know where to get the right version or how to install it, since it doesn't seem to have an obvious CD rom drive or the like.  Hopefully Tuesday I can get Alex to come over and help with the setup of it, and the other 1-2 IO chassis.

  6042   Tue Nov 29 18:54:29 2011 kiwamuUpdateCDSc1sus machine up

[Zach / Kiwamu]

 Woke up the c1sus machine in order to lock PSL to MC so that we can observe the effect of not having the EOM heater.

  7182   Tue Aug 14 17:47:44 2012 JamieUpdateCDSc1sus machine replaced

Rolf and Alex came back over with a replacement machine for c1sus.   We removed the old machine, removed it's timing, dolphin, and PCIe extension cards and put them in the new machine.  We then installed the new machine and booted it and it came up fine.  The BIOS in this machine is slightly different, and it wasn't having the same failure-to-boot-with-no-COM issue that the previous one was.  The COM ports are turned off on this machine (as is the USB interface).

Unfortunately the problem we were experiencing with the old machine, that unloading certain models was causing others to twitch and that dolphin IPC writes were being dropped, is still there.  So the problem doesn't seem to have anything to do with hardware settings...

After some playing, Rolf and Alex determined that for some reason the c1rfm model is coming up in a strange state when started during boot.  It runs faster, but the IPC errors are there.  If instead all models are stopped, the c1rfm model is started first, and then the rest of the models are started, the c1rfm model runs ok.  They don't have an explanation for this, and I'm not sure how we can work around it other than knowing the problem is there and do manual restarts after boot.  I'll try to think of something more robust.

A better "fix" to the problems is to clean up all of our IPC routing, a bunch of which we're currently doing very inefficient right now.  We're routing things through c1rfm that don't need to be, which is introducing delays.  It particular, things that can communicate directly over RFM or dolphin should just do so.  We should also figure out if we can put the c1oaf and c1pem models on the same machine, so that they can communicate directly over shared memory (SHMEM).  That should cut down on overhead quite a bit.  I'll start to look at a plan to do that.

 

  6026   Mon Nov 28 16:46:55 2011 kiwamuUpdateCDSc1sus is now up

I have restarted the c1sus machine and burt-restored c1sus and c1mcs to the day before Thank giving, namely 23rd of November.

Quote from #6020

I have restarted the c1sus machine around 9:00 PM yesterday and then shut it down around 4:00 AM this morning after a little bit of taking care of the interferometer.

  6923   Thu Jul 5 16:49:35 2012 JenneUpdateComputersc1sus is funny

I was trying to use a new BLRMs c-code block that the seismic people developed, instead of Mirko's more clunky version, but putting this in crashed c1sus.

I reverted to a known good c1pem.mdl, and Jamie and I did a reboot, but c1sus is still funny - none of the models are actually running. 

rtcds restart all - all the models are happy again, c1sus is fine.

But, we still need to figure out what was wrong with the c-code block.

Also, the BLRMS channels are listed in a Daq Channels block inside of the (new) library part, so they're all saved with the new CDS system which became effective as of the upgrade.  (I made the Mirko copy-paste BLRMS into a library part, including a DAQ channels block before trying the c-code.  This is the known-working version to which I reverted, and we are currently running.)

  14719   Tue Jul 2 16:57:09 2019 gautamUpdateCDSc1sus is flaky

Since the work earlier this morning, the fast c1sus model has crashed ~5 times. Tried rebooting vertex FEs using the reboot script a few times, but the problem is persisting. I'm opting to do the full hard reboot of the 3 vertex FEs to resolve this problem.

Judging by Attachment #1, the processes have been stable overnight.

Attachment 1: c1sus_timing.png
c1sus_timing.png
  6924   Fri Jul 6 01:12:02 2012 JenneUpdateComputersc1sus is fine

Quote:

I was trying to use a new BLRMs c-code block that the seismic people developed, instead of Mirko's more clunky version, but putting this in crashed c1sus.

I reverted to a known good c1pem.mdl, and Jamie and I did a reboot, but c1sus is still funny - none of the models are actually running. 

rtcds restart all - all the models are happy again, c1sus is fine.

But, we still need to figure out what was wrong with the c-code block.

Also, the BLRMS channels are listed in a Daq Channels block inside of the (new) library part, so they're all saved with the new CDS system which became effective as of the upgrade.  (I made the Mirko copy-paste BLRMS into a library part, including a DAQ channels block before trying the c-code.  This is the known-working version to which I reverted, and we are currently running.)

 The reason I started looking at BLRMS and c1sus today was that the BLRMS striptool was totally wacky.  I finally figured out that the pemepics hadn't been burt restored, so none of the channels were being filtered.  It's all better now, and will be even better soon when Masha finishes updating the filters (she'll make her own elog later)

  3946   Thu Nov 18 14:05:06 2010 josephb, yutaUpdateCDSc1sus is alive!

Problem:

We broke c1sus by moving ADC cards around.

Solution:

We pulled all the cards out, examined all contacts (which looked fine), found 1 poorly connected cable internally, going between an ADC and ADC timing interface card  (that probably happened last night), and one of the two RFM fiber cables pulled out of its RFM card.

We then placed all of the cards back in with a new ordering, tightened down everything, and triple checked all connections were on and well fit.

 

Gotcha!

Joe forgot that slot 1 and slot 2 of the timing interface boards have their last channels reserved for duotone signals.  Thus, they shouldn't be used for any ADCs or DACs that need their last channel (such as MC3_LR sensor input).  We saw a perfect timing signal come in through the MC3_LR sensor input, which prevented damping. 

We moved the ADC timing interface card out of the 1st slot  of the timing interface board and into slot 6 of the timing interface board, which resolved the problem.

Final Configuration:

 

 Timing Interface Board

Timing Interface Slot 1 (Duotone) 2 (Duotone) 3 4 5 6 7 8 9 10 11 12 13
Card None DAC interface (can't use last channel) ADC Interface ADC interface ADC interface

ADC

interface

None None None DAC interface DAC interface None None

 PCIe Chassis

Slot 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
PCIe Number Do Not Use 1 6 5 4 9 8 7 3 2 14 13 12 17 16 15 11 10
Card None ADC DAC ADC ADC ADC BO BO BO BO DAC DAC BIO RFM None None None None

Still having Issues with:

ITM West damps.  ITM South damps, but the coil gains are opposite to the other optics in order to damp properly.

We also need to look into switching the channel names for the watchdogs on ITMX/Y in addition to the front end code changes.

  6787   Thu Jun 7 17:49:09 2012 JamieUpdateCDSc1sus in weird state, running models but unresponsive otherwise

Somehow c1sus was in a very strange state.  It was running models, but EPICS was slow to respond.  We could not log into it via ssh, and we could not bring up test points.  Since we didn't know what else to do we just gave it a hard reset.

Once it came it, none of the models were running.  I think this is a separate problem with the model startup scripts that I need to debug.  I logged on to c1sus and ran:

rtcds restart all

(which handles proper order of restarts) and everything came up fine.

Have no idea what happened there to make c1sus freeze like that.  Will keep an eye out.

  3653   Tue Oct 5 16:58:41 2010 josephb, yutaUpdateCDSc1sus front end status

We moved the filters for the mode cleaner optics over from the C1SUS.txt file in /opt/rtcds/caltech/c1/chans/ to the C1MCS.txt file, and placed SUS_ on the front of all the filter names.  This has let us load he filters for the mode cleaner optics.

At the moment, we cannot seem to get testpoints for the optics (i.e. dtt is not working, even the specially installed ones on rosalba). I've asked Yuta to enter in the correct matrix elements and turn the correct filters on, then save with a burt backup.

  4183   Fri Jan 21 15:26:15 2011 josephbUpdateCDSc1sus broken yesterday and now fixed

[Joe, Koji]
Yesterday's CDS swap of c1sus and c1iscex left the interfometer in a bad state due to several issues.

The first being a need to actually power down the IO chassis completely (I eventually waited for a green LED to stop glowing and then plugged the power back in) when switching computers.  I also plugged and plugged the interface cable from the IO chassis and computer while powered down.  This let the computer actually see the IO chassis (previously the host interface card was glowing just red, no green lights).

Second, the former c1iscex computer and now new c1sus computer only has 6 CPUs, not 8 like most of the other front ends.  Because it was running 6 models (c1sus, c1mcs, c1rms, c1rfm, c1pem, c1x02) and 1 CPU needed to be reserved for the operating system, 2 models were not actually running (recycling mirrors and PEM).  This meant the recycling mirrors were left swinging uncontrolled.

To fix this I merged the c1rms model with the c1sus model.  The c1sus model now controls BS, ITMX, ITMY, PRM, SRM.  I merged the filter files in the /chans/ directory, and reactivated all the DAQ channels.  The master file for the fb in the /target/fb directory had all references to c1rms removed, and then the fb was restarted via "telnet fb 8088" and then "shutdown".

My final mistake was starting the work late in the day.

So the lesson for Joe is, don't start changes in the afternoon.

Koji has been helping me test the damping and confirm things are really running.  We were having some issues with some of the matrix values.  Unfortunately I had to add them by hand since the previous snapshots no longer work with the models.

  6738   Fri Jun 1 08:01:46 2012 steveUpdateComputersc1sus and c1iscex are down

Quote:

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

 

 

Attachment 1: compdown.png
compdown.png
  6737   Fri Jun 1 02:33:40 2012 JenneUpdateComputersc1sus and c1iscex - bad fb connections

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

  6740   Fri Jun 1 09:50:50 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

 This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

  6742   Fri Jun 1 14:40:24 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

I managed to recover c1sus.  It required stopping all the models, and the restarting them one-by-one:

$ rtcds stop all     # <-- this does the right to stop all the models with the IOP stopped last, so they will all unload properly.

$ rtcds start iop

$ rtcds start c1sus c1mcs c1rfm

I have no idea why the c1sus models got wedged, or why restarting them in this way fixed the issue.

  4733   Tue May 17 18:09:13 2011 Jamie, KiwamuConfigurationCDSc1sus and c1auxey crashed, rebooted

c1sus and c1auxey crashed, required hard reboot

For some reason, we found that c1sus and c1auxey were completely unresponsive.  We went out and gave them a hard reset, which brought them back up with no problems.

This appears to be related to a very similar problem report by Kiwamu just a couple of days ago, where c1lsc crashed after editing the C1LSC.ini and restarting the daqd process, which is exactly what I just did (see my previous log).  What could be causing this?

  3945   Thu Nov 18 11:06:20 2010 josephbUpdateCDSc1sus and ADCs

Problem:

ADCs are timing out on c1sus when we have more than 3.

Talked with Rolf:

Alex will be back tomorrow (he took yesterday and today off), so I talked with Rolf.

He said ordering shouldn't make a difference and he's not sure why would be having a problem. However, when he loads the chassis, he tends to put all the ADCs on the same PCI bus (the back plane apparently contains multiples).  Slot 1 is its own bus, Slots 2-9 should be the same bus, and 10-17should be the same bus.

He also mentioned that when you use dmesg and see a line like "ADC TIMEOUT # ##### ######", the first number should be the ADC number, which is useful for determining which one is reporting back slow.

Plan:

Disconnect c1sus IO chassis completely, pull it out, pull out all cards, check connectors, and repopulate with Rolf's suggestions and keeping this elog in mind.

In regards to the RFM, it looks like one of the fibers had been disconnected from  the c1sus chassis RFM card (its plugged in in the middle of the chassis so its hard to see) during all the plugging in and out of the cables and cards last night.

  10135   Mon Jul 7 13:44:21 2014 JenneUpdateCDSc1sus - bad fb connection

Quote:

 

I managed to recover c1sus.  It required stopping all the models, and the restarting them one-by-one:

$ rtcds stop all     # <-- this does the right to stop all the models with the IOP stopped last, so they will all unload properly.

$ rtcds start iop

$ rtcds start c1sus c1mcs c1rfm

I have no idea why the c1sus models got wedged, or why restarting them in this way fixed the issue.

 In addition to needing obnoxiously regular mxstream restarts, this afternoon the sus machine was doing something slightly differently.  Only 1 fb block per core was red (the mxstream symptom is 3 fb-related blocks are red per core), and restarting the mxstream didn't help.  Anyhow, I was searching through the elog, and this entry to which I'm replying had similar symptoms.  However, by the time I went back to the CDS FE screen, c1sus had regular mxstream symptoms, and an mxstream restart fixed things right up. 

So, I don't know what the issue is or was, nor do I know why it is fixed, but it's fine for now, but I wanted to make a note for the future.

  6619   Mon May 7 22:39:37 2012 DenUpdateCDSc1sus

[Jenne, Den]

We decided to reboot C1SUS machine in hope that this will fix the problem with seismic channels. After reboot the machine could not connect to framebuilder. We restarted mx_stream but this did not relp. Then we manually executed

/opt/rtcds/caltech/c1/target/fb/mx_stream -s c1x02 c1sus c1mcs c1rfm c1pem -d fb:0 -l /opt/rtcds/caltech/c1/target/fb/mx_stream_logs/c1sus.log

but c1sus still could not connect to fb. This script returned the following error:

controls@c1sus ~ 128$ cat /opt/rtcds/caltech/c1/target/fb/mx_stream_logs/c1sus.log


c1x02
c1sus
c1mcs
c1rfm
c1pem
mmapped address is 0x7fb5ef8cc000
mapped at 0x7fb5ef8cc000
mmapped address is 0x7fb5eb8cc000
mapped at 0x7fb5eb8cc000
mmapped address is 0x7fb5e78cc000
mapped at 0x7fb5e78cc000
mmapped address is 0x7fb5e38cc000
mapped at 0x7fb5e38cc000
mmapped address is 0x7fb5df8cc000
mapped at 0x7fb5df8cc000
send len = 263596
OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)
mx_connect failed

Looks like CDS error. We are leaving the WATCHDOGS OFF for the night.

  7165   Mon Aug 13 20:12:29 2012 jamieUpdateCDSc1sup model moved to c1lsc machine

I moved the c1sup simplant model to the c1lsc machine, where there was one remaining available processor.  This requires changing a bunch of IPC routing in the c1sus and c1lsp models.  I have rebuilt and installed the models, and have restarted c1sup, but have not restarted c1sus and c1lsp since they're currently in use.  I'll restart them first thing tomorrow.

  16533   Wed Dec 22 17:40:22 2021 AnchalSummaryCDSc1su2 model updated with SUS damping blocks for 7 SOSs

[Anchal, Koji]

I've updated the c1su2 model today with model suspension blocks for the 7 new SOSs (LO1, LO2, AS1, AS4, SR2, PR2 and PR3). The model is running properly now but we had some difficulty in getting it to run.

Initially, we were getting 0x2000 error on the c1su2 model CDS screen. The issue probably was high data transmission required for all the 7 SOSs in this model. Koji dug up a script /opt/rtcds/caltech/c1/userapps/trunk/cds/c1/scripts/activateDQ.py that has been used historically for updating the data rate on some of theDQ channels in the suspension block. However, this script was not working properly for Koji, so he create a new script at /opt/rtcds/caltech/c1/chans/daq/activateSUS2DQ.py.

[Ed by KA: I could not make this modified script run so that I replaces the input file (i.e. C1SU2.ini). So the output file is named C1SU2.ini.NEW and need to manually replace the original file.]

With this, Koji was able to reduce acquisition rate of SUSPOS_IN1_DQ, SUSPIT_IN1_DQ, SUSYAW_IN1_DQ, SUSSIDE_IN1_DQ, SENSOR_UL, SENSOR_UR, SENSOR_LL,SENSOR_LR, SENSOR_SIDE, OPLEV_PERROR, OPLEV_YERROR, and OPLEV_SUM to 2048 Sa/s. The script modifies the /opt/rtcds/caltech/c1/chans/daq/C1SU2.ini file which would get re-written if c1su2 model is remade and reinstalled. After this modification, the 0x2000 error stopped appearing and the model is running fine.


Should we change the library model part for sus_single_control.mdl

We notice that all our suspension models need to go through this weird python script modifying auto-generated .ini files to reduce the data rate. Ideally, there is a simpler solution to this by simply adding the datarate 2048 in the '#DAQ Channels' block in the model library part /cvs/cds/rtcds/userapps/trunk/sus/c1/models/lib/sus_single_control.mdl which is the root model in all the suspensions. With this change, the .ini files will automatically be written with correct datarate and there will be no need for using the activateDQ script. But we couldn't find why this simple solution was not implemented in the past, so we want to know if there is more stuff going on here then we know. Changing the library model would obviously change every suspension model and we don't want a broken CDS system on our head at the begining of holidays, so we'll leave this delicate task for the near future.

  16537   Wed Dec 29 20:09:40 2021 ranaSummaryCDSc1su2 model updated with SUS damping blocks for 7 SOSs

We want to maintain the 16 kHz sample rate for the COIL DAQ channels, but nothing wrong with reducing the others.

I would suggest setting the DQ sample rates to 256 Hz for the SUS DAMP channels and 1024 Hz for the OPLEV channels (for noise diagnostics).

Maybe you can put these numbers into a new library part and we can have the best of all worlds?

Quote:
 

Should we change the library model part for sus_single_control.mdl

We notice that all our suspension models need to go through this weird python script modifying auto-generated .ini files to reduce the data rate. Ideally, there is a simpler solution to this by simply adding the datarate 2048 in the '#DAQ Channels' block in the model library part /cvs/cds/rtcds/userapps/trunk/sus/c1/models/lib/sus_single_control.mdl which is the root model in all the suspensions. With this change, the .ini files will automatically be written with correct datarate and there will be no need for using the activateDQ script. But we couldn't find why this simple solution was not implemented in the past, so we want to know if there is more stuff going on here then we know. Changing the library model would obviously change every suspension model and we don't want a broken CDS system on our head at the begining of holidays, so we'll leave this delicate task for the near future.

 

  16726   Tue Mar 15 11:52:34 2022 AnchalSummaryCDSc1su2 model updated for sending Run/Acquire Binary Output to Binary Interface card

I routed the XXX_COIL_DW signals from the 7 SOS blocks in c1su2.mdl (located at /cvs/cds/rtcds/userapps/trunk/sus/c1/models/c1su2.mdl) to the binary outputs from the FE model. The routing is done such that when these binary outputs are routed through the binary interface card mounted on 1Y0, they go to the acromag chassis just installed and from there they go to the binary inputs of the coil drivers together with the acromag controlled coil outputs.

I have not restarted the rtcds models yet. This needs more care and need to follow instructions from 40m/16533. Will do that sometime later or Koji can follow up this work.

Attachment 1: c1su2.pdf
c1su2.pdf
  16728   Tue Mar 15 14:10:41 2022 AnchalSummaryCDSc1su2 model remade, reinstalled, restarted after the update

I have restarted c1su2 model with the connections of Run Acquire switch to analog filters on coil drivers. Following steps were taken:

First ssh to c1sus2 and then:

controls@c1sus2:~ 0$ rtcds make c1su2
buildd: /opt/rtcds/caltech/c1/rtbuild/release
### building c1su2...
Cleaning c1su2...
Done
Parsing the model c1su2...
Done
Building EPICS sequencers...
Done
Building front-end Linux kernel module c1su2...
Done
RCG source code directory:
/opt/rtcds/rtscore/branches/branch-3.4
The following files were used for this build:
/opt/rtcds/userapps/release/cds/common/models/lockin.mdl
/opt/rtcds/userapps/release/cds/common/models/rtbitget.mdl
/opt/rtcds/userapps/release/cds/common/models/rtdemod.mdl
/opt/rtcds/userapps/release/isc/common/models/QPD.mdl
/opt/rtcds/userapps/release/sus/c1/models/c1su2.mdl
/opt/rtcds/userapps/release/sus/c1/models/lib/sus_single_control.mdl

Successfully compiled c1su2
***********************************************
Compile Warnings, found in c1su2_warnings.log:
***********************************************
WARNING  *********** No connection to subsystem output named  SUS_DAC1_12  
WARNING  *********** No connection to subsystem output named  SUS_DAC1_13  
WARNING  *********** No connection to subsystem output named  SUS_DAC1_14  
WARNING  *********** No connection to subsystem output named  SUS_DAC1_15  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_7  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_8  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_9  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_10  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_11  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_12  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_13  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_14  
WARNING  *********** No connection to subsystem output named  SUS_DAC2_15  
***********************************************
controls@c1sus2:~ 0$ rtcds install c1su2
buildd: /opt/rtcds/caltech/c1/rtbuild/release
### installing c1su2...
Installing system=c1su2 site=caltech ifo=C1,c1
Installing /opt/rtcds/caltech/c1/chans/C1SU2.txt
Installing /opt/rtcds/caltech/c1/target/c1su2/c1su2epics
Installing /opt/rtcds/caltech/c1/target/c1su2
Installing start and stop scripts
/opt/rtcds/caltech/c1/scripts/killc1su2
/opt/rtcds/caltech/c1/scripts/startc1su2
Performing install-daq
Updating testpoint.par config file
/opt/rtcds/caltech/c1/target/gds/param/testpoint.par
/opt/rtcds/rtscore/branches/branch-3.4/src/epics/util/updateTestpointPar.pl -par_file=/opt/rtcds/caltech/c1/target/gds/param/archive/testpoint_220315_135808.par -gds_node=26 -site_letter=C -system=c1su2 -host=c1sus2
Installing GDS node 26 configuration file
/opt/rtcds/caltech/c1/target/gds/param/tpchn_c1su2.par
Installing auto-generated DAQ configuration file
/opt/rtcds/caltech/c1/chans/daq/C1SU2.ini
Installing Epics MEDM screens
Running post-build script

/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-AS1_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS1_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-AS1_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS1_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-AS1_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS1_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-AS4_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS4_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-AS4_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS4_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-AS4_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_AS4_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-LO1_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO1_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-LO1_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO1_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-LO1_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO1_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-LO2_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO2_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-LO2_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO2_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-LO2_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_LO2_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-PR2_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR2_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-PR2_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR2_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-PR2_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR2_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-PR3_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR3_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-PR3_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR3_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-PR3_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_PR3_TO_COIL_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 4 5 C1:SUS-SR2_INMATRIX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_SR2_INMATRIX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 2 4 C1:SUS-SR2_LOCKIN_INMTRX > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_SR2_LOCKIN_INMTRX_KB.adl
/opt/rtcds/userapps/release/cds/common/scripts/generate_KisselButton.py 5 6 C1:SUS-SR2_TO_COIL --fi > /opt/rtcds/caltech/c1/medm/c1su2/C1SUS_SR2_TO_COIL_KB.adl
safe.snap exists
controls@c1sus2:~ 0$

Then on rossa, run activateSUS2DQ.py which creates a file C1SU2.ini.NEW. Remove old backup file C1SU2.ini.bak, rename C1SU2.ini to C1SU2.ini.bak and rename C1SU2.ini.NEW to C1SU2.ini:

~> cd /opt/rtcds/caltech/c1/chans/daq/
daq>python2 activateSUS2DQ.py 
/opt/rtcds/caltech/c1/chans/daq/C1SU2.ini
daq>rm C1SU2.ini.bak
daq>mv C1SU2.ini C1SU2.ini.bak
daq>mv C1SU2.ini.NEW C1SU2.ini

Then ssh back to c1sus2 and restart the rtcds model:

controls@c1sus2:~ 0$ rtcds restart c1su2
### stopping c1su2...
### starting c1su2...
c1su2epics: no process found
Number of ADC cards on bus = 2
Number of DAC16 cards on bus = 3
Number of DAC18 cards on bus = 0
Number of DAC20 cards on bus = 0
Specified filename iocC1.log does not exist.
c1su2epics C1 IOC Server started
c1su2 RT ready in 4
awg_server Version $Id$
channel_client Version $Id$
testpoint_server Version $Id$
/opt/rtcds/caltech/c1/target/gds/bin/awgtpman -s c1su2 -l /opt/rtcds/caltech/c1/target/gds/awgtpman_logs/c1su2.log started on host c1sus2 hostid ffffffffa8c05771 
awgtpman Version $Id$
controls@c1sus2:~ 0$

Then restart daqd services from rossa and burtrestore to latest snap of c1su2epics.snap:

daq>telnet fb 8083
Trying 192.168.113.201...
Connected to fb.martian.
Escape character is '^]'.
daqd> shutdown
OK
Connection closed by foreign host.
daq>burtgooey
>burtwb -f /opt/rtcds/caltech/c1/burt/autoburt/latest/c1su2epics.snap -l /tmp/controls_1220315_140755_0.write.log -o /tmp/controls_1220315_140755_0.nowrite.snap -v <
daq>

All suspensions are back online and everything is same as before now. Will test later the Run/Acquire switch functionality.

  5786   Wed Nov 2 17:29:10 2011 KatrinUpdateCDSc1scy.mdl compiled

Slight modification on that model:

  • terminated Q_out of Lockins to be able to compile the old model
  • assigned other ADC channels to GCY (green YARM)
  9441   Wed Dec 4 21:33:24 2013 KojiUpdateCDSc1scy time-over issue mitigated

c1scy had frequent time-over. This caused the glitches of the OSEM damping servos.

Today Eric Q was annoyed by the glitches while he worked on the green PDH inspection at the Y-end.

In order to mitigate this issue, low priority RFM channels are moved from c1scy to c1tst.
The moved channels (see Attachment 1) are supposed to be less susceptible to the additional delay.

This modification required the following models to be modified, recompiled, reinstalled, and restarted
in the listed order:
c1als, c1sus, c1rfn, c1tst, c1scy

Now the models are are running. CDS status is all green.
The time consumption of c1scy is now ~30us (porevious ~60us)
(see Attachment 2)

I am looking at the cavity lock of TEM00 and I have witnessed no glitch any more.
In fact, the OSEM signals have no glitch. (see Attachment 3)

We still have c1mcs having regularly time-over. Can I remove the WFS->OAF connections temporarily?

Attachment 1: TST.png
TST.png
Attachment 2: CDS.png
CDS.png
Attachment 3: no_glitch.png
no_glitch.png
  8626   Thu May 23 10:24:23 2013 JamieSummaryCDSc1scy model continues to run at the hairy edge

c1scy, the controller model at the Y END, is still running very long, typically at 55/60 microseconds, or ~92% of it's cycle.  It's currently showing a recorded max cycle time (since last restart or reset) of 60, which means that it has actually hit it's limit sometime in the very recent past.  This is obviously not good, since it's going to inject big glitches into ETMY.

c1scy is actually running a lot less code than c1scx, but c1scx caps out it's load at about 46 us.  This indicates to me that it must be some hardware configuration setting in the c1iscey computer.

I'll try to look into this more as soon as I can.

  4173   Thu Jan 20 04:03:02 2011 kiwamuUpdateCDSc1scy error

 I found that c1scy was not running due to a daq initialization error.

 I couldn't figure out how to fix it, so I am leaving it to Joe.


 Here is the error messages in the dmesg on c1iscey
[   39.429002] c1scy: Invalid num daq chans = 0
[   39.429002] c1scy: DAQ init failed -- exiting
 
 
Before I found this fact, I rebooted c1iscey in order to recover the synchronization with fb.
The synchronization had been lost probably because I shutdowned the daqd on fb.
  4175   Thu Jan 20 10:15:50 2011 josephbUpdateCDSc1scy error

This is caused by an insufficient number of active DAQ channels in the C1SCY.ini file located in /opt/rtcds/caltech/c1/chans/daq/.  A quick look (grep -v # C1SCY.ini) indicates there are no active channels.  Experience tells me you need at least 2 active channels.

Taking a look at the activateDAQ.py script in the daq directory, it looks like the C1SCY.ini file is included, by the loop over optics is missing ETMY.  This caused the file to improperly updated when the activateDAQ.py script was run.  I have fixed the C1SCY.ini file (ran a modified version of the activate script on just C1SCY.ini).

I have restarted the c1scy front end using the startc1scy script and is currently working.

Quote:
 Here is the error messages in the dmesg on c1iscey
[   39.429002] c1scy: Invalid num daq chans = 0
[   39.429002] c1scy: DAQ init failed -- exiting
 

 

  6175   Fri Jan 6 01:00:56 2012 kiwamuUpdateCDSc1scx out of sync

Both the c1scx and its IOP realtime processes became out of sync.

Initially I found that the c1scx didn't show any ADC signals, though the sync sign was green.

Then I software-rebooted the c1iscex machine and then it became out of sync.

For tonight this is fine because I am concentrating on the central part anyway.

  5535   Sat Sep 24 01:38:14 2011 kiwamuUpdateCDSc1scx and c1x01 restarted

[Koji / Kiwamu]

 The c1scx and c1x01 realtime processes became frozen. We restarted them around 1:30 by sshing and running the kill/start scripts.

  6436   Thu Mar 22 16:45:06 2012 kiwamuUpdateCDSc1scx and c1scy not properly running

It seems that neither c1scx nor c1scy is working properly as their ADC counts are showing digital-zeros.

However the IOPs, c1gcx and c1gcy look running fine, and also the IOPs seem successfully recognizing the ADCs according to dmesg.

Also there is one more confusing fact : c1scx and c1scy are synchronizing to the timing signal somehow.

I restarted the c1scx front end model to see if this helps, but unfortunately it didn't work.

As this is not the top priority concern for now, I am leaving them as they are now with the watchgods off.

(I may try hardware rebooting them in this evening)

Quote from #6434

The power was turned back on at 4pm It took some time for Suresh to restart the computers. We have damping but things are not perfect yet. Auto BURTH did not work well.

 

  6438   Thu Mar 22 17:41:15 2012 sureshUpdateCDSc1scx and c1scy not properly running

Quote:

It seems that neither c1scx nor c1scy is working properly as their ADC counts are showing digital-zeros.

Quote from #6434

The power was turned back on at 4pm It took some time for Suresh to restart the computers. We have damping but things are not perfect yet. Auto BURTH did not work well.

 When Steve and I restarted the c1iscex and c1iscey computers after the power shutdown, the models within them did not start-up automatically.  I had to start them manually from a terminal in the control room. 

I also tried rebooting the FB a couple of times.  Did not make any difference.

Manually starting the c1x05, c1scy and c1x01, c1scx models (with the Burt Restore button ON) did not resolve the issue of zeros in the epics screens.  though it did re-establish timing. 

  6439   Thu Mar 22 23:43:56 2012 KojiUpdateCDSc1scx and c1scy not properly running

Did you guys checked if the simplant switch is set to "REAL WORLD" mode?

Edit by KI:

Bingo ! The input signals were bypassed to the simplant. I switched the simplant settings to REAL WORLD and now both end suspensions are working fine.

  7008   Mon Jul 23 18:57:52 2012 JamieUpdateCDSc1scx and c1scy models recompiled and restarted

After the changes listed in 7005 and 7007, I have rebuilt, installed, and restarted the c1scx and c1scy models.  Everything seems to have come back up ok.

Running into some daqd troubles because of a change to c1ioo, but will report on the new ALS channels when I can.

ELOG V3.1.3-