40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m elog, Page 46 of 357  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  5424   Thu Sep 15 20:16:15 2011 jamieUpdateCDSNew c1oaf model installed and running

[Jamie, Jenne, Mirko]

New c1oaf model installed

We have installed the new c1oaf (online adaptive feed-forward) model.  This model is now running on c1lsc.  It's not really doing anything at the moment, but we wanted to get the model running, with all of it's interconnections to the other models.

c1oaf has interconnections to both c1lsc and c1pem via the following routes:

c1lsc ->SHMEM-> c1oaf
c1oaf ->SHMEM-> c1lsc
c1pem ->SHMEM-> c1rfm ->PCIE-> c1oaf

Therefore c1lsc, c1pem, and c1rfm also had to be modified to receive/send the relevant signals.

As always, when adding PCIx senders and receivers, we had to compile all the models multiple times in succession so that the /opt/rtcds/caltech/c1/chans/ipc/C1.ipc would be properly populated with the channel IPC info.

Issues:

There were a couple of issues that came up when we installed and re/started the models:

c1oaf not being registered by frame builder

When the c1oaf model was started, it had no C1:DAQ-FB0_C1OAF_STATUS channel, as it's supposed to.  In the daqd log (/opt/rtcds/caltech/c1/target/fb/logs/daqd.log.19901) I found the following:

Unable to find GDS node 22 system c1oaf in INI files

It turns out this channel is actually created by the frame builder, and it could not find the channel definition file for the new model, so it was failing to create the channels for it.  The frame builder "master" file (/opt/rtcds/caltech/c1/target/fb/master) needs to list the c1oaf daq ini files:

/opt/rtcds/caltech/c1/chans/daq/C1OAF.ini
/opt/rtcds/caltech/c1/target/gds/param/tpchn_c1oaf.par

These were added, and the framebuilder was restarted.  After which the C1:DAQ-FB0_C1OAF_STATUS appeared correctly.

SHMEM errors on c1lsc and c1oaf

This turned out to be because of an oversight in how we wired up the skeleton c1oaf model.  For the moment the c1oaf model has only the PCIx sends and receives.  I had therefore grounded the inputs to the SHMEM parts that were meant to send signals to C1LSC.  However, this made the RCG think that these SHMEM parts were actually receivers, since it's the grounding of the inputs to these parts that actually tells the RCG that the part is a receiver.  I fixed this by adding a filter module to the input of all the senders.

Once this was all fixed, the models were recompiled, installed, and restarted, and everything came up fine.

All model changes were of course committed to the cds_user_apps svn as well.

  5426   Thu Sep 15 21:56:01 2011 MirkoUpdateCDSc1oaf check, possible shmem problem

After Jamie installed the c1oaf model ( entry 5424 ) I went and checked the intermodel communication.

Remember the config is:

c1lsc ->SHMEM-> c1oaf
c1oaf ->SHMEM-> c1lsc
c1pem ->SHMEM-> c1rfm ->PCIE-> c1oaf

I checked at least one of every communications type.

-All signals reach their destinations.
-c1lsc_to_c1oaf_via_shmem is more noisy adding noise to the signal. lsc runs at 16kHz and oaf at 2kHz but that should actually smooth things out.

c1lsc_to_c1oaf_via_shmem.png

 

  5486   Tue Sep 20 17:45:30 2011 kiwamuUpdateCDSdaqd is restarting by hisself ?

[Jenne / Kiwamu]

 Fb was sick. Dataviewer and Fourier Tools didn't work for a while.

After 10 minutes later they became healthy again. No idea what exactly was going on.

One thing we found was that : during the sickness of fb, it looks like daqd was restarting by hisself. Is this normal ??

Here is the bottom sentences of restart.log. Apparently daqd was rebooting although we didn't command to do so.

  daqd_start Tue Sep 20 02:41:17 PDT 2011
  daqd_start Tue Sep 20 13:18:12 PDT 2011
  daqd_start Tue Sep 20 17:33:00 PDT 2011

  5535   Sat Sep 24 01:38:14 2011 kiwamuUpdateCDSc1scx and c1x01 restarted

[Koji / Kiwamu]

 The c1scx and c1x01 realtime processes became frozen. We restarted them around 1:30 by sshing and running the kill/start scripts.

  5561   Wed Sep 28 02:42:04 2011 kiwamuUpdateCDSsome DAQ channel lost in c1sus : fb, c1sus and c1pem restarted

Somehow some DAQ channels for C1SUS have disappeared from the DAQ channel list.

Indeed there are only a few DAQ channels listed in the C1SUS.ini file.

I ran the activateDQ.py and restarted daqd.

Everything looks okay.  C1SUS and C1PEM were restarted because they became frozen.

  5579   Fri Sep 30 02:56:56 2011 kiwamuUpdateCDSC1IOO.ini and C1LSC.ini files reverted

[Suresh/Kiwamu]

We found that the C1LSC.ini and C1IOO.ini file had been refreshed and there were a few recorded channels in the files.

So we manually recovered C1LSC.ini file using the daqconfig GUI screen.

For the C1IOO.ini file we simply replaced it by the archived one which had been saved in 22nd of September.

Then we restarted daqd.

  5580   Fri Sep 30 03:14:18 2011 kiwamuUpdateCDSsuspension became crazy : c1sus rebooted

Quote from #5579

Then we restarted daqd.

[Suresh / Kiwamu]

The c1lsc and c1sus machine were rebooted.

 

- - (CDS troubles)

 After we restarted daqd and pressed some DAQ RELOAD buttons the c1lsc machine crashed.

The machine didn't respond to ssh, so the machine was physically rebooted by pressing the reset button.

Then we found all the realtime processes on the c1sus machine became frozen, so we restarted them by sshing and typing the start scripts.

However after that, the vertex suspensions became undamped, even though we did the burt restore correctly.

This symptom was exactly the same as Jenne reported (#5571).

We tried the same technique as Jenne did ; hardware reboot of the c1sus machine. Then everything became okay.

The burt restore was done for c1lsc, c1asc, c1sus and c1mcs.

 

- - (ITMX trouble)

During the trial of damping recovery, the ITMX mirror seemed stacked to an OSEM. The UL readout became zero and the rest of them became full range.

Eventually introducing a small offset in C1:SUS-ITMX_YAW_COMM released the mirror. The amount of the offset we introduced was about +1.

  5607   Mon Oct 3 20:47:51 2011 kiwamuUpdateCDSc1lsc and c1sus didn't run

[Mirko / Jenne / Kiwamu]

Just a quick update. All the realtime processes on the c1lsc and c1sus machine didn't run at all.

Somehow the c1xxxfe.ko kernel module, where xxx is x04, x02, lsc, ass, sus, mcs, pem and rfm failed to be insmod.

The timing indicators on the c1lsc and c1sus machine are saying NO SYNC.

 

- According to log files (target/c1lsc/logs/log.txt)

insmod: error inserting '/opt/rtcds/caltech/c1/target/c1lsc/bin/c1lscfe.ko': -1 Unknown symbol in module

- dmesg on c1lsc (c1sus also dumps the same error message):

[   45.831507] DXH Adapter 0 : sci_map_segment - Failed to map segment - error=0x40000d01
[   45.833006] c1x04: DIS segment mapping status 1073745153

DXH dapter is a part of the Dolphine connections.

When a realtime codes is waking up, the code checks the Dolphin connections.

The checking procedure is defined by dolphin.c (/src/fe/doplhin.c).

According to a printk sentence in dolphin.c the second error message listed above will return status "0" if everything is fine.

The first error above is an error vector from a special dolphin's function called sci_map_segment, which is called in dolphin.c.

So something failed in this sci_map_segment function and is preventing the realtime code from waking up.

Note that sci_map_segment is defined in genif.h and genif.c which reside in /opt/srcdis/src/IRM_DSX/drv/src.

  5608   Mon Oct 3 21:20:30 2011 JenneUpdateCDSc1lsc and c1sus didn't run

[Jenne, Mirko, Kiwamu, Koji, and Jamie by phone]

We just got off the phone with Jamie.  In addition to all the stuff that Kiwamu mentioned, Mirko reverted the c1oaf model and C-code to stuff that was working successfully on Friday (using "svn export, rev # 1134) since that's what we were working on when all hell broke loose.

We did a few rounds of "sudo shutdown -h now" on c1lsc and c1sus machines, and pulled the power cords out.  

We also switched the c1ioo and c1lsc 1PPS fibers in the fanout chassis, to see if that would fix the problem.  Nope.  c1ioo is still fine, and c1lsc is still not fine.

Still getting "No Sync".

We're going to call in Alex in the morning, if we can't figure it out soon.

  5609   Mon Oct 3 23:52:49 2011 kiwamuUpdateCDSsome more tests for the Dolphin

[Koji / Kiwamu]

 We did several tests to figure out what could be a source of the computer issue.

The Dolphin switch box looks suspicious, but not 100% sure.

 

(what we did)

 + Removed the pciRfm sentence from the c1x04 model to disable the Dolphin connection in the software.

 + Found no difference in the Makefile, which is supposed to comment out the Dolphin connection sentences.

   ==> So we had to edit the Makefile by ourselves

 + Did a hand-comilpe by editing the Makefile and running a make command.

 + Restarted the c1x04 process and it ran wihtout problems.

   ==> the Dolphin connection was somehow preventing the c1x04 process from runnning.

 +  Unplugged the Dolphin cables on the back side of the Dolphin box and re-plug them to other ports.

   ==> didn't improve the issue.

 + During those tests, c1lsc frequently became frozen. We disabled the automatic-start of c1lsc, c1ass, c1oaf by editting rtsystab.

  ==> after the test we reverted it.

 + We reverted all the things to the previous configuration.

  5610   Tue Oct 4 07:51:36 2011 steveUpdateCDSc1lsc and c1sus are still down

 

 c1lsc and c1sus are still down. Only ETMX and ETMY are damped

  5611   Tue Oct 4 10:35:08 2011 MirkoUpdateCDSc1lsc and c1sus running again

[Alex, Mirko]

Alex fixed the computers this morning. It was in fact a dolphin problem:

Hi Jenne,  figured it out. Even though dxadmin said the Dolphin net was fine, it wasn't. Something happeneed to DIS networkmanager and I had to restart it. It is running on fb: 
controls@fb ~ $ ps -e | grep dis 12280 ?        00:00:00 dis_networkmgr
controls@fb ~ $ sudo /etc/init.d/dis_networkmgr restart
Once the restart was done both c1lsc and c1sus nodes were configured correctly, they printed the usual "node 12 is OK" "node 8 is OK" messages into the dmesg and was able to run /etc/start_fes.sh on lsc and sus to load all the FEs.  Alex

Some lights on c1lsc were still red: C1:DAQ-FBO_C1SYS_SYS and the smaller red light left of it. Restarted the fb. Didn't help. Restarted c1lsc, all green now.
Restored autoburt from Oct 3. 19:07 on c1lsc and c1sus.
  5621   Wed Oct 5 14:18:09 2011 kiwamuUpdateCDSsome DAQ channel lost in c1sus : fb, c1sus and c1pem restarted

I found again the ini files had been refreshed.

I ran the activateDQ.py script (link to the script wiki page) and restarted the daqd process on fb.

The activateDQ.py script should be included into the recompile or rebuild scripts so that we don't have to run the script everytime by hands.

I am going to add this topic on the CDS todo list (wiki page).

Quote from #5561

Somehow some DAQ channels for C1SUS have disappeared from the DAQ channel list.

 

  5663   Thu Oct 13 21:44:48 2011 MirkoUpdateCDSSeismic BLRMS channels, new RMS calculation

[Rana, Koji, Mirko]

We looked into the CDS RMS block c-code as described in Rolfs RCG app guide. Seems the block uses a first order LP filter with a corner freq. / time of 20k execution cycles. There are also some weird thersholds at +-2000counts in there.

I was looking into implementing a hand-made RMS block, by squaring, filtering, rooting. The new RMS (left) seems nicer than the old one (bottom right). Signal was 141counts sinus at 4Hz.

Filters used: Before squaring: 4th order butterworth BP at given freq. & (new) 6th order inverse Chebyshew 20dB at 0.9*lower BP freq. and 1.1*upper BP freq. => about 1dB at BP freq.

                       After squaring: 4th order butterworth LP @ 1Hz.

C1PEM execution time increased from about 20us to about 45us.

Made a new medm screen with the respective filters in place of the empty C1PEM_OVERVIEW. Should go onto the sitemap.

New_RMS_vs_old_RMS.png

Original RMS LP is slower than 0.1Hz, see below for single LP at 0.1Hz in the new RMS. Original RMS is faster than single LP @ 0.01Hz

Original_RMS_LP_slower_than_0.1Hz.png

Some of the channels are recorded as 256Hz DAQ channels now. Need to figure out how to record these as 16Hz EPICS channls.

  5676   Mon Oct 17 10:43:14 2011 MirkoUpdateCDSCommited changes to c1rfm

I want to make changes to c1rfm. Found uncommited changes to it from Sept 27. Since we recompiled it since then it should be safe to commit them, so I did. See svn log for details.

  5677   Mon Oct 17 11:06:31 2011 MirkoUpdateCDSPiping data from c1lsc to c1oaf

Changed, recompiled, installed and restarted c1rfm and c1oaf to get the MC1-3 Pitch and Yaw data into the c1oaf model.
Also changed c1oaf to use MCL as a witness channel (as well as an actuator).
Added the changes to svn.

  5679   Mon Oct 17 14:26:22 2011 MirkoUpdateCDSSeismic BLRMS channels, new RMS calculation

Quote:

[Rana, Koji, Mirko]

We looked into the CDS RMS block c-code as described in Rolfs RCG app guide. Seems the block uses a first order LP filter with a corner freq. / time of 20k execution cycles. There are also some weird thersholds at +-2000counts in there.

I was looking into implementing a hand-made RMS block, by squaring, filtering, rooting. The new RMS (left) seems nicer than the old one (bottom right). Signal was 141counts sinus at 4Hz.

Filters used: Before squaring: 4th order butterworth BP at given freq. & (new) 6th order inverse Chebyshew 20dB at 0.9*lower BP freq. and 1.1*upper BP freq. => about 1dB at BP freq.

                       After squaring: 4th order butterworth LP @ 1Hz.

C1PEM execution time increased from about 20us to about 45us.

Made a new medm screen with the respective filters in place of the empty C1PEM_OVERVIEW. Should go onto the sitemap.

New_RMS_vs_old_RMS.png

Original RMS LP is slower than 0.1Hz, see below for single LP at 0.1Hz in the new RMS. Original RMS is faster than single LP @ 0.01Hz

Original_RMS_LP_slower_than_0.1Hz.png

Some of the channels are recorded as 256Hz DAQ channels now. Need to figure out how to record these as 16Hz EPICS channls.

 Channels are now going into EPICS channels (e.g. C1:PEM-ACC1_RMS_1_3 ). Adapted the PEM_SLOW.ini file. Channels don't yet show up in dataviewer. Probably due to other C1PEM maschine

  5695   Wed Oct 19 12:06:26 2011 MirkoUpdateCDSIncluded the MC servo channel in CDS

[Jamie, Mirko]

Included the 'Servo' output from the D040180 in c1ioo, which I hoped would be a better measure of the MC length fluctuations. It goes into ADC6, labeled CH7 on the physical board.
Servo-output => C1:IOO-MC_SERVO. (Already present is Out1-output => C1:IOO-MC_F).
At low freq. the servo signal is about 4.5dB bigger. Both are recorded at 256Hz now which is the reason for the downward slope at about 100Hz.

MC_F_versus_MC_SERVO.png

Coh_MC_F_MC_SERVO.png

  5702   Wed Oct 19 16:53:38 2011 kiwamuUpdateCDSsome screens need labels

Untitled.png

Some of the sub-suspension screens need labels to describe what those row and column are.

  5727   Fri Oct 21 18:20:54 2011 MirkoUpdateCDSFirst OAF version running

[Jenne, Jamie, Mirko]

We got the first version of the oaf code based on Matt"s code running!! :-)
Produces already data for e.g. MICH DOF. But don"t trust that. It's only 10 taps long and delay is not adjusted.

  5731   Mon Oct 24 20:00:21 2011 MirkoUpdateCDSTiny little scripts

Located in /opt/rtcds/caltech/c1/userapps/release/cds/common/scripts

Script 1: Diagreset.sh
Hits the diag reset buttons on the models on c1lsc and c1sus computers.

Script 2: Burtrest.sh
Restores the burt files from "today" 4am. Use a text editor if you want to change the times.

  5736   Tue Oct 25 18:09:44 2011 jamieUpdateCDSNew DEMOD part

I forgot to elog (bad Jamie) that I broke out the demodulator from the LOCKIN module to make a new DEMOD part:

DEMOD.png

The LOCKIN part now references this part, and the demodulator can now be used independently.  The 'LO SIN in' and 'LO COS in' should receive their input from the SIN and COS outputs of the OSCILLATOR part.

  5755   Fri Oct 28 12:47:38 2011 jamieUpdateCDSCSS/BOY installed on pianosa

I've installed Control System Studio (CSS) on pianosa, from the version 3.0.2 Red Hat binary zip.  It should be available as "css" from the command line.

CSS is a new MEDM replacement. It's output is .opi files, instead of .adl files.  It's supposed to include some sort of converter, but I didn't play with it enough to figure it out.

Please play around with it and let me know if there are any issues.

links:

  5756   Fri Oct 28 14:56:02 2011 JenneUpdateCDSCSS/BOY installed on pianosa

Quote:

I've installed Control System Studio (CSS) on pianosa, from the version 3.0.2 Red Hat binary zip.  It should be available as "css" from the command line.

CSS is a new MEDM replacement. It's output is .opi files, instead of .adl files.  It's supposed to include some sort of converter, but I didn't play with it enough to figure it out.

Please play around with it and let me know if there are any issues.

links:

 So far I've only given it about half an hour of my time, but it is *really* frustrating so far.  There don't seem to be any instructions on how to tell it what our channels are / how to link CSS to our EPICS databases.  Or, the instructions that are there say "do it!", but they neglect to mention how...  Also, there exists (maybe?) an ADL->BOY converter, but I can't find any buttons to click, or how to import an .adl, or what I'm supposed to do.  Also, it's not clear how to get to the editor to start making screens from scratch. 

It looks like it has lots of nifty indicators and buttons, but I would have felt better if I had been able to do anything.

Another thing that is going to be a problem:  the Shell Command button that we use all over the place in our MEDM screens is not supported by this program.  It's listed in the "limitations" of the ADL2BOY converter.  This may kill the CSS program immediately.  Jamie: did Rolf/anyone mention a game plan for this?  It's super nice to be able to run scripts from the screens.

Moral of the story:  I'm annoyed, and going to continue making my OAF screens in MEDM for now.

  5786   Wed Nov 2 17:29:10 2011 KatrinUpdateCDSc1scy.mdl compiled

Slight modification on that model:

  • terminated Q_out of Lockins to be able to compile the old model
  • assigned other ADC channels to GCY (green YARM)
  5789   Wed Nov 2 20:56:49 2011 KatrinUpdateCDSdigital zeros at C1:X05-MADC0 (c1scy)

Channels C1:X05-MADC0_TP_XXX with XXX 2-9, 14-19, 21-27, 29-31 showed digital zeros.

Some of these channels are used in c1scy.mdl, e.g. for OSEM stuff. I guess this is not optimal...

  5790   Wed Nov 2 21:15:06 2011 KatrinUpdateCDSand again c1scy.mdl compiled

I changed an ADC channel for GCY_ERR and thus recompiled the c1scy model.

  5791   Wed Nov 2 21:49:59 2011 KatrinUpdateCDSc1iscey computer died again

while I was not doing anything on the machine.

  5795   Thu Nov 3 15:14:22 2011 KojiUpdateCDSCSS/BOY installed on pianosa

How to run/use CSS/BOY

(PREPARATION)

0) Everything runs on pianosa for now.

1) type css to launch CSS IDE.

2) You may want to create your own project folder as generally everything happens below this folder.

=== How to make a new project ===
- Right-click on tree view of Navigator pane
- You are asked to select a wizard. Select "General -> Project". Click "Next".
- Type in an appropriate project name (like KOJI). Click "Finish"
- The actual location of the project is /home/controls/CSS-Workspaces/Default/KOJI/ in the above example

(PLAY WITH "Data Browser", THE STRIPTOOL ALTERNATIVE)

1) Select the menu "CSS -> Trends -> Data Browser". A new data browser window appears.

2) Right-click on the data browser window. Select the menu "Add PV". Type in the channel name (e.g. "C1:LSC-ASDC_OUT16")

3) Once the plot configuration is completed, it can be saved as a template. Select the menu "File -> Save" and put it in your project folder.

4) Everything else is relatively straightforward. You can add multiple channels. Log scaling is also available.
I still don't find how to split the vertical axis to make a stacked charts, but I don't surprise even if it is not available.
CSS_snapshot1.png

(HOW TO MAKE A NEW "BOY" SCREEN)

0) Simply to say BOY is the alternative of MEDM. The screen file of BOY is named as  "*.opi" similar to "*.adl" for MEDM.

1) To create a new opi file, right-click on the navigator tree and select the menu "New -> Other".

2) You are asked to select a wizard. Select "BOY -> OPI file" and click "Next".

3) Type in the name of the opi file. Also select the location of the file in the project tree. Click "Finish".

4) Now you are in OPI EDITOR. Place your widgets as you like.

5) To test the OPI screen, push the green round button at the top right. The short cut key is  "CTRL-G".
OPI_EDITOR.png

(HOW TO EDIT AN EXISTING OPI FILE)

1) Right click an OPI file in the navigator tree. Select the menu "Open WIth -> OPI Editor". That's it.

(HOW TO CONVERT AN EXISTING ADL FILE INTO OPI FILE)

1) You need to copy your ADL file into your project folder. In this example, it is /home/controls/CSS-Workspaces/Default/KOJI/

2) Once the ADL file is in the project folder, it should appear in the navigator tree. If not, right-click the navigator pane and select "Refresh".

3) Make sure "ADL Parser" button at the left top part is selected, although this has no essential function.
This button just changes the window layout and does nothing. But the ADL Tree View pane would be interesting to see.

3) If you select the ADL file by clicking it,the tree structure of the ADL file is automatically interpreted and appears in the ADL Tree View pane.
But it is just a display and does nothing.

4) Right-click the ADL file in the navigator pane. Now you can see the new menu "BOY". Select "BOY -> Convert ADL File to OPI".

5) Now you get the opi version of that file.
The conversion is not perfect as we can imagine. It works fine for the simple screens.
(e.g. matrix screens)
But the filter module screens get wierd. And the new LSC screen did not work properly (maybe too heavy?)
ADL2OPI1.png ADL2OPI2.png

(HOW TO RUN OUR SHELL/PERL/PYTHON SCRIPTS FROM BOY)

CSS has javascript/python scripting capability.
I suspect that we can make a wrapper to run external commands from python script although it is not obvious yet.

  5841   Tue Nov 8 17:48:21 2011 MirkoUpdateCDSDolphin weirdness

Had since yesterday evening some trouble with getting a channel from rfm on c1sus to oaf on c1lsc via dolphin. Several restarts of c1lsc and c1sus didn't help. At some point this morning a restart of c1lsc helped. Everything ok again.
At the bad time the dolphin TF looked like this:

Dolphn_TF.png

Should be flat at gain 1 and no phase change obviously.

  5854   Wed Nov 9 18:02:42 2011 jamieUpdateCDSISCEX front-end working again (for the moment...)

The c1iscex IO chassis seems to be working again, and the iscex front-end is running again.

However, I can't say that I actually fixed the problem.

Originally I thought the timing slave board had died by the fact that the front LED indicators next to the fiber IO were out.  I didn't initially consider this a power supply problem since there were other leds on the board that were lit.  I finally managed to track down Rolf to give downs the OK to pull the timing boards out of a spare IO chassis for us to use.  However, when I replaced the timing boards in the chassis with the new ones, they showed the exact same behavior.

I then checked the power to the timing boards, which comes off a 2-pin connector from the backplane board in the back of the IO chassis.  Apparently it's supposed to be 12V, but it was only showing ~2.75V.  Since it was showing the same behavior for both timing boards, I assumed that the issue was on the IO chassis backplane.

I (with the help of Todd Etzel) started pulling cards out of the IO chassis (while power cycling appropriately, of course) to see if that changed anything.  After pulling out both the ADC and DAC cards, the timing system then came up fine, with full power.  The weird part is that everything then stayed fine after we started plugging all the cards back in.  We eventually got back to the fully assembled configuration with everything working.  But, nothing was changed, other than just re-seating all the cards.

Clearly there's some sort of flaky connection on the IO chassis board.  Something is prone to shorting, or something, that overloads the power supply and causes the voltage supply to the timing card to drop.

All I can do at this point is keep an eye on it and go through another round of debugging if it happens again.

If it does happen again, I ask that everyone please not touch the IO chassis and let me look at it first.  I want to try to poke around before anyone giggles any cables so I can track down where the issue might be.

  5861   Thu Nov 10 11:52:00 2011 JenneUpdateCDSRFM signal transferring

I am not so happy with the control signals that are coming into the OAF via the RFM/Dolphin/shmem. 

The MCL/MCF signal travels via RFM from the IOO computer to the RFM model on the SUS computer, and then via dolphin to the OAF model on the LSC computer.

The MICH and PRCL signals travel via shmem from the LSC model to the OAF model, all on the LSC computer.  They don't go through the RFM model.

The seismometer channels travel via shmem between the PEM model on the SUS computer and the RFM model on the SUS computer, and then via dolphin between the SUS computer and the OAF model on the LSC computer.

Each pdf shows the power spectrum and a time series of the signals in their "original" model, and in the OAF model.  The seismometer is the only one that seems fine.  The time series match, except for a delay which is not surprising, since the signals have to travel.  The other signals seem pretty distorted.  What is going on??? Why can we trust some, but not all, of the signals that move between models and between computers???

 (This data was all taken while the MC was locked, but MICH and PRCL were not.  I don't think this should have any effect on the signal transfer though).

The MCL isn't soooo bad, so maybe we can keep moving forward with it, but I'm concerned that we're not really going to be successful OAF-ing the other degrees of freedom if the signals are so distorted.

  5895   Tue Nov 15 15:16:04 2011 kiwamuUpdateCDSdataviewer doesn't run

Dataviewer is not able to access to fb somehow.

I restarted daqd on fb but it didn't help.

Also the status screen is showing a blank white form in all the realtime model. Something bad is happening.

blank.png

JAMIEEEE !!!!

  5896   Tue Nov 15 15:56:23 2011 jamieUpdateCDSdataviewer doesn't run

Quote:

Dataviewer is not able to access to fb somehow.

I restarted daqd on fb but it didn't help.

Also the status screen is showing a blank while form in all the realtime model. Something bad is happening.

 So something very strange was happening to the framebuilder (fb).  I logged on the fb and found this being spewed to the logs once a second:

[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory

Apparently /bin/gcore was trying to be called by some daqd subprocess or thread, and was failing since that file doesn't exist.  This apparently started at around 5:52 AM last night:

[Tue Nov 15 05:46:52 2011] main profiler warning: 1 empty blocks in the buffer
[Tue Nov 15 05:46:53 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:54 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:55 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:56 2011] main profiler warning: 0 empty blocks in the buffer
...
[Tue Nov 15 05:52:43 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:52:44 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:52:45 2011] main profiler warning: 0 empty blocks in the buffer
GPS time jumped from 1005400026 to 1005400379
[Tue Nov 15 05:52:46 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 05:52:46 2011] going down on signal 11
sh: /bin/gcore: No such file or directory

The gcore I believe it's looking for is a debugging tool that is able to retrieve images of running processes.  I'm guessing that something caused something int the fb to eat crap, and it was stuck trying to debug itself.  I can't tell what exactly happend, though.  I'll ping the CDS guys about it.  The daqd process was continuing to run, but it was not responding to anything, which is why it could not be restarted via the normal means, and maybe why the various FB0_*_STATUS channels were seemingly dead.

I manually killed the daqd process, and monit seemed to bring up a new process with no problem.  I'll keep an eye on it.

  5901   Tue Nov 15 23:44:44 2011 MirkoUpdateCDSC1:LSC & C1:SUS restarted

Earlier this evening C1:LSC died then I hit the DAQ reload after adding an OAF channel to be recorded. No change to any model. Had to restart C1:SUS too. Reloaded burts from this morning 5am, except for C1:IOO, which I loaded from 16:07.

  5938   Fri Nov 18 01:12:14 2011 SureshUpdateCDSMC1 LR dead for > 1 month; now revived temporarily

[Den, Mirko, Suresh]

    We were investigating why there is no correlation between MC1 osem signals and seismic motion.   During this we noticed a recurrence of this old problem of MC1_LR sensor being dead.  I went and pressed down the chip holders where the AA filters used to sit and which now hold the jumper wire.  The board is large and flexible it is quite likely some solder joint is broken on the MC1_LR path on this board.

   The signal came back to life and is okay now. But it can break off again any time.

 

 

Quote:

 Since the MC1 LRSEN channel is not wasn't working, my input matrix diagonalization hasn't worked today wasn't working. So I decided to fix it somehow.

I went to the rack and traced the signal: first at the LEMO monitor on the whitening card, secondly at the 4-pin LEMO cable which goes into the AA chassis.

The signal existed at the input to the AA chassis but not in the screen. So I pressed the jumper wire (used to be AA filter) down for the channel corresponding to the MC1 LRSEN channel.

It now has come back and looks like the other sensors. As you can see from this plot and Joe's entry from a couple weeks ago, this channel has been dead since May 17th.

The ELOG reveals that Kiwamu caught Steve doing some (un-elogged) fooling around there. Burnt Toast -> Steve.

bt.jpg

993190663   =      free swinging ringdown restarted again

 

  5971   Mon Nov 21 17:07:34 2011 MirkoUpdateCDSc1pem model dead

For some reason C1PEM doesn't seem to work anymore after a recompilation. It did recompile fine. We just changed some channel / subsystem names.

Tried reverting to the svn version. Doesn't work. Reboot C1SUS also no good.

  5973   Mon Nov 21 22:51:55 2011 MirkoUpdateCDSc1pem model dead

Quote:

For some reason C1PEM doesn't seem to work anymore after a recompilation. It did recompile fine. We just changed some channel / subsystem names.

Tried reverting to the svn version. Doesn't work. Reboot C1SUS also no good.

 It is fine again. Thanks Jamie.

  5979   Tue Nov 22 18:15:39 2011 jamieUpdateCDSc1iscex ADC found dead. Replaced, c1iscex working again

c1iscex has not been running for a couple of days (since the power shutdown at least). I was assuming that the problem was recurrence of the c1iscex IO chassis issue from a couple weeks ago (5854).  However, upon investigation I found that the timing signals were all fine.  Instead, the IOP was reporting that it was finding now ADC, even though there is one in the chassis.

Since I had a spare ADC that was going to be used for the CyMAC, I decided to try swapping it out to see if that helped.  Sure enough, the system came up fine with the new ADC.  The IOP (c1x01) and c1scx are now both running fine.

I assume the issue before might have been caused by a failing and flaky ADC, which has now failed.  We'll need to get a new ADC for me to give back to the CyMAC.

  6002   Thu Nov 24 15:27:15 2011 kiwamuUpdateCDSc1iscey hardware rebooted
The c1iscey machine crashed around 1:00 AM last night and I did a hard-ware reboot by pressing a button on the front panel of the machine.
After the reboot its been running okay so far.
The crash happened after I pressed the "Diag Reset" button on the CDS status screen.
  6011   Fri Nov 25 22:11:12 2011 MirkoUpdateCDSBeware of fancy filter modules

[Rana, Den, Mirko]

It seems you can shoot yourself in the foot if your filter modules are too complex.

Den discovered this when looking into the C1:SUS-MC?_SUSPOS filter module named Cheby, consisting of cheby1("LowPass",6,1,12)cheby1("LowPass",2,0.1,3)gain(1.13501) by noticing that the coherence between input and output of the filter is low.

Cheby filter:

Cheby.png

CoherenceCheby.pdf

This is most likely due to the filter spanning more than the 16 orders of precision that the double data type spans.

The coherence is fine when one splits the filter in two, giving every cheby1 filter its own module. The coherence is also fine when you use the Cheby filter in a 2kHz system, although the freq. response looks very odd

Black: 16kHz, Red 2kHz (yes the filter was converted correctly, no text file editing there)

ChebyAt16kHzBlackand2kHzRed.png

The problem occurs on c1lsc as well as c1sus computer.

 

Looking into the foton files actually points to a precision problem, with the huge range of scale covered in there:

C1:MCS 16kHz (Cheby: Original filter with low coherence. CHbyTST & ChebyTST: Original filter split amongst two filter modules)
################################################################################
### SUS_MC3_LSC                                                              ###
################################################################################
# DESIGN   SUS_MC3_LSC 0 zpk([0],[30],0.333333,"n")
# DESIGN   SUS_MC3_LSC 1 cheby1("LowPass",6,1,12)
# DESIGN   SUS_MC3_LSC 2 cheby1("LowPass",2,0.1,3)gain(1.13501) \
#                       
# DESIGN   SUS_MC3_LSC 3 cheby1("LowPass",2,0.1,3)gain(1.13501)cheby1("LowPass",6,1,12)
# DESIGN   SUS_MC3_LSC 4 ellip("BandStop",4,1,40,16.1,16.9)ellip("BandStop",4,1,40,23.7,24.5)gain(1.25871)
###                                                                          ###
SUS_MC3_LSC 0 12 1  32768      0 30:0.0          9.942903833923793  -0.9885608209680459   0.0000000000000000  -1.0000000000000000   0.0000000000000000
SUS_MC3_LSC 1 21 3      0      0 CHbyTST     9.095012702673064e-18  -1.9978637592754149   0.9978663974923444   2.0000000000000000   1.0000000000000000
                                                                 -1.9984258494490537   0.9984376515442090   2.0000000000000000   1.0000000000000000
                                                                 -1.9994068831713223   0.9994278587363880   2.0000000000000000   1.0000000000000000
SUS_MC3_LSC 2 12 1  32768      0 ChebyTST    1.228759186937126e-06  -1.9972699801052749   0.9972743606395355   2.0000000000000000   1.0000000000000000
SUS_MC3_LSC 3 12 4  32768      0 Cheby       1.117558041371939e-23  -1.9972699801052749   0.9972743606395355   2.0000000000000000   1.0000000000000000
                                                                 -1.9978637592754149   0.9978663974923444   2.0000000000000000   1.0000000000000000
                                                                 -1.9984258494490537   0.9984376515442090   2.0000000000000000   1.0000000000000000
                                                                 -1.9994068831713223   0.9994278587363880   2.0000000000000000   1.0000000000000000
SUS_MC3_LSC 4 12 8  32768      0 BounceRoll     0.9991466189294013  -1.9996634951844035   0.9997010181703262  -1.9999611719719754   0.9999999999999997
                                                                 -1.9999303040590390   0.9999684339228864  -1.9999605309876360   0.9999999999999999
                                                                 -1.9999248796830529   0.9999668732412945  -1.9999594299327190   1.0000000000000002
                                                                 -1.9996385459838455   0.9996812069238987  -1.9999587601905868   1.0000000000000000
                                                                 -1.9996161812709703   0.9996978939989944  -1.9999163485656493   0.9999999999999999
                                                                 -1.9998855694973159   0.9999681878303275  -1.9999154056705493   0.9999999999999998
                                                                 -1.9998788577090287   0.9999671193335300  -1.9999137972442669   1.0000000000000000
                                                                 -1.9995951159123118   0.9996843310430819  -1.9999128255920269   1.0000000000000000

C1:OAF 2kHz
###############################################################################
### YARM_IN                                                                  ###
################################################################################
# DESIGN   YARM_IN 0 zpk([0],[30],0.333333,"n")
# DESIGN   YARM_IN 3 cheby1("LowPass",6,1,12)cheby1("LowPass",2,0.1,3)gain(1.13501)
# DESIGN   YARM_IN 4 ellip("BandStop",4,1,40,16.1,16.9)ellip("BandStop",4,1,40,23.7,24.5)gain(1.25871)
# DESIGN   YARM_IN 8 cheby1("LowPass",6,1,12)cheby1("LowPass",2,0.1,3)gain(1.13501)zpk([],[10],1,"n")
###                                                                          ###
YARM_IN  0 12 1   4096      0 30:0.0           9.56649943398763  -0.9119509539166185   0.0000000000000000  -1.0000000000000000   0.0000000000000000
YARM_IN  3 12 4   4096      0 Cheby       1.829878084970283e-16  -1.9828889048300398   0.9830565293861987   2.0000000000000000   1.0000000000000000
                                                                 -1.9868188576622443   0.9875701115261976   2.0000000000000000   1.0000000000000000
                                                                 -1.9940934073784453   0.9954330165532327   2.0000000000000000   1.0000000000000000
                                                                 -1.9781245722853238   0.9784022621062476   2.0000000000000000   1.0000000000000000

  6013   Sat Nov 26 02:05:43 2011 MirkoUpdateCDSBeware of fancy filter modules

 

We replaced the complicated Cheby filter module with three separate filter modules. Probably the filter doesn't need to be so complicated, but rather not change too many things at once. The new filter modules are called:
Ch1, Ch2, Ch3 and are in filter module 3,9, and 10 of the C1:SUS-MC?_SUSPOS filters. The coherence with these filters is fine. Someone should look into the possibility of simplifying these filters.

It would be good to check for numerical problems in other filters!

  6017   Sat Nov 26 10:55:40 2011 ranaUpdateCDSBeware of fancy filter modules

 

 Could be that what we're seeing is the noise floor of the Direct Form II filter structure (see Matt's 2008 elog) which shows an example (also see G0900928-v1 ).

 

  6020   Mon Nov 28 06:53:30 2011 kiwamuUpdateCDSc1sus shutdown

I have restarted the c1sus machine around 9:00 PM yesterday and then shut it down around 4:00 AM this morning after a little bit of taking care of the interferomter.

Quote from #6016

c1sus has been shutdown so that the optics dont bang around.  This is because the watch dogs are not working.

  6026   Mon Nov 28 16:46:55 2011 kiwamuUpdateCDSc1sus is now up

I have restarted the c1sus machine and burt-restored c1sus and c1mcs to the day before Thank giving, namely 23rd of November.

Quote from #6020

I have restarted the c1sus machine around 9:00 PM yesterday and then shut it down around 4:00 AM this morning after a little bit of taking care of the interferometer.

  6030   Mon Nov 28 19:24:51 2011 JenneUpdateCDSBeware of fancy filter modules

[Rana, Jenne]

Some of the funniness is some kind of mysterious interaction between 2 filter modules in the filter banks.  Just FM1 (30:0.0) or just FM4 (Cheby, which is 2 cheby1's) has reasonable coherence.  Both FM1 and FM4 together doesn't do so well - the coherence goes way down.

Just FM1 (30:0.0)

SUSPOS_ETMY_30and0_measured_vs_idealTF.pdf

Just FM4 (Cheby)

SUSPOS_ETMY_Cheby_measured_vs_idealTF.pdf

 Both FM1 and FM4

SUSPOS_ETMY_30and0andCheby_measured_vs_idealTF.pdf

 All the coherences plotted together

SUSPOS_ETMY_30and0andCheby_compareCoherence.pdf

You'd think that the signal encounters FM1, gets filtered, and that result is the signal sent to the next active filter module, FM4, so the 2 filter modules shouldn't interact.  But clearly there's some funny business here since engaging both makes things crappy. 

Matlab investigations to replicate this behavior offline are in progress.

  6031   Mon Nov 28 22:09:24 2011 ranaUpdateCDSBeware of fancy filter modules

To see what might be causing the problem, I used a version of the filter noise test matlab code that Matt had in the elog.

To see if it was a single precision problem, I just recast the input data:   x = single(x)

This is not strictly correct, since some of the rest of the operations are as double precision, but I think that attached plot shows that a casting from double to single is close to the right amount of noise to explain our excess noise problem in the 0.1-1 Hz region.

Den is going to interview Alex to find out if we have some kind of issue like this. My understanding was that all of our filter module calculations were being done in double precision (64 bit), but its possible that some single stuff has crept back in. Currently the FIR filtering code IS single precision and in the past, the SUS code which didn't carry the LSC signals (meaning ASC and damping) were done in single precision.

  6033   Tue Nov 29 04:47:49 2011 kiwamuUpdateCDSc1sus shut down again

I have shut down the c1sus machine at 3:30 AM.

  6037   Tue Nov 29 15:30:01 2011 jamieUpdateCDSlocation of currently used filter function

So I tracked down where the currently-used filter function code is defined (the following is all relative to /opt/rtcds/caltech/c1/core/release):

Looking at one of the generated front-end C source codes (src/fe/c1lsc/c1lsc.c) it looks like the relevant filter function is:

filterModuleD()

which is defined in:

src/include/drv/fm10Gen.c

and an associated header file is:

src/include/fm10Gen.h
  6038   Tue Nov 29 15:57:43 2011 DenUpdateCDSlocation of currently used filter function

 

We are interested in the following question : Can the structures defined in fm10Gen.h (or some other *.c *.h files with defined as FLOAT variables) create single precision instead of double in the filter calculations?

 

typedef struct FM_OP_IN{
  UINT32 opSwitchE;     /* Epics Switch Control Register; 28/32 bits used*/
  UINT32 opSwitchP;     /* PIII Switch Control Register; 28/32 bits used*/
  UINT32 rset;          /* reset switches */
  float offset;         /* signal offset */
  float outgain;        /* module gain */
  float limiter;        /* used to limit the filter output to +/- limit val */
  int rmpcmp[FILTERS];  /* ramp counts: ramps on a filter for type 2 output*/
                        /* comparison limit: compare limit for type 3 output*/
                        /* not used for type 1 output filter */
  int timeout[FILTERS]; /* used to timeout wait in type 3 output filter */
  int cnt[FILTERS];     /* used to keep track of up and down cnt of rmpcmp */
                        /* should be initialized to zero */
  float gain_ramp_time; /* gain change ramping time in seconds */
} FM_OP_IN;  

 

ELOG V3.1.3-