40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 41 of 341  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  5731   Mon Oct 24 20:00:21 2011 MirkoUpdateCDSTiny little scripts

Located in /opt/rtcds/caltech/c1/userapps/release/cds/common/scripts

Script 1: Diagreset.sh
Hits the diag reset buttons on the models on c1lsc and c1sus computers.

Script 2: Burtrest.sh
Restores the burt files from "today" 4am. Use a text editor if you want to change the times.

  5736   Tue Oct 25 18:09:44 2011 jamieUpdateCDSNew DEMOD part

I forgot to elog (bad Jamie) that I broke out the demodulator from the LOCKIN module to make a new DEMOD part:

DEMOD.png

The LOCKIN part now references this part, and the demodulator can now be used independently.  The 'LO SIN in' and 'LO COS in' should receive their input from the SIN and COS outputs of the OSCILLATOR part.

  5755   Fri Oct 28 12:47:38 2011 jamieUpdateCDSCSS/BOY installed on pianosa

I've installed Control System Studio (CSS) on pianosa, from the version 3.0.2 Red Hat binary zip.  It should be available as "css" from the command line.

CSS is a new MEDM replacement. It's output is .opi files, instead of .adl files.  It's supposed to include some sort of converter, but I didn't play with it enough to figure it out.

Please play around with it and let me know if there are any issues.

links:

  5756   Fri Oct 28 14:56:02 2011 JenneUpdateCDSCSS/BOY installed on pianosa

Quote:

I've installed Control System Studio (CSS) on pianosa, from the version 3.0.2 Red Hat binary zip.  It should be available as "css" from the command line.

CSS is a new MEDM replacement. It's output is .opi files, instead of .adl files.  It's supposed to include some sort of converter, but I didn't play with it enough to figure it out.

Please play around with it and let me know if there are any issues.

links:

 So far I've only given it about half an hour of my time, but it is *really* frustrating so far.  There don't seem to be any instructions on how to tell it what our channels are / how to link CSS to our EPICS databases.  Or, the instructions that are there say "do it!", but they neglect to mention how...  Also, there exists (maybe?) an ADL->BOY converter, but I can't find any buttons to click, or how to import an .adl, or what I'm supposed to do.  Also, it's not clear how to get to the editor to start making screens from scratch. 

It looks like it has lots of nifty indicators and buttons, but I would have felt better if I had been able to do anything.

Another thing that is going to be a problem:  the Shell Command button that we use all over the place in our MEDM screens is not supported by this program.  It's listed in the "limitations" of the ADL2BOY converter.  This may kill the CSS program immediately.  Jamie: did Rolf/anyone mention a game plan for this?  It's super nice to be able to run scripts from the screens.

Moral of the story:  I'm annoyed, and going to continue making my OAF screens in MEDM for now.

  5786   Wed Nov 2 17:29:10 2011 KatrinUpdateCDSc1scy.mdl compiled

Slight modification on that model:

  • terminated Q_out of Lockins to be able to compile the old model
  • assigned other ADC channels to GCY (green YARM)
  5789   Wed Nov 2 20:56:49 2011 KatrinUpdateCDSdigital zeros at C1:X05-MADC0 (c1scy)

Channels C1:X05-MADC0_TP_XXX with XXX 2-9, 14-19, 21-27, 29-31 showed digital zeros.

Some of these channels are used in c1scy.mdl, e.g. for OSEM stuff. I guess this is not optimal...

  5790   Wed Nov 2 21:15:06 2011 KatrinUpdateCDSand again c1scy.mdl compiled

I changed an ADC channel for GCY_ERR and thus recompiled the c1scy model.

  5791   Wed Nov 2 21:49:59 2011 KatrinUpdateCDSc1iscey computer died again

while I was not doing anything on the machine.

  5795   Thu Nov 3 15:14:22 2011 KojiUpdateCDSCSS/BOY installed on pianosa

How to run/use CSS/BOY

(PREPARATION)

0) Everything runs on pianosa for now.

1) type css to launch CSS IDE.

2) You may want to create your own project folder as generally everything happens below this folder.

=== How to make a new project ===
- Right-click on tree view of Navigator pane
- You are asked to select a wizard. Select "General -> Project". Click "Next".
- Type in an appropriate project name (like KOJI). Click "Finish"
- The actual location of the project is /home/controls/CSS-Workspaces/Default/KOJI/ in the above example

(PLAY WITH "Data Browser", THE STRIPTOOL ALTERNATIVE)

1) Select the menu "CSS -> Trends -> Data Browser". A new data browser window appears.

2) Right-click on the data browser window. Select the menu "Add PV". Type in the channel name (e.g. "C1:LSC-ASDC_OUT16")

3) Once the plot configuration is completed, it can be saved as a template. Select the menu "File -> Save" and put it in your project folder.

4) Everything else is relatively straightforward. You can add multiple channels. Log scaling is also available.
I still don't find how to split the vertical axis to make a stacked charts, but I don't surprise even if it is not available.
CSS_snapshot1.png

(HOW TO MAKE A NEW "BOY" SCREEN)

0) Simply to say BOY is the alternative of MEDM. The screen file of BOY is named as  "*.opi" similar to "*.adl" for MEDM.

1) To create a new opi file, right-click on the navigator tree and select the menu "New -> Other".

2) You are asked to select a wizard. Select "BOY -> OPI file" and click "Next".

3) Type in the name of the opi file. Also select the location of the file in the project tree. Click "Finish".

4) Now you are in OPI EDITOR. Place your widgets as you like.

5) To test the OPI screen, push the green round button at the top right. The short cut key is  "CTRL-G".
OPI_EDITOR.png

(HOW TO EDIT AN EXISTING OPI FILE)

1) Right click an OPI file in the navigator tree. Select the menu "Open WIth -> OPI Editor". That's it.

(HOW TO CONVERT AN EXISTING ADL FILE INTO OPI FILE)

1) You need to copy your ADL file into your project folder. In this example, it is /home/controls/CSS-Workspaces/Default/KOJI/

2) Once the ADL file is in the project folder, it should appear in the navigator tree. If not, right-click the navigator pane and select "Refresh".

3) Make sure "ADL Parser" button at the left top part is selected, although this has no essential function.
This button just changes the window layout and does nothing. But the ADL Tree View pane would be interesting to see.

3) If you select the ADL file by clicking it,the tree structure of the ADL file is automatically interpreted and appears in the ADL Tree View pane.
But it is just a display and does nothing.

4) Right-click the ADL file in the navigator pane. Now you can see the new menu "BOY". Select "BOY -> Convert ADL File to OPI".

5) Now you get the opi version of that file.
The conversion is not perfect as we can imagine. It works fine for the simple screens.
(e.g. matrix screens)
But the filter module screens get wierd. And the new LSC screen did not work properly (maybe too heavy?)
ADL2OPI1.png ADL2OPI2.png

(HOW TO RUN OUR SHELL/PERL/PYTHON SCRIPTS FROM BOY)

CSS has javascript/python scripting capability.
I suspect that we can make a wrapper to run external commands from python script although it is not obvious yet.

  5841   Tue Nov 8 17:48:21 2011 MirkoUpdateCDSDolphin weirdness

Had since yesterday evening some trouble with getting a channel from rfm on c1sus to oaf on c1lsc via dolphin. Several restarts of c1lsc and c1sus didn't help. At some point this morning a restart of c1lsc helped. Everything ok again.
At the bad time the dolphin TF looked like this:

Dolphn_TF.png

Should be flat at gain 1 and no phase change obviously.

  5854   Wed Nov 9 18:02:42 2011 jamieUpdateCDSISCEX front-end working again (for the moment...)

The c1iscex IO chassis seems to be working again, and the iscex front-end is running again.

However, I can't say that I actually fixed the problem.

Originally I thought the timing slave board had died by the fact that the front LED indicators next to the fiber IO were out.  I didn't initially consider this a power supply problem since there were other leds on the board that were lit.  I finally managed to track down Rolf to give downs the OK to pull the timing boards out of a spare IO chassis for us to use.  However, when I replaced the timing boards in the chassis with the new ones, they showed the exact same behavior.

I then checked the power to the timing boards, which comes off a 2-pin connector from the backplane board in the back of the IO chassis.  Apparently it's supposed to be 12V, but it was only showing ~2.75V.  Since it was showing the same behavior for both timing boards, I assumed that the issue was on the IO chassis backplane.

I (with the help of Todd Etzel) started pulling cards out of the IO chassis (while power cycling appropriately, of course) to see if that changed anything.  After pulling out both the ADC and DAC cards, the timing system then came up fine, with full power.  The weird part is that everything then stayed fine after we started plugging all the cards back in.  We eventually got back to the fully assembled configuration with everything working.  But, nothing was changed, other than just re-seating all the cards.

Clearly there's some sort of flaky connection on the IO chassis board.  Something is prone to shorting, or something, that overloads the power supply and causes the voltage supply to the timing card to drop.

All I can do at this point is keep an eye on it and go through another round of debugging if it happens again.

If it does happen again, I ask that everyone please not touch the IO chassis and let me look at it first.  I want to try to poke around before anyone giggles any cables so I can track down where the issue might be.

  5861   Thu Nov 10 11:52:00 2011 JenneUpdateCDSRFM signal transferring

I am not so happy with the control signals that are coming into the OAF via the RFM/Dolphin/shmem. 

The MCL/MCF signal travels via RFM from the IOO computer to the RFM model on the SUS computer, and then via dolphin to the OAF model on the LSC computer.

The MICH and PRCL signals travel via shmem from the LSC model to the OAF model, all on the LSC computer.  They don't go through the RFM model.

The seismometer channels travel via shmem between the PEM model on the SUS computer and the RFM model on the SUS computer, and then via dolphin between the SUS computer and the OAF model on the LSC computer.

Each pdf shows the power spectrum and a time series of the signals in their "original" model, and in the OAF model.  The seismometer is the only one that seems fine.  The time series match, except for a delay which is not surprising, since the signals have to travel.  The other signals seem pretty distorted.  What is going on??? Why can we trust some, but not all, of the signals that move between models and between computers???

 (This data was all taken while the MC was locked, but MICH and PRCL were not.  I don't think this should have any effect on the signal transfer though).

The MCL isn't soooo bad, so maybe we can keep moving forward with it, but I'm concerned that we're not really going to be successful OAF-ing the other degrees of freedom if the signals are so distorted.

  5895   Tue Nov 15 15:16:04 2011 kiwamuUpdateCDSdataviewer doesn't run

Dataviewer is not able to access to fb somehow.

I restarted daqd on fb but it didn't help.

Also the status screen is showing a blank white form in all the realtime model. Something bad is happening.

blank.png

JAMIEEEE !!!!

  5896   Tue Nov 15 15:56:23 2011 jamieUpdateCDSdataviewer doesn't run

Quote:

Dataviewer is not able to access to fb somehow.

I restarted daqd on fb but it didn't help.

Also the status screen is showing a blank while form in all the realtime model. Something bad is happening.

 So something very strange was happening to the framebuilder (fb).  I logged on the fb and found this being spewed to the logs once a second:

[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory

Apparently /bin/gcore was trying to be called by some daqd subprocess or thread, and was failing since that file doesn't exist.  This apparently started at around 5:52 AM last night:

[Tue Nov 15 05:46:52 2011] main profiler warning: 1 empty blocks in the buffer
[Tue Nov 15 05:46:53 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:54 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:55 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:56 2011] main profiler warning: 0 empty blocks in the buffer
...
[Tue Nov 15 05:52:43 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:52:44 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:52:45 2011] main profiler warning: 0 empty blocks in the buffer
GPS time jumped from 1005400026 to 1005400379
[Tue Nov 15 05:52:46 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 05:52:46 2011] going down on signal 11
sh: /bin/gcore: No such file or directory

The gcore I believe it's looking for is a debugging tool that is able to retrieve images of running processes.  I'm guessing that something caused something int the fb to eat crap, and it was stuck trying to debug itself.  I can't tell what exactly happend, though.  I'll ping the CDS guys about it.  The daqd process was continuing to run, but it was not responding to anything, which is why it could not be restarted via the normal means, and maybe why the various FB0_*_STATUS channels were seemingly dead.

I manually killed the daqd process, and monit seemed to bring up a new process with no problem.  I'll keep an eye on it.

  5901   Tue Nov 15 23:44:44 2011 MirkoUpdateCDSC1:LSC & C1:SUS restarted

Earlier this evening C1:LSC died then I hit the DAQ reload after adding an OAF channel to be recorded. No change to any model. Had to restart C1:SUS too. Reloaded burts from this morning 5am, except for C1:IOO, which I loaded from 16:07.

  5938   Fri Nov 18 01:12:14 2011 SureshUpdateCDSMC1 LR dead for > 1 month; now revived temporarily

[Den, Mirko, Suresh]

    We were investigating why there is no correlation between MC1 osem signals and seismic motion.   During this we noticed a recurrence of this old problem of MC1_LR sensor being dead.  I went and pressed down the chip holders where the AA filters used to sit and which now hold the jumper wire.  The board is large and flexible it is quite likely some solder joint is broken on the MC1_LR path on this board.

   The signal came back to life and is okay now. But it can break off again any time.

 

 

Quote:

 Since the MC1 LRSEN channel is not wasn't working, my input matrix diagonalization hasn't worked today wasn't working. So I decided to fix it somehow.

I went to the rack and traced the signal: first at the LEMO monitor on the whitening card, secondly at the 4-pin LEMO cable which goes into the AA chassis.

The signal existed at the input to the AA chassis but not in the screen. So I pressed the jumper wire (used to be AA filter) down for the channel corresponding to the MC1 LRSEN channel.

It now has come back and looks like the other sensors. As you can see from this plot and Joe's entry from a couple weeks ago, this channel has been dead since May 17th.

The ELOG reveals that Kiwamu caught Steve doing some (un-elogged) fooling around there. Burnt Toast -> Steve.

bt.jpg

993190663   =      free swinging ringdown restarted again

 

  5971   Mon Nov 21 17:07:34 2011 MirkoUpdateCDSc1pem model dead

For some reason C1PEM doesn't seem to work anymore after a recompilation. It did recompile fine. We just changed some channel / subsystem names.

Tried reverting to the svn version. Doesn't work. Reboot C1SUS also no good.

  5973   Mon Nov 21 22:51:55 2011 MirkoUpdateCDSc1pem model dead

Quote:

For some reason C1PEM doesn't seem to work anymore after a recompilation. It did recompile fine. We just changed some channel / subsystem names.

Tried reverting to the svn version. Doesn't work. Reboot C1SUS also no good.

 It is fine again. Thanks Jamie.

  5979   Tue Nov 22 18:15:39 2011 jamieUpdateCDSc1iscex ADC found dead. Replaced, c1iscex working again

c1iscex has not been running for a couple of days (since the power shutdown at least). I was assuming that the problem was recurrence of the c1iscex IO chassis issue from a couple weeks ago (5854).  However, upon investigation I found that the timing signals were all fine.  Instead, the IOP was reporting that it was finding now ADC, even though there is one in the chassis.

Since I had a spare ADC that was going to be used for the CyMAC, I decided to try swapping it out to see if that helped.  Sure enough, the system came up fine with the new ADC.  The IOP (c1x01) and c1scx are now both running fine.

I assume the issue before might have been caused by a failing and flaky ADC, which has now failed.  We'll need to get a new ADC for me to give back to the CyMAC.

  6002   Thu Nov 24 15:27:15 2011 kiwamuUpdateCDSc1iscey hardware rebooted
The c1iscey machine crashed around 1:00 AM last night and I did a hard-ware reboot by pressing a button on the front panel of the machine.
After the reboot its been running okay so far.
The crash happened after I pressed the "Diag Reset" button on the CDS status screen.
  6011   Fri Nov 25 22:11:12 2011 MirkoUpdateCDSBeware of fancy filter modules

[Rana, Den, Mirko]

It seems you can shoot yourself in the foot if your filter modules are too complex.

Den discovered this when looking into the C1:SUS-MC?_SUSPOS filter module named Cheby, consisting of cheby1("LowPass",6,1,12)cheby1("LowPass",2,0.1,3)gain(1.13501) by noticing that the coherence between input and output of the filter is low.

Cheby filter:

Cheby.png

CoherenceCheby.pdf

This is most likely due to the filter spanning more than the 16 orders of precision that the double data type spans.

The coherence is fine when one splits the filter in two, giving every cheby1 filter its own module. The coherence is also fine when you use the Cheby filter in a 2kHz system, although the freq. response looks very odd

Black: 16kHz, Red 2kHz (yes the filter was converted correctly, no text file editing there)

ChebyAt16kHzBlackand2kHzRed.png

The problem occurs on c1lsc as well as c1sus computer.

 

Looking into the foton files actually points to a precision problem, with the huge range of scale covered in there:

C1:MCS 16kHz (Cheby: Original filter with low coherence. CHbyTST & ChebyTST: Original filter split amongst two filter modules)
################################################################################
### SUS_MC3_LSC                                                              ###
################################################################################
# DESIGN   SUS_MC3_LSC 0 zpk([0],[30],0.333333,"n")
# DESIGN   SUS_MC3_LSC 1 cheby1("LowPass",6,1,12)
# DESIGN   SUS_MC3_LSC 2 cheby1("LowPass",2,0.1,3)gain(1.13501) \
#                       
# DESIGN   SUS_MC3_LSC 3 cheby1("LowPass",2,0.1,3)gain(1.13501)cheby1("LowPass",6,1,12)
# DESIGN   SUS_MC3_LSC 4 ellip("BandStop",4,1,40,16.1,16.9)ellip("BandStop",4,1,40,23.7,24.5)gain(1.25871)
###                                                                          ###
SUS_MC3_LSC 0 12 1  32768      0 30:0.0          9.942903833923793  -0.9885608209680459   0.0000000000000000  -1.0000000000000000   0.0000000000000000
SUS_MC3_LSC 1 21 3      0      0 CHbyTST     9.095012702673064e-18  -1.9978637592754149   0.9978663974923444   2.0000000000000000   1.0000000000000000
                                                                 -1.9984258494490537   0.9984376515442090   2.0000000000000000   1.0000000000000000
                                                                 -1.9994068831713223   0.9994278587363880   2.0000000000000000   1.0000000000000000
SUS_MC3_LSC 2 12 1  32768      0 ChebyTST    1.228759186937126e-06  -1.9972699801052749   0.9972743606395355   2.0000000000000000   1.0000000000000000
SUS_MC3_LSC 3 12 4  32768      0 Cheby       1.117558041371939e-23  -1.9972699801052749   0.9972743606395355   2.0000000000000000   1.0000000000000000
                                                                 -1.9978637592754149   0.9978663974923444   2.0000000000000000   1.0000000000000000
                                                                 -1.9984258494490537   0.9984376515442090   2.0000000000000000   1.0000000000000000
                                                                 -1.9994068831713223   0.9994278587363880   2.0000000000000000   1.0000000000000000
SUS_MC3_LSC 4 12 8  32768      0 BounceRoll     0.9991466189294013  -1.9996634951844035   0.9997010181703262  -1.9999611719719754   0.9999999999999997
                                                                 -1.9999303040590390   0.9999684339228864  -1.9999605309876360   0.9999999999999999
                                                                 -1.9999248796830529   0.9999668732412945  -1.9999594299327190   1.0000000000000002
                                                                 -1.9996385459838455   0.9996812069238987  -1.9999587601905868   1.0000000000000000
                                                                 -1.9996161812709703   0.9996978939989944  -1.9999163485656493   0.9999999999999999
                                                                 -1.9998855694973159   0.9999681878303275  -1.9999154056705493   0.9999999999999998
                                                                 -1.9998788577090287   0.9999671193335300  -1.9999137972442669   1.0000000000000000
                                                                 -1.9995951159123118   0.9996843310430819  -1.9999128255920269   1.0000000000000000

C1:OAF 2kHz
###############################################################################
### YARM_IN                                                                  ###
################################################################################
# DESIGN   YARM_IN 0 zpk([0],[30],0.333333,"n")
# DESIGN   YARM_IN 3 cheby1("LowPass",6,1,12)cheby1("LowPass",2,0.1,3)gain(1.13501)
# DESIGN   YARM_IN 4 ellip("BandStop",4,1,40,16.1,16.9)ellip("BandStop",4,1,40,23.7,24.5)gain(1.25871)
# DESIGN   YARM_IN 8 cheby1("LowPass",6,1,12)cheby1("LowPass",2,0.1,3)gain(1.13501)zpk([],[10],1,"n")
###                                                                          ###
YARM_IN  0 12 1   4096      0 30:0.0           9.56649943398763  -0.9119509539166185   0.0000000000000000  -1.0000000000000000   0.0000000000000000
YARM_IN  3 12 4   4096      0 Cheby       1.829878084970283e-16  -1.9828889048300398   0.9830565293861987   2.0000000000000000   1.0000000000000000
                                                                 -1.9868188576622443   0.9875701115261976   2.0000000000000000   1.0000000000000000
                                                                 -1.9940934073784453   0.9954330165532327   2.0000000000000000   1.0000000000000000
                                                                 -1.9781245722853238   0.9784022621062476   2.0000000000000000   1.0000000000000000

  6013   Sat Nov 26 02:05:43 2011 MirkoUpdateCDSBeware of fancy filter modules

 

We replaced the complicated Cheby filter module with three separate filter modules. Probably the filter doesn't need to be so complicated, but rather not change too many things at once. The new filter modules are called:
Ch1, Ch2, Ch3 and are in filter module 3,9, and 10 of the C1:SUS-MC?_SUSPOS filters. The coherence with these filters is fine. Someone should look into the possibility of simplifying these filters.

It would be good to check for numerical problems in other filters!

  6017   Sat Nov 26 10:55:40 2011 ranaUpdateCDSBeware of fancy filter modules

 

 Could be that what we're seeing is the noise floor of the Direct Form II filter structure (see Matt's 2008 elog) which shows an example (also see G0900928-v1 ).

 

  6020   Mon Nov 28 06:53:30 2011 kiwamuUpdateCDSc1sus shutdown

I have restarted the c1sus machine around 9:00 PM yesterday and then shut it down around 4:00 AM this morning after a little bit of taking care of the interferomter.

Quote from #6016

c1sus has been shutdown so that the optics dont bang around.  This is because the watch dogs are not working.

  6026   Mon Nov 28 16:46:55 2011 kiwamuUpdateCDSc1sus is now up

I have restarted the c1sus machine and burt-restored c1sus and c1mcs to the day before Thank giving, namely 23rd of November.

Quote from #6020

I have restarted the c1sus machine around 9:00 PM yesterday and then shut it down around 4:00 AM this morning after a little bit of taking care of the interferometer.

  6030   Mon Nov 28 19:24:51 2011 JenneUpdateCDSBeware of fancy filter modules

[Rana, Jenne]

Some of the funniness is some kind of mysterious interaction between 2 filter modules in the filter banks.  Just FM1 (30:0.0) or just FM4 (Cheby, which is 2 cheby1's) has reasonable coherence.  Both FM1 and FM4 together doesn't do so well - the coherence goes way down.

Just FM1 (30:0.0)

SUSPOS_ETMY_30and0_measured_vs_idealTF.pdf

Just FM4 (Cheby)

SUSPOS_ETMY_Cheby_measured_vs_idealTF.pdf

 Both FM1 and FM4

SUSPOS_ETMY_30and0andCheby_measured_vs_idealTF.pdf

 All the coherences plotted together

SUSPOS_ETMY_30and0andCheby_compareCoherence.pdf

You'd think that the signal encounters FM1, gets filtered, and that result is the signal sent to the next active filter module, FM4, so the 2 filter modules shouldn't interact.  But clearly there's some funny business here since engaging both makes things crappy. 

Matlab investigations to replicate this behavior offline are in progress.

  6031   Mon Nov 28 22:09:24 2011 ranaUpdateCDSBeware of fancy filter modules

To see what might be causing the problem, I used a version of the filter noise test matlab code that Matt had in the elog.

To see if it was a single precision problem, I just recast the input data:   x = single(x)

This is not strictly correct, since some of the rest of the operations are as double precision, but I think that attached plot shows that a casting from double to single is close to the right amount of noise to explain our excess noise problem in the 0.1-1 Hz region.

Den is going to interview Alex to find out if we have some kind of issue like this. My understanding was that all of our filter module calculations were being done in double precision (64 bit), but its possible that some single stuff has crept back in. Currently the FIR filtering code IS single precision and in the past, the SUS code which didn't carry the LSC signals (meaning ASC and damping) were done in single precision.

  6033   Tue Nov 29 04:47:49 2011 kiwamuUpdateCDSc1sus shut down again

I have shut down the c1sus machine at 3:30 AM.

  6037   Tue Nov 29 15:30:01 2011 jamieUpdateCDSlocation of currently used filter function

So I tracked down where the currently-used filter function code is defined (the following is all relative to /opt/rtcds/caltech/c1/core/release):

Looking at one of the generated front-end C source codes (src/fe/c1lsc/c1lsc.c) it looks like the relevant filter function is:

filterModuleD()

which is defined in:

src/include/drv/fm10Gen.c

and an associated header file is:

src/include/fm10Gen.h
  6038   Tue Nov 29 15:57:43 2011 DenUpdateCDSlocation of currently used filter function

 

We are interested in the following question : Can the structures defined in fm10Gen.h (or some other *.c *.h files with defined as FLOAT variables) create single precision instead of double in the filter calculations?

 

typedef struct FM_OP_IN{
  UINT32 opSwitchE;     /* Epics Switch Control Register; 28/32 bits used*/
  UINT32 opSwitchP;     /* PIII Switch Control Register; 28/32 bits used*/
  UINT32 rset;          /* reset switches */
  float offset;         /* signal offset */
  float outgain;        /* module gain */
  float limiter;        /* used to limit the filter output to +/- limit val */
  int rmpcmp[FILTERS];  /* ramp counts: ramps on a filter for type 2 output*/
                        /* comparison limit: compare limit for type 3 output*/
                        /* not used for type 1 output filter */
  int timeout[FILTERS]; /* used to timeout wait in type 3 output filter */
  int cnt[FILTERS];     /* used to keep track of up and down cnt of rmpcmp */
                        /* should be initialized to zero */
  float gain_ramp_time; /* gain change ramping time in seconds */
} FM_OP_IN;  

 

  6042   Tue Nov 29 18:54:29 2011 kiwamuUpdateCDSc1sus machine up

[Zach / Kiwamu]

 Woke up the c1sus machine in order to lock PSL to MC so that we can observe the effect of not having the EOM heater.

  6048   Wed Nov 30 01:35:49 2011 JenneUpdateCDSOSEM noise / nullstream and what does it mean for satellites

I'm picking points off of this no-magnet OSEM plot, and I thought I'd write them down somewhere so I don't have to do it again when I lose my sticky note...

1e-2 Hz        1.05e-2 um/rtHz

1e-1 Hz        3.4e-3 um/rtHz

1 Hz            1.3e-3 um/rtHz

10 Hz          2.5e-4 um/rtHz

60 Hz          7.5e-5 um/rtHz

100 Hz        7e-5 um/rtHz

400 Hz        7e-5 um/rtHz

  6049   Wed Nov 30 02:04:26 2011 rana, den, jenne, kiwamu, jzweizigUpdateCDSFiltering Noise issue tracked down ???

 You can read through all of our past tests to see what didn't work in tracking things down. As Den mentions, there was actually a lot of evidence that there was some double->single precision action in the filter calculation causing the noise we saw.

However, it turns out that this is NOT the case.

This afternoon I was so confused that I enlisted JZ to help us out. He came over and I tried to replicate the error. When looking at the time series, we noticed that it wasn't random noise; the signals seem to be getting clipped as they crossed zero. Sort of like a stiction problem. JZ left to go replicate the error on an offline system.

This turned out to be the important clue. As we examine the code we find this inside of fm10Gen.c:

if((new_hist < 1e-20) && (new_hist > -1e-20)) new_hist = new_hist<0 ? -1e-20: 1e-20;

this is line is basically trapping the filter history at 1e-20, to prevent some kind of numerical underflow problem (?). Seems reasonable, except that some filters which have higher order low passing in them actually have an overall scale factor which can be small (even as small as 1e-23, as Den pointed out).

So the reason we saw such weird behavior is that the first filter in SUSPOS is an AC coupling filter. This takes the OSEM signal and remove the large mean value. Then the next filter multiplies it by 1e-23 before doing the filtering and you end up with this noise in the filter history.

I looked and this line is commented out in the new BiQuad code, but as far as I can tell this issue has been around in aLIGO, eLIGO, iLIGO, etc. for a long time and could have been causing many cases of excess noise whenever we ended up a tiny gain factor in an IIR filter. At the 40m, there are easily a hundred such cases.

For now, I suppose we can just change this number to 1e-40 or so. I don't know how to calculate what the right number should be. Not sure why this underflow is not an issue for the BiQuad, however.

  6051   Wed Nov 30 11:04:26 2011 josephbUpdateCDSFiltering Noise issue tracked down ???

Quote:

For now, I suppose we can just change this number to 1e-40 or so. I don't know how to calculate what the right number should be. Not sure why this underflow is not an issue for the BiQuad, however.

According to the RCG SVN logs, the reason it was removed was a more general change done to the compiled code, not specific to just the biquad.  Basically, the ability to have an underflow number (subnormal) has been turned off completely by having any number that underflows set to zero. I'm not positive, but from a quick search looks that the smallest number before hitting is an underflow as a double is 2.2250738585072014e-308.

Alex's entry from the SVN log for 2663:

Added new fz_daz() function to turn on two bits in the FPU SSE control register.
Bits FZ (flush underflows to zero) and DOZ (denorms are zeros) are set to
avoid runaway code on float/double denorms (really small numbers).
Ref: http://software.intel.com/en-us/articles/how-to-avoid-performance-penalties-for-gradual-underflow-behavior/

SVN log 2664:

Removed +- 1e-20 limiting code, this is taken care of by setting FZ/DOZ bits
in the CPU SEE control register (see mathInline.h)

SVN log 2665:

Kill the underflows and roll down float denorms to zero,
see fz_doz() in mathInline.h.

  6052   Wed Nov 30 11:36:12 2011 DenUpdateCDSFiltering Noise issue tracked down ???

Quote:

 if((new_hist < 1e-20) && (new_hist > -1e-20)) new_hist = new_hist<0 ? -1e-20: 1e-20;

20 is indeed a random number. We can change it to 300. Alex said that during that iir filter calculations sometimes numbers are very small and if they are less then 1e-308 then a very slow code in the processor is executed and this will crash the online system. For single precision this number is 1e-38 and may be 10 years ago it was not decided for sure what to use - float or double. 20 will be "OK" for both but as we can see causes other problems.

Anyway, Alex removed this line from the code and added another code that sets the two proper bits in the MXCSR register and prohibits to the CPU to run the slow code. As far as I understand if the numbers are less then 1e-308 they become 0. Roughly, this is equivalent to 

 if((new_hist < 1e-308) && (new_hist > -1e-308)) new_hist = 0;

This is in 2.4 release. It is in the svn. I think we can install it and figure out if the problem is gone.

 

  6091   Thu Dec 8 19:48:23 2011 kiwamuUpdateCDSrestarted c1lsc machine and daqd

Since the c1lsc machine became frozen I restarted the c1lsc machined and daqd.

Then I burtrestored c1lsc, c1ass and c1oaf to this evening. They seem running okay.

  6095   Fri Dec 9 15:14:41 2011 DenUpdateCDSrelease 2.4

Alex has created a 2.4 branch of the RCD. Jamie, we can try to compile and install it. As a test a did it for c1oaf, it compiles, installs and runs once variables SITE, IFO, RCD_LIBRARY_PATH are properly defined. As we do not want to run one model at 2.4 code and others at 2.1, I recompiled c1oaf back to 2.1. Jamie, please, let me know when you are ready to upgrade to 2.4 release.

  6106   Mon Dec 12 13:02:08 2011 kiwamuUpdateCDSdaqd restarted

I have restarted the daqd process at 1:01 PM since I have added some new ALS's daq channels.

  6124   Thu Dec 15 11:47:43 2011 jamieUpdateCDSRTS UPGRADE IN PROGRESS

I'm now in the middle of upgrading the RTS to version 2.4.

All RTS systems will be down until futher notice...

  6125   Thu Dec 15 22:22:18 2011 jamieUpdateCDSRTS upgrade aborted; restored to previous settings

Unfortunately, after working on it all day, I had to abort the upgrade and revert the system back to yesterday's state.

I think I got most of the upgrade working, but for some reason I could never get the new models to talk to the framebuilder.  Unfortunately, since the upgrade procedure isn't document anywhere, it was really a fly by the seat of my pants thing.  I got some help from Joe, which got me through one road block, but I ultimately got stumped.

I'll try to post a longer log later about what exactly I went through.

In any event, the system is back to the state is was in yesterday, and everything seems to be working.

  6149   Mon Dec 26 12:04:41 2011 kiwamuUpdateCDSc1gcy.ini hand edited

I have edited c1scx.ini by hand in order to acquire some green locking related channels.

Somehow c1sus.ini, c1mcs.ini, c1scx.ini and c1scy.ini are not accessible via the daqconfig script.

As far as I remember it had been accessible via daqconfig a week ago when I edited c1scy.ini.

Anyway I had to edit it by hand. They need to be fixed at some point

  6173   Thu Jan 5 09:59:27 2012 JamieUpdateCDSRTS/RCG/DAQ UPGRADE TO COMMENCE

RTS/RCG/DAQ UPGRADE TO COMMENCE

I will be attempting (again) to upgrade the RTS, including the RCG and the daqd, to version 2.4 today.  The RTS will be offline until further notice.

  6174   Thu Jan 5 20:40:21 2012 JamieUpdateCDSRTS upgrade aborted; restored to previous settings; fb symmetricom card failing?

After running into more problems with the upgrade, I eventually decided to abort todays upgrade attempt, and revert back to where we were this morning (RTS 2.1).  I'll try to follow this with a fuller report explaining what problems I encountered when attempting the upgrade.

However, when Alex and I were trying to figure out what was going wrong in the upgrade, it appears that the fb symmetricom card lost the ability to sync with the GPS receiver.  When the symmeticom module is loaded, dmesg shows the following:

[  285.591880] Symmetricom GPS card on bus 6; device 0
[  285.591887] PIC BASE 2 address = fc1ff800
[  285.591924] Remapped 0x17e2800
[  285.591932] Current time 947125171s 94264us 800ns 
[  285.591940] Current time 947125171s 94272us 600ns 
[  285.591947] Current time 947125171s 94280us 200ns 
[  285.591955] Current time 947125171s 94287us 700ns 
[  285.591963] Current time 947125171s 94295us 800ns 
[  285.591970] Current time 947125171s 94303us 300ns 
[  285.591978] Current time 947125171s 94310us 800ns 
[  285.591985] Current time 947125171s 94318us 300ns 
[  285.591993] Current time 947125171s 94325us 800ns 
[  285.592001] Current time 947125171s 94333us 900ns 
[  285.592005] Flywheeling, unlocked...

Because of this, the daqd doesn't get the proper timing signal, and consequently is out of sync with the timing from the models.

It's completely unclear what caused this to happen.  The card seemed to be working all day today, then Alex and I were trying to debug some other(maybe?) timing issues and the symmetricom card all of a sudden stopped syncing to the GPS.  We tried rebooting the frame builder and even tried pulling all the power to the machine, but it never came back up.  We checked the GPS signal itself and to the extend that we know what that signal is supposed to look like it looked ok.

I speculate that this is also the cause of the problems were were seeing earlier in the week.  Maybe the symmetricom card has just been acting flaky, and something we did pushed it over the edge.

Anyway, we will try to replace it tomorrow, but Alex is skeptical that we have a replacement of this same card.  There may be a newer Spectracom card we can use, but there may be problems using it on the old sun hardware that the fb is currently running on.  We'll see.

In the mean time, the daqd is running rogue, off of it's own timing.  Surprisingly all of the models are currently showing 0x0 status, which means no problems.  It doesn't seem to be recording any data, though.  Hopefully we'll get it all sorted out tomorrow.

  6175   Fri Jan 6 01:00:56 2012 kiwamuUpdateCDSc1scx out of sync

Both the c1scx and its IOP realtime processes became out of sync.

Initially I found that the c1scx didn't show any ADC signals, though the sync sign was green.

Then I software-rebooted the c1iscex machine and then it became out of sync.

For tonight this is fine because I am concentrating on the central part anyway.

  6176   Fri Jan 6 11:49:13 2012 JamieUpdateCDSframebuilder taken offline to diagnose problem with symmetricom timing card

Alex and I have taken the framebuilder offline to try to see what's wrong with the symmetricom card.  We have removed the card from the chassis and Alex has taken it back to downs to do some more debugging.

We have been formulating some alternate methods to get timing to the fb in case we can't end up getting the card working.

  6177   Fri Jan 6 14:31:54 2012 JamieUpdateCDSframebuilder back online, using NTP time syncronization

The framebuilder is back online now, minus it's symmetricom GPS card.  The card seems to have failed entirely, and was not able to be made to work at downs either.  It has been entirely removed from fb.

As a fall back, the system has been made to work off of the system NTP-based time synchronization.  The latest symmetricom driver, which is part of the RCG 2.4 branch, will fall back to using local time if the GPS synchronization fails.  The new driver was compiled from our local checkout of the 2.4 source in the new to-be-used-in-the-future rtscore directory:

controls@fb ~ 0$ diff {/opt/rtcds/rtscore/branches/branch-2.4/src/drv/symmetricom,/lib/modules/2.6.34.1/kernel/drivers/symmetricom}/symmetricom.ko
controls@fb ~ 0$ 

The driver was reloaded.   daqd was also linked against the last running stable version and restarted:

controls@fb ~ 0$ ls -al $(readlink -f /opt/rtcds/caltech/c1/target/fb/daqd)
-rwxr-xr-x 1 controls controls 6592694 Dec 15 21:09 /opt/rtcds/caltech/c1/target/fb/daqd.20120104
controls@fb ~ 0$ 

We'll have to keep an eye on the system, to see that it continues to record data properly, and that the fb and the front-ends remain in sync.

The question now is what do we do moving forward.  CDS is not supporting the symmetricom cards anymore, and have moved to using Spectracom GPS/IRIG-B cards.  However, Downs has neither at the moment.  Even if we get a new Spectracom card, it might not work in this older Sun hardware, in which case we might need to consider upgrading the framebuilder to a new machine (one supported by CDS).

  6179   Fri Jan 6 20:10:48 2012 ranaUpdateCDSframebuilder back online, using NTP time syncronization

 

 You (Jamie) should talk with Rolf and Alex to find out what framebuilder they will support for > 3 years. Then we should buy that along with the adapter card which allows us to use the same RAID we now have for the frames.

  6182   Mon Jan 9 23:52:15 2012 kiwamuUpdateCDSSUS channels not accessible from dataviewer

[John / Kiwamu]

 We found that some of the suspensions channels (for example C1:SUS-BS_POS_IN1 and etc) were not accessible from dataviewer for some reasons.

So far it seems none of the channels associated with c1sus are accessible from dataviewer.

  6204   Tue Jan 17 02:44:59 2012 kiwamuUpdateCDSawg not working

AWG is not working. This needs to be fixed.

I could set the channel and the parameters in the AWGGUI screen, but it never inject signals to the realtime system.

  6207   Tue Jan 17 16:09:20 2012 kiwamuUpdateCDSawg not working on the c1sus machine

Actually awg works fine without any problems when the excitation channels belong to the c1lsc machine.

It seems that the awg doesn't inject signals on the channels of the c1sus machine, for example C1:SUS-BS_LSC_EXC and so on.

Quote from #6204

AWG is not working. This needs to be fixed.

ELOG V3.1.3-