40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 45 of 357  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  4772   Tue May 31 14:29:00 2011 jzweizigUpdateCDSframes

There seems to be something strange going on with the 40m frame builder.
Specifically, there is a gap in the frames in /frames/full near the start of
each 100k second subdirectory. For example, frames for the following times are missing:

990200042-990200669
990300045-990300492
990400044-990400800
990500032-990500635
990600044-990600725
990700037-990700704
990800032-990800677
990900037-990900719


To summarize, after writing the first two frames in a data directory, the next ~10 minutes of frames are usually missing. To make matters worse (for
the nds2 frame finder, at least) the first frame after the gap (and all successive frames) start at an arbitrary time, usually not aligned to a 16-second boundary. Is there something about the change of directories that is causing the frame builder to crash? Or is the platform/cache disk too slow to complete the directory switch-over without loss of data?

  4773   Tue May 31 15:45:37 2011 JamieUpdateCDSc1iscey IOchassis powered off for some reason. repowered.

We found that both of the c1iscey models (c1x05 and c1scy) were unresponsive, and weren't coming back up even after reboot.  We then found that the c1iscey IOchassis was actually powered off.  Steve's accepts some sort of responsibility, since he was monkeying around down there for some reason.  After powerup and reboot, everything is running again.

  4776   Wed Jun 1 11:31:50 2011 josephbUpdateCDSMC1 LR digital reading close to zero, readback ~0.7 volts

There appears to be a bad cable connection somewhere on the LR sensor path for the MC1 optic.

The channel C1:SUS-MC1_LRPDMon is reading back 0.664 volts, but the digital sensor channel, C1:SUS-MC1_LRSEN_INMON, is reading about -16.  This should be closer to +1000 or so.

We've temporarily turned off the LRSEN filter module output while this is being looked into.

I briefly went out and checked the cables around the whitening and AA boards for the suspension sensors, but even after wiggling and making sure everything was plugged in solidly.  There was one semi-loose connection, but it wasn't on the MC1 board, but I pushed it all the way in anyways.  The monitor point on the AA board looks correct for the LR channels, although ITMX LR struck me as being very low at about -0.05 Volts.

According to data viewer, the MC1 LR sensor channel went bad roughly two weeks ago, around 00:40 on 5/18 UTC, or 17:40 on 5/17 PDT.

 

UPDATE:

It appears the AA board (or possibly the SCSI cable connected to it) is the problem in the chain.

  4781   Thu Jun 2 16:31:41 2011 JamieUpdateCDSaquired SUS channel name suffixes changed from _DAQ to _DQ

CDS changed the suffix for all aquired channel names from _DAQ to _DQ.  When we rebuilt the sus models, described in the previous log, the channel names were changed and consequently the channel files were completely rewritten.

To fix the issue, the latest archived channel file was copied back into the chans directory, and the suffixes were changed, as so:

cd /opt/rtcds/caltech/c1/chans
cp archive/C1SUS_110602_155403.ini  C1SUS.ini
sed -i 's/DAQ/DQ/g' C1SUS.ini

We then restarted the models and the framebuilder.

  4790   Mon Jun 6 18:29:01 2011 Jamie, JoeUpdateCDSCOMPLETE FRONT-END REBUILD (WITH PROBLEMS (fixed))

Today Joe and I undertook a FULL rebuild of all front end systems with the head of the 2.1 branch of the RCG.  Here is the full report of what we did:

  1. checked out advLigoRTS/branches/branch-2.1, r2457 into core/branches/branch-2.1
  2. linked core/release to branches/branch-2.1
  3. linked in models to core/release/src/epics/simLink using Joe's new script (userapps/release/cds/c1/scripts/link_userapps)
  4. remove unused/non-up-to-date models:
  5. c1dafi.md
    c1lsp.mdl
    c1gpv.mdl
    c1sup_vertex_plant_shmem.mdl
  6. modified core/release/Makefile so that it can find models:
  7. --- Makefile	(revision 2451)
    +++ Makefile (working copy)
    @@ -346,7 +346,7 @@
    #MDL_MODELS = x1cdst1 x1isiham x1isiitmx x1iss x1lsc x1omc1 x1psl x1susetmx x1susetmy x1susitmx x1susitmy x1susquad1 x1susquad2 x1susquad3 x1susquad4 x1x12 x1x13 x1x14 x1x15 x1x16 x1x20 x1x21 x1x22 x1x23

    #MDL_MODELS = $(wildcard src/epics/simLink/l1*.mdl)
    -MDL_MODELS = $(shell cd src/epics/simLink; ls m1*.mdl | sed 's/.mdl//')
    +MDL_MODELS = $(shell cd src/epics/simLink; ls c1*.mdl | sed 's/.mdl//')

    World: $(MDL_MODELS)
    showWorld:
  8. removed channel files for models that we know will be renumbered
    • For this rebuild, we are also building modified sus models, that are now using libraries, so the channel numbering is changing.
  9. make World
    • this makes all the models
  10. make installWorld
    • this installs all the models
  11. Run activateDQ.py script to activate all the relevant channels
    • this script was modified to handle the new "_DQ" channels
  12. make/install new awgtpman:
  13. cd src/gds
    make
    cp awgtpman /opt/rtcds/caltech/c1/target/gds/bin
  14. turn off all watchdogs
  15. test restart one front end: c1iscex
  16. BIG PROBLEM

    The c1iscex models (c1x01 and c1scx) did not come back up.  c1x01 was running long on every cycle, until the model crashed and brought down the computer.  After many hours, and with Alex's help, we managed to track down the issue to a patch from Rolf at r2361.  The code included in that patch should have been wrapped in an "#ifndef RFM_DIRECT_READ".  This was fixed and committed to branches/branch-2.1 at r2460 and to trunk at r2461.

  17. update to core/branches/branch-2.1 to r2460
  18. make World && make installWorld with the new fixed code
  19. restarted all computers
  20. restart frame builder
  21. burt restored to 8am this morning
  22. turned on all watchdogs

Everything is now green, and things seem to be working.  Mode cleaner is locked.  X arm locked.

 

  4800   Thu Jun 9 16:18:03 2011 josephbUpdateCDSSecond trends only go back 12 days

While answering a quick question by Kiwamu, I noticed we only had second trends going back to 99050000 GPS time, May 27th 2011. 

Trends (I thought) were intended to be kept forever, and certainly longer than full data, which currently goes back several months.

Jamie will need to look into this.

  4801   Thu Jun 9 18:25:22 2011 kiwamuHowToCDSlook back a channel which doesn' exist any more

For some purposes I looked back the data of some channels which don't exist any more.  Here I explain how to do it.

If this method is not listed on the wiki, I will put this instruction on a wiki page.

 

(How to)

   (1) Edit an "ini" file which is not associated to the real-time control (e.g. IOP_SLOW.ini)

   (2) In the file, write a channel name which you are interested in. The channel name should be bracketed like the other existing channels.

               example:  [C1:LSC-REFL11_I_OUT_DAQ]

   (3) Define the data rate. If you want to look at the full data, write

              datarate = 2048

        just blow each channel name.

        Or if you want to look at only the trends, don't write anything.

   (4) Save the ini file and restart fb. If necessary hit "DAQ Reload" button on a C1:AAA_GDS_TP.adl screen to make the indicators green.

   (5) Now you should be able to look at the data for example by dataviewer.

   (6) After you finish the job, don't forget to clean up the sentences that you put in the ini file because it will always show up on the channel list on dtt and is just confusing.

        Also don't forget to restart fb to reflect the change.

  4803   Fri Jun 10 12:02:10 2011 ranaUpdateCDSSecond trends only go back 12 days

Quote:

While answering a quick question by Kiwamu, I noticed we only had second trends going back to 99050000 GPS time, May 27th 2011. 

Trends (I thought) were intended to be kept forever, and certainly longer than full data, which currently goes back several months.

Jamie will need to look into this.

 Our concept is to keep second trends for 1-2 months and minute trends forever. The scheme that Alan had worked out many years ago had it such that we could look back to 1998 and that the minute trends would be backed up somehow.

If its not working, we need to get Alan's help to recover the previous configuration.

  4809   Mon Jun 13 15:33:55 2011 Jamie, JoeUpdateCDSDolphin fiber between 1Y3 and 1X4 appears to be dead

The fiber that connects the Dolphin card in the c1lsc machine (in the 1Y3 rack) to the Dolphin switch in the 1X4 rack appears to have died spontaneously this morning.  This was indicated by loss of Dolphin communication between c1lsc and c1sus.

We definitively tracked it down to the fiber by moving the c1lsc machine over to 1X4 and testing the connection with a short cable.  This worked fine.  Moving it back to using the fiber again failed.

Unfortunately, we have no replaced Dolphin fiber.  As a work around, we are stealing  a long computer->IO chassis cable from Downs and moving the c1lsc machine to 1X4.

This is will be a permanent reconfiguration.  The original plan was to have the c1lsc machine also live in 1X4.  The new setup will put the computer farther from the RF electronics, and more closely mimic the configuration at the sites, both of which are good things.

  4811   Mon Jun 13 18:40:08 2011 Jamie, JoeUpdateCDSSnags in the repair of LSC CDS

We've run into a problem with our attempts to get the LCS control back up and running.

As reported previously, the Dolphin fiber connection between c1lsc and c1sus appears to be dead.  Since we have no replacement fiber, the solution was to move the c1lsc machine in to the 1X4 rack, which would allow us to use one of the many available short Dolphin cables, and then use a long fiber PCIe extension cable to connect c1lsc to it's IO chassis.  However, the long PCIe extension cable we got from Downs does not appear to be working with our setup.  We tested the cable with c1sus, and it does not seem to work.

We've run out of options today.  Tomorrow we're going to head back to Downs to see if we can find a cable that at least works with the test-stand setup they have there.

  4812   Mon Jun 13 19:26:42 2011 Jamie, JoeConfigurationCDSSUS binary IO chassis 2 and 3 moved from 1X5 to 1X4

While preping 1X4 for installation of c1lsc, we removed some old VME crates that were no longer in use.  This freed up lots of space in 1X4.  We then moved the SUS binary IO chassis 2 and 3, which plug into the 1X4 cross-connect, from 1X5 into the newly freed space in 1X4.  This makes the cable run from these modules to the cross connect much cleaner.

  4815   Tue Jun 14 09:25:17 2011 JamieUpdateCDSDolphin fiber tested with c1sus, still bad

The bad Dolphin was still bad when tested with a connection between c1sus and the Dolphin swtich.

I'm headed over to Downs to see if we can resolve the issue with the PCIe extension fiber.

  4816   Tue Jun 14 12:23:44 2011 Jamie, JoeUpdateCDSWE ARE ALL GREEN! LSC back up and running in new configuration.

After moving the c1lsc computer to 1X4, then connecting c1lsc to it's IO chassis in 1Y3 by a fiber PCIe extension cable, everything is back up and running and the status screen is all green.  c1lsc is now directly connected to c1sus via a short copper Dolphin cable.

After lunch we will do some more extensive testing of the system to make sure everything is working as expected.

  4837   Mon Jun 20 09:28:19 2011 JamieUpdateCDSShutting down low-voltage DC power in 1X1/1X2 racks

In order to install the BO module in 1X2, I need to shut down all DC power to the 1X1 and 1X2 racks.

  4838   Mon Jun 20 10:45:43 2011 JamieUpdateCDSPower restored to 1X1/1X2 racks. IOO binary output module installed.

All power has been restored to the 1X1 and 1X2 racks.  The modecleaner is locked again.

I have also hooked up the binary output module in 1X2, which was never actually powered.  This controls the whitening filters for MC WFS.  Still needs to be tested.

  4843   Mon Jun 20 17:58:00 2011 ranaUpdateCDSGateway program killed

There was a rogue, undocumented, gateway process running on NODUS since ~4 PM. This guy was broadcasting channels back into the Martian and causing lockups in the IOO controls. I did a kill -9 on its process.

Someone will pay for this.

  4867   Thu Jun 23 21:34:21 2011 kiwamuUpdateCDSno foton on the CentOS machines

For some reasons foton's deafault sample rate is NOT correct when it runs on the CentOS machines.

It tries to setup the sample rate to be 2048 Hz instead of 16384 Hz until you specify the frequency.

To avoid an accidental change of the sample rate,

running foton on CentOS is forbidden until any further notifications.

Run foton only on Pianosa.

 

Additionally I added an alias sentence in cshrc.40m such that people can not run foron on CentOS (csh and tcsh, technically speaking).

Below is an example of raw output when I typed foron on a CentOs machine.

    rossa:caltech>foton
    DO NOT use foton on CentOS

  4871   Thu Jun 23 22:53:02 2011 kiwamuUpdateCDSran activateDQ.py

I found some DQ channels (e.g. SENSOE_UL and etc.) for C1SUS haven't been activated, so I ran activateDQ.py.

Then I restarted daqd on fb as usual. So far the DQ channels look working fine.

  4881   Fri Jun 24 22:35:23 2011 ranaConfigurationCDSdataviewer broken on pianosa

When I try to get minute trend, it says "word too long".

  4886   Sun Jun 26 16:17:22 2011 ranaUpdateCDSdiagonalization of MC input matrix

I have updated the scripts/SUS/peakFit/ directory so that it now finds the SUS input matrix coefficients in addition to just finding the free-swinging peaks.

Procedure:

  1. Get OSEM sensors data via NDS2 from a time when the optics have been kicked and then left free swinging.
  2. Downsample the data to 64 Hz and save.
  3. Make power spectra with a 1 mHz resolution (i.e. we need a few hours of data) and  ~10 averages.
  4. use the fminsearch lorentzian peak fitter -> save the peak frequencies
  5. Make Transfer Function estimate matrix at the peak frequencies between all OSEMs (this makes a 5x4 complex matrix)
  6. The matrix should be real, so make sure its mostly real and then take the real part only
  7. Normalize (height of biggest peak for each f_DOF should be 1)
  8. Add a Butterfly mode vector. This makes the sensing matrix go from 5x4 to 5x5. (Butterfly a.k.a. Pringle)
  9. Invert
  10. Normalize so that the biggest element in each Sensor2DOF column is 1.
  11. Load values into MEDM screen and then verify by another free swinging data run.

 

The attached PDF shows how much rejection of the unwanted DOFs we get between the existing diagonal input matrix and this new empirical matrix. Previously, the decoupling was only a factor of a few for some of the modes. Now the decoupling is more like orders of magnitude (at least according to this calculation). It will be worse when we load it and then try another free swinging run. However, the fact that the suppression can be this good means that the variation in the coefficients at the ~hours time scale is at least this small (~< 0.1%)

 

That's the basic procedure, but there are a lot of important but mainly technical details:

  1. Free swinging data must be taken with the angle bias ON. Otherwise, we are not measuring the correct sensing gain (i.e. the magnets are not in their nominal place within the OSEM-LED beam)
  2. Data must be checked so that the shadow sensor outputs are in their linear regime: if they are exploring the cubic part, then the fundamental is being suppressed.
  3. Instead of just using the peak frequency, I average a few points around the peak to get better SNR before inversion. I think this will make the results more stable.
  4. All previous input matrix diagonalization efforts (Buckley, Sakata & Kawamura, Black, Barton, Gonzalez, Adhikari & Lawrence, Saulson,...) for the past ~15 years have been using the spectra's peak height data. Today's technique uses the TF and so is more precise. The coherent transfer function is always better than just using the magnitude data.
  5. This method is now fairly automatic - there's no human intervention in fudging values, choosing peak heights, frequencies, etc.
  6. We'll have to rerun this, of course, after the mirrors are aligned and after the OSEM whitening fiasco is cleaned up somewhat.

I'll set the optics to be aligned and then swing tonight.

Attachment 1: inMatDiag.pdf
inMatDiag.pdf
  4888   Sun Jun 26 22:38:20 2011 ranaUpdateCDSMC1 LR dead for > 1 month; now revived temporarily

 Since the MC1 LRSEN channel is not wasn't working, my input matrix diagonalization hasn't worked today wasn't working. So I decided to fix it somehow.

I went to the rack and traced the signal: first at the LEMO monitor on the whitening card, secondly at the 4-pin LEMO cable which goes into the AA chassis.

The signal existed at the input to the AA chassis but not in the screen. So I pressed the jumper wire (used to be AA filter) down for the channel corresponding to the MC1 LRSEN channel.

It now has come back and looks like the other sensors. As you can see from this plot and Joe's entry from a couple weeks ago, this channel has been dead since May 17th.

The ELOG reveals that Kiwamu caught Steve doing some (un-elogged) fooling around there. Burnt Toast -> Steve.

bt.jpg

993190663   =      free swinging ringdown restarted again

Attachment 1: lrsen.png
lrsen.png
  4889   Mon Jun 27 00:23:11 2011 ranaUpdateCDSETMX SIDE problem

The slow readback of the ETMX side seems to also have something flaky and bi-stable. This is not an issue for damping, but it disables the SIDE watchdog for ETMX and makes it unsafe if we accidentally use the wrong damping sign.

Attachment 1: etmx-side.png
etmx-side.png
  4918   Thu Jun 30 06:54:07 2011 josephbUpdateCDSModified the automated scripts for producing model webviews

Dave Barker pointed out last week that the webview of our simulink model files, generated from the installed models (i.e. in /opt/rtcds/caltech/c1/target/<system name>/simLink/) was not handling libraries properly.  Essentially the web pages generated couldn't see inside library parts.

This was caused by 2 problems.  The first being the userapps not being in the matlab path when the slwebview call was done, so it couldn't even find the libraries.  The second problem is the slwebview code by default doesn't follow libraries and links, and needs a special command to be told to do so.

I added the following lines to the webview_simlink_update.m file:

addpath('/opt/rtcds/caltech/c1/core/trunk/src/epics/simLink/lib')
for sub = {'cds','isc','isi','sus','psl'}
 for spath = {'common/models','c1/models/lib'}
   addpath(['/opt/rtcds/caltech/c1/userapps/release/' sub{1} '/' spath{1}]);
 end
end

I also changed the following:

temp = slwebview(final_files{x},'viewFile',false);

became

temp = slwebview(final_files{x},'viewFile',false,'FollowLinks','on','FollowModelReference','on');

After confirming these changes worked, I have sent a corrected version to Dave and Keith.

The webview results can be found at: https://nodus.ligo.caltech.edu:30889/FE/

 

 

  4961   Tue Jul 12 10:18:05 2011 JamieUpdateCDSC1:DAQ-FB0_C1???_STATUS indicators red, restored after controller restarts

Yesterday I found the C1:DAQ-FB0_C1???_STATUS lights to be red for the SUS, MCS, SCX, and SCY controllers.  I know this has something to do with model communication with the framebuilder, but I unfortunately don't remember exactly what it is.  I decided to try restarting the affected models to see if that cleared up the problem.  It did.  After restarting c1scx, c1scy, c1sus, and c1mcs everything came back up green.

We need some better documentation about what all of these status indicators mean.

  5006   Wed Jul 20 20:04:54 2011 JamieUpdateCDSC1:DAQ-FB0_C1XXX_STATUS sometimes unexplainably goes red

I have been noticing this happening occasionally, but I don't understand what is causing:

status-fb-red1.png

The channel in question above is C1:DAQ-FB0_C1SCX_STATUS.  This channel is (I believe) reporting some status of the front end model communication with the frame builder, but I'm not sure exactly what.

Usually this problem goes away when I restart the model or the frame builder, but it didn't work this time.  Tomorrow I will figure out what this channel means, why it's sporadically going red, and how to correct it.

  5030   Mon Jul 25 13:01:24 2011 kiwamuUpdateCDSc1ioo Make problem

[Suresh / Kiwamu]

HELP US Jamieeeeeeee !! We are unable to compile c1ioo.

 

It looks like something wrong with Makefile.

We ran make c1ioo -- this was successful every time. However make install-c1ioo doesn't run.

The below is the error messages we got.

        make install-target-c1ioo
        make[1]: Entering directory `/opt/rtcds/caltech/c1/core/branches/branch-2.1'
        Please make c1ioo first

Then we looked at Makefile and tried to find what was wrong. Then found the sentence (in 36th line from the top) saying

        if test $(site)no = no; then echo Please make $$system first; exit 1; fi;\

We thought the lack of the site-name specification caused the error.

So then we tried the compile it again with the site name specified by typing

     export site=c1

in the terminal window.

It went ahead a little bit further, but it still doesn't run all through the Make commands.

 

  5031   Mon Jul 25 13:09:39 2011 JamieUpdateCDSc1ioo Make problem

> It looks like something wrong with Makefile.

Sorry, this was my bad.  I was making a patch to the makefile to submit back upstream and I forgot to revert my changes.  I've reverted them now, so everything should be back to normal.

  5049   Wed Jul 27 15:49:13 2011 jamieConfigurationCDSdataviewer now working on pianosa

Not exactly sure what the problem was, but I updated to the head of the SVN and rebuilt and it seems to be working fine now.

  5060   Fri Jul 29 12:39:26 2011 jamieUpdateCDSc1iscex mysteriously crashed

c1iscex was behaving very strangely this morning.  Steve earlier reported that he was having trouble pulling up some channels from the c1scx model.  I went to investigate and noticed that indeed some channels were not responding.

While I was in the middle of poking around, c1iscex stopped responding altogether, and became completely unresponsive.  I walked down there and did a hard reset.  Once it rebooted, and I did a burt restore from early this morning, everything appeared to be working again.

The fact that problems were showing up before the machine crashed worries me.  I'll try to investigate more this afternoon.

  5094   Tue Aug 2 16:43:23 2011 jamieUpdateCDSNDS2 server on mafalda restarted for access to new channels

In order to get access to new DQ channels from the NDS2 server, the NDS2 server needs to be told about the new channels and restarted.  The procedure is as follows:

ssh mafalda
cd /users/jzweizig/nds2-mafalda
./build_channel_history
./install_channel_list
pkill nds2
# wait a few seconds for the process to quit and release the server port
./start_nds2

This procedure needs to be run every time new _DQ channels are added.

We need to set this up as a proper service, so the restart procedure is more elegant.

An additional comment from John Z.:

    The --end-gps parameter in ./build_channel_history seems to be causeing
    some trouble. It should work without this parameter, but there is a
    directory with a gps time of 1297900000 (evidently a test for GPS1G)
    that might screw up the channel list generation. So, it appears that
    the end time requires a time for which data already exists. this
    wouldn't seem to be a big deal, but it means that it has to be modified
    by hand before running. I haven't fixed this yet, but I think that I
    can probably pick out the most recent frame and use that as an end-time
    point. I'll see if I can make that work...

  5136   Mon Aug 8 00:12:58 2011 ranaUpdateCDSdiagonalization of MC input matrix

I've finally completed the SUS/peakFit/ scripts which find the new input matrix for the SUS. MC1, MC2, MC3, and ITMX have been matrix'd.

I tried to do the BS, but it came out with very funny matrix elements. Also the BS is missing its DAQ channels again (JAMIE !) so we can't diagnose it with the free swinging method.

To continue, we have to get some good data and try this again. Right now there are some weird issues with a lot of the optics. I've also set the damping gains for the optics with the new matrices.

 Ex.

new_matrix = findMatrix('ITMX')

writeSUSinmat('ITMX', new_matrix)

writeSUSinmat.m

this script writes the values to the MEDM input SUS matrix. To do the writing, I used the low level 'caput' command instead of ezcawrite since the ezca libraries are getting deprecated.

caput doesn't really have good diagnostics, so I use matlab to check the return status and then display to the terminal. You can just rerun it if it gives you an error.

 

 

A coupled of normalization notes:

1) The POS/PIT/YAW rows are scaled so that the mean of abs(FACE elements) = 1. Previously, I had the max element = 1.

2) The SIDE row is scaled so that the SIDE element = +1.

3) I then normalized the ROWS according to the geometrical factors that Jamie has calculated and almost put into the elog.

 

All these scripts have been added to the SVN. I've removed the large binary data files from the directory though. You can just rsync them in to your laptop if you want to run this stuff remotely.

  5137   Mon Aug 8 00:58:26 2011 ranaUpdateCDSdiagonalization of MC input matrix

Besides the purpose of correctly tuning the suspensions, my hidden goal in the input matrix diagonalization has been to figure out what the 'true' sensing noise of the OSEMs is so that we can accurately predict the noise impact on the OAF.

The attached plot shows the DOFs of ITMX calibrated into microns or microrad as per Jamie's ethereal input matrix calculations.

The main result is in the ratio of POS to BUTTER. It tells us that even at nighttime (when this data was taken) we should be able to get some reduction in the arms at 1 Hz.

Whether we can get anything down to 0.1 Hz depends on how the arm control signal compares to the POS signal here. I leave it to Jenne to overlay those traces using a recent Arm lock.

Attachment 1: null.png
null.png
  5143   Mon Aug 8 19:45:27 2011 jamieUpdateCDSactivateDQ script run; SUS channels being acquired again

> Also the BS is missing its DAQ channels again (JAMIE !) so we can't diagnose it with the free swinging method.

I'm not sure why the BS channels were not being acquired.  I reran the activateDQ script, which seemed to fix everything.  The BS DQ channels are now there.

I also noticed that for some reason there were SUS-BS-ASC{PIT,YAW}_IN1_DQ channels, even though they had their acquire flags set to 0.  This means that they were showing up like test point channels, but not being written to frames by the frame builder.  This is pretty unusual, so I'm not sure why they were there.  I removed them.

  5162   Wed Aug 10 00:21:10 2011 jamieUpdateCDSupdates to peakFit scripts

I updated the peakFit routines to make them a bit more user friendly:

  • modified so that any subset of optics can be processed at a time, instead of just all
  • broke out tweakable fit parameters into a separate parameters.m file
  • added a README that describes use

These changes were committed to the 40m svn.

  5211   Fri Aug 12 16:50:37 2011 YoichiConfigurationCDSFE Status screen rearranged
I rearranged the FE_STATUS.adl so that I have a space to add c1ffc in the screen.
So, please be aware that the FE monitors are no longer in their original positions
in the screen.
  5214   Fri Aug 12 17:27:49 2011 YoichiSummaryCDSToggle button for RCG
Bottom line: I made an RCG block to realize a toggle button easily.

Read on if you need such a button, or if you want to know how to
write a new RCG block with C.

-----------------
When I was making MEDM screens for FFC, I wanted to have a toggle
button to enable/disable the FFC path.
I wanted to have something like the ON/OFF buttons of the filter bank
screens, the one changes its state every time I click on it.
However, I could not find an easy way to realize that.

From MEDM, I can send a value to an EPICS channel using a "Message Button".
This value is always the same, say 1.
In the RCG model, I used a cdsEpicsMomentary block so that whenever the channel
gets 1, it stays to be 1 for a while and turns back to 0 in a second or so.
This generates a pulse of 1 when I click on a message button on a MEDM screen.
Then I needed a block to keep its internal state (0 or 1), and flips its state
whenever it receives a pulse of 1.
Since I couldn't find such a block in the current RCG library, I implemented one
using the cdsFunctionCall block. It allows you to implement a block with C code.

There is a good explanation of how to use this block in the CDS_PARTS library.
Here is basically what I did.

(1) Drag and drop thee cdsFunctionCall block to my model.

(2) In the "Block Properties", I put the following line in the Description field.
inline cdsToggle /opt/rtcds/caltech/c1/userapps/release/cds/common/src/cdsToggle.c
This means to call a function cdsToggle(), whose code is in the file indicated above.

(3) The contents of the source code is very simple.
void cdsToggle(double *in, int inSize, double *out, int outSize){
  static double x = 0;
  static double y = 0;

  if (*in != y){
    y = *in;
    if (y == 1){
      x = (x == 1) ? 0 : 1;
      *out = x;
    }
  }
}
The function prototype is always the same. *in and *out are the pointers to the arrays of doubles
for input and output signals of the block. In simuLink, the signals have to be
multiplexed so that the RCG can know how many signals are handed to or returned from the function.
In order to keep the internal state of my custom block, I used "static" keyword in the
declaration of the variables. The rest of the code should be obvious.

(4) Just compile the model as usual. The RCG will automatically include the source code and put
a call to the function in the proper place.

I made the block a library so that people can use it.
/opt/rtcds/caltech/c1/userapps/trunk/cds/common/models/cdsToggle.mdl
is the one.
For the usage of it, please have a look at
/opt/rtcds/caltech/c1/userapps/trunk/isc/c1/models/c1lsc
  5312   Sat Aug 27 15:47:59 2011 ranaUpdateCDSOSEM noise / nullstream and what does it mean for satellites

In the previous elog of mine, I looked at the nullstream (aka butterfly mode) to find out if the intrinsic OSEM noise is limiting the displacement noise of the interferometer or possibly the Wiener FF performance.

The conclusion was that its not above ~0.2 Hz. Due to the fortuitous breaking of the ITMX magnet, we also have a chance to check the 'bright noise': what the noise is with no magnet to occlude the LED beam.

As expected, the noise spectra with no magnets is less than the calculated nullstream. The attached plot shows the comparison of the LL OSEM (all the bright spectra look basically alike) with the damped

optic spectra from 1 month week ago.

From 0.1 - 10 Hz, the motion is cleanly larger than the noise. Below ~0.2 Hz, its possible that the common mode rejection of the short cavity lengths are ruined by this. We should try to see if the low frequency

noise in the PRC/SRC is explainable with our current knowledge of seismicity and the 2-dimensional 2-poiint correllation functions of the ground.

So, the question is, "Should we try to upgrade the satellite boxes to improve the OSEM sensing noise?"

Attachment 1: Untitled.png
Untitled.png
  5315   Sun Aug 28 22:49:40 2011 SureshUpdateCDSfb down

I recompiled c1ioo after making some changes and restarted fb. (about 9:45 - 10PM PDT)  But it failed to restart.  It responds to ping, but does not allow a ssh or telnet. The screen output is:

allegra:~>ssh fb
ssh: connect to host fb port 22: Connection refused
allegra:~>telnet fb 8087
Trying 192.168.113.202...
telnet: connect to address 192.168.113.202: Connection refused
telnet: Unable to connect to remote host: Connection refused
allegra:~>
 

Nor am I able to connect to c1ioo either....

 

 

  5316   Mon Aug 29 00:49:00 2011 kiwamuUpdateCDSRe : fb down

Fb is in a bad situation. It needs a MANUAL fsck to fix the file system.

HELP US, Jamieeeeeeeeeeee !!!

 

When Suresh and I connected a display and tried to see what was going on, the fb computer was in a file system check.

This was because Suresh did a hardware reboot by pressing a power button on the front panel.

Since the file checking took so long time and didn't proceed fast, we pressed the reset button and again the power button.

Actually the reset button didn't work (maybe ?) it just made some light indicators flashing.

After the second reboot the reboot message said that it needs a manual fsck to fix the file system. This maybe because we interrupted the file checking.

We are leaving it to Jamie because the fsck command would do something bad if unfamiliar persons, like us, do it.

 

In addition to it, the boot message was also saying that line 37 in /etc/fstab was bad.

We logged into the machine with a safe mode, then found there was an empty line in 37th line of fstab.

We tried erasing this empty line, but failed for some reasons. We were able to edit it by using vi, but wasn't able to save it.

  5317   Mon Aug 29 12:05:32 2011 jamieUpdateCDSRe : fb down

fb was requiring manual fsck on it's disks because it was sensing filesystem errors.  The errors had to do with the filesystem timestamps being in the future.  It turned out that fb's system date was set to something in 2005.  I'm not sure what caused the date to be so off (motherboard battery problem?)  But I did determine after I got the system booting that the NTP client on fb was misconfigured and was therefore incapable of setting the system date.  It seems that it was configured to query a non-existent ntp server.  Why the hell it would have been set like this I have no idea.

In any event, I did a manual check on /dev/sdb1, which is the root disk, and postponed a check on /dev/sda1 (the RAID mounted at /frames) until I had the system booting.  /dev/sda1 is being checked now, since there are filesystems errors that need to be corrected, but it will probably take a couple of hours to complete.  Once the filesystems are clean I'll reboot fb and try to get everything up and running again.

  5319   Mon Aug 29 18:16:10 2011 jamieUpdateCDSRe : fb down

fb is now up and running, although the /frames raid is still undergoing an fsck which is likely take another day.  Consequently there is no daqd and no frames are being written to disk.  It's running and providing the diskless root to the rest of the front end systems, so, so the rest of the IFO should be operational.

I burt restored the following (which I believe is everything that was rebooted), from Saturday night:

/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1lscepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1susepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1iooepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1assepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1mcsepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1gcvepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1gfdepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1rfmepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1pemepics.snap

 

  5323   Tue Aug 30 11:28:56 2011 jamieUpdateCDSframebuilder back up

The fsck on the framebuilder (fb) raid array (/dev/sda1) completed overnight without issue.  I rebooted the framebuilder and it came up without problem.

I'm now working on getting all of the front-end computers and models restarted and talking to the framebuilder now.

  5324   Tue Aug 30 11:42:29 2011 jamieUpdateCDStestpoint.par file found to be completely empty

The testpoint.par file, located at /opt/rtcds/caltech/c1/target/gds/param/testpoint.par, which tells GDS processes where to find the various awgtpman processes, was completely empty.  The file was there but was just 0 bytes.  Apparently the awgtpman processes themselves also consult this file when starting, which means that none of the awgtpman processes would start.

This file is manipulated in the "install-daq-%" target in the RCG Makefile, ultimately being written with output from the src/epics/util/updateTestpointPar.pl script, which creates a stanza for each front-end model.  Rebuilding and installing all of the models properly regenerated this file.

I have no idea what would cause this file to get truncated, but apparently this is not the first time: elog #3999.  I'm submitting a bug report with CDS.

 

  5325   Tue Aug 30 14:33:52 2011 jamieUpdateCDSall front-ends back up and running

All the front-ends are now running.  Many of them came back on their own after the testpoint.par was fixed and the framebuilder was restarted.  Those that didn't just needed to be restarted manually.

The c1ioo model is currently in a broken state: it won't compile.  I assume that this was what Suresh was working on when the framebuilder crash happened.  This model needs to be fixed.

  5408   Wed Sep 14 20:04:05 2011 jamieUpdateCDSUpdate to frame builder wiper.pl script for GPS 1000000000

I have updated the wiper.pl script (/opt/rtcds/caltech/c1/target/fb/wiper.pl) that runs on the framebuilder (in crontab) to delete old frames in case of file system overloading.  The point of this script is to keep the file system from overloading by deleting the oldest frames.  As it was, it was not properly sorting numbers which would have caused it to delete post-GPS 1000000000 frames first.  This issue was identified at LHO, and below is the patch that I applied to the script.


--- wiper.pl.orig  2011-04-11 13:54:40.000000000 -0700
+++ wiper.pl       2011-09-14 19:48:36.000000000 -0700
@@ -1,5 +1,7 @@
 #!/usr/bin/perl
 
+use File::Basename;
+
 print "\n" .  `date` . "\n";
 # Dry run, do not delete anything
 $dry_run = 1;
@@ -126,14 +128,23 @@
 
 if ($du{$minute_trend_frames_dir} > $minute_frames_keep) { $do_min = 1; };
 
+# sort files by GPS time split into prefixL-T-GPS-sec.gwf
+# numerically sort on 3rd field
+sub byGPSTime {
+    my $c = basename $a;
+    $c =~ s/\D+(\d+)\D+(\d+)\D+/$1/g;
+    my $d = basename $b;
+    $d =~ s/\D+(\d+)\D+(\d+)\D+/$1/g;
+    $c <=> $d;
+}
+
 # Delete frame files in $dir to free $ktofree Kbytes of space
 # This one reads file names in $dir/*/*.gwf sorts them by file names
 # and progressively deletes them up to $ktofree limit
 sub delete_frames {
        ($dir, $ktofree) = @_;
        # Read file names; Could this be inefficient?
-       @a= <$dir/*/*.gwf>;
-       sort @a;
+       @a = sort byGPSTime <$dir/*/*.gwf>;
        $dacc = 0; # How many kilobytes we deleted
        $fnum = @a;
        $dnum = 0;
@@ -145,6 +156,7 @@
          if ($dacc >= $ktofree)  { last; }
          $dnum ++;
          # Delete $file here
+         print "- " . $file . "\n";
          if (!$dry_run) {     
            unlink($file);
          }

  5424   Thu Sep 15 20:16:15 2011 jamieUpdateCDSNew c1oaf model installed and running

[Jamie, Jenne, Mirko]

New c1oaf model installed

We have installed the new c1oaf (online adaptive feed-forward) model.  This model is now running on c1lsc.  It's not really doing anything at the moment, but we wanted to get the model running, with all of it's interconnections to the other models.

c1oaf has interconnections to both c1lsc and c1pem via the following routes:

c1lsc ->SHMEM-> c1oaf
c1oaf ->SHMEM-> c1lsc
c1pem ->SHMEM-> c1rfm ->PCIE-> c1oaf

Therefore c1lsc, c1pem, and c1rfm also had to be modified to receive/send the relevant signals.

As always, when adding PCIx senders and receivers, we had to compile all the models multiple times in succession so that the /opt/rtcds/caltech/c1/chans/ipc/C1.ipc would be properly populated with the channel IPC info.

Issues:

There were a couple of issues that came up when we installed and re/started the models:

c1oaf not being registered by frame builder

When the c1oaf model was started, it had no C1:DAQ-FB0_C1OAF_STATUS channel, as it's supposed to.  In the daqd log (/opt/rtcds/caltech/c1/target/fb/logs/daqd.log.19901) I found the following:

Unable to find GDS node 22 system c1oaf in INI files

It turns out this channel is actually created by the frame builder, and it could not find the channel definition file for the new model, so it was failing to create the channels for it.  The frame builder "master" file (/opt/rtcds/caltech/c1/target/fb/master) needs to list the c1oaf daq ini files:

/opt/rtcds/caltech/c1/chans/daq/C1OAF.ini
/opt/rtcds/caltech/c1/target/gds/param/tpchn_c1oaf.par

These were added, and the framebuilder was restarted.  After which the C1:DAQ-FB0_C1OAF_STATUS appeared correctly.

SHMEM errors on c1lsc and c1oaf

This turned out to be because of an oversight in how we wired up the skeleton c1oaf model.  For the moment the c1oaf model has only the PCIx sends and receives.  I had therefore grounded the inputs to the SHMEM parts that were meant to send signals to C1LSC.  However, this made the RCG think that these SHMEM parts were actually receivers, since it's the grounding of the inputs to these parts that actually tells the RCG that the part is a receiver.  I fixed this by adding a filter module to the input of all the senders.

Once this was all fixed, the models were recompiled, installed, and restarted, and everything came up fine.

All model changes were of course committed to the cds_user_apps svn as well.

  5426   Thu Sep 15 21:56:01 2011 MirkoUpdateCDSc1oaf check, possible shmem problem

After Jamie installed the c1oaf model ( entry 5424 ) I went and checked the intermodel communication.

Remember the config is:

c1lsc ->SHMEM-> c1oaf
c1oaf ->SHMEM-> c1lsc
c1pem ->SHMEM-> c1rfm ->PCIE-> c1oaf

I checked at least one of every communications type.

-All signals reach their destinations.
-c1lsc_to_c1oaf_via_shmem is more noisy adding noise to the signal. lsc runs at 16kHz and oaf at 2kHz but that should actually smooth things out.

c1lsc_to_c1oaf_via_shmem.png

 

Attachment 1: c1lsc_to_c1oaf_via_shmem.png
c1lsc_to_c1oaf_via_shmem.png
Attachment 2: c1oaf_to_c1lsc_via_shmem_fixed_sine_inj_at_100Hz.png
c1oaf_to_c1lsc_via_shmem_fixed_sine_inj_at_100Hz.png
Attachment 3: c1oaf_to_c1lsc_via_shmem_white_noise_inj.png
c1oaf_to_c1lsc_via_shmem_white_noise_inj.png
Attachment 4: c1pem_to_c1oaf_via_rfm.png
c1pem_to_c1oaf_via_rfm.png
  5486   Tue Sep 20 17:45:30 2011 kiwamuUpdateCDSdaqd is restarting by hisself ?

[Jenne / Kiwamu]

 Fb was sick. Dataviewer and Fourier Tools didn't work for a while.

After 10 minutes later they became healthy again. No idea what exactly was going on.

One thing we found was that : during the sickness of fb, it looks like daqd was restarting by hisself. Is this normal ??

Here is the bottom sentences of restart.log. Apparently daqd was rebooting although we didn't command to do so.

  daqd_start Tue Sep 20 02:41:17 PDT 2011
  daqd_start Tue Sep 20 13:18:12 PDT 2011
  daqd_start Tue Sep 20 17:33:00 PDT 2011

  5535   Sat Sep 24 01:38:14 2011 kiwamuUpdateCDSc1scx and c1x01 restarted

[Koji / Kiwamu]

 The c1scx and c1x01 realtime processes became frozen. We restarted them around 1:30 by sshing and running the kill/start scripts.

  5561   Wed Sep 28 02:42:04 2011 kiwamuUpdateCDSsome DAQ channel lost in c1sus : fb, c1sus and c1pem restarted

Somehow some DAQ channels for C1SUS have disappeared from the DAQ channel list.

Indeed there are only a few DAQ channels listed in the C1SUS.ini file.

I ran the activateDQ.py and restarted daqd.

Everything looks okay.  C1SUS and C1PEM were restarted because they became frozen.

ELOG V3.1.3-