  3995   Tue Nov 30 12:25:08 2010 josephbUpdateCDSLSC computer to chassis cable dead


We seemed to have a broken fiber link for use between the LSC and its IO chassis.  It is unclear to mean when this damage occurred.  The cable had been sitting in a box with styrofoam padding, and the kink is in the middle of the fiber, with no other obvious damage near by.  The cable however may have previously been used by the people in Downs for testing and possibly then.  Or when we were stringing it, we caused a kink to happen.

Tried Solutions:

I talked to Alex yesterday, and he suggested unplugging the power on both the computer and the IO chassis completely, then plugging in the new fiber connector, as he had to do that once with a fiber connection at Hanford.  We tried this this morning, however, still no joy.  At this point I plan to toss the fiber as I don't know of any way to rehabilitate kinked fibers.

Note this means that I rebooted c1sus and then did a burt restore from the Nov/30/07:07 directory for c1suspeics, c1rmsepics, c1mcsepics.  It looks like all the filters switched on.

Current Plan:

We do, however, have the a Dolphin fiber which originally was intended to go between the LSC and its IO chassis, before Rolf was told it doesn't work well that way.  However, we were going to connect the LSC machine to the rest of the network via Dolphin.

We can put the LSC machine next to its chassis in the LSC rack, and connect the chassis to the rest of the front ends by the Dolphin fiber.  In that case we just need the usual copper style cable going between the chassis and the computer.


  3999   Tue Nov 30 16:02:18 2010 josephbUpdateCDSstatus


1) Turns out the /opt/rtcds/caltech/c1/target/gds/param/testpoint.par file had been emptied or deleted at one point, and the only entry in it was c1pem.  This had been causing us a lack of test points for the last few days.  It is unclear when or how this happened.  The file has been fixed to include all the front end models again.  (Fixed)

2) Alex and I worked on tracking down why there's a GPS difference between the front ends and the frame builder, which is why we see a 0x4000 error on all the front end GDS screens. This involved several rebuilds of the front end codes and reboots of the machines involved. (Broken)

3) Still working on understanding why the RFM communication, which I think is related to the timing issues we're seeing.  I know the data is being transferred on the card, but it seems to being rejected after being red in, suggesting a time stamp mismatch. (Broken)

4) The c1iscex binary output card still doesn't work.  (Broken)


Alex and I will be working on the above issues tomorrow morning.


Currently, the c1ioo, c1sus and c1iscex computers are running with their front ends. They all still have 0x4000 error.  However, you can still look at channels on dataviewer for example.  However, there's a possibility of inconsistent timing between computer (although all models on a single computer will be in sync).

All the front ends where burt restorted to 07:07 this morning.  I spot checked several optic filter banks and they look to have been turned on.

  4009   Fri Dec 3 15:37:10 2010 josephbUpdateCDSfb, front ends fixed - tested RFM between c1ioo and c1iscex


The front ends and fb computers were unresponsive this morning.

This was due to the fb machine having its ethernet cable plugged into the wrong input.   It should be plugged into the port labeled 0.

Since all the front end machines mount their root partition from fb, this caused them to also hang.


The cable has been relabled to "fb" on both ends, and plugged into the correct jack.  All the front ends were rebooted.


Testing RFM for green locking:

I tested the RFM connection between c1ioo and c1scx.  Unfortunately, on the first test, it turns out the c1ioo machine had its gps time off by 1 second compared to c1sus and c1iscex.  A second reboot seems to have fixed the issue.

However, it bothers me that the code didn't come up with the correct time on the first boot.

The test was done using the c1gcv model and by modifying the c1scx model.  At the moment, the MC_L channel is being passed the MC_L input of the ETMX suspension.  In the final configuration, this will be a properly shaped error signal from the green locking.

The MC_L signal is currently not actually driving the optic, as the ETMX POS MATRIX currently has a 0 for the MC_L component.

  4014   Mon Dec 6 11:59:41 2010 josephbUpdateCDSNew c1lsc computer moved to lsc rack

Computer moved:

The c1lsc computer has been moved over to the 1Y3 rack, just above the c1lsc IO chassis. 

It will talking to the c1sus computer via a Dolphin PCIe reflected memory card.  The cards have been installed into c1lsc and c1sus this morning.

It will talk to its IO chassis via the usual short IO chassis cable.


To Do:

The Dolphin fiber still needs to be strung between c1sus and c1lsc.

The DAQ cable between c1lsc and the DAQ router (which lets the frame builder talk directly with the front ends) also needs t to be strung.

c1lsc needs to be configured to use fb as a boot server, and the fb needs to be configured to handle the c1lsc machine.

  4015   Mon Dec 6 16:49:43 2010 josephbUpdateCDSc1lsc halfway to working

C1LSC Status:

The c1lsc computer is running Gentoo off of the fb server. It has been connected to the DAQ network and is handling mx_streams properly (so we're not flooding the network error messages like we used to with c1iscex).  It is using the old c1lsc ip address ( It can ssh'd into.

However, it is not talking properly to the IO chassis.  The IO chassis turns on when the computer turns on, but the host interface board in the IO chassis only has 2 red lights on (as opposed to many green lights on the host interface boards in the c1sus, c1ioo, and c1iscex IO chassis).  The c1lsc IO processor (called c1x04) doesn't see any ADCs, DACs, or Binary cards.  The timing slave is receiving 1PPS and is locked to it, but because the chassis isn't communicating, c1x04 is running off the computer's internal clock, causing it to be several seconds off. 

Need to investigate why the computer and chassis are not talking to each other.

General Status:

The c1sus and c1ioo computers are not talking properly to the frame builder.  A reboot of c1iscex fixed the same problem earlier, however, as Kiwamu and Suresh are working in the vacuum, I'm leaving those computers alone for the moment, but a reboot and burt restore probably should be done later today for c1sus and c1ioo


Current CDS status:

  4020   Tue Dec 7 16:09:53 2010 josephbUpdateCDSc1iscex status

I swapped out the IO chassis which could only handle 3 PCIe cards with the another chassis which has space for 17, but which previously had timing issues.  A new cable going between the timing slave and the rear board seems to have fixed the timing issues. 

I'm hoping to get a replacement PCI extension board which can handle more than 3 cards this week from Rolf and then eventually put it in the Y-end rack.  I'm also still waiting for a repaired Host interface board to come in for that as well.

At this point, RFM is working to c1iscex, but I'm still debugging the binary outputs to the analog filters.  As of this time they are not working properly (turning the digital filters on and off seems to have no effect on the transfer function measured from an excitation in SUSPOS, all the way around to IN1 of the sensor inputs (but before measuring the digital fitlers).  Ideally I should see a difference when I switch the digital filters on and off (since the analog ones should also switch on and off), but I do not.

  4025   Wed Dec 8 12:26:56 2010 josephbUpdateCDSmegatron set up - as a test front end

[josephb, Osamu]

Megatron Setup:

To show Osamu how to setup a a front end as well as provide a test computer for Osamu's use, we used the new megatron (sunfire x4600 with 16 cores and 8 gigabytes of memory) as a front end without an IO chassis.

The steps we followed are in the wiki, here.

The new megatron's IP address is  It is running the c1x99 front end code.

  4028   Wed Dec 8 14:51:09 2010 josephbUpdateCDSc1pem now recording data


c1pem model was reporting all zeros for all the PEM channels.


Two fold.  On the software end, I added ADCs 0, 1, and 2 to the model.  ADC 3 was already present and is the actual ADC taking in PEM information.

There was a problem noted awhile back by Alex and Rolf that there's a problem with the way the DACs and ADCs are number internally in the code.  Missing ADCs or DACs prior to the one you're actually using can cause problems.

At some point that problem should be fixed by the CDS crew, but for now, always include all ADCs and DACs up to and including the highest number ADC/DAC you need to use for that model.

On the physical end, I checked the AA filter chassis and found the power was not plugged in.  I plugged it in.


We now have PEM channels being recorded by the FB, which should make Jenne happier.

  4029   Wed Dec 8 17:05:39 2010 josephbUpdateCDSPut in dolphin fiber between c1sus and c1lsc


We put in the fiber for use with the Dolphin reflected memory between c1sus and c1lsc (rack 1X4 to rack 1Y3).  I still need to setup the dolphin hub in the 1X4 rack, but once that is done, we should be able to test the dolphin memory tomorrow.

  4046   Mon Dec 13 17:18:47 2010 josephbUpdateCDSBurt updates


Autoburt wouldn't restore settings for front ends on reboot

What was done:

First I moved the burt directory over to the new directory structure.

This involved moving /cvs/cds/caltech/burt/ to /opt/rtcds/caltech/c1/burt.

Then I updated the burt.cron file in the new location, /opt/rtcds/caltech/c1/burt/autoburt/.  This pointed to the new autoburt.pl script.

I created an autoburt directory in the /opt/rtcds/caltech/c1/scripts directory and placed the autoburt.pl script there.

I modified the autoburt.pl script so that it pointed to the new snapshot location.  I also modified it so it updates a directory called "latest" located in the /opt/rtcds/caltech/c1/burt/autoburt directory.  In there is a set of soft links to the latest autoburt backup.

Lastly, I edited the crontab on op340m (using crontab -e) to point to the new burt.cron file in the new location.

This was the easiest solution since the start script is just a simple bash script and I couldn't think of a quick and easy way to have it navigate the snapshots directory reliably.

I then modified the Makefile located in /opt/rtcds/caltech/c1/core/advLigoRTS/ which actually generates the start scripts, to point at the "latest" directory when doing restores.  Previously it had been pointing to /tmp/ which didn't really have anything in it.

So in the future, when building code, it should point to the correct snapshots now.  Using sed I modified all the existing start scripts to point to the latest directory when grabbing snapshots.


According to Keith directory documentation (see T1000248) , the burt restores should live in the individual target system directory i.e. /target/c1sus/burt, /target/c1lsc/burt, etc.  This is a distinctly different paradigm from what we've been using in the autoburt script, and would require a fairly extensive rewrite of that script to handle this properly.  For the moment I'm keeping the old style, everything in one directory by date.  It would probably be worth discussing if and how to move over to the new system.

  4053   Tue Dec 14 11:24:35 2010 josephbUpdateCDSburt restore

I had updated the individual start scripts, but forgotten to update the rc.local file on the front ends to handle burt restores on reboot.

I went to the fb machine and into /diskless/root/etc/ and modified the rc.local file there.

Basically in the loop over systems, I added the following line:

/opt/epics-3.14.9-linux/base/bin/linux-x86/burtwb -f /opt/rtcds/caltech/c1/burt/autoburt/latest/${i}epics.snap  -l /opt/rtcds/caltech/c1/burt/autoburt/logs/${i}epics.log.restore -v

The ${i} gets replaced with the system name in the loop (c1sus, c1mcs, c1rms, etc)

  4057   Wed Dec 15 13:36:44 2010 josephbUpdateCDSETMY IO chassis update

I gave Alex a sob story over lunch about having to go and try to resurrect dead VME crates.  He and Rolf then took pity on me and handed me their last host interface board from their test stand, although I was warned by Rolf that this one (the latest generation board from One Stop) seems to be flakier than previous versions, and may require reboots if it starts in a bad state.

Anyways, with this in hand I'm hoping to get c1iscey damping by tomorrow at the latest.

  4060   Wed Dec 15 17:21:20 2010 josephbUpdateCDSETMY controls status


The c1iscey was converted over to be a diskless Gentoo machine like the other front ends, following the instructions found here.  Its front end model, c1scy was copied and approriately changed from the c1scx model, along with the filter banks.  A new IOP c1x05 was created and assigned to c1iscey.

The c1iscey IO chassis had the small 4 PCI slot board removed and a large 17 PCI slot board put in.  It was repopulated with an ADC/DAC/BO and RFM card.  The host interface board from Rolf was also put in. 

On start up, the IOP process did not see or recognize any of the cards in the IO chassis.

Four reboots later, the IOP code had seen the ADC/DAC/BO/RFM card once.  And on that reboot, there was a time out on the ADC which caused the IOP code to exit.

In addition to the not seeing the PCI cards most of the time, several cables still need to be put together for plugging into the the adapter boards and a box need to be made for the DAC adapter electronics.


  4064   Thu Dec 16 10:52:42 2010 josephbUpdateCamerasNew PoE digital cameras

We have two new Basler acA640-100gm cameras.  These are power over ethernet (PoE) and very tiny.

Attachment 1: basler.jpg
  4082   Tue Dec 21 11:52:58 2010 josephbUpdateComputersRGA scripts fixed, c0rga fixed

c0rga apparently had a full hard drive.  There was 1 Gig log file in /var/log directory, called Xorg.0.log.old which I deleted which freed up about 20% of the hard drive.  This let me then modify the crontab file (which previously had been complaining about no room on disk to make edits).

I updated the crontab to look at the new scripts location, updated the RGA script itself to write to the new log location, and then created a soft link in the /opt directory to /cvs/cds/rtcds on c0rga.

The RGA script should now be running again once a day.

  4097   Fri Dec 24 09:01:33 2010 josephbUpdateCDSBorrowed ADC

Osamu has borrowed an ADC card from the LSC IO chassis (which currently has a flaky generation 2 Host interface board).  He has used it to get his temporary Dell test stand running daqd successfully as of yesterday.

This is mostly a note to myself so I remember this in the new year, assuming Osamu hasn't replaced the evidence by January 7th.

  4132   Tue Jan 11 11:19:13 2011 josephbSummaryCDSStoring FE harddrives down Y arm

Lacking a better place, I've chosen the cabinet down the Y arm which had ethernet cables and various VME cards as a location to store some spare CDS computer equipment, such as harddrives.  I've added (or will add in 5 minutes) a label "FE COMPUTER HARD DRIVES" to this cabinet.

  4135   Tue Jan 11 14:05:11 2011 josephbUpdateComputersMartian host table updated daily

I created two simple cron jobs, one running on linux1 and one running on nodus, to produce an updated copy of the martian host table linkable from the wiki every day.

The scripts live in /opt/rtcds/caltech/c1/scripts/AutoUpdate/.  One is called  updateHostTable.cron and run on linux1 everyday at 4 am, and the other is called moveHostTable.cron which is run on nodus everyday at 5am.

The new link has been added to the Martian Host table wiki page  here.


  4136   Tue Jan 11 16:04:17 2011 josephbUpdateCDSScript to update web views of models for all installed front ends

I wrote a new script that is in /opt/rtcds/caltech/c1/scripts/AutoUpdate/ called  webview_simlink_update.m. 

This m-file when run in matlab will go to the /opt/rtcds/caltech/c1/target directory and for each c1 front end, generate the corresponding webview files for that system and place them in the AutoUpdate directory. 

Afterwards the files can be moved on Nodus to the /users/public_html/FE/ directory with:

mv /opt/rtcds/caltech/c1/scripts/AutoUpdate/*slwebview* /users/public_html/FE/

This was run today, and the files can be viewed at:


Long term, I'd like to figure out a way of automating this to produce automatically updated screens without having to run it manually.  However, simulink seems to stubbornly require an X window to work.

  4144   Wed Jan 12 17:50:21 2011 josephbUpdateCDSWorked on c1lsc, MC2 screens

[josephb, osamu, kiwamu]

We worked over by the 1Y2 rack today, trying to debug why we didn't get any signal to the c1lsc ADC.

We turned off the power to the rack several times while examining cards, including the whitening filter board, AA board, and the REFL 33 demod board.  I will note, I incorrectly turned off power in the 1Y1 rack briefly. 

We noticed a small wire on the whitening filter board on the channel 5 path.  Rana suggested this was to part of a fix for the channels 4 and 5 having too much cross talk.  A trace was cut and this jumper added to fix that particular problem.

We confirmed would could pass signals through each individual channel on the AA and whitening filter boards.  When we put them back in, we did noticed a large offset when the inputs were not terminated.  After terminating all inputs, values at the ADC were reasonable, measuring on from 0 to about -20 counts.  We applied a 1 Hz, 0.1 Vpp signal and confirmed we saw the digital controls respond back with the correct sine wave.

We examined the REFL 33 demod board and confirmed it would work for demodulating 11 MHZ, although without tuning, the I and Q phases will not be exactly 90 degrees apart.

The REFL 33  I and Q outputs have been connected to the whitening board's 1 and 2 inputs, respectively.  Once Kiwamu  adds approriate LO and PD signals to the REFL 33 demod board he should be able to see the resulting I and Q signals digitally on the PD1 I and Q channels.


In an unrelated fix, we examined the suspensions screens, specifically the Dewhitening lights.  Turns out the lights were still looking at SW2 bit 7 instead of SW2 bit 5.  The actual front end models were using the correct bit (21 which corresponds to the 9th filter bank), so this was purely a display issue.  Tomorrow I'll take a look at the binary outputs and see why the analog filters aren't actually changing.




  4150   Thu Jan 13 14:21:13 2011 josephbUpdateCDSWebview of front end model files automated

After Rana pointed me to Yoichi's MEDM snapshot script, I learned how to use Xvfb, which is what Yoichi used to write screens without a real screen.  With this I wrote a new cron script, which I added to Mafalda's cron tab to be run once a day at 6am.

The script is called webview_update.cron and is in /opt/rtcds/caltech/c1/scripts/AutoUpdate/.

export DISPLAY
#Check if Xvfb server is already running
pid=`ps -eaf|grep vfb | grep $DISPLAY | awk '{print $2}'`
if [ $pid ]; then
        echo "Xvfb already running [pid=${pid}]" >/dev/null
# Start Xvfb
echo "Starting Xvfb on $DISPLAY"
Xvfb $DISPLAY -screen 0 1600x1200x24 >&/dev/null &
echo $pid > /opt/rtcds/caltech/c1/scripts/AutoUpdate/Xvfb.pid
sleep 3

#Running the matlab process
/cvs/cds/caltech/apps/linux/matlab/bin/matlab -display :6 -logfile /opt/rtcds/caltech/c1/scripts/AutoUpdate/webview.log -r webview_simlink_update

  4151   Thu Jan 13 16:34:02 2011 josephbUpdateComputers32 bit matlab updated

There was a problem with running the webview report generator in matlab on Mafalada.  It complained of not having a spare report generator license to use, even though the report generator was working before and after on other machines such as Rosalba.  So I moved the old 32 bit matlab directory from /cvs/cds/caltech/apps/Linux/matlab to /cvs/cds/caltech/apps/Linux/matlab_old.  I installed the latest R2010b matlab from IMSS in /cvs/cds/caltech/apps/Linux/matlab and this seems to have made the cron job work on Mafalda now.

  4152   Thu Jan 13 16:41:07 2011 josephbUpdateCDSChannel names for LSC updated

I renamed most of the filter banks in the c1lsc model.  The input filters are now labeled based on the RF photodiode's name, plus I or Q.  The last set of filters in the OM subsystem (output matrix) have had the TO removed, and are now sensibly named ETMX, ETMY, etc.

We also removed the redundant filter banks between the LSCMTRX and the LSC_OM_MTRX.  There is now only one set, the DARM, CARM, etc ones.

The webview of the LSC model can be found here.

  4157   Fri Jan 14 17:13:39 2011 josephbUpdateCamerasPylon driver for Basler Cameras installed on Megatron

After getting some help from the Basler technical support, I was directed to the following ftp link:


I went to the pylon 2.1.0 directory and downloaded the pylon-2.1.0-1748-bininst-64.tar.bz2 file.  Inside of this tar file was another one called pylon-bininst-64.tar.bz2 (along with some other sample programs). I ran tar -jxf on pylon-bininst-64.tar.bz2 and placed the results into the /opt/pylon directory.  It produced a directory of includes, libraries and binaries there.

After playing around with the make files for several sample programs they provided, I finally have been able to compile  them.  At several places I had to have the make files point to /opt/pylon/lib64 rather than /opt/pylon/lib.  I'll be testing the camera with these programs on Monday.  I'd also like to see if this particular distribution will work on Centos machines.  There's some comments in one of the INSTALL help files suggesting packages needed for an install on Fedora 9, which may mean its possible to get this version working on the Centos machines.

  4163   Mon Jan 17 15:31:50 2011 josephbUpdateCamerasTest the Basler acA640-100gm camera

The Basler acA640-100gm is a power over ethernet camera.  It uses a power injector to supply power over an ethernet cable to the camera.  Once I got past some initial IP difficulties, the camera worked fine out of the box.

You need to set some environment variables first, so the code knows where its libraries are located.

setenv PYLON_ROOT /opt/pylon
setenv GENICAM_ROOT_V1_1 /opt/pylon
setenv GENICAM_CACHE /cvs/cds/caltech/users/josephb/xml_cache
setenv LD_LIBRARY_PATH /opt/pylon/lib64:$LD_LIBRARY_PATH

I then run the /opt/pylon/bin/PylonViewerApp

Notes on IP:

Initially, you need to set the computer connecting to the camera to an ip in the 169.254.0.XXX range.  I used on megatron's eth1 ethernet connection.  I also set mtu to 9000.

You can then run the IpConfigurator in /opt/pylon/bin/ to change the camera IP as needed.

Attachment 1: PylonViewer.jpg
  4168   Wed Jan 19 10:31:24 2011 josephbUpdateelogElog restarted again

Elog wasn't responding at around 10 am this morning.  I killed the elogd process, then used the restart script.

  4175   Thu Jan 20 10:15:50 2011 josephbUpdateCDSc1scy error

This is caused by an insufficient number of active DAQ channels in the C1SCY.ini file located in /opt/rtcds/caltech/c1/chans/daq/.  A quick look (grep -v # C1SCY.ini) indicates there are no active channels.  Experience tells me you need at least 2 active channels.

Taking a look at the activateDAQ.py script in the daq directory, it looks like the C1SCY.ini file is included, by the loop over optics is missing ETMY.  This caused the file to improperly updated when the activateDAQ.py script was run.  I have fixed the C1SCY.ini file (ran a modified version of the activate script on just C1SCY.ini).

I have restarted the c1scy front end using the startc1scy script and is currently working.

 Here is the error messages in the dmesg on c1iscey
[   39.429002] c1scy: Invalid num daq chans = 0
[   39.429002] c1scy: DAQ init failed -- exiting


  4179   Thu Jan 20 18:20:55 2011 josephbUpdateCDSc1iscex computer and c1sus computer swapped

Since the 1U sized computers don't have enough slots to hold the host interface board, RFM card, and a dolphin card, we had to move the 2U computer from the end to middle to replace c1sus.

We're hoping this will reduce the time associated with reads off the RFM card compared to when its in the IO chassis.  Previous experience on c1ioo shows this change provides about a factor of 2 improvement, with 8 microseconds per read dropping to 4 microseconds per read, per this elog.

So the dolphin card was moved into the 2U chassis, as well as the RFM card.  I had to swap the PMC to PCI adapter on the RFM card since the one originally on it required an external power connection, which the computer doesn't provide.  So I swapped with one of the DAC cards in the c1sus IO chassis.

But then I forgot to hit submit on this elog entry..............

  4183   Fri Jan 21 15:26:15 2011 josephbUpdateCDSc1sus broken yesterday and now fixed

[Joe, Koji]
Yesterday's CDS swap of c1sus and c1iscex left the interfometer in a bad state due to several issues.

The first being a need to actually power down the IO chassis completely (I eventually waited for a green LED to stop glowing and then plugged the power back in) when switching computers.  I also plugged and plugged the interface cable from the IO chassis and computer while powered down.  This let the computer actually see the IO chassis (previously the host interface card was glowing just red, no green lights).

Second, the former c1iscex computer and now new c1sus computer only has 6 CPUs, not 8 like most of the other front ends.  Because it was running 6 models (c1sus, c1mcs, c1rms, c1rfm, c1pem, c1x02) and 1 CPU needed to be reserved for the operating system, 2 models were not actually running (recycling mirrors and PEM).  This meant the recycling mirrors were left swinging uncontrolled.

To fix this I merged the c1rms model with the c1sus model.  The c1sus model now controls BS, ITMX, ITMY, PRM, SRM.  I merged the filter files in the /chans/ directory, and reactivated all the DAQ channels.  The master file for the fb in the /target/fb directory had all references to c1rms removed, and then the fb was restarted via "telnet fb 8088" and then "shutdown".

My final mistake was starting the work late in the day.

So the lesson for Joe is, don't start changes in the afternoon.

Koji has been helping me test the damping and confirm things are really running.  We were having some issues with some of the matrix values.  Unfortunately I had to add them by hand since the previous snapshots no longer work with the models.

  4194   Mon Jan 24 10:39:16 2011 josephbHowToDAQDAQ Wiki Failure

Actually both port 8087 and 8088 work to talk to the frame builder.  Don't let the lack of a daqd prompt fool you.


Here's putting in the commands:

rosalba:~>telnet fb 8088 Trying

Connected to fb.martian ( Escape character is '^]'.


0000Connection closed by foreign host.

rosalba:~>date Mon Jan 24 10:30:59 PST 2011


Then looking at the last 3 lines of restart.log in /opt/rtcds/caltech/c1/target/fb/

daqd_start Fri Jan 21 15:20:48 PST 2011

daqd_start Fri Jan 21 23:06:38 PST 2011

daqd_start Mon Jan 24 10:30:29 PST 2011


So clearly its talking to the frame builder, it just doesn't have the right formatting for the prompt.  If you try typing in "help" at the prompt, you still get all the frame builder commands listed and can try using any of them.

However, I'll edit the DAQ wiki and indicate 8087 should be used because of the better formatting for the prompt.

Apparently, 8087 is the right port. Various elog entries from Joe and Kiwamu say 8087 or 8088. Not sure what's going on here.

After figuring this out, I activated the C1:GCV-XARM_COARSE_OUT_DAQ and C1:GCV-XARM_FINE_OUT_DAQ and set both of them to be recorded at 2048 Hz. We are loading filters and setting gains into these filter modules such that the OUT signals will be calibrated into Hz (that's why we used the OUT instead of the IN1 as there was last night).


  4200   Tue Jan 25 15:20:38 2011 josephbUpdateCDSUpdated c1rfm model plus new naming convention for RFM/Dolphin

After sitting down for 5 minutes and thinking about it, I realized the names I had been using for internal RFM communication were pretty bad.  It was because looking at a model didn't let you know where the RFM connection was coming from or going to.  So to correct my previous mistakes, I'm instituting the following naming convention for reflected memory, PCIE reflected memory (dolphin) and shared memory names.  These don't actually get used anywhere but the models, and thus don't show up as channel names anywhere else.  They are replaced by raw hex memory locations in the actual code through the use of the IPC file (/opt/rtcds/caltech/c1/chans/ipc/C1.ipc).  However it will make understanding the models easier for anyone looking at them or modifying them.


The new naming convention for RFM and Dolphin channels is as follows.

SITE:Sending Model-Receiving Model_DESCRIPTION_HERE

The description should be unique to that data being transferred and reused if its the same data.  Thus if its transfered to another model, its easy to identify it as the same information.

The model should be the .mdl file name, not the subsystem its a part of.  So SCX is used instead of SUS.  This is to make it easier to track where data is going.

In the unlikely case of multiple models receiving, it should be of the form SITE:Sending Model-Receiving Model 1-Receiving Model 2_DESCRIPTION_HERE.  Seperate models by dashes and description by underscores.



This channel goes from the LSC model (on c1lsc) to the RFM model (on c1sus).  It transfers ETMX LSC position feedback.  The second LSC may seem redundant until we look at the next channel in the chain.


This channel goes from the RFM model to the SCX model (on c1iscex). It contains the same information as the first channel, i.e. ETMX LSC position feedback.


I have updated all the models that had RFM and SHMEM connections, as well as adding all the LSC communciation connections to c1rfm.  This includes c1sus, c1rfm, c1mcs, c1ioo, c1gcv, c1lsc, c1scx, c1scy.  I have not yet built all the models since I didn't finish the updates until this afternoon.  I will build and test the code tomorrow morning.




  4206   Wed Jan 26 10:58:48 2011 josephbUpdateCDSFront End multiple crash

Looking at dmesg on c1lsc, it looks like the model is starting, but then eventually times out due to a long ADC wait. 

[  114.778001] c1lsc: cycle 45 time 23368; adcWait 14; write1 0; write2 0; longest write2 0
[  114.779001] c1lsc: ADC TIMEOUT 0 1717 53 181

I'm not sure what caused the time out, although there about 20 messages indicating a failed time stamp read from c1sus (its sending TRX information to c1lsc via the dolphin connection) before the time out.

Not seeing any other obvious error messages, I killed the dead c1lsc model by typing:

sudo rmmod c1lscfe

I then tried starting just the front end model again by going to the /opt/rtcds/caltech/c1/target/c1lsc/bin/ directory and typing:

sudo insmod c1lscfe.ko

This started up just the FE again (I didn't use the restart script because the EPICs processes were running fine since we had non-white channels).  At the moment, c1lsc is now running and I see green lights and 0x0 for FB0 status  on the C1LSC_GDS_TP screen.

At this point I'm not sure what caused the timeout.  I'll be adding some more trouble shooting steps to the wiki though.  Also, c1scx, c1scy are probably in need of restart to get them properly sync'd to the framebuilder.

I did a quick test on dataviewer and can see LSC channels such as C1:LSC-TRX_IN1, as well other channels on C1SUS such as BS sensors channels.



  • Rebooted c1lsc and c1sus. Restarted fb many times.
  • c1sus seems working.
  • All of the suspensions are damped / Xarm is locked by the green
  • Thermal control for the green is working
  • c1lsc is frozen
  • FB status: c1lsc 0x4000, c1scx/c1scy 0x2bad
  • dataviewer not working 


  4208   Wed Jan 26 12:04:31 2011 josephbUpdateCDSExplanation of why c1sus and c1lsc models crash when the other one goes down

So apparently with the current Dolphin drivers, when one of the nodes goes down (say c1lsc), it causes all the other nodes to freeze for up to 20 seconds.

This 20 seconds can force a model to go over the 60 microseconds limit and is sufficiently long enough to force the FE to time out.  Alex and Rolf have been working with the vendors to get this problem fixed, as having all your front ends go down because you rebooted a single computer is bad.

[40184.120912] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0
[40184.120914] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0
[44472.627831] c1pem: ADC TIMEOUT 0 7718 38 7782
[44472.627835] c1mcs: ADC TIMEOUT 0 7718 38 7782
[44472.627849] c1sus: ADC TIMEOUT 0 7718 38 7782
[44472.644677] c1rfm: cycle 1945 time 17872; adcWait 15; write1 0; write2 0; longest write2 0
[44472.644682] c1x02: cycle 7782 time 17849; adcWait 12; write1 0; write2 0; longest write2 0
[44472.646898] c1rfm: ADC TIMEOUT 0 8133 5 7941

The solution for the moment is to start the computers at exactly the same time, so the dolphin is up before the front ends, or start the models by hand after the computer is up and dolphin running, but after they have timed out.  This is done by:

sudo rmmod c1SYSfe

sudo insmod /opt/rtcds/caltech/c1/target/c1SYS/bin/c1SYSfe.ko


Alex and Rolf have been working with the vendors to get this fixed, and we may simply need to update our Dolphin drivers.  I'm trying to get in contact with them and see if this is the case.

  4212   Thu Jan 27 15:16:43 2011 josephbUpdateCDSUpdated generate_master_screens.py

I modified the generate_master_screens.py script in /opt/rtcds/caltech/c1/medm/master/ to handle changing the MCL (and MC_L) listings to ALS for the two ETM suspension screens and associated sub-screens.

The relevant added code is:

custom_optic_channels = ['ETMX',


for index in range(len(custom_optic_channels)/2):
   if optic == custom_optic_channels[index*2]:
     for swap in custom_optic_channels[index*2+1]:
       sed_command = start_sed_string + swap + "/" + custom_optic_channels[index*2+1][swap] + middle_sed_string + optic + file

When run, it generates the correctly named C1:SUS-ETMX_ALS channels, and replaces MCL and MC_L with ALS in the matrix screens.


  4219   Fri Jan 28 11:08:44 2011 josephbUpdateGreen Lockingno transmission of ALS signals

As you've correctly noted, the source of the C1:GCV-SCX_ETMX_ALS channels is in the c1gcv model. The first 3 letters of the channel name indicate this (GCV).

The destination of this channel is c1scx, the 2nd 3 letters indicate this (SCX). If it passed through the c1rfm model, it would be written like C1:GCV-RFM_ETMX_ALS.

This particular channel doesn't pass through the c1rfm model, because the computers these two run on (c1ioo and c1scx) are directly connected via our old VMIC 5565 RFM cards, and don't need to pass through the c1sus computer. This is in contrast to all communications going to or from the c1lsc machine, since that is only connected the c1sus machine by the Dolphin RFM. The c1rfm also handles a bunch of RFM reads from the mode cleaner WFS, since each eats up 3-4 microseconds and I didn't want to slow the c1mcs model by 24 microseconds (and ~50 microseconds before the c1sus/c1scx computer switch).

So basically c1rfm is only used for LSC communications and for some RFM reads for local suspensions on c1sus.

As for the reason we have no transmission, that looks to be a problem on c1ioo's end. I'm also noticing that MCL is not updating on the MC2 suspension screen as well as no changes to MC PIT and YAW channels, which suggests we're not transmitting properly.

I rebooted the c1ioo machine and then did a burt restore of the c1ioo and c1gcv models. These are now up and running, and I'm seeing both MCL and ALS data being transmitted now.

Its possible that when we were working on the c1gfd (green frequency divider model) on c1ioo machine we disturbed the RFM communication somehow. Although what exactly, I'm not sure.


No signal is transmitted from C1:GCV-SCX_ETMX_ALS (on c1gcv) to C1:GCV-SCX_ETMX_ALS (on c1scx)

I can't find RFM definition for ALS channels in c1rfm. Where are they???


  4220   Fri Jan 28 12:15:58 2011 josephbUpdateCDSUpdating conlog channel list/ working on "HealthCheck" script

I've updated the scan_adls script (currently located in /cvs/cds/caltech/conlog/bin) to look at new location of our medm screens.  I made a backup of the old conlog channel list as /cvs/cds/caltech/conlog/data/conlog_channels.old-2011-01-28.

I then ran the update_chanlist script in the same directory, which calls the scan_adl script.  After about 5 minutes it finished updating the channel list.  I restarted the conlogger just to be sure, and checked that our new model channels showed up in the conlog (which they do).

I have added a cron job to the op340m cron tab to once a day run the update_conlog script at 7am.

Next, I'm working on a HealthCheck script which looks at the conlog channel list and checks to see if channels are actually changing over short time scales, and then spit back a report on possibly non-functioning channels to the user.

  4231   Mon Jan 31 10:31:30 2011 josephbUpdateWienerFilteringImprovement in H1 Wiener FF prediction by using weights and taps

Rossa is a rather beefy machine. It effectively has 8 Intel i7 Cores (2.67 Ghz each) and 12 Gigs of ram.  Megatron only has 8 Gigs of ram and just 8 Opterons (1 GHz each).  Rosalba has 4 Quad Core2  (2.4 GHz) with only 4 Gigs of ram. 

  4241   Wed Feb 2 15:07:20 2011 josephbUpdateCDSactivateDAQ.py now includes PEM channels

[Joe, Jenne]

We modified the activateDAQ.py script to handle the C1PEM.ini file (defining the PEM channels being recorded by the frame builder) in addition to all the optics channels.  Jenne will be modifying it further so as to rename more channels.

  4246   Thu Feb 3 16:45:28 2011 josephbUpdateCDSGeneral CDS updates

Updated the FILTER.adl file to have the yellow button moved up, and replaced the symbol in the upper right with a white A with black background.  I made a backup of the filter file called FILTER_BAK.adl.  These are located in /opt/rtcds/caltech/c1/core/advLigoRTS/src/epics/util.

I also modified the Makefile in /opt/rtcds/caltech/c1/core/advLigoRTS/ to make the startc1SYS scripts it makes take in an argument.  If you type in:

sudo startc1SYS 1

it automatically writes 1 to the BURT RESTORE channel so you don't have to open the GDS_TP screen and by hand put a 1 in the box before the model times out.

The scripts also points to the correct burtwb and burtrb files so it should stop complaining about not finding them when running the scripts, and actually puts a time stamped burt snapshot in the /tmp directory when the kill or start scripts are run.  The Makefile was also backed up to Makefile_bak.


  4247   Thu Feb 3 17:25:03 2011 josephbUpdateComputersrsync script was not really backing up /cvs/cds

So today, after an "rm" error while working with the autoburt.pl script and burt restores in general, I asked Dan Kozak how to actually look at the backup data.  He said there's no way to actually look at it at the moment.  You can reverse the rsync command or ask him to grab the data/file if you know what you want.  However, in the course of this, we realized there was no /cvs/cds data backup.

Turns out, the rsync command line in the script had a "-n" option.  This means do a dry run.  Everything *but* the actual final copying.

I have removed the -n from the script and started it on nodus, so we're backing up as of 5:22pm today.

I'm thinking we should have a better way of viewing the backup data, so I may ask Dan and Stewart about a better setup where we can login and actually look at the backed up files.

In addition, tomorrow I'm planning to add cron jobs which will put changes to files in the /chans and /scripts directories into the SVN on a daily basis, since the backup procedure doesn't really provide a history for those, just a 1 day back backup.

  4249   Fri Feb 4 13:31:16 2011 josephbUpdateCDSFE start scripts moved to scripts/FE/ from scripts/

All start and kill scripts for the front end models have been moved into the FE directory under scripts:  /opt/rtcds/caltech/c1/scripts/FE/.  I modified the Makefile in /opt/rtcds/caltech/c1/core/advLigoRTS/ to update and place new scripts in that directory. 

This was done by using

sed -i 's[scripts/start$${system}[scripts/FE/start$${system}[g' Makefile

sed -i 's[scripts/kill$${system}[scripts/FE/kill$${system}[g' Makefile

  4250   Fri Feb 4 13:45:25 2011 josephbUpdateComputersTemporarily removed cronjob for rsync.backup
<p>I removed the rsync backup from nodus' crontab temporarily so as to not have multiple backup jobs running.&nbsp; The job I started from yesterday was still running.&nbsp; Hopefully the backup will finish by Monday.</p>
<p>The line I removed was:</p>
<p>0 5 * * * /opt/rtcds/caltech/c1/scripts/backup/rsync.backup</p>
  4251   Fri Feb 4 15:03:20 2011 josephbUpdateComputersModified cshrc.40m

Removed some lines from the PATH environment variable since they point to old codes which no longer work with the new frame builder and setup.

The change was:


  4256   Mon Feb 7 10:37:28 2011 josephbUpdateComputersTemporarily removed cronjob for rsync.backup

The backup appears to have finished on nodus, and I've put the rsync job back in the crontab.


I removed the rsync backup from nodus' crontab temporarily so as to not have multiple backup jobs running.  The job I started from yesterday was still running.  Hopefully the backup will finish by Monday.

The line I removed was:

0 5 * * * /opt/rtcds/caltech/c1/scripts/backup/rsync.backup

  4262   Tue Feb 8 16:04:58 2011 josephbUpdateCDSHard coded decimation filters need to be fixed

[Joe, Rana]

Filter definitions for the decimation filters to epics readback channels (like _OUT16) can be found in the fm10Gen.c code (in /opt/rtcds/caltech/c1/core/advLigoRTS/src/include/drv).

At the moment, the code is broken for systems running at 32k, 64k as they look to be defaulting to the 16k filter.  I'd like to also figure out the notation and plot the actual filter used for the 16k.

Rana has suggested a 2nd order, 2db ripple low pass Cheby1 filter at 1 Hz.


  51 #if defined(SERVO16K) || defined(SERVOMIXED) || defined(SERVO32K) || defined(SERVO64K) || defined(SERVO128K) || defined(SERVO256K)
  52 static double sixteenKAvgCoeff[9] = {1.9084759e-12,
  53                                      -1.99708675982420, 0.99709029700517, 2.00000005830747, 1.00000000739582,
  54                                      -1.99878510620232, 0.99879373895648, 1.99999994169253, 0.99999999260419};
  55 #endif
  57 #if defined(SERVO2K) || defined(SERVOMIXED) || defined(SERVO4K)
  58 static double twoKAvgCoeff[9] = {7.705446e-9,
  59                                  -1.97673337437048, 0.97695747524900,  2.00000006227141,  1.00000000659235,
  60                                  -1.98984125831661,  0.99039139954634,  1.99999993772859,  0.99999999340765};
  61 #endif
  63 #ifdef SERVO16K
  64 #define avgCoeff sixteenKAvgCoeff
  65 #elif defined(SERVO32K) || defined(SERVO64K) || defined(SERVO128K) || defined(SERVO256K)
  66 #define avgCoeff sixteenKAvgCoeff
  67 #elif defined(SERVO2K)
  68 #define avgCoeff twoKAvgCoeff
  69 #elif defined(SERVO4K)
  70 #define avgCoeff twoKAvgCoeff
  71 #elif defined(SERVOMIXED)
  72 #define filterModule(a,b,c,d) filterModuleRate(a,b,c,d,16384)
  73 #elif defined(SERVO5HZ)
  74 #else
  75 #error need to define 2k or 16k or mixed
  76 #endif

  4265   Wed Feb 9 15:26:22 2011 josephbUpdateCDSUpdated c1scx with lockin, c1gcv for green transmission pd

Updated the c1scx model to have two Lockin demodulators (C1:SUS-ETMX_LOCKIN1 and C1:SUS-ETMX_LOCKIN2).  There is a matrix C1:SUS-ETMX_INMUX which directs signals to the inputs of LOCKIN1 and LOCKIN2.  Currently only the GREEN_TRX signal is the only signal going in to this matrix, the other 3 are grounds.  The actual clocks themselves had to be at the top level (they don't work inside blocks) and thus named C1:SCX-ETMX_LOCKIN1_OSC and C1:SCX-ETMX_LOCKIN2_OSC.


There is a signal (IPC name is C1:GCV-SCX_GREEN_TRX) going from the c1gcv model to the c1scx model, which will contain the output from Jenne's green transmission PD which will eventually be placed. I've placed a filter bank on it in the c1gcv model as a monitor point, and it corresponds to C1:GCV-GREEN_TRX.


The suspension control screens were modified to have a screen for the Matrix feeding signals into the two lockin demodulators.  The green medm screen was also modified to have readbacks for the GREEN_TRX and GREEN_TRY channels.


So on the board, the top channel (labeled 1, corresponds to code ADC_0_0) is MCL.

Channel 2 (ADC_0_1) is assigned to frequency divided green signal.

Channel 3 (ADC_0_2) is assigned to the beat PD's DC output.

Channel 4 (ADC_0_3) is assigned to the green power transmission for the x-arm.

Channel 5 (ADC_0_4) is assigned to the green power transmission for the y-arm.

  4270   Thu Feb 10 14:07:18 2011 josephbUpdateCDSUpdating dolphin drivers to eliminate timeouts when one dolphin card is shutdown


Alex came over and we installed the new Dolphin drivers so that the front ends using the Dolphin PCIe RFM network don't pause for a long time when one of the other nodes in the network go down.  Generally this pause would cause the code to time out and quit.  Now you can take c1lsc or c1sus down without having the other have problems.

We did note on reboot however, that the Dolphin_wait script sometimes (not always) seems to hang.  Since this is run at boot up, to ensure the dolphin card has had enough to allocate memory space for data to be written/read from by the IOP process, it means nothing else in the startup script gets run if it does happen.  In this case, running "pkill dolphin_wait" may be necessary.

Note that you may still have problems if you hit the power button to force a shutdown (i.e. holding it for 4 seconds for immediate power off), but as long as you do a "reboot" or "shutdown -r now" type command, it should come down gracefully. 


What was done:

Alex grabbed the code from his server, and put it /home/controls/DIS/ on fb.

He ran the following commands in that directory to build the code.

./configure  '--with-adapter=DX' '--prefix=/opt/DIS'


sudo make install

He proceeded to modify the /diskless/root/etc/rc.local to have the line:

insmod /lib/modules/

In that same file he commented out

cd /root


exec /bin/bash/

He then modified the run levels in /diskless/root/etc/inittab. Level 0, level 3, and level 6 were changed:




Then he created the scripts he was refering to:

rc.level3 is just:

exec /bin/bash

rc.halt is:

/opt/DIS/sbin/dxtool prepare-shutdown 0
sleep 3
halt -p

rc.reboot is:



Basically rc.halt calls a special code which prepares the Dolphin RFM card to shutdown nicely.  This is why just hitting the power button for 4 seconds will cause problems for the rest of the dolphin network.

We then checked out of svn the latest dolphin.c in  /opt/rtcds/caltech/c1/core/advLigoRTS/src/fe


The Dolphin RFM cards have a new numbering scheme.  4 is reserved for special  broadcasts to everyone, so the Dolphin node IDs now start at 8.  So we needed to change the c1lsc and c1sus Dolphin node IDs.

To change them we went to /etc/dis/dishosts.conf on the fb machine, and changed the following lines:


ADAPTER:  c1sus_a0 4 0 4
ADAPTER:  c1lsc_a0 8 0 4


ADAPTER:  c1sus_a0 8 0 4
ADAPTER:  c1lsc_a0 12 0 4

The FE models for the c1lsc and c1sus machines were recompiled and then the computers were rebooted.  After having them come back up, we tested that there was no time out by shutting down c1lsc and watching c1sus. We then reveresed and shutdown c1sus while watching c1lsc.  No problems occured.  Currently they are up and communicating fine.


  4291   Mon Feb 14 18:27:39 2011 josephbUpdateCDSBegan updating to latest CDS svn, reverted to previous state

[Joe, Alex]

This morning I began the process of bringing our copy of the CDS code up to date to the version installed at Livingston. The motivation was to get fixes to various parts, among others such as the oscillator part.   This would mean cleaning up front end model .mdl files without having to pass clk, sin, cos channels for every optic through 3 layers of simulink boxes.

I also began the process of using a similar startup method, which involved creating /etc/init.d/ start and stop scripts for the various processes which get run on the front ends, including awgtpman and mx_streams.  This allows the monitor software called monit to remotely restart those processes or provide a web page with a real time status of those processes.  A cleaner rc.local file utilizing sub-scripts was also adapted.

I did some testing of the new codes on c1iscey.  This testing showed a problem with the timing part of the code, with cycles going very long.  We think it has something to do with the code not accounting for the fact that we do not have IRIG-B timing cards in the IO chassis providing GPS time, which the sites do have.  We rely on the computer clock and ntpd.

At the moment, we've reverted to svn revision 2174 of the CDS code, and I've put the previously working version of the c1scy and c1x05 (running on the c1iscey computer) back. Its from the /opt/rtcds/caltech/c1/target/c1x05/c1x05_11014_163146 directory.  I've put the old rc.local file back in /diskless/root/etc/ directory on the fb machine.  Currently running code on the other front end computers was not touched.

  4300   Tue Feb 15 11:56:17 2011 josephbUpdateCDSUpdated some DAQ channel names
That is my fault for not running the activateDAQ.py script after a round of rebuilds. I have run the script this morning, and confirmed that the oplev channels are showing up in dataviewer.


Although Joe and Kiwamu claim that they have inserted the correct DAQ names for the OPLEVs (e.g. PERROR and YERROR) back in Jan. 11, when I look today, I see that these channels are missing!

I want my PERROR/YERRORs back!



  4302   Tue Feb 15 15:06:25 2011 josephbUpdateCDSCDS todo list for tomorrow morning

Currently, there is a test directory called /opt/rtcds/caltech/c1/new_core where we have the latest svn checkout.  Tomorrow (after everything works), it will become the core directory.

1) Modify on the fb machine the /diskless/root/etc/ld.so.cache file.  This is done by logging into fb, going to /etc/ld.so.conf.d/, modifying epics-x86_64.conf to only have .10 stuff , and running sudo /sbin/ldconfig.  Copy the newly generated /etc/ld.so.cache file to /diskless/root/etc/.

2) Modify the rc.local file on the fb machine in /diskless/root/etc/ to take advantage of the new subscripts and init.d/ start scripts.

3) Add the no_rfm_dma to all the iop models (c1x01,c1x02,c1x03,c1x04,c1x05).

4) Rebuild all front end models with new code.  Install.

5) Build awgtpman and mx_streams with new code.

6) Rerun activateDaq.py (to fix channel names from all the rebuilt code).

7) Double check Burt request files have the switch fix.

8) Restart the front ends.

9)Restart the frame builder.

9) Check channels, exitations, RFM connections.

10) Check Monit is working.

