  847   Mon Aug 18 15:32:18 2008 josephbConfigurationCamerasHow to multicast with gstreamer and Gige Cameras
In order to get multicasting to work, one simply needs to understand the address scheme.

In general, the address range - are reserved for multicasting. Within in this address space, there are some base level operations in the 224.0.0.x range which shouldn't be interfered with.

For a single site, the address range between and is probably best.

Gstreamer and the current 40m network hubs are designed to handle this kind of communication already, so one merely needs to point them at the correct addresses.

While in /cvs/cds/caltech/target/Prosilica/40mCode/SnapCode type:

CamServe -F 'Mono8' -c 44058 -E 20000 -X 0 -Y 0 -H 480 -W 752 -l 0 -m 300 | gst-launch-0.10 fdsrc fd=0 blocksize=360960 ! video/x-raw-gray, height=480, width=752, bpp=8,depth=8,framerate=60/1 ! ffmpegcolorspace ! queue ! smokeenc keyframe=8 qmax=40 ! udpsink host= port=5000

This will multicast to the address, using port 5000.

On the machine you wish to subscribe type:

gst-launch udpsrc multicast-group= port=5000 ! smokedec ! ffmpegcolorspace ! ximagesink sync=false
  852   Tue Aug 19 13:34:58 2008 josephbConfigurationComputersSwitched c1pem1, c0daqawg, c0daqctrl over to new switches
Moved the Ethernet connections for c1pem1, c0daqawg, and c0daqctrl over to the Netgear Prosafe switch in 1Y6, using new cat6 cables.
  897   Fri Aug 29 11:01:49 2008 josephbConfigurationComputersAttempt to change a channel gain in ICS-110B
As noted earlier by Rana, I was playing around with the /cvs/cds/caltech/chans/daq/C1IOOF.ini file with help from Rob. I had made a backup before hand and saved it as C1IOOF.ini.Aug-28-2008. (I have since been informed that C1IOOF.ini.082808 would have been prefered as a name).

We had been trying to up the gain in the C1: PSL-ISS_INMONPD_F in order to do a very low power PMC sweep, in an attempt to get clean modes for fitting. Initially we pressed the reconfig button on the C0DAQ_DETAIL screen, but all that seemed to do was change the Config File CRC. We proceeded to reboot fb40m remotely. However, any change to the ini file (even an extra space at the end of the file) caused a 0x2000 status for C1IOVME16k on the C0DAQ_DETAIL screen. At the time I presumed it was comparing the CRC of the ini-file to something else.

Digging around on in Alex's webspace at http://www.ligo.caltech.edu/~aivanov/ , I found the NDS Access page, which indicated that 0x2000 was a conflict between the front-end and frame builder .ini files.

"There is also status bit 0x2000 which gets added when the DCU configuration is different in front-end and frame builder. That is you can change and .ini file an then reload DAQ configuration with Epics button, which reconfigures the front-end, but leaves frame builders with invalid old configuration. They will detect this change and set the status to 0x2000 to indicate this condition. You will have to restart frame builders to pick up new .ini file and set status back to zero for the affected DCU."

It was when I was going to try reseting the c1iovme via the C0DAQ_RFMNETOWRK medm screen that we realized the EPICS controls were not responding properly. The .ini file was returned to its original form, and mass reboots commenced.
  898   Fri Aug 29 11:05:11 2008 josephbSummaryComputersc1asc was down this morning
I had to manually reboot c1asc this morning, as for some unknown reason its status was red, and the fiber lights on the board were status:red, sig det:amber, own data: nothing. Shut the crate down, turned it back on, heard a beep, then followed wiki reboot instructions. Seems to be working now.
  900   Fri Aug 29 12:43:44 2008 josephbSummaryComputersc1susvme1 down
Around noon today, c1susvme was having problems. The C0DAQ_RFMNETWORK light was red. The status light was off, the sig det light was amber and the own data light was green. I could also ssh in, but could not not run startup. I switched off the watchdogs for c1susvme2 (the watchdogs for c1susvme1 had already been tripped), and manually power cycled the crate.

However, when c1susvme1 when it came back up it had not mounted the usual cvs/cds/ directories. c1susvme2 did however. c1susvme1 has been on the new network for awhile, while c1susvme2 was switch over today. So apparently switching networks doesn't help this particular problem.

I did a remote reboot of c1susvme1, and it came up with the correct files mounted. Both machines ran their approriate startup.cmd files and are currently green.
  941   Thu Sep 11 11:29:14 2008 josephbConfigurationComputersFinal netgear switch in place in 1Y2
I've placed the final (of 4) Netgear prosafe 24 port switch at the very top of 1Y2. At that location, there are no holes left to screw into, so it has 4 rubber feet and is sitting on the top most signal generator. It has been plugged in and connected to the control room hub with a labeled cat6 ethernet cable.

Its IP address has been set to, and has the usual controls password if using the "Smart Wizard Discovery Tool" which comes on the Netgear CD. The CD can be found in the Equipment manuals filing cabinet under Netgear. This program unfortunately only runs on a window PC.

To Do: Fix the C1:ASC ethernet connection which is currently coming straight out the front door and connected to the 1X4 switch (again through the front door).
  948   Mon Sep 15 14:00:52 2008 josephbConfigurationComputers1Y9 Hub and C1asc
The 1Y9 switch is now using a labeled Cat6 cable in cable trays to connect to the main switch in the offices. In addition, the c1asc cable which had been coming out the door was fixed last Friday, and is now labeled, going out the top and connects to the hub in 1Y2.

Note: Do not connect new ethernet cable from switch to switch without disconnecting the old cable to the rest of the network - this tends to make the Ethernet network unhappy with white flashing alarms.
  965   Thu Sep 18 14:36:54 2008 josephbConfigurationComputersName server and Epics
The problems Rob was experiencing last night was due to part of the setup (or rather testing of the setup) of the new nameserver running on linux1.

The name server was setup on linux1 by doing the following:

1) Installed xorg-x11-xauth via yum which was necessary to get remote x windows to work in linux1

2) Installed xorg-x11-fonts-Type1 in order to get the gui system-config-* programs to work

3) Ran system-config-bind, which created a default set of nameserver files. I unfortunately didn't understand the gui all that well, so I manually edited and added files to these base ones. The base files were generated in /var/named/chroot/etc/ and /var/named/chroot/var/named.

4) I added martian.zone and 113.215.131.in-addr.arpa.zone, named.conf.local, and edited named.conf so it loaded named.conf.local. The martian.zone file acts a forward look up (i.e. give it a name and it returns an IP number like The 113.215.131.in-addr.arpa.zone acts as a reverse look up (i.e. give it an IP number like and it tells you the name). The file named.conf.local merely points to these two files.

Note: One can add or change IP lookup by simply updating these two files. The format should be obvious from the files.

5) I specifically ssh'd in as root to linux1 (using su wasn't sufficient) and then typed "service named start" (without quotes). You can also use "restart" or "stop" instead of "start". This started the name server, giving an [Ok] message.

6) I edited the /etc/resolve.conf file on linux1 so that it pointed to itself first ("nameserver" at the top of the file). I also added the line "search martian", which allows one to simply use linux1 as opposed to linux1.martian.

I also edited the /etc/resolve/conf file on linux2, and it seems to resolve names fine.

7) And here is where I broke things. As a test, I moved /etc/hosts to /etc/hosts.bak, and then tested to see if names were being resolved correctly. By using the command host, I determined they were in fact working. I also tested with ssh.

However, something basic didn't like me moving the hosts file. Apparently when a front-end machine needed to reboot, it wouldn't come back up, without any ability to SSH or telnet into them.

With Yoichi and I did quite a bit of debugging this morning and determined the nameserver itself isn't conflicting, merely the lack of the host file was the source of the problem. One theory is that services don't know to go to DNS to resolve host names. I think by modifying the /etc/nsswitch.conf file to include dns as an option for services and other programs, it might work without the host file, however, I'm going to leave that to tomorrow morning which is less likely to interfere with current operations.

As it stands, things are working with the nameserver running and the host file in place.
  973   Fri Sep 19 11:21:45 2008 josephbConfigurationComputersNameserver and Rosalba
I tried modifying the nsswitch.conf file to include going to dns in addition to local files for everything (services, network, etc) and then moving the /etc/hosts file to /etc/hosts.bak. Unfortunately, this still didn't allow front-ends to reboot properly. So I'm not sure what is using the hosts file, but whatever it is, is apparently important. After the test I placed the hosts file back and reverted the nsswitch.conf file.

I also noticed that Rosalba was having problems connecting to the network. This apparently was because I had shut down the dhcp server on the NAT router, as had been discussed at the meeting on Wednesday.

To fix this, I modified the /etc/sysconfig/network-scripts/ifcfg-eth1 file to fix rosalba's ip as (which doesn't seem to be in use). I also updated rosalba's /etc/resolv.conf file to point at linux1's name server, and two additional name servers as well, and added the "search martian" line. I modified the /etc/sysconfig/network-scripts/ifcfg-eth0 file so the built in network card doesn't come up automatically, since its currently not plugged into anything. Lastly, I added rosalba and its IP to linux1's name server files.
  992   Thu Sep 25 14:03:08 2008 josephbConfigurationComputers 


We (Joe) need to buy a GigE card for linux1 and to also set up conlog and conlogger to run on Nodus.

A spare Intel Pro 1000/GT desktop adapter (gigabit ethernet card) has been added to Linux1 and is now using that card to connect to the network.

This was after a slight scare when I somehow reset the bios on Linux1 during the first reboot after adding the card.
After some debugging and discussion with Yoichi, the bios was fixed and the computer works again, with its new faster network connection.

Although we both noted that Linux1 is a rather old machine, with only half a gig of Ram and reaching about 80% capacity on its 58 gigabyte hard drive (raid). Might be worth upgrading in general.

Need to figure out how to install conlog/conlogger programs next...
  1006   Mon Sep 29 13:33:39 2008 josephbConfigurationComputersGigabit network finished and conlog available on Nodus
The last 100 Mb unmounted hub has been removed (or at least of the ones I could find). We should be on a fully gigabit network with Cat6 cables and lots and lots of labels.

In other news, the pearl script that runs the web interface on linux1 for the conlog has been copied to /cvs/cds/caltech/apache/cgi-bin/ and is now being pointed to by the apache server on Nodus.

  1033   Wed Oct 8 12:35:56 2008 josephbConfigurationComputersNew Network diagram for the 40m
Attached is a pdf of the new network diagram for the 40m after having removed all of the old hubs.
Attachment 1: 40m_network_10-07-08.pdf
  1067   Wed Oct 22 12:37:47 2008 josephbUpdateComputersNetwork spreadsheet
Attached in open office format as well as excel format is spreadsheet containing all the devices with IP addresses at the 40m. Please contact me with any corrections.
Attachment 1: 40m_network_10-15-08.ods
Attachment 2: 40m_network_10-15-08.xls
  1098   Tue Oct 28 12:01:01 2008 josephbConfigurationComputerslinux2

I have removed linux2 and its cables from the control room and put it into 1Y3 along with op340m.

When Joe next comes in we can ask him to Cat6 it to the rest of the world, although it already
seems to me that the CDS hub/switch next Alberto's desk is too full and that we need to purchase
a 48 port device for there.

Note I still need to remove a fair bit of cabling no longer in use from the Martian network switch next to Alberto's desk. There's actually about 8-10 cables there which show no connectivity and are not being used. So there's really about 33% of the ports open in the control room hub, it just doesn't look like it.

As for linux2, I'll probably just connect it to the 1Y2 or 1Y6 Hubs when I get the chance.
  1175   Thu Dec 4 16:29:20 2008 josephbConfigurationComputersError message on Frame Builder Raid Array
The Fibrenetix FX-606-U4 RAID connected to the frame builder in 1Y7 is showing the following error message: IDE Channel #4 Error Reading
  1253   Mon Jan 26 14:51:54 2009 josephbConfigurationVAC 
We need a new RS-232 to Ethernet bridge in order to interface properly with the new RGA. The RGA has a fixed baud rate of 28.8k, and the current bridge (which used to work with the old RGA) doesn't have that baud rate as an option. Currently looking into purchasing a new bridge, and trying to make sure it can meet the communications requirements of the RGA.
  1294   Wed Feb 11 15:01:47 2009 josephbConfigurationComputersAllegra

So after having broke Allegra by updating the kernel, I was able to get it running again by copying the xorg.conf.backup file over xorg.conf in /etc/X11.  So at this point in time, Allegra is running with generic video drivers, as opposed to the ATI specific and proprietary drivers.

  1333   Mon Feb 23 16:42:08 2009 josephbConfigurationCamerasCamera Beta Testing

I've setup the GC650 camera (ID 32223) to look at the mode cleaner transmission.  I've also added an alias to the camera server and client for this camera.

To use, type: "pserv1 &"on the machine you want to run the server on and "pcam1 &" on the machine you want to actually view the video.  At the moment, this only works for the 64-bit Centos 5 machines, Rosalba, Allegra and Ottavia.

Note, you will generally want to start the client first (pcam1 or pcam2) to see if a server is already running somewhere.  The server will complain that it can't connect to a camera if it already is in use.

I've setup the GC750 camera (ID 44026) to look at the the right most analog quad TV.  This can be run by using "pserv2 &" and "pcam2 &". 

If the image stops playing you can try starting and stoping the server to see if will start back up. 

You can also try increasing or decreasing the exposure, to see if that helps.  The increase and decrease buttons change the exposure by a factor of 2 for each press.

  Lastly, the button "Read Epic Channel" reads in the current value from the channel: "C1:PEM-stacis_EEEX_geo" and uses it as the exposure value, in microseconds (in principle 10 to 1000000 should work).

For example, to exposre for 10000 microseconds, use "ezcawrite C1:PEM-stacis_EEEX_geo 10000" and press the "Read Epic Channel" button.

  1355   Wed Mar 4 17:20:04 2009 josephbUpdateCamerasCamera code upgrades

I've updated the digital camera python code as well as changed the network topology.

At the moment, both cameras are connected to a small gigabit switch which only talks to Ottavia.  This means all camera servers must be run on Ottavia, allow camera output is still UDP multicast so any machine capable of running gstreamer can pick up the images.

The server and client programs now have the ability to read a configuration file for the setup of the cameras.  They default to pcameraSettings.ini, but this can default can be changed with a -c or --config option

For example, "serverV3.py --config pcam1.ini" will run the server using the pcam1.ini settings file.  Similarly, "client.py --config pcam1.ini" will also take the IP settings from the config file so that it knows at which port and IP to listen.

These programs and .ini files have been placed in /cvs/cds/caltech/apps/linux64/python/pcamera/

I've updated the cshrc.40m aliases so that it uses the new configuration file options, so now pcam1 calls "client.py -c pcam1.ini" in the above directory.

So to start a client use pcam1 or pcam2 (for the 32223 camera in PSL looking at MC trans or 44026 looking at an analog moniter in the control room respectively).  These can be run on Allegra, Rosalba or Ottavia at the moment.

To start a server, use pserv1 or pserv2.  These *must* be run on Ottavia.

I've also added a -n or --no-gui option at Yoichi's request, one which just starts up and plays, with no graphical gui.

Lastly, I've made some changes to the base pcamerasrc.py file, which should make display more robust.  After a failed transmission of an image from the camera to Ottavia, it should re-attempt up to 10 times before giving up. I'm hoping this will make it more robust against packet loss.  The change in network topology has also helped this, allowing 640x480 to be transmitted on both cameras before tens of minutes before a packet loss causes a stop.

  1385   Wed Mar 11 11:30:15 2009 josephbConfigurationCameras 

I modified the Video.db file used by c1aux located in  /cvs/cds/caltech/target/c1aux.

I added the following channels to the db file, intended for either read in or read out by the digital camera scripts.











A better naming scheme can probably be devised, but these will do for now.

  1496   Sun Apr 19 11:34:33 2009 josephbHowToCamerasUSB Frame Grabber - How to

To use the Sensoray 2250 USB frame grabber:

Ensure you have the following packages installed: build-essential, libusb-dev

Download the Linux manual and linux SDK from the Sensoray website at:


Go to the Software and Manual tab near the bottom to find the links.  The software can also be found on the 40m computers at /cvs/cds/caltech/users/josephb/sensoray/

The files are Manual2250LinuxV120.pdf and s2250_v120.tar.gz

Run the following commands in the directory where you have the files.

tar -xvf s2250_v120.tar.gz

cd s2250_v120


cd ezloader


sudo make modules_install

cd ..

At this point plug in the 2250 frame grabber.

sudo modprobe s2250_ezloader

Now you can run the demo with

./sraydemo or ./sraydemo64

Options will show up on screen.  A simple set to start with is "encode 0", which sets the recording type, "recvid test.mpg", which starts the recording in file test.mpg, and "stop", which stops recording.  Note there is no on screen playback.  One needs an installed mpeg player to view the saved file, such as Totem (which can screen cap to .png format) or mplayer.

All these instructions are on the first few pages of the Manual2250LinuxV120 pdf.



  1497   Sun Apr 19 11:51:05 2009 josephbUpdateCamerasMafalda may need an update

I tried installing libusb-dev on mafalda in order to try getting the usb frame grabber to work on it, but could not as it could not download the package.

I then tried to do a sudo apt-get update, which failed completely, as the repository seems to have ceased existing.  Basically I had all 404 Not Found errors.

Turns out Mafalda is still running Ubuntu 7.04, whose support ended late 2008.  So there's a couple things that can be done:

1) Ignore it, and simply not update Mafalda anymore.  This also means some newer software and hardware simply won't work with it (like the usb frame grabber)

2) Try to find another, unofficial repository which still has all of the Ubuntu 7.04 packages.

3) Upgrade to a newer, still supported Ubuntu, such as 7.10, 8.04, or 8.10.

I'd personally lean towards the 3rd option, and go to the 8.04 long term support version.  If people agree with it, I could do the upgrade sometime Monday or Tuesday.



  1581   Wed May 13 12:41:14 2009 josephbUpdateCamerasTiming and stability tests of GigE Camera code

At the request of people down at LLO I've been trying to work on the reliability and speed of the GigE camera code.  In my testing, after several hours, the code would tend to lock up on the camera end.  It was also reported at LLO after several minutes the camera display would slow down, but I haven't been able to replicate that problem.

I've recently added some additional error checking and have updated to a more recent SDK which seems to help.  Attached are two plots of the frames per second of the code.  In this case, the frames per second  are measured as the time between calls to the C camera code for a new frame for gstreamer to encode and transmit.  The data points in the first graph are actually the averaged time for sets of 1000 frames.  The camera was sending 640x480 pixel frames, with an exposure time of 0.01 seconds.  Since the FPS was mostly between 45 and 55, it is taking the code roughly 0.01 second to process, encode, and transmit a frame.

During the test, the memory usage by the server code was roughly 1% (or 40 megabytes out of 4 gigabytes) and 50% of a CPU (out a total of  CPUs).

Attachment 1: newCodeFPS.png
Attachment 2: newCodeFPS_hist.png
  1590   Fri May 15 16:47:44 2009 josephbUpdateCamerasImproved camera code

At Rob's request I've added the following features to the camera code.

The camera server, which can be started on Ottavia by just typing pserv1 (for camera 1) or pserv2 (for camera 2), now has the ability to save individual jpeg snap shots, as well as taking a jpeg image every X seconds, as defined by the user.

The first text box is for the file name (i.e. ./default.jpg will save the file to the local directory and call it default.jpg).  If the camera is running (i.e. you've pressed start), prsessing "Take Snapshot to" will take an image immediately and save it.  If the camera is not running, it will take an image as soon as you do start it.

If you press "Start image capture every X seconds", it will do exactly that.  The file name is the same as for the first button, but it appends a time stamp to the end of the file.

There is also a viedo recording client now.  This is access by typing "pcam1-mov" or "pcam2-mov".  The text box is for setting the file name.  It is currently using the open source Theora encoder and Ogg format (.ogm).  Totem is capable of reading this format (and I also believe vlc).  This can be run on any of the Linux machines.

The viewing client is still accessed by "pcam1" or "pcam2".

I'll try rolling out these updates to the sites on Monday.

The configuration files for camera 1 and camera 2 can be found by typing in camera (which is aliased to cd /cvs/cds/caltech/apps/linux64/python/pcamera) and are called pcam1.ini, pcam2.ini, etc.


  1901   Fri Aug 14 10:39:50 2009 josephbConfigurationComputersRaid update to Framebuilder (specs)

The RAID array servicing the Frame builder was finally switched over to JetStor Sata 16 Bay raid array. Each bay contains a 1 TB drive.  The raid is configured such that 13 TB is available, and the rest is used for fault protection.

The old Fibrenetix FX-606-U4, a 5 bay raid array which only had 1.5 TB space, has been moved over to linux1 and will be used to store /cvs/cds/.

This upgrade provides an increase in look up times from 3-4 days for all channels out to about 30 days.  Final copying of old data occured on August 5th, 2009, and was switched over on that date.

  1904   Fri Aug 14 15:20:42 2009 josephbSummaryComputersLinux1 now has 1.5 TB raid drive



nodus was rebooted by Alex at Fri Aug 14 13:53. I launched elogd.

cd /export/elog/elog-2.7.5/
./elogd -p 8080 -c /export/elog/elog-2.7.5/elogd.cfg -D

 It looks like Alex also rebooted all of the control room computers.  Or something.  The alarm handler and strip tool aren't running.....after I fix susvme2 (which was down when I got in earlier today), I'll figure out how to restart those.

 Alex switched the mount point for /cvs/cds on Linux1 to the 1.5 TB RAID array after he finished copying the data from old drives.  This required a reboot of linux1, with all the resulting /cvs/cds mount points on the other computers becoming stale.  Easiest way to fix that he found was to do a reboot of all the control room machines.  In addition, a reboot fest should probably happen in the near futuer for all the front end machines since they will also have stale mount points as well from linux1.

The 1.5 TB RAID array mount is now mounted on /home of linux1, which was the old mount point of the ~300 GB drive.  The old drive is now at /oldhome on linux1.


  1967   Fri Sep 4 16:09:26 2009 josephbSummaryVACRebooted RGA computer and reset RGA settings

Steve noticed the RGA was not working today.  It was powered on but no other lights were lit.

Turns out the c0rga machine had not been rebooted when the file system on linux1 was moved to the raid array, and thus no longer had a valid mount to /cvs/cds/.  Thus, the scripts that were run as a cron could not be called.

We rebooted c0rga, and then ran ./RGAset.py to reset all the RGA settings, which had been reset when the RGA had lost power (and thus was the reason for only the power light being lit).


Everything seems to be working now.  I'll be adding c0rga to the list of computers to reboot in the wiki.

  2068   Thu Oct 8 11:37:59 2009 josephbUpdateComputersReboot of dcuepics helped, c1susvme1 having problems

Power cycling c1dcuepics seems to have fixed the EPICs channel problems, and c1lsc, c1asc, and c1iovme are talking again.

I burt restored c1iscepics and c1Iosepics from the snapshot at 6 am this morning.

However, c1susvme1 never came back after the last power cycle of its crate that it shared with c1susvme2.  I connected a monitor and keyboard per the reboot instructions.  I hit ctrl-x, and it proceeded to boot, however, it displays that there's a media error, PXE-E61, suggests testing the cable, and only offers an option to reboot.  From a cursory inspection of the front, the cables seem to look okay.  Also, this machine had eventually come back after the first power cycle and I'm pretty sure no cables were moved in between.


  2183   Thu Nov 5 16:41:14 2009 josephbConfigurationComputersMegatron's personal network

In investigating why megatron wouldn't talk to the network, I re-discovered the fact that it had been placed on its own private network to avoid conflicts with the 40m's test point manager.  So I moved the linksys router (model WRT310N V2) down to 1Y9, plugged megatron into a normal network port, and connected its internet port to the rest of the gigabit network. 

Unfortunately, megatron still didn't see the rest of the network, and vice-versa.  I brought out my laptop and started looking at the settings.  It had been configured with the DMZ zone on for, which was Megatron's IP, so communications should flow through the router. Turns out it needs the dhcp server on the gateway router ( to be on for everyone to talk to each other.  However, this may not be the best practice.  It'd probably be better to set the router IP to be fixed, and turn off the dhcp server on the gateway.  I'll look into doing this tomorrow.

Also during this I found the DNS server running on linux1 had its IP to name and name to IP files in disagreement on what the IP of megatron should be.  The IP to name claimed while the name to IP claimed  I set it so both said  (These are in /var/named/chroot/var/ directory on linux1, the files are 113.215.131.in-addr.arpa.zone  and martian.zone - I modified the 113.215.131.in-addr.arpa.zone file).  This is the dhcp served IP address from the gateway, and in principle could change or be given to another machine while the dhcp server is on.

  2192   Fri Nov 6 10:35:56 2009 josephbUpdateComputersRFM reboot fest and re-enabled ITMY coil drivers

As noted by Steve, the RFM network was down this morning.  I noticed that c1susvme1 sync counter was pegged at 16384, so I decided to start with reboots in that viscinity.

After power cycling crates containing c1sosvme, c1susvme1, and c1susvme2 (since the reset buttons didn't work) only c1sosvme and c1susvme2 came back normally.  I hooked up a monitor and keyboard to c1susvme1, but saw nothing.  I power cycled the c1susvme crate again, and this time I watched it boot properly.  I'm not sure why it failed the first time.

The RFM network is now operating normally.  I have re-enabled the watchdogs again after having turned them off for the reboots.  Steve and I also re-enabled the ITMY coil drivers when I noticed them not damping once the watch dogs were re-enabled.  The manual switches had been set to disabled, so we re-enabled them.

  2195   Fri Nov 6 17:04:01 2009 josephbConfigurationComputersRFM and Megatron

I took the RFM 5565 card dropped off by Jay and installed it into megatron.  It is not very secure, as it was too tall for the slot and could not be locked down.  I did not connect the RFM fibers at this point, so just the card is plugged in.

Unfortunately, on power up, and immediately after the splash screen I get "NMI EVENT!" and "System halted due to fatal NMI". 

The status light on the RFM light remains a steady red as well.  There is a distinct possibility the card is broken in some way.

The card is a VMIPMC-5565 (which is the same as the card used by the ETMY front end machine).  We should get Alex to come in and look at it on Monday, but we may need to get a replacement.

  2196   Fri Nov 6 18:02:22 2009 josephbUpdateComputersElog restarted

While I was writing up an elog entry, the elog died again, and I restarted it.  Not sure what caused it to die since no one was uploading to it at the time.

  2197   Fri Nov 6 18:13:34 2009 josephbUpdateComputersMegatron woes

I have removed the RFM card from Megatron and left it (along with all the other cables and electronics) on the trolly in front of the 1Y9 rack.

Megatron proceeded to boot normally up until it started loading Centos 5.  During the linux boot process it checks the file systems.  At this point we have an error:


/dev/VolGroup00/LogVol00 contains a file system with errors, check forced

Error reading block 28901403 (Attempt to read block from filesystem resulted short read) while doing inode scan.

/dev/VolGroup00/LogVol00 Unexpected Inconsistency; RUN fsck MANUALLY


So I ran fsck manually, to see if I get some more information.  fsck reports back it can't read block 28901403 (due to a short read), and asks if you want to ignore(y)?.  I ignore (by hitting space), and unfortunately touch it an additional time.  The next question it asks is force rewrite(y)?  So I apparently forced a rewrite of that block.  On further ignores (but no forced rewrites) I continue seeing short read errors at 28901404, *40, *41,*71, *512, *513, etc.  So not totally continugous.  Each iteration takes about 5-10 seconds.  At this point I reboot, but the same problem happens again, although it starts 28901404 instead of 28901403.  So apparently the force re-write fixed something, but I don't know if this is the best way of going about this.  I just wondering if there's any other tricks I can try before I just start rewriting random blocks on the hard drive.  I also don't know how widespread this problem is and how long it might take to complete (if its a large swath of the hard drive and its take 10 seconds for each block that wrong, it might take a while).

So for the moment, megatron is not functional.  Hopefully I can get some advice from Alex on Monday (or from anyone else who wants to chime in).  It may wind up being easiest to just wipe the drive and re-install real time linux, but I'm no expert at that.


  2264   Fri Nov 13 09:47:18 2009 josephbUpdateComputersMegatron status lights lit

Megatron's top fan, rear ps, and temperature front panel lights were all lit amber this morning.  I checked the service manual, found at :


According to the manual, this means a front fan failed, a voltage event occured, and we hit a high temperature threshold.  However, there were no failure light on any of the individual front fans (which should have been the case given the front panel fan light).  The lights remained on after I shutdown megatron.  After unplugging, waiting 30 seconds, and replugging the power cords in, the lights went off and stayed off.  Megatron seems to come up fine.

I unplugged the IO chassis from megatron, rebooted, and tried to start Peter's plant model.  However, it still prints that its starting, but really doesn't.  One thing I forgot to mention in the previous elog on the matter, is that on the local monitor it prints "shm_open(): No such file or directory" every time we try to start one of these programs.

  2265   Fri Nov 13 09:54:14 2009 josephbConfigurationComputersMegatron switched to tcsh

I've changed megatron's controls account default shell to tcsh (like it was before).  It now sources cshrc.40m in /cvs/cds/caltech/ correctly at login, so all the usual aliases and programs work without doing any extra work.

  2273   Mon Nov 16 15:13:25 2009 josephbUpdateComputersezcaread updated to Yoichi style ezcawrite

In order to get the gige camera code running robustly here at the 40m, I created a "Yoichi style" ezcaread, which is now the default, while the original ezcaread is located in ezcaread.bin.  This tries 5 times before failing out of a read attempt.

  2275   Mon Nov 16 15:58:02 2009 josephbConfigurationGeneralAdded Gige camera to AP table, added some screens

I placed a GC750 gige camera looking at a pickoff of the AS port, basically next to the analog camera, on the AP table.

I've modified the main sitemap to include a CAM button, for the digital cameras.  There's a half done screen associated with it.  At the moment, it reports on the X and Y center of mass calculation, the exposure setting, and displays a little graph with a dot indicating the COM of mass location.  Currently this screen is associated a GC750 camera looking at pickoff of the AS port.  I'm having some issues with getting shell scripts to run from it, as well as a slider having limits other than 0 and 0.

  2276   Mon Nov 16 17:24:28 2009 josephbConfigurationComputersCamera medm functionality improved

Currently the Camera medm screen (now available from the sitemap), includes a server and client script buttons.  The server has two options.  One which starts a server, the second which (for the moment) kills all copies of the server running on Ottavia.  The client button simply starts a video screen with the camera image.  The slider on this screen changes the exposure level.  The snap shot button saves a jpeg image in the /cvs/cds/caltech/cam/c1asport directory with a date and time stamp on it (up to the second).  For the moment, these buttons only work on Linux machines.

All channels were added to C0EDCU.ini, and should be being recorded for long term viewing.

Feel free to play around with it, break it, and let me know how it works (or doesn't).

  2279   Tue Nov 17 10:09:57 2009 josephbUpdateEnvironmentFumes

The smell of diesel is particularly bad this morning.  Its concentrated enough to be causing me a headache.  I'm heading off to Millikan and will be working remotely on Megatron.

  2299   Thu Nov 19 09:55:41 2009 josephbUpdateComputersTrying to get testpoints on megatron

This is a continuation from last night, where Peter, Koji, and I were trying to get test point channels working on megatron and with the TST module.

Things we noticed last night:

We could run starttst, and ./daqd -c daqdrc, which allowed us to get some channels in dataviewer.  The default 1k channel selection works, but none of the testpoints do. 

However, awgtpman -s tst does appear in the processes running list.

The error we get from dataviewer is:

Server error 861: unable to create thread
Server error 23328: unknown error
datasrv: DataWriteRealtime failed: daq_send: Illegal seek

Going to DTT, it starts with no errors in this configuration.  Initially it listed both MDC and TST channels.  However, as a test, I moved the tpchn_C4.par , tpchn_M4.par and tpchn_M5.par files to the directory backup, in /cvs/cds/caltech/target/gds/param.  This caused only the TST channels to show up (which is what we want when not running the mdc module.

We had changed the daqdrc file in /cvs/cds/caltech/target/fb, several times to get to this state.  According to the directions in the RCG manual written by Rolf, we're supposed to "set cit_40m=1" in the daqdrc file, but it was commented out.  However, when we uncommented it, it started causing errors on dtt startup, so we put it back.  We also tried adding lines:

set dcu_rate 13 = 16384;
set dcu_rate 14 = 16384;

But this didn't seem to help.  The reason we did this is we noticed dcuid = 13 and dcuid = 14 in the /cvs/cds/caltech/target/gds/param/tpchn_C1.par file.  We also edited the testpoint.par file so that it correctly corresponded to the tst module, and not the mdc and mdp modules.  We basically set:


in that file, and commented everything else out.

At this point, given all the things we've changed, I'm going to try a rebuild of the tst and daq and see if that solves things.


  2300   Thu Nov 19 10:19:04 2009 josephbUpdateComputersMegatron tst status

I did a full make clean and make uninstall-daq-tst, then rebuilt it.  I copied a good version of filters to C1TST.txt in /cvs/cds/caltech/chans/ as well as a good copy of screens to /cvs/cds/caltech/medm/c1/tst/.

Test points still appear to be broken.  Although for a single measurement in dtt, I was somehow able to start, although the output in the results page didn't seem to have any actual data in the plots, so I'm not sure what happened there - after that it just said unable to select test points.  It now says that when starting up as well.  The tst channels are the only ones showing up.  However, the 1k channels seem to have disappeared from Data Viewer, and now only 16k channels are selectable, but they don't actually work.  I'm not actually sure where the 1k channels were coming from earlier now that I think about it.  They were listed like C1:TST-ETMY-SENSOR_UL and so forth.

RA: Koji and I added the SENSOR channels by hand to the .ini file last night so that we could have data stored in the frames ala c1susvme1, etc.

  2301   Thu Nov 19 11:33:15 2009 josephbConfigurationComputersMegatron

I tried rebooting megatron, to see if that might help, but everything still acts the same. 

I tried using daqconfig and changed channels from deactiveated to activated.  I learned by activating them all, that the daq can't handle that, and eventually aborts from an assert checking a buffer size being too small.  I also tried activating 2 and looking at those channels, and it looks like the _DAQ versions of those channels work, or at least I get 0's out of C1:TST-ETMY_ASCPIT_OUT_DAQ (which is set in C1TST.ini file).

I've added the SENSOR channels back to the /csv/cds/caltech/chans/daq/C1TST.ini file, and those are again working in data viewer.

At this point, I'm leaving megatron roughly in the same state as last night, and am going to wait on a response from Alex.

  2371   Wed Dec 9 10:53:41 2009 josephbUpdateCamerasCamera client wasn't able to talk to server on port 5010, reboot fixed it.

I finally got around to taking a look at the digital camera setup today.  Rob had complained the client had stopped working on Rosalba.

After looking at the code start up and not complain, yet not produce any window output, it looks like it was a network problem.  I tried rebooting Rosalba, but that didn't fix anything.

Using netstat -an, I looked for the port 5010 on both rosalba and ottavia, since that is the port that was being used by the camera.  Ottavia was saying there were 6 established connections after Rosalba had rebooted (rosalba is  I can only presume 6 instances of the camera code had somehow shutdown in such a way they had not closed the connection.

[root@ottavia controls]#netstat -an | grep 5010
tcp        0      0      *                   LISTEN     
tcp        0      0       ESTABLISHED
tcp        0      0       ESTABLISHED
tcp        1      0         CLOSE_WAIT 
tcp        0      0       ESTABLISHED
tcp        0      0       ESTABLISHED
tcp        0      0       ESTABLISHED
tcp        0      0       ESTABLISHED


I switched the code to use port 5022 which worked fine.  However, I'm not sure what would have caused the original connection closure failures, as I test several close methods (including the kill command on the server end used by the medm screen), and none seemed to generate this broken connection state.  I rebooted Ottavia, and this seemed to fix the connections, and allowed port 5010 to work.  I also tried creating 10 connections, which all seem to run fine simultaneously.  So its not someone overloading that port with too many connections which caused the problem.  Its like the the port stopped working somehow, which froze the connection status, but how or why I don't know at this point.

  2509   Tue Jan 12 11:34:26 2010 josephbUpdatePEMAllegra dataviewer


So that we can use both Guralps for Adaptive stuff, and so that I can look at the differential ground motion spectra, I've reconnected the Guralp Seismometers to the PEM ADCU, instead of where they've been sitting for a while connected to the ASS ADC.  I redid the ASS.mdl file, so that the PEM and PEMIIR matricies know where to look for the Gur2 data.  I followed the 'make ass' procedure in the wiki.  The spectra of the Gur1 and Gur2 seismometers look pretty much the same, so everything should be all good.

There's a problem with DataViewer though:  After selecting signals to plot, whenever I hit the "Start" button for the realtime plots, DataViewer closes abruptly. 

When I open dataviewer in terminal, I get the following output:

Warning: communication protocol revision mismatch: expected 11.3, received 11.4
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: communication protocol revision mismatch: expected 11.3, received 11.4
msgget: No space left on device
allegra:~>framer4 msgget:msqid: No space left on device

Does anyone have any inspiration for why this is, or what the story is?  I have GR class, but I'll try to follow up later this afternoon.

 This problem seems to be restricted to allegra.  Dataviewer works fine on Rosalba  and op440m, as well as using ssh -X into rosalba to run dataviewer remotely.  DTT seems to work fine on allegra.  The disk usage seems no where near full on allegra.  Without knowing which "device" its refering to (although it must be a device local to allegra), I'm not sure what to look at now. 

I'm going to do a reboot of allegra and see if that helps.

Update:  The reboot seems to have fixed the issue.





  2514   Thu Jan 14 11:44:06 2010 josephbSummaryComputersMemory locations for TST model for ITMY

The main communications data structure is RFM_FE_COMMS, from the rts/src/include/iscNetDsc40m.h file.  The following comments regard sub-structures inside it.  I'm looking at all the files in /rts/src/fe/40m to determine how the structures are used, or if they seem to be unnecessary.

The dsccompad structure is used in the lscextra.c file.  I am assuming I don't need to add anything fo the model for these.  They cover from 0x00000040 to 0x00001000.

FE_COMMS_DATA is used twice, once for dataESX (0x00001000 to 0x00002000), and once for dataESY (0x00002000 to 0x00003000).

Inside FE_COMMS_DATA we have:

status and cycle which look to be initialized then never changed (although they are compared to).

ascETMoutput[P,Y], ascQPDinput are all set to 0 then never used.

qpdGain is used, and set by asc40m, but not read by anything.  It is offset 114, so in dataESX its 4210 (0x00001072), and in dataESY its (0x00002072)

All the other parts of this substructure seem to be unused.

daqTest, dgsSet, low1megpad,mscomms seem unused.

dscPad is referenced, but doesn't seem to be set.

pCoilDriver is a structure of type ALL_CD_INFO, inside a union called suscomms, inside FE_COMMS_Data, and is used.  In this structure, we have:

extData[16], an array of DSC_CD_PPY structures, which is used.  Inside extData we have for each optic (ETMY has an offset of 9 inside the extData array):

Pos is set in sos40m.c via the line pRfm->suscomms.pCoilDriver.extData[jj].Pos = dsp[jj].data[FLT_SUSPos].filterInput;   Elsewhere, Pos seems to be set to 1.0

Similarly, Pit and Yaw are set in sos40m, except with FLT_SUSPitch and FLT_SUSYaw, and being set elsewhere to 1.1, 1.2.  However, these are never applied to the ETMX and ETMY optics (it goes through offests 0 through 7 inclusive). 

Side is set 1.3 or 1.0 only, not used.

ascPit , ascYaw, lscPos are read by the losLinux.c code, and is updated by the sos40m.c code. For ETMY, their respective addresses are: 0x11a1c0, 0x11a1c4, 0x11a1c8.

lscTpNum, lscExNum, seem to be initialized, and read by the losLinux.c, and set by sos40m.c.

modeSwitch is read, but looks to be used for turning dewhitening on and off. Similarly dewhiteSW1R is read and used. 

This ends the DSC_CD_PPY structure.

lscCycle, which is used, although it seems to be an internal check.

dum is unused.

losOpLev is a substructure that is mostly unused.  Inside losOpLev, opPerror, opYerror, opYout seem to be unused, and opPout only seems ever to be set to 0.

Thats the end of ALL_CD_INFO and pCoilDriver.

After we have itmepics, itmfmdata, itmcoeffs, rmbsepics,...etymyepics, etmyfmdata,etmycoeffs which I don't see in use.

We have substructure asc inside mcasc, with epics, filt, and coeff char arrays. These seem to be asc and iowfsDrv specific.

lscIpc, lscepics, and lscla seems lsc specific,

The there is lscdiag struct, which contains v struct, which includes cpuClock, vmeReset, nSpob, nPtrx, nPtry don't seem to be used by the losLinux.c.

The lscfilt structure contains the FILT_MOD dspVME, which seems to be used only by lsc40m.

The lsccoeff structure contrains the VME_COEF pRfmCoeff, which again seems to only interact in the lsc code.

Then we have aciscpad, ascisc, ascipc, ascinfo, and mscepics which do not seem to be used.

ascepics and asccoeff are used in asc.c, but does not seem to be referenced elsewhere.

hepiepics , hepidsp, hepicoeff, hepists do not appear to be used.







  2515   Fri Jan 15 11:21:05 2010 josephbUpdateComputersMegatron and tst model for ETMY

The tst model wasn't compiling this morning due to some incorrect lines in the RfmIOfloat.pm file located in /home/controls/cds/advLIgo/src/epics/util/lib file on megatron. 

The error was "Undefined subroutine &CDS::RfmIOfloat::partType called at lib/Parser2.pm line 354, <IN> line 3363."

I changed RfmIO to RfmIOfloat on lines 1 and 6.
Basically the first 6 lines are now

package CDS::RfmIOfloat;
use Exporter;
@ISA = ('Exporter');

sub partType {
        return RfmIOfloat;

The tst now compiles.  At the moment, I believe we should be able to switch megatron in for ETMY and attempt to lock the arm.  The whitening/dewhitening filters are still not automatically synced in software and hardware, but I don't think that should prevent locking.

  2530   Tue Jan 19 10:30:29 2010 josephbUpdateComputersBoot fest to restart the computer and c1iscey not responding.


Thi afternoon I found that the RFM network in trouble. The frontends sync counters had railed to 16384 counts and some of the computers were not responding. I went for a bootfest, but before I rebooted c1dcu epics. I did it twice. Eventually it worked and I could get the frontends back to green.

Although trying to burtrestore to snapshots taken at any time after last wednesday till today would make the RFM crash again. Weird.

Also, c1iscey seems in a coma and doesn't want to come back. Power cycling it didn't work. I don't know how to be more persuasive with it.

During the testing of Megatron as the controller for ETMY, c1iscey had been disconnected from the ethernet hub.  Apparently we forgot to reconnect it after the test.  This prevented it from mounting the nfs directory from linux1, and thus prevented it from coming up after being shutdown.  It has been reconnected, restarted, and is working properly now.

  2544   Mon Jan 25 11:42:24 2010 josephbUpdateComputersMegatron and BO board

I was talking with Vladimir on Friday discussing the Binary Output board, we looked at the manual for it (Contec DO-32L-PE) and we realized its an opto-coupler isolated open-collector output.  He mentioned they had the right kind of 50 channel breakout board for testing in Riccardo's lab.

This morning I borrowed the 50 channel breakout board from Riccardo's lab, and along with some resistor loads, test the BO board.  It seems to be working, and I can control the output channel on/off state.

  2558   Tue Feb 2 10:38:30 2010 josephbUpdateComputersMegatron BO test

Last night, I connected megatron's BO board to the analog dewhitening board.  However, I was unable to lock the y arm (although once I disconnected the cable and reconnected it back the xy220 the yarm did lock).

So either A) I've got the wrong cable, or B) I've got the wrong logic being sent to the analog dewhitening filters.

During testing, I ran into an odd continuous on/off cycle on one of the side filer modules (on megatron).  Turns out I had forgotten to use a ground input to the control filer bank (which allows you to both set switches as well as read them out), and it was reading a random variable.  Grounding it in the model fixed the problem (after re-making).



  2570   Thu Feb 4 12:29:04 2010 josephbUpdateComputersMartian IP switch over notes

What is the change:

We will be moving the 131.215.113.XXX ip addresses of the martian network over to a 192.168.XXX.YYY scheme.

What will users notice:

Computer names (i.e. linux1, scipe25/c1dcuepics) will not change.  The domain name martian, will not change (i.e. linux1.martian.).  What will change is the underlying IP address associated with the host names.  Linux1 will no longer be but something like  If everything is done properly, that should be it.  There should be no impact or need to change anything in EPICS for example.  However, if there are custom scripts with hard coded IP addresses rather than hostnames, those would need to be updated, if they exist.

What needs to be done:

Each computer and router will need to either be accessed remotely, or directly, and the configuration files controlling the IP address (and/or dns lookup locations) be modified.  Then it needs to be rebooted so the configuration changes take effect. I'll be making an updated list of computers this week (tracked down via their physical ethernet cables), and next week, probably on Thursday, and then we simply go down the list one by one.


For a linux machine, this means checking the /etc/hosts file and making sure it doesn't have old information.  It should look like:               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

Then change the /etc/sysconfig/network-scripts/ifcfg-eth0 file (or ethX file depending on the ethernet card in question).  The IPADDR, NETWORK, and GATEWAY lines will need to be changed.  You can change the hostname (although I don't plan on it) by modifying the /etc/sysconfig/network file.

The /etc/resolv.conf file will need to be updated with the new DNS server location (i.e. to for example).


Simlarly to linux, the /etc/hosts file will need to be updated and/or simplified.  The /etc/defaultrouter file will need to be updated to the new router ip.  /etc/defaultdomain will need to be updated.  The /etc/resolv.conf will need to be updated with the new dns server.


Looking at the vxWorks machines, the command bootChange can be used to view or edit the IP configuration.

The following is an example from c1iscey.

-> bootChange

'.' = clear field;  '-' = go to previous field;  ^D = quit

boot device          : eeE0
processor number     : 0
host name            : linux1
file name            : /cvs/cds/vw/pIII_7751/vxWorks
inet on ethernet (e) :
inet on backplane (b):
host inet (h)        :
gateway inet (g)     :
user (u)             : controls
ftp password (pw) (blank = use rsh):
flags (f)            : 0x0
target name (tn)     : c1iscey
startup script (s)   :
other (o)            :

value = 0 = 0x0

By updating the the host (name of machine where its mounting /cvs/cds from - i.e. linux1), inet on ethernet (the IP of c1iscey) and host inet (linux1's ip address), we should be able to change all the vxWorks machines.


The DNS server running on linux1 will need to be updated with the new IPs and domain information.  The host file on linux1 will also need to be updated for all the new IP addresses as well.

This will need to be handled carefully as the last time I tried getting away without the host file on linux1, it broke NFS mounting from other machines.  However, as long as the host on linux1 is kept in sync with the dns server files everything should work.

