40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 88 of 335  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  5886   Mon Nov 14 12:16:41 2011 JenneUpdateComputersOAF model died for unknown reason

I am meditating on the OAF, and had it running and calculating things.  I had the outputs disabled so I could take reference traces in DTT, but the Adapt block was calculating for MCL.  At some point, all the numbers froze, and the CPU meter had gone up to ~256ms.  Usually it's around ~70 or so for the configuration I had (2 witness sensors and one degree of freedom enabled....no c-code calculations on any other signals).  The "alive" heartbeat was also frozen.

I ssh'ed into c1lsc, ran ./startc1oaf in the scripts directory, and it came back just fine.

Anyhow, I don't know why it got funny, but I wanted to record the event for posterity.  I'm back to OAFing now.

  5889   Mon Nov 14 21:22:48 2011 ranaConfigurationComputersprimetime RSYNC slowing down NODUS

nodus:elog>w; who ; date
  9:20pm  up 44 day(s),  5:14,  5 users,  load average: 0.29, 1.04, 1.35
User     tty           login@  idle   JCPU   PCPU  what
controls pts/1         9:18pm            5         -tcsh
controls pts/2         2:37pm  6:39  25:02  25:02  /opt/rsync/bin/rsync -avW /cvs/c
controls pts/3         9:14pm                      w
controls pts/4         4:20pm  1:56   5:02   5:02  ssh -X rosalba
controls pts/8         8:23pm    47   4:03         -tcsh
controls   pts/1        Nov 14 21:18    (pianosa.martian)
controls   pts/2        Nov 14 14:37    (ldas-cit.ligo.caltech.edu)
controls   pts/3        Nov 14 21:14    (rosalba)
controls   pts/4        Nov 14 16:20    (192.168.113.128)
controls   pts/8        Nov 14 20:23    (gwave-103.ligo.caltech.edu)
Mon Nov 14 21:20:48 PST 2011

we will ask the man to stop running backups at this time of night...

  6016   Sat Nov 26 07:22:20 2011 sureshUpdateComputers 

c1sus has been shutdown so that the optics dont bang around.  This is because the watch dogs are not working.

  6108   Mon Dec 12 16:30:17 2011 JenneUpdateComputersDid someone just do something to fb??

Dataviewer couldn't connect to the framebuilder, so I checked the CDS status screen, and all the fb-related things on each model went white, then red, then computer-by-computer they came back green.  Now dataviewer works again.  Is someone secretly doing shit while not in the lab???  Not cool man!

  6112   Tue Dec 13 11:51:33 2011 JamieUpdateComputersDid someone just do something to fb??

Quote:

Dataviewer couldn't connect to the framebuilder, so I checked the CDS status screen, and all the fb-related things on each model went white, then red, then computer-by-computer they came back green.  Now dataviewer works again.  Is someone secretly doing shit while not in the lab???  Not cool man!

This happens on occasion, and I have reported it to the CDS guys.  Something apparently causes the framebuilder to crash, but I haven't figured out what it is yet.  I doubt this particular instance had anything to do with remote futzing.

  6117   Wed Dec 14 12:22:00 2011 VladimirHowToComputersligo_viewer installed on pianosa

I made a test installation of ligo_viewer in /users/volodya/ligo_viewer-0.5.0c . It runs on pianosa (the Ubuntu machine) and needs Tcl/Tk 8.5.

 

To try it out run the following command on pianosa:

cd /users/volodya/ligo_viewer-0.5.0c/

./ligo_viewer.no_install

 

Press "CONNECT" to connect to the NDS server and explore. There are slides describing ligo_viewer at http://volodya-project.sourceforge.net/Ligo_viewer.pdf

 

Installation notes:

Use /users/volodya/ligo_viewer-0.5.0c.tgz or later version - it has been updated to work with 64 bit machines.

Make sure Tcl and Tk development packages are installed. You can find required packages by running

apt-file search tclConfig.sh

apt-file search tkConfig.sh

If apt-file returns empty output run apt-file update

Unpack ligo_viewer-0.5.0c.tgz, change into the created directory.

Run the following command to configure:

export CFLAGS=-I/usr/include/tcl8.5
./configure --with-tcl=/usr/lib/tcl8.5/ --with-tk=/usr/lib/tk8.5/

This works on Ubuntu machines. --with-tcl and --with-tk should point to the directories containing tclConfig.sh and tkConfig.sh correspondingly.

Run "make".

You can test the compilation with ./ligo_viewer.no_install

If everything works install with make install

If Tcl/Tk 8.5 is unavailable it should work with Tcl/Tk 8.3 or 8.4

 

 

Attachment 1: ligo_viewer_40m2.png
ligo_viewer_40m2.png
  6157   Tue Jan 3 15:45:04 2012 JenneUpdateComputersFB?

Is there a reason the framebuilder status light is red for all the front ends?

Also, I reenabled PRM watchdog.

  6159   Tue Jan 3 15:49:27 2012 JamieUpdateComputerspossible front-end timing issue

Quote:

Is there a reason the framebuilder status light is red for all the front ends?

Also, I reenabled PRM watchdog.

Apparently there is a bug in the timing cards having to do with the new year roll-over that is causing front-end problems.  From Rolf:

For systems using the Spectracom IRIG-B cards for timing information, the code did not properly roll over the time for
2012 (still thinks it is 2011 and get reports from DAQ of timing errors (0x4000)). I have made a temporary fix for this
in the controller.c code in branch-2.3, branch-2.4 and release 2.3.1. 

I was going to check to see if the 40m is suffering from this. I'll be over to see if that's the problem.

  6168   Wed Jan 4 09:06:50 2012 steveUpdateComputerspossible front-end timing issue

Quote:

Quote:

Is there a reason the framebuilder status light is red for all the front ends?

Also, I reenabled PRM watchdog.

Apparently there is a bug in the timing cards having to do with the new year roll-over that is causing front-end problems.  From Rolf:

For systems using the Spectracom IRIG-B cards for timing information, the code did not properly roll over the time for
2012 (still thinks it is 2011 and get reports from DAQ of timing errors (0x4000)). I have made a temporary fix for this
in the controller.c code in branch-2.3, branch-2.4 and release 2.3.1. 

I was going to check to see if the 40m is suffering from this. I'll be over to see if that's the problem.

 The problem is the same as yesterday.

Attachment 1: rtntstat.png
rtntstat.png
  6171   Wed Jan 4 16:40:52 2012 JamieUpdateComputersfront-end fb communication restored

Communication between the front end models and the framebuilder has been restored.  I'm not sure exactly what the issue was, but rebuilding the framebuilder daqd executable and restarting seems to have fixed the issue.

I suspect that the problem might have had to do with how I left things after the last attempt to upgrade to RCG 2.4.  Maybe the daqd that was running was linked against some library that I accidentally moved after starting the daqd process.  It would have kept running fine as was, but if the process died and was attempted to be started again, it's broken linking might have kept it from running correctly.  I don't have any other explanation.

It turns out this was not (best I can tell) related to the new year time sync issues that wer seen at the sites.

  6243   Fri Feb 3 10:48:24 2012 DenUpdateComputersc1lsc kernel

This morning I killed again c1lsc kernel with the new realization of fxlms algorithm. It works fine with gcc compiler during the tests. However, smth forbidden for the kernel is going on. I'll spend some more time on investigatin it. Interesting thing is that I did not even pressed "On" at the OAF MEDM screen to make the code running. c1lsc suspended even before. May be there is some function-name mismatch.

After c1lsc suspention I recomiled back non-working code and rebooted c1lsc. c1sus is also bad after c1lsc reboot as they communicate. I killed x04, lsc, ass, oaf models on the c1lsc computer and sus, mcs, rfm, pem on the c1sus computer. Then I restarted x02 model and restored its burt snapshot from 08:07. After I started all models back and restored their burt snapshots from 08:07. Then I diag reset all started models.

Before starting new fxlms code I've shutted down all the optics so that possible c1lsc suspention would not make them crazy. After reboot I turned the coils back. Everything seems to work fine.

  6249   Fri Feb 3 17:29:28 2012 DenUpdateComputersc1lsc kernel

The reason I've killed the c1lsc kernel was the following - when the code starts to run, it initializes some parameters and this takes ~0.2 msec per dof. Now, the old code did nothing with a DOF if C1:OAF-ADAPT_???_ONOFF == OFF. My code still initialized the parameters but then does nothing because no witness channels are given. But it spends 8*0.2 = 1.6 msec for initializing all 8 dof. As the code is called with frequency 2k, this was the reason for crashing. Now I've corrected my code, it compiles, runs and does not kill c1lsc. However, the old code would also kill the kernel if all DOF are filtered. So, when we'll use all 8 DOF, we'll have to split variable initialization.

But this is not the biggest problem. C1OAF model must be corrected, because, as for now, all 8 DOF call the same ADAPT_XFCODE function. As this function uses static variables, they will be all messed up by different DOF signals.

  6314   Fri Feb 24 16:10:48 2012 mikeUpdateComputersPyNDS and a Plot

Power Spectral Density plot using PyNDS, comparing 5 fast data channels for ETMX.

**EDIT** Script here:

import nds
import numpy as np
import matplotlib.pyplot as plt
import time
daq=nds.daq('fb', 8088)
channels=daq.recv_channel_list()
e=0
start=int(time.time()-315964819)
rqst=['C1:SUS-ETMX_SENSOR_UR','C1:SUS-ETMX_SENSOR_UL','C1:SUS-ETMX_SENSOR_LL','C1:SUS-ETMX_SENSOR_LR','C1:SUS-ETMX_SENSOR_SIDE']    #Requested Channels
for c in channels:
    if c.name in rqst:
        daq=nds.daq('fb', 8088)
        data=daq.fetch(start-100, start, c.name)
        vars()['psddata'+str(e)], vars()['psdfreq'+str(e)]=plt.psd(data[0],NFFT=16384,Fs=c.rate)
        vars()['label'+str(e)]=c.name
        e+=1
plt.figure(1)
plt.clf()
plt.title('PSD Comparison')
plt.grid(True, which='majorminor')
plt.xlabel(r'Frequency $Hz$')
plt.ylabel(r'Decibels $\frac{dB}{Hz}$')       
for x in np.arange(0,e):
    plt.loglog(psdfreq0, 10*vars()['psddata'+str(x)], label=vars()['label'+str(x)])
plt.legend()
plt.show()

Attachment 1: PSD_Comparison.png
PSD_Comparison.png
  6316   Fri Feb 24 18:59:04 2012 JenneUpdateComputersPyNDS and a Plot

Quote:

Power Spectral Density plot using PyNDS, comparing 5 fast data channels for ETMX.

 Is there any stuff to install, etc?  Y'know, for those of use who don't really know how to use computers and stuff....

  6341   Wed Feb 29 17:32:11 2012 MikeUpdateComputersPyNDS and a Plot

Quote:

Quote:

Power Spectral Density plot using PyNDS, comparing 5 fast data channels for ETMX.

 Is there any stuff to install, etc?  Y'know, for those of use who don't really know how to use computers and stuff....

 No new stuff for these computers.  Everything should be installed already.

  6431   Tue Mar 20 17:50:44 2012 SureshUpdateComputersBeam Scan machine fixed

There was something wrong with the Beam Scan PC.  The  mouse and screen were not responding and the PC was asking for drivers for any new hardware that we plugged in.  We called in the services of Junaid and co. since we do not have a Win98 Second Edition installation disk in the lab.   Junaid came with the disk, we changed the screen and the mouse and installed everything. 

We tried to get the network going on the PC so that we could update stuff easily over the net.  This didnt succeed. For now, we still have to depend on a Win98se CD to get drivers if any new hardware is connected to this machine. 

For future reference, some notes:

1)  We will get a copy of Win98SE for the lab from Junaid

2) We have to use a USB mouse from Dell. We have several spares of this. The drivers for these are present in the machine. 

 

 

The Beam Scan is working okay now.  We will proceed with the beam profile measurements.  

  6434   Wed Mar 21 19:12:27 2012 steveUpdateComputersAC power back on both ends

Quote:

Quote:

ETMY sus damping was disabled. Green locking laser and associated electronics turned off. Computers and power supplies turned off at rack 1Y4

The electricians picking up ac power from 1Y4 manual disconnect box and installing conduit line to ISCT-ETMY east end optical table.

There will be no more daisy chaining this way. 

 The power is back on at ETMY . c1iscey has not been restarted.

Now I'm turning ac power off at ETMX for the same job to be done.

 The power was turned back on at 4pm It took some time for Suresh to restart the computers. We have damping but things are not perfect yet. Auto BURTH did not work well.

  6463   Wed Mar 28 21:15:53 2012 ranaOmnistructureComputersWireless router for GC

I installed a NETGEAR Wireless Router (WPN824N) today on the 131 network. The admin password for it as well as the wireless access password are in the usual places.

The SSID is 40EARTH. I have set it to allow WPA as well as WPA2 access, so the speed is only 54 Mbps for now. In a year or so, we can turn off the WPA support and up the speed.

  6465   Thu Mar 29 13:23:05 2012 JenneOmnistructureComputersWireless router for GC

Quote:

I installed a NETGEAR Wireless Router (WPN824N) today on the 131 network. The admin password for it as well as the wireless access password are in the usual places.

The SSID is 40EARTH. I have set it to allow WPA as well as WPA2 access, so the speed is only 54 Mbps for now. In a year or so, we can turn off the WPA support and up the speed.

 This router was confiscated by the GC guys this morning around ~10am.  They barged in and said that someone at the 40m had connected a new router, and we had magically taken down half of the GC network.  The cable was plugged in to the wrong port on the back of the router. 

Junaid / Christian said that they would "secure" the router, and then reinstall it.  Apparently just having a password didn't satisfy them.  This was the compromise, versus them just taking the router and never bringing it back.

 

Attachment 1: IMG_0079.JPG
IMG_0079.JPG
  6467   Thu Mar 29 19:13:56 2012 JamieOmnistructureComputersWireless router for GC

I retrieved the newly "secured" router from Junaid.  It had apparently been hooked up to the GC network via it's LAN port, but it's LAN services had no been shut off.  It was therefore offering a competing DHCP server, which will totally hose a network.  A definite NONO.

The new SSID is "40mWiFi", it's WPA2, and the password is pasted to the bottom of the unit (the unit is back in it's original spot on the office computer rack.

  6479   Tue Apr 3 12:42:19 2012 Mike J.UpdateComputersHysteresis Model

Here's my first hysteresis model in Simulink. It's based on the equation y=Amplitude*sin(frequency*t+phase)+(hysteresis/frequency2) as a solution to y''+frequency2*y+hysteresis=0. All values in the model are variables that should be manipulated through the model workspace or external code.

Attachment 1: hysteresis1.mdl
Model {
  Name			  "hysteresis1"
  Version		  7.6
  MdlSubVersion		  0
  GraphicalInterface {
    NumRootInports	    0
    NumRootOutports	    0
    ParameterArgumentNames  ""
    ComputedModelVersion    "1.9"
    NumModelReferences	    0
... 734 more lines ...
  6485   Wed Apr 4 21:43:16 2012 Mike J.UpdateComputersBetter Hysteresis Model

A better hysteresis model based on the simple harmonic oscillator equation. Useless variables have been removed and output can now be saved to workspace for plotting. The model is at "/users/mjenson/matlab/SHO_hyst.mdl".

Attachment 1: SHO_hyst.png
SHO_hyst.png
  6487   Thu Apr 5 01:07:08 2012 Mike J.UpdateComputersHysteresis Plots

Here are the hysteresis plots from the most recent model, which uses a modified harmonic oscillator equation y''=-(Frequency)2*y-Hysteresis.  The hysteresis constant seems to change both the amplitude and equilibrium point of the pendulums, which is akin to changing the length of a pendulum without changing the frequency. This does not make sense. Perhaps the hysteresis value should be moved to the "spring" constant for the pendula and not restricted to a position-biasing value.

SHO_hyst_plot.png

  6494   Fri Apr 6 11:32:09 2012 JenneUpdateComputersRAID array is rebuilding....

Suresh reported to Den, who reported to me (although no elogs were made.....) that something was funny with the FB.  I went to look at it, and it's actually the RAID array rebuilding itself.  I have called in our guru, Jamie, to have a look-see.

  6495   Fri Apr 6 14:39:21 2012 JamieUpdateComputersRAID array is rebuilding....

The RAID (JetStor SATA 416S) is indeed resyncing itself after a disk failure.  There is a hot spare, so it's stable for the moment.  But we need a replacement disk:

    RAID disks:  1000.2GB Hitachi HDT721010SLA360

Do we have spares?  If not we should probably buy some, if we can.  We want to try to keep a stock of the same model number.

Other notes:

The RAID has a web interface, but it was for some reason not connected.  I connected it to the martian network at 192.168.113.119.

Viewing the RAID event log on the web interface silences the alarm.

I retrieved the manual from Alex, and placed it in the COMPUTER MANUALS drawer in the filing cabinet.

  6498   Fri Apr 6 16:35:37 2012 DenUpdateComputersc1ioo

c1ioo computer can not connect to the framebuilder and everything is red in the status for this machine, C1:FEC-33_CPU_METER is not moving.

EDIT by KI:

 We rebooted the c1ioo machine, but none of the ftont end model came back. It looked like they failed the burt process for some reasons according to dmesg.

Then we restarted each front end model one by one, and every time after immediately we restarted it we hit the 'BURT' button in the GDS screen.

Everyone came back to the normal operation.

  6502   Fri Apr 6 20:24:31 2012 Mike J.UpdateComputersSensoray

The Sensoray device is currently viewing Monitor 4 and plugged into Pianosa.  The user interface is run at /home/controls/Downloads/sdk_2253_1.2.2_linux/python demo.py. It can preview and capture the video stream, however the captured files are terrible. I believe it has something to do with the bitrate, since the captured video with lower bitrates are not as bad as the ones with higher bitrates, but  I am not certain.

  6503   Fri Apr 6 20:38:41 2012 Mike J.UpdateComputersSensoray

 Turns out that the "MPEG-4 VES" video format is just bad for captured video.  Everything except "MP4" and "MPEG-TS" works for streaming, and "MP4" and "MPEG-TS" seem to be the only captured formats that can be viewed properly.

  6505   Sat Apr 7 01:45:02 2012 Mike J.UpdateComputersEven Better Hysteresis Model and Plots

 The new hysteresis model is slightly based on the SHO equation, but with the force being out of phase with the position by an amount of hysteresis {x(t)=Amp*sin(freq*t), F(t)=Amp*sin(freq*t+Hyst)}. The new model can be found at /users/mjenson/matlab/hyst_v_3.mdl.  Pictures are: new hysteresis model, x(t) subsystem in new model[xh''(t) only lacks -1 multiplier and includes hysteresis variable], new plots.

 hyst_v_3.pnghyst_v_3-x(t).pnghyst_v3.png

  6507   Sat Apr 7 02:01:29 2012 Mike J.UpdateComputersProjector Cable Management

I replaced the projector video and power cables with longer ones, and zip-tied them to the ceiling and wall so they don't block the image.

projector_cables.jpg

  6513   Mon Apr 9 20:02:19 2012 Mike J.UpdateComputersSensoray

The highest resolution available is 720x480 pixels. Bit depth of captured images and video is most likely 16 bits per pixel. Video may be captured raw as well, which will be necessary for image subtraction/enhancement, however it cannot currently be played raw. A captured image is shown below, along with MP4 video.

out_0.jpg

 

  6517   Tue Apr 10 23:56:44 2012 ranaUpdateComputersSensoray

Now that Mike has got the Sensoray working, Jenne/Suresh should grab some new images of the ETM cage as Keiko did so that we can analyze them for another mode matching diagnostic.

  6518   Wed Apr 11 12:25:11 2012 RyanUpdateComputersUpdating aLIGO Conlog

Over the next few days, I will be working on upgrading the aLIGO Conlog install to include new bugfixes distributed by Patrick T.  The currently running conlog *should* not be affected, but please let me know if it is (ryan.fisher@ligo.org).

  6530   Thu Apr 12 22:04:17 2012 Mike J.UpdateComputersNew Hysteresis Model & Plots

The new hysteresis model uses a triangle wave with offset zero points as the position function and a sinusoidal force function, creating a loop similar to this. Model is at /users/mjenson/matlab/ferro_hyst.mdl.

ferro_hyst.pnghyst_combo.png

  6539   Tue Apr 17 10:55:50 2012 RyanUpdateComputersUpdating aLIGO Conlog

Quote:

Over the next few days, I will be working on upgrading the aLIGO Conlog install to include new bugfixes distributed by Patrick T.  The currently running conlog *should* not be affected, but please let me know if it is (ryan.fisher@ligo.org).

 The upgrade to the aLIGO Conlog is completed.  The conlog is once again running on megatron in a screen session. (see http://nodus.ligo.caltech.edu:8080/40m/6396)

  6585   Mon Apr 30 18:46:34 2012 ranaUpdateComputersmegatron

Last week I found that megatron was still off after the power outage. Apparently, the power recovery checklist had not been followed during the recovery.

Please remember to use the checklist and post the checklist results after each power outage. Megatron is now on and functioning.

  6592   Tue May 1 17:42:15 2012 Mike J.UpdateComputersSensoray

I've upgraded the Sensoray GUI so it can now switch the video channel it receives, thanks to the videoswitch script.

V4L2_Capture_Demo_r01.png

  6645   Tue May 15 23:40:46 2012 Mike J.UpdateComputersImage Subtraction

I acquired 2 raw frames of MC2 using "/users/mjenson/sensoray/sdk_2253_1.2.2_linux/capture -n -s 720x480 -f 1", one while the laser was off the mode cleaner and another while it was on:

mc2_1.bmp mc2_2.bmp

I then used "/users/mjenson/sensoray/sdk_2253_1.2.2_linux/imsub/display-image.py" to generate bitmaps of the raw images, which I then subtracted using the Python Imaging Library to generate a new image:

mc2_1-mc2_2.bmp

It doesn't look all that different, but the first image didn't have that much lit up in it to begin with. I should be able to write a script that does all of this without needing to generate new files in between acquisition and subtraction.

  6646   Wed May 16 11:53:45 2012 JenneUpdateComputersImage Subtraction

Quote:

It doesn't look all that different, but the first image didn't have that much lit up in it to begin with.

 This is totally cool!  You can see that the OSEM lights are almost entirely gone in the subtracted image.

Can you switch to trying with one of the *TM*F cameras?  (ITMXF, ITMYF, ETMYF, ETMXF)  They tend to have more background, so there should be a more dramatic subtraction.  Den or Suresh should be able to lock one of the arms for you.

  6662   Tue May 22 20:24:06 2012 JamieUpdateComputersrossa is now running Ubuntu 10.04

Now same as pianosa and rosalba.  I'll upgrade allegra on Friday.

  6677   Thu May 24 16:13:05 2012 yutaUpdateComputersASS scripts on new ubuntu machines

[Den, Yuta]

Background:
 ASS and many other scripts don't work on new ubuntu machines.

What we did:
1. Installed C-shell on rossa and rosalba(Ubuntu machine).
  sudo apt-get insall csh

2. Find out that
  /opt/rtcds/caltech/c1/scripts/AutoDither/alignY

runs, but
  /opt/rtcds/caltech/c1/scripts/medmrun /opt/rtcds/caltech/c1/scripts/AutoDither/alignY

doesn't run. It gives us the following error messages.

ezcawrite: error while loading shared libraries: libca.so: cannot open shared object file: No such file or directory
ezcaswitch: error while loading shared libraries: libca.so: cannot open shared object file: No such file or directory

Result:
 ASS scripts run on rossa and rosalba, but not with medmrun.
 At least ASS scripts run on pianosa(ubuntu machine) with medmrun. So we decided to wait for JAMIE to fix it.

  6683   Fri May 25 16:58:54 2012 JamieConfigurationComputers.bashrc for workstations

I have setup a shared .bashrc for all the workstations that is symlinked to the normal location on all machines:

controls@rossa:~ 0$ ls -al /home/controls/.bashrc 
lrwxrwxrwx 1 controls controls 23 2012-05-25 15:37 /home/controls/.bashrc -> /users/controls/.bashrc
controls@rossa:~ 0$ 

This should help simplify maintenance considerably.  Editing that file on one machine will edit it for all.  Just edit this one file!  Don't try to get fancy and add extra files!

I also added a bunch of aliases that had previously been missing.  This should help with some of the problems that people had been having.

NOTE: PLEASE DO NOT CHANGE THE DEFAULT SHELL!  We are using bash, because that's what the sites are now using and we want to be as compatible as possible.

You can of course still write scripts in csh/tcsh or use tcsh in a shell if you wish.   Just don't change the default shell for the controls user.

  6684   Fri May 25 17:50:38 2012 JamieUpdateComputersASS scripts on new ubuntu machines

Quote:

[Den, Yuta]

Background:
 ASS and many other scripts don't work on new ubuntu machines.

What we did:
1. Installed C-shell on rossa and rosalba(Ubuntu machine).
  sudo apt-get insall csh

2. Find out that
  /opt/rtcds/caltech/c1/scripts/AutoDither/alignY

runs, but
  /opt/rtcds/caltech/c1/scripts/medmrun /opt/rtcds/caltech/c1/scripts/AutoDither/alignY

doesn't run. It gives us the following error messages.

ezcawrite: error while loading shared libraries: libca.so: cannot open shared object file: No such file or directory
ezcaswitch: error while loading shared libraries: libca.so: cannot open shared object file: No such file or directory

Result:
 ASS scripts run on rossa and rosalba, but not with medmrun.
 At least ASS scripts run on pianosa(ubuntu machine) with medmrun. So we decided to wait for JAMIE to fix it.

Apparently the environment was not being properly inherited by the scripts launched from medmrun.  We modified the medmrum script so that it executes things with an interactive shell ("bash -i -c ...") and this fixed the problem (by assuring that it sources all the interactive environment configs (i.e. ~/.bashrc)).  I'm still not sure why we were seeing different behavior on pianosa, but at least the solution we have now should be robust.

As a reminder, all scripts launched from MEDM should use medmrun:

/opt/rtcds/caltech/c1/scripts/medmrun
  6685   Fri May 25 17:52:08 2012 JamieUpdateComputersallegra now running Ubuntu 10.04

The last of the control room machines is now upgraded.

  6703   Tue May 29 15:29:16 2012 JamieUpdateComputerslatest pynds installed on all new control room machines

The DASWG lscsoft package repositories have a lot of useful analysis software.  It is all maintained for Debian "sqeeze", but it's mostly installable without modification on Ubuntu 10.04 "lucid" (which is based on Debian squeeze).  Basically the only thing that needs to access the lscsoft repositories is to add the following repository file:

controls@rossa:~ 0$ cat /etc/apt/sources.list.d/lscsoft.list 
deb http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze contrib
deb-src http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze contrib

deb http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze-proposed contrib
deb-src http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze-proposed contrib
controls@rossa:~ 0$ 

A simple "apt-get update" then makes all the lscsoft packages available.

lscsoft includes the nds2 client packages (nds2-client-lib) and pynds (python-pynds).  Unfortunately the python-pynds debian squeeze package currently depends on libboost-python1.42, which is not available in Ubuntu lucid.  Fortunately, pynds itself does not require the latest version and can use what's in lucid.  I therefore rebuilt the pynds package on one of the control room machines:

$ apt-get install dpkg-dev devscripts debhelper            # these are packages needed to build a debian/ubuntu package
$ apt-get source python-pynds                              # this downloads the source of the package, and prepares it for a package build
$ cd python-pynds-0.7
$ debuild -uc -us                                          # this actually builds the package
$ ls -al ../python-pynds_0.7-lscsoft1+squeeze1_amd64.deb
-rw-r--r-- 1 controls controls 69210 2012-05-29 11:57 python-pynds_0.7-lscsoft1+squeeze1_amd64.deb

I then copied the package into a common place:

/ligo/apps/debs/python-pynds_0.7-lscsoft1+squeeze1_amd64.deb

I then installed it on all the control room machines as such:

$ sudo apt-get install libboost-python1.40.0 nds2-client-lib python-numpy   # these are the dependencies of python-pynds
$ sudo dpkg -i /ligo/apps/debs/python-pynds_0.7-lscsoft1+squeeze1_amd64.deb

I did this on all the control room machines.

It looks like the next version of pynds won't require us to jump through these extra hoops and should "just work".

  6722   Thu May 31 00:56:13 2012 JamieMetaphysicsComputersPlease remember to check in code changes

I know it's really hard to remember, but our future selves will thank us dearly if we remember to commit all of our code changes to the svn with nice log messages.  At the moment there's a LOT of modified stuff in the userapps working directory that needs to be committed:

controls@pianosa:/opt/rtcds/userapps/release 0$ svn status | grep '^M'
M       cds/c1/models/c1rfm.mdl
M       sus/c1/medm/templates/SUS_SINGLE.adl
M       sus/c1/models/c1mcs.mdl
M       sus/c1/models/c1sus.mdl
M       sus/c1/models/c1scx.mdl
M       sus/c1/models/c1scy.mdl
M       isc/c1/models/c1pem.mdl
M       isc/c1/models/c1ioo.mdl
M       isc/c1/models/ADAPT_XFCODE_MCL.c
M       isc/c1/models/c1oaf.mdl
M       isc/c1/models/c1gcv.mdl
M       isc/common/medm/OAF_OVERVIEW.adl
M       isc/common/medm/OAF_DOF_BLRMS.adl
M       isc/common/medm/OAF_OVERVIEW_BAK.adl
M       isc/common/medm/OAF_ADAPTATION_MICH.adl
controls@pianosa:/opt/rtcds/userapps/release 0$ 

This doesn't even include things that haven't even been added yet.  It doesn't take much time.  Just copy and paste what you elog about the changes.

  6737   Fri Jun 1 02:33:40 2012 JenneUpdateComputersc1sus and c1iscex - bad fb connections

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

  6738   Fri Jun 1 08:01:46 2012 steveUpdateComputersc1sus and c1iscex are down

Quote:

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

 

 

Attachment 1: compdown.png
compdown.png
  6740   Fri Jun 1 09:50:50 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

 This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

  6742   Fri Jun 1 14:40:24 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

I managed to recover c1sus.  It required stopping all the models, and the restarting them one-by-one:

$ rtcds stop all     # <-- this does the right to stop all the models with the IOP stopped last, so they will all unload properly.

$ rtcds start iop

$ rtcds start c1sus c1mcs c1rfm

I have no idea why the c1sus models got wedged, or why restarting them in this way fixed the issue.

ELOG V3.1.3-