40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 41 of 337  Not logged in ELOG logo
ID Date Authorup Type Category Subject
  6630   Wed May 9 08:21:42 2012 JamieUpdateCDSNo signals for DTT from SUS

Quote:

 c1iscey is much less happy - neither the IOP nor the scy model are willing to talk to fb.  I might give up on them after another few minutes, and wait for some daytime support, since I wanted to do DRMI stuff tonight.

Yeah, giving up now on c1iscey (Jamie....ideas are welcome).  I can lock just fine, including the Yarm, I just can't save data or see data about ETMY specifically.  But I can see LSC data, so I can lock, and I can now take spectra of corner optics.

 This is the mx_stream issue reported previously.  The symptom is that all models on a single front end loose contact with the frame builder, as opposed to *all* models on all front end loosing contact with the frame builder.  That indicates that the problem is a common fb communication issue on the single front end, and that's all handled with mx_stream.

ssh'ing into c1iscey and running "sudo /etc/init.d/mx_stream restart" fixed the problem.

  6640   Fri May 11 08:07:30 2012 JamieUpdateCDSFB

Quote:

Already for the second time today all computers loose connection to the framebuilder. When I ssh to framebuilder DAQD process was not running. I started it

controls@fb ~ 130$ sudo /sbin/init q

Just to be clear, "init q" does not start the framebuilder.  It just tells the init process to reparse the /etc/inittab.  And since init is supposed to be configured to restart daqd when it dies, it restarted it after the reloading of /etc/inittab.  You and Alex must have forgot to do that after you modified the inittab when you're were trying to fix daqd last week.

daqd is known to crash without reason.  It usually just goes unnoticed because init always restarts it automatically.  But we've known about this problem for a while.

Quote:

But I do not know what causes this problem. May be this is a memory issue. For FB

Mem:   7678472k total,  7598368k used,    80104k free

Practically all memory is used. If more is needed and swap is off, DAQD process may die.

This doesn't really mean anything, since the computer always ends up using all available memory.  It doesn't indicate a lack of memory.  If the machine is really running out of memory you would see lots of ugly messages in dmesg.

  6657   Tue May 22 11:32:02 2012 JamieUpdateCDSMEDM suspension screens using macro

Very nice, Yuta!  Don't forget to commit your changes to the SVN.  I took the liberty of doing that for you.  I also tweaked the file a bit, so we don't have to specify IFO and SYS, since those aren't going to ever change.  So the arguments are now only: OPTIC=MC1,DCU_ID=36.  I updated the sitemap accordingly.

Yuta, if you could go ahead and modify the calls to these screens in other places that would be great.  The WATCHDOG, LSC_OVERVIEW, MC_ALIGN screens are ones that immediately come to mind.

And also feel free to make cool new ones.  We could try to make simplified version of the suspension screens now being used at the sites, which are quite nice.

  6658   Tue May 22 11:45:12 2012 JamieConfigurationCDSPlease remember to commit SVN changes

Hey, folks.  Please remember to commit all changes to the SVN in a timely manor.  If you don't, multiple commits will get lumped together and we won't have a good log of the changes we're making.  You might also end up just loosing all of your work.  SVN COMMIT when you're done!  But please don't commit broken or untested code.

pianosa:release 0> svn status | grep -v '^?'
M       cds/c1/models/c1rfm.mdl
M       sus/c1/models/c1mcs.mdl
M       sus/c1/models/c1scx.mdl
M       sus/c1/models/c1scy.mdl
M       isc/c1/models/c1lsc.mdl
M       isc/c1/models/c1pem.mdl
M       isc/c1/models/c1ioo.mdl
M       isc/c1/models/ADAPT_XFCODE_MCL.c
M       isc/c1/models/c1oaf.mdl
M       isc/c1/models/c1gcv.mdl
M       isc/common/medm/OAF_OVERVIEW.adl
M       isc/common/medm/OAF_DOF_BLRMS.adl
M       isc/common/medm/OAF_OVERVIEW_BAK.adl
M       isc/common/medm/OAF_ADAPTATION_MICH.adl
pianosa:release 0>

  6659   Tue May 22 11:47:43 2012 JamieUpdateCDSMEDM suspension screens using macro

Actually, it looks like we're not quite done here.  All the paths in the SUS_SINGLE screen need to be updated to reflect the move.  We should probably make a macro that points to /opt/rtcds/caltech/c1/screens, and update all the paths accordingly.

  6662   Tue May 22 20:24:06 2012 JamieUpdateComputersrossa is now running Ubuntu 10.04

Now same as pianosa and rosalba.  I'll upgrade allegra on Friday.

  6683   Fri May 25 16:58:54 2012 JamieConfigurationComputers.bashrc for workstations

I have setup a shared .bashrc for all the workstations that is symlinked to the normal location on all machines:

controls@rossa:~ 0$ ls -al /home/controls/.bashrc 
lrwxrwxrwx 1 controls controls 23 2012-05-25 15:37 /home/controls/.bashrc -> /users/controls/.bashrc
controls@rossa:~ 0$ 

This should help simplify maintenance considerably.  Editing that file on one machine will edit it for all.  Just edit this one file!  Don't try to get fancy and add extra files!

I also added a bunch of aliases that had previously been missing.  This should help with some of the problems that people had been having.

NOTE: PLEASE DO NOT CHANGE THE DEFAULT SHELL!  We are using bash, because that's what the sites are now using and we want to be as compatible as possible.

You can of course still write scripts in csh/tcsh or use tcsh in a shell if you wish.   Just don't change the default shell for the controls user.

  6684   Fri May 25 17:50:38 2012 JamieUpdateComputersASS scripts on new ubuntu machines

Quote:

[Den, Yuta]

Background:
 ASS and many other scripts don't work on new ubuntu machines.

What we did:
1. Installed C-shell on rossa and rosalba(Ubuntu machine).
  sudo apt-get insall csh

2. Find out that
  /opt/rtcds/caltech/c1/scripts/AutoDither/alignY

runs, but
  /opt/rtcds/caltech/c1/scripts/medmrun /opt/rtcds/caltech/c1/scripts/AutoDither/alignY

doesn't run. It gives us the following error messages.

ezcawrite: error while loading shared libraries: libca.so: cannot open shared object file: No such file or directory
ezcaswitch: error while loading shared libraries: libca.so: cannot open shared object file: No such file or directory

Result:
 ASS scripts run on rossa and rosalba, but not with medmrun.
 At least ASS scripts run on pianosa(ubuntu machine) with medmrun. So we decided to wait for JAMIE to fix it.

Apparently the environment was not being properly inherited by the scripts launched from medmrun.  We modified the medmrum script so that it executes things with an interactive shell ("bash -i -c ...") and this fixed the problem (by assuring that it sources all the interactive environment configs (i.e. ~/.bashrc)).  I'm still not sure why we were seeing different behavior on pianosa, but at least the solution we have now should be robust.

As a reminder, all scripts launched from MEDM should use medmrun:

/opt/rtcds/caltech/c1/scripts/medmrun
  6685   Fri May 25 17:52:08 2012 JamieUpdateComputersallegra now running Ubuntu 10.04

The last of the control room machines is now upgraded.

  6703   Tue May 29 15:29:16 2012 JamieUpdateComputerslatest pynds installed on all new control room machines

The DASWG lscsoft package repositories have a lot of useful analysis software.  It is all maintained for Debian "sqeeze", but it's mostly installable without modification on Ubuntu 10.04 "lucid" (which is based on Debian squeeze).  Basically the only thing that needs to access the lscsoft repositories is to add the following repository file:

controls@rossa:~ 0$ cat /etc/apt/sources.list.d/lscsoft.list 
deb http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze contrib
deb-src http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze contrib

deb http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze-proposed contrib
deb-src http://www.lsc-group.phys.uwm.edu/daswg/download/software/debian/ squeeze-proposed contrib
controls@rossa:~ 0$ 

A simple "apt-get update" then makes all the lscsoft packages available.

lscsoft includes the nds2 client packages (nds2-client-lib) and pynds (python-pynds).  Unfortunately the python-pynds debian squeeze package currently depends on libboost-python1.42, which is not available in Ubuntu lucid.  Fortunately, pynds itself does not require the latest version and can use what's in lucid.  I therefore rebuilt the pynds package on one of the control room machines:

$ apt-get install dpkg-dev devscripts debhelper            # these are packages needed to build a debian/ubuntu package
$ apt-get source python-pynds                              # this downloads the source of the package, and prepares it for a package build
$ cd python-pynds-0.7
$ debuild -uc -us                                          # this actually builds the package
$ ls -al ../python-pynds_0.7-lscsoft1+squeeze1_amd64.deb
-rw-r--r-- 1 controls controls 69210 2012-05-29 11:57 python-pynds_0.7-lscsoft1+squeeze1_amd64.deb

I then copied the package into a common place:

/ligo/apps/debs/python-pynds_0.7-lscsoft1+squeeze1_amd64.deb

I then installed it on all the control room machines as such:

$ sudo apt-get install libboost-python1.40.0 nds2-client-lib python-numpy   # these are the dependencies of python-pynds
$ sudo dpkg -i /ligo/apps/debs/python-pynds_0.7-lscsoft1+squeeze1_amd64.deb

I did this on all the control room machines.

It looks like the next version of pynds won't require us to jump through these extra hoops and should "just work".

  6716   Wed May 30 18:08:40 2012 JamieUpdateLSCc1lsc: add error point pick-offs, moved ctrl pick-offs after feedforward

I made some modifications to the c1lsc model in order to extract both the error and control signals.

I added pick-offs for the error signals right before IFO DOF filter modules.  These are then sent with GOTOs to outputs.

I also modified things on the control side.  The OAF stuff was picking off control signals before feedforward in/outs.  After discussing with Jenne we decided that it would make sense for the OAF to be looking at the control signals after feedforward.  It also makes sense to define the control signal after the feedforward.  These control signals are then sent with GOTOs to another set of outputs.

Finally, I moved the triggers to after the control signal pickoffs, and right before the output matrix.  The final chain looks like (see attachment):

input matrix --> power norm --> ERR pickoff --> DOF filters --> FF out --> FF in --> CTRL pickoff --> trigger --> output matrix

The error pickoff outputs in the top level of the model are left terminated for the moment.  Eventually I will be hooking these into the new c1cal calibration model.

The model was recompiled, installed, and restarted.  Everything came up fine.

Attachment 1: LSCchain.png
LSCchain.png
  6717   Wed May 30 18:16:44 2012 JamieUpdateLSCskeleton of new c1cal calibration model created

[Jamie, Xavi Siemens, Chris Pankow]

We built the skeleton of a new calibration model for the LSC degrees of freedom.  I named it "c1cal".  It will run on the c1lsc FE machine, in CPU slot 4, and has been given DCUID 50.

Right now there's not much in the model, just inputs for DARM_ERR and DARM_CTRL, filters for each input, and the sum of the two channels that is h(t).

Tomorrow we'll extract all the needed signals from c1lsc, and see if we can generate something resembling a calibrated signal for one of the IFO DOFs.

 

  6722   Thu May 31 00:56:13 2012 JamieMetaphysicsComputersPlease remember to check in code changes

I know it's really hard to remember, but our future selves will thank us dearly if we remember to commit all of our code changes to the svn with nice log messages.  At the moment there's a LOT of modified stuff in the userapps working directory that needs to be committed:

controls@pianosa:/opt/rtcds/userapps/release 0$ svn status | grep '^M'
M       cds/c1/models/c1rfm.mdl
M       sus/c1/medm/templates/SUS_SINGLE.adl
M       sus/c1/models/c1mcs.mdl
M       sus/c1/models/c1sus.mdl
M       sus/c1/models/c1scx.mdl
M       sus/c1/models/c1scy.mdl
M       isc/c1/models/c1pem.mdl
M       isc/c1/models/c1ioo.mdl
M       isc/c1/models/ADAPT_XFCODE_MCL.c
M       isc/c1/models/c1oaf.mdl
M       isc/c1/models/c1gcv.mdl
M       isc/common/medm/OAF_OVERVIEW.adl
M       isc/common/medm/OAF_DOF_BLRMS.adl
M       isc/common/medm/OAF_OVERVIEW_BAK.adl
M       isc/common/medm/OAF_ADAPTATION_MICH.adl
controls@pianosa:/opt/rtcds/userapps/release 0$ 

This doesn't even include things that haven't even been added yet.  It doesn't take much time.  Just copy and paste what you elog about the changes.

  6728   Thu May 31 10:31:19 2012 JamieUpdateIOOMC beam spot oscillation

Quote:

This is a common occurrence when diagnostic scripts are written without the ability to handle exceptions (e.g. ctrl-c, terminal gets closed, etc.).

The first thing to do is make sure that the "new" script you are writing doesn't already exist (hint: look in the old scripts directory).

If you are writing a script that touches things in the interferometer, it must always return the settings to the initial state on abnormal termination:

http://linuxdevcenter.com/pub/a/linux/lpt/44_12.html

This is very good advice.  However, "trap" is bash-specific.  tcsh has a different method that uses a function called "onint".  Here's a description of the difference.

A couple notes about bash traps:

  • You can give a name instead of a number for the signal.  So instead of trap 'do stuff' 1 you can say trap 'do stuff' SIGHUP
  • The easiest signal to use is EXIT, which covers all your bases (ie. anything that would cause the script to exit prematurely.
  • You can define a function that gets executed in the trap

So the easiest way to use it is something like the following:

#!/bin/bash   # define cleanup function  function cleanup {      # do cleanup stuff, like reset EPICS records to defaults      ....  }  # set the trap on EXIT  trap cleanup EXIT  # the rest of your script below here
...
  6734   Thu May 31 22:13:08 2012 JamieUpdateCDSc1lsc: added remaining SHMEM senders for ERR and CTRL, c1oaf model updated appropriately

All the ERR and CTRL outputs in c1lsc now go to SHMEM senders.  I renamed the the CTRL output SHMEM senders to be more generic, since they aren't specifically for OAF anymore.  See attached image from c1lsc.

c1oaf was updated so that SHMEM receivers pointed to the newly renamed senders.

c1lsc and c1oaf were rebuilt, installed, and restarted and are now running.

Attachment 1: lsc-shmem-out.png
lsc-shmem-out.png
  6740   Fri Jun 1 09:50:50 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

Something bad happened to c1sus and c1iscex ~20 min ago.  They both have "0x2bad" 's.  I restarted the daqd on the framebuilder, and then rebooted c1sus, and nothing changed.  The SUS screens are all zeros (the gains seem to be set correctly, but all of the signals are 0's).

If it's not fixed when I get in tomorrow, I'll keep poking at it to make it better.

 This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

  6742   Fri Jun 1 14:40:24 2012 JamieUpdateComputersc1sus and c1iscex - bad fb connections

Quote:

This is at least partially related to the mx_stream issue I reported previously.  I restarted mx_stream on c1iscex and that cleared up the models on that machine.

Something else is happening with c1sus.  Restarting mx_stream on c1sus didn't help.  I'll try to fix it when I get over there later.

I managed to recover c1sus.  It required stopping all the models, and the restarting them one-by-one:

$ rtcds stop all     # <-- this does the right to stop all the models with the IOP stopped last, so they will all unload properly.

$ rtcds start iop

$ rtcds start c1sus c1mcs c1rfm

I have no idea why the c1sus models got wedged, or why restarting them in this way fixed the issue.

  6743   Fri Jun 1 14:56:08 2012 JamieUpdateSUSOplevs all different, messed up

For some reason the state of the oplevs is completely different for almost every suspension.  They have different sets of filters in the bank, and different filters engaged.  wtf?  How did this happen?  Is this correct?  Do we expect that the state of the oplevs should be different on all the different suspensions?  I wouldn't have thought so.

I discovered this because the PRM is unstable with the oplevs engaged.  I don't think it was yesterday.  Is something hidden changing the oplev settings?

Attachment 1: oplevs.png
oplevs.png
  6755   Tue Jun 5 14:47:28 2012 JamieUpdateCDSnew c1tst model for testing RCG code

I made a new model, c1tst, that we can use for debugging the FREQUENT RCG bugs that we keep encountering.  It's a bare model that runs on c1iscey.  Don't do any thing important in here, and don't leave it in some crappy state.  Clean if up when you're done.

  6768   Wed Jun 6 18:04:22 2012 JamieUpdateComputer Scripts / Programshacked ezca tools

Quote:

Currently, ezca tools are flakey and fails too much.
So, I hacked ezca tools just like Yoichi did in 2009 (see elog #1368).

For now,

/ligo/apps/linux-x86_64/gds-2.15.1/bin/ezcaread
/ligo/apps/linux-x86_64/gds-2.15.1/bin/ezcastep
/ligo/apps/linux-x86_64/gds-2.15.1/bin/ezcaswitch
/ligo/apps/linux-x86_64/gds-2.15.1/bin/ezcawrite

are wrapper scripts that repeats ezca stuff until it succeeds (or fails more than 5 times).

Of course, this is just a temporary solution to do tonight's work.
To stop this hack, run /users/yuta/scripts/ezhack/stophacking.cmd. To hack, run /users/yuta/scripts/ezhack/starthacking.cmd.

Original binary files are located in /ligo/apps/linux-x86_64/gds-2.15.1/bin/ezcabackup/ directory.
Wrapper scripts live in /users/yuta/scripts/ezhack directory.

I wish I could alias ezca tools to my wrapper scripts so that I don't have to touch the original files. However, alias settings doesn't work in our scripts.
Do you have any idea?

I didn't like this solution, so I hacked up something else.  I made a new single wrapper script to handle all of the utils.  It then executes the correct command based on the zeroth argument (see below).

I think moved all the binaries to give them .bin suffixes, and the made links to the new wrapper script.  Now everything should work as expected, with this new retry feature.

controls@rosalba:/ligo/apps/linux-x86_64/gds-2.15.1/bin 0$ for pgm in ezcaread ezcawrite ezcaservo ezcastep ezcaswitch; do mv $pgm{,.bin}; ln ezcawrapper $pgm; done
controls@rosalba:/ligo/apps/linux-x86_64/gds-2.15.1/bin 0$ cat ezcawrapper
#!/bin/bash

retries=5

pgm="$0"
run="${pgm}.bin"

if ! [ -e "$run" ] ; then
    cat <&2
This is the ezca wrapper script.  It should be hardlinked in place of
the ezca commands (ezcaread, ezcawrite, etc.), and executing the
original binaries (that have been moved to *.bin) with $retries
failure retries.
EOF
    exit -1
fi

if [ -z "$@" ] || [[ "$1" == '-h' ]] ; then
    "$run"
    exit
fi

for try in $(seq 1 "$retries") ; do
    if "$run" "$@"; then
	exit
    else
	echo "retrying ($try/$retries)..." >&2
    fi
done
echo "$(basename $pgm) failed after $retries retries." >&2
exit 1

  6769   Wed Jun 6 18:22:52 2012 JamieUpdateComputer Scripts / Programshacked ezca tools

Quote:

I didn't like this solution, so I hacked up something else.  I made a new single wrapper script to handle all of the utils.  It then executes the correct command based on the zeroth argument (see below).

I think moved all the binaries to give them .bin suffixes, and the made links to the new wrapper script.  Now everything should work as expected, with this new retry feature.

Yuta and I added a feature such that it will not retry if the environment variables EZCA_NORETRY is set, e.g.

$ EZCA_NORETRY=true ezcaread FOOBAR

  6772   Wed Jun 6 22:04:19 2012 JamieUpdateComputersFailed attempt to get pianosa to support dual monitors

I've spent an inordinate amount of time trying to figure out how to get pianosa to support dual monitors.  It apparently has some special nvidia graphics that are not well supported in Ubuntu (10.04 at least).  I've tried installing a newer kernel (3.0.0) and installing the special nvidia compile-from-source Ubuntu packages, but nothing is working.  And now unfortunately he's not giving me any video at all.  I'm sick of this today, so I'll try again tomorrow.

In the future we should avoid these bullshit nvidia cards like the plague.

  6773   Wed Jun 6 22:23:53 2012 JamieUpdateComputersFailed attempt to get pianosa to support dual monitors

I managed to get pianosa working again with just a single monitor.  I'm done trying to configure it for dual, though.

  6774   Wed Jun 6 22:25:05 2012 JamieUpdateComputerszita now running ubuntu 10.04

I'll finish the rest of the config to turn it into a full-fledged workstation tomorrow.

  6783   Thu Jun 7 10:39:21 2012 JamieUpdateComputerspianosa dual monitor working

I decided to remove what I thought was the problematic extra nVidia video card, since there are already two DVI outputs build-in.  The card turned out to not even be nVidia, so I don't know what was going on there.

I futzed with the BIOS to configure the primary video card, which is some new Intel card.  The lucid (10.04) support for it is lacking, but it was easy enough to pull in new drivers from the appropriate Ubuntu PPA repository:

controls@pianosa:~ 0$ sudo apt-add-repository ppa:f-hackenberger/x220-intel-mesa
controls@pianosa:~ 0$ sudo apt-add-repository ppa:glasen/intel-driver
controls@pianosa:~ 0$ sudo apt-get update
...
controls@pianosa:~ 0$ sudo apt-get upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages will be upgraded:
  libdrm-intel1 libdrm-nouveau1 libdrm-radeon1 libdrm2 libgl1-mesa-dri libgl1-mesa-glx libglu1-mesa xserver-xorg-video-intel
8 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 3,212kB of archives.
After this operation, 25.2MB disk space will be freed.
Do you want to continue [Y/n]?
...
controls@pianosa:~ 0$

After a reboot, both monitors came up fine.

http://www.subcritical.org/running_ubuntu_lts_on_sandy_bridge/

  6784   Thu Jun 7 12:49:50 2012 JamieUpdateComputerszita network configured

Added zita to the martian host table, and configured his network accordingly:

zita            A       192.168.113.217

https://wiki-40m.ligo.caltech.edu/Martian_Host_Table

  6787   Thu Jun 7 17:49:09 2012 JamieUpdateCDSc1sus in weird state, running models but unresponsive otherwise

Somehow c1sus was in a very strange state.  It was running models, but EPICS was slow to respond.  We could not log into it via ssh, and we could not bring up test points.  Since we didn't know what else to do we just gave it a hard reset.

Once it came it, none of the models were running.  I think this is a separate problem with the model startup scripts that I need to debug.  I logged on to c1sus and ran:

rtcds restart all

(which handles proper order of restarts) and everything came up fine.

Have no idea what happened there to make c1sus freeze like that.  Will keep an eye out.

  6801   Tue Jun 12 11:25:05 2012 JamieUpdateComputersrtcds: command not found

Quote:

We can't compile any changes to the LSC or the GCV models since Jamie's new script / program isn't found.  I don't know where it is (I can't find it either), so I can't do the compiling by hand, or point explicitly to the script.  The old way of compiling models in the wiki is obsolete, and didn't work :(

Sorry about that.  I had modified the path environment that pointed to the rtcds util.  The rtcds util is now in /opt/rtcds/caltech/c1/scripts/rtcds, which is in the path.  Starting a new shell should make it available again.

  6803   Tue Jun 12 13:49:32 2012 JamieConfigurationComputer Scripts / Programstconvert

A nicer, better maintained version of tconvert is now supplied by the lalapps package.  It's called lalapps_tconvert.  I installed lalapps on all the workstations and aliased tconvert to point to lalapps_tconvert.

  6837   Wed Jun 20 01:02:20 2012 JamieUpdateGreen LockingRF amp removed from X arm ALS setup

I very badly forgot to log about this in the crush of surfs.

I removed Koji's proto-beatbox RF comparator amp from the X arm ALS setup.  I was investigating hacking it onto one of the discriminator channels in the new beatbox, now that Yuta/Koji's Yuta/Koji's phase tracker is making the coarse beatbox path obsolete.  Upon further reflection we decided to just go ahead and stuff the beatbox board for the X arm, and use the proto-beatbox to test some faster ECL comparators.  This will be done first thing in the morning.

In the meantime the old amp is in my cymac mess on the far left of the electronics bench.

  6850   Thu Jun 21 20:07:18 2012 JamieUpdateGreen LockingImproved beatbox returns

I've reinstalled the beatbox in the 1X2 rack.  This improved version has the X and Y arm channels stuffed, but just one of the DFD channels (fine) each.

I hooked up the beat PD signals for X and Y to the RF inputs, and used the following two delay lines:

  • X: 140' light-colored cable on spool
  • Y: 30m black cable

The following channel --> c1ioo ADC --> c1gcv model connections were made:

  • X I  --> SR560 whitening --> ADC 22 -->  X fine I
  • X Q --> SR560 whitening --> ADC 23 --> X fine Q
  • Y I --> SR560 whitening --> ADC 24 --> Y fine I
  • Y Q --> SR560 whitening --> ADC 25 --> Y fine Q

The connections to the course inputs on the ALS block were grounded.  I then recompiled, reinstalled, and restarted c1gcv.  Functioning fine so far.

 

  6856   Fri Jun 22 19:52:47 2012 JamieUpdateComputersottavia reconfigured as CDS workstation

ottavia has been reinstalled with Ubuntu 10.04 LTS, and has been configured as a CDS workstation.

I have been maintaining a script that takes a stock 10.04 install and configures it as a workstation.  I've attached it here, but it lives at:

/users/controls/workstation-setup.sh

The script is designed to be idempotent, i.e. it can be run on a machine that has already been configured and it will either have no affect or update.

Attachment 1: workstation-setup.sh
#!/bin/bash

# This is a CDS workstation setup script for Ubuntu 10.04 It is
# idempotent, so running it on an already configured workstation is
# fine.

########################################
### controls

if [[ $(id -u controls) != 1001 ]] ; then
... 111 more lines ...
  6857   Fri Jun 22 20:00:14 2012 JamieOmnistructureElectronicstwo RG-405 cables ran from 1X2 rack to control room

[Yaakov, Eric, Jenne, Yuta]

Two of our surfs, Yaakov and Eric, pulled two unused RG-405/SMA cables that had been running from 1X2 to (mysteriously) 1Y2 racks.  They left the 1X2 end where it was and pulled the 1Y2 end and rerouted it to the control room.  We labeled both ends appropriately.

The end at 1X2 is now plugged into a splitter that is combining the RF input monitor outputs for the X and Y beatbox channels, so that we can watch the beat signals with the HP8591 in the control room.

  6867   Mon Jun 25 11:27:13 2012 JamieSummaryComputersc1ioo is down

Quote:

Quote:

I tried to restart c1ioo becuase I can't live without him.

I couldn't ssh or ping c1ioo, so I did hardware reboot.
c1ioo came back, but now ADC/DAC stats are all red.

c1ioo was OK until 3am when I left the control room last night. I don't know what happened, but StripTool from zita tells me that MC lock went off at around 4pm.

 c1ioo was still all red on the CDS status screen, so I tried a couple of things.

mxstreamrestart (which aliases on the front ends to sudo /etc/init.d/mx_stream restart) didn't help

sudo shutdown -r now didn't change anything either....c1ioo came back with red everywhere and 0x2bad on the IOP

eventually doing as Jamie did for c1sus in elog 6742, rtcds stop all, then rtcds start all fixed everything.  Interestingly, when I tried rtcds start iop, I got the error
Cannot start/stop model 'iop' on host c1ioo, so I just tried rtcds start all, and that worked fine....started with c1x03, then c1ioo, then c1gcv.

There seems to be a problem with how models are coming up on boot since the upgrade.  I think the IOP isn't coming up correctly for some reason, which is then preventing the rest of the models from starting since they depend on the IOP.

The simple way to fix this is to run the following:

ssh c1ioo
rtcds restart all

The "restart" command does the smart thing, by stopping all the models in the correct order (IOP last) and then restarting them also in the correct order (IOP first).

  6883   Wed Jun 27 15:10:34 2012 JamieUpdateComputer Scripts / Programs40m summary webpages move

I have moved the summary pages stuff that Duncan set up to a new directory that it accessible to the nodus web server and is therefore available from the outside world:

/users/public_html/40-summary

which is available at:

https://nodus.ligo.caltech.edu:30889/40m-summary/

I updated the scripts, configurations, and crontab appropriately:

/users/public_html/40m-summary/bin/c1_summary_page.sh
/users/public_html/40m-summary/share/c1_summary_page.ini

 

  6897   Fri Jun 29 22:56:35 2012 JamieUpdateVACinput telescope beam clipping on Faraday

[Yuta, Koji, Jamie]

We went into ITMX chamber to inspect the situation there.  We looked for clipping and flipping at PR2, and found none, although we noticed that the beam at PR2 looks a little clipped.

We then went back into the BS chamber and took a closer look at the beam incident on PRM, and the situation with PR3.  The PRM incident beam looked a little clipped, which we expected from the PR2 observation.  But the beam looks well centered on PRM and PR3.  As best I could tell the beam is reflecting off the front surface of PR3, as expected.

Looking at the beam around MMT1 and PJ2 (the second PZT on BS), we could tell that the beam incident and reflected off of MMT1 looked round, where as the beam incident on PJ2 looked clipped.  Using my tallness super power I was able to reach into the IMC chamber and confirm that the beam going from MMT1 to MMT2 clips fairly badly on the edge of the Faraday.   Koji speculates that this is the result of a misalignment of the PSL output beam into the MC.  In any event, it's not clear how this would be the cause of our PRC woes.

We decided to close up for the night, and let Yuta work on aligning PRMI.  We need to figure out what the heck to do now.

We've been watching the input power reduce, from 18 mW initially when we first went in, to about 5mW now.  It seems to be leveling out now.  It's unclear what would have been causing it.  Drift of the input polarization?

  6907   Tue Jul 3 17:56:35 2012 JamieUpdateGreen LockingLaseroptik dichroic optics received

We have received the dichroic optics from Laseroptik.  The coatings are:

HR:

  • 532nm: T(s+p) > 97%
  • 1064nm:  R(p) > 99.9%

AR:

  • 532nm: R(s+p) < 1%
  • 1064nm: R(p) < 2%

We got two sets with these coatings:

  • 6x: 50 x 9.5mm, 2 degree wedge
  • 8x: 25 x 6.35mm, 2 degree wedge
  • 1x: 25 x 3mm, witness
  6909   Tue Jul 3 19:04:59 2012 JamieUpdateGreen LockingLaseroptik dichroic optics received

I put them in the "visible optics" drawer of the newish, metal optics cabinet with the thin drawers down the Y arm.

  6911   Wed Jul 4 17:33:04 2012 JamieUpdateCDStiming, possibly leap second, brought down CDS

I got a call from Koji and Yuta that something was wrong with the CDS system.  I somehow had an immediate suspicion that it had something to do with the recent leap second.

It took a while for nodus to respond, and once he finally let me in I found a bunch of the following in his dmesg, repeated and filling the buffer:

Jul  3 22:41:34 nodus xntpd[306]: [ID 774427 daemon.notice] time reset (step) 0.998366 s
Jul  3 22:46:20 nodus xntpd[306]: [ID 774427 daemon.notice] time reset (step) -1.000847 s

Looking at date on all the front end systems, including fb, I could tell that they all looked a second fast, which is what you would expect if they had missed the leap second.  Everything syncs against nodus, so given nodus's problems above, that might explain everything.

I stopped daqd and nds on fb, and unloaded the mx drivers, which seemed to be showing problems.  I also stopped nodus's xntp:

  sudo /etc/init.d/xntpd stop

His ntp config file is in /etc/inet/ntp.conf, which is definitely the WRONG PLACE, given that the ntp server is not, as far as I can tell, being controlled by inetd.  (nodus is WAY out of date and desperately needs an overhaul.  it's nearly impossible to figure out what the hell is going on in there).  I found an old elog of Rana's that mentioned updating his config to point him to the caltech NTP server, which is now listed in the config, so I tried manually resyncing against that:

  sudo ntpdate -s -b -u 131.215.239.14

Unfortunately that didn't seem to have any effect.  This was making me wonder if the caltech server is off?  Anyway, I tried resyncing against the global NTP pool:

  sudo ntpdate -s -b -u pool.ntp.org

This seemed to work: the clock came back in sync with others that are known good.  Once nodus time was good I reloaded the mx drivers on fb and restarted daqd and nds.  They seemed come up fine.  At this point front ends started coming back on their own.  I went and restarted all the models on the machines that didn't (c1iscey and c1ioo).  Currently everything is looking ok.

I'm worried that there is still a problem with one of the NTP servers that nodus is sync'ing against, and that the problem might come back.  I'll check in again later tonight.

  6917   Thu Jul 5 10:49:38 2012 JamieUpdateCDSfront-end/fb communication lost, likely again due to timing offsets

All the front-ends are showing 0x4000 status and have lost communication with the frame builder.  It looks like the timing skew is back again.  The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds!  I'm looking into it now.

  6919   Thu Jul 5 12:06:35 2012 JamieUpdateComputersNDS2 client now working on Ubuntu machines

What I did

NDS2 was not working on any of the machines, so the first thing I did was simply to install the newest version. I downloaded the latest tarball (0.9.1) from the LDAS Wiki, unzipped and installed it

/users/zach $ tar -xvf nds2-client-0.9.1.tar

/users/zach $ cd nds2-client-0.9.1

/users/zach $ sudo ./configure --prefix=/cvs/cds/caltech/apps/linux64 --with-matlab=/cvs/cds/caltech/apps/linux64/matlab/bin/matlab

/users/zach $ sudo make

/users/zach $ sudo make install

No no, this is totally unnecessary.  NDS2 was already installed on every machine from the official packaged releases (apt-get install nds2-client), and it's known to work fine. We use it with pynds all the time. If the matlab component is not working we should figure out the right way to fix it with the existing packages.

In general, please only manually install software as a very last resort.  Manually installed software doesn't get maintained, where as the officially packaged stuff is being actively maintained by the collaboration. If there is a problem with the distributed packaging we should report it and get it fixed (and hint I was the one who built the original Debian packaging for nds2, so I know how to fix all the issues).  I'm trying to bring the 40m out of the dark days of complete chaos, where random software was installed in random locations.

Even with the new version, it still didn't work. 

That's because this wasn't the problem!

Solution: The main problem was that the cyrus-sasl-gssapi authentication protocol was not installed on these machines, so that even with a kerberos ticket the datalink could not be established. Using information from the LDAS Wiki, I used aptitude to install it as:

$ sudo aptitude install lscsoft-auth

This group installs both the SASL protocol and the package python-kerberos

I also needed to update the kerberos config file for each machine, which is located at /etc/krb5.conf. I found that ottavia had a nice one with many realms, so I copied that one over to the other machines. In any case where there was an old config file overwritten, it is now /etc/krb5.conf.old.

Finally, the matlab path for NDS2 was still set to the old 2010a directory (/cvs/cds/caltech/apps/linux64/lib/matlab2010a) that was created by the NDS2 install when Rana originally did it. The new install I made above created the appropriate 2010b mexa64 files, so I changed the matlab path within matlab to this one:

>> rmpath /cvs/cds/caltech/apps/linux64/lib/matlab2010a

>> addpath /cvs/cds/caltech/apps/linux64/lib/matlab2010b

>> savepath

This sounds like it's more likely the issues. You did the right thing by going to apt to fix the authentication packages.  It's curious to me that you did that here, whereas you went totally out of band for the nds2 client stuff.  Why?

The matlab mex files are the other problem.  But there is also a nds2-client-matlab Debian/Ubuntu package for that as well.  The problem is that the package just distributes the source, and it needs to be compiled.  I'll help figure out a good way to do that.

  • I would like to extend this to the 32-bit machines, but I have to figure out the best way to install the proper NDS2 client without interfering with the 64-bit version. I think it is just a matter of specifying the matlabroot in the .../linux/ instead of .../linux64/

Again, this is handled by the packaging!  Just use apt and the right architecture is installed automatically.

But what 32 bit machines are you referring to?  I think basically everything is 64 bit nowadays.

  • It would be nice to find a way that the nice tool gps('MM/DD/YYYY XX:XX:XX UTC'), which calls the ligotool executable tconvert, can be automatically usable when calling NDS2 functions. Right now, there seems to be an issue preventing that: even though tconvert can be run in the terminal, gps() returns an error and even directly running unix('tconvert now') or !tconvert returns the same error. I have emailed Peter Shawhan to see if he has any advice. 

We are now using lalapps_tconvert for tconvert.  We're not using that ligotools crap anymore.  I've aliased that to tconvert on the command line, but maybe matlab isn't getting the message.  I'll try to think of a more robust solution (e.g. make a wrapper script).

  6920   Thu Jul 5 12:27:05 2012 JamieUpdateCDSfront-end/fb communication lost, likely again due to timing offsets

Quote:

All the front-ends are showing 0x4000 status and have lost communication with the frame builder.  It looks like the timing skew is back again.  The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds!  I'm looking into it now.

This was indeed another leap second timing issue.  I'm guessing nodus resync'd from whatever server is posting the wrong time, and it brought everything out of sync again.  It really looks like the caltech server is off.  When I manually sync form there the time is off by a second, and then when I manually sync from the global pool it is correct.

I went ahead and updated nodus's config (/etc/inet/ntp.conf) to point to the global pool (pool.ntp.org).  I then restarted the ntp daemon:

  nodus$ sudo /etc/init.d/xntpd stop
  nodus$ sudo /etc/init.d/xntpd start

That brought nodus's time in sync.

At that point all I had to do was resync the time on fb:

  fb$ sudo /etc/init.d/ntp-client restart

When I did that daqd died, but it immediately restarted and everything was in sync.

  6951   Tue Jul 10 10:50:02 2012 JamieConfigurationGeneralSeismometer repositioning

Quote:

Today I REPOSITIONED THE SEISMOMETERS in order to triangulate noise sources (as Rana suggested).

I re-levelled all of them, locked them, and turned them on. They should be located out of sight, but just in case:

GUR 1 IS DOWN THE X-ARM, behind the interferometer.

GUR 2 IS BETWEEN THE TWO ARMS, BEHIND THE CABLE TRAP THAT RUNS PARALLEL TO THE X-ARM.

STS 1 IS DOWN THE Y-ARM behind the interferometer.

I'll wait a day for them to stabilize (continuing to reset STS-1 every hour or so) and then begin taking data tomorrow morning, depending on the condition of the signal.

Ideally, I'd like a few days' worth of data, so I'll update when I've changed the configuration back to the way it was prior.  

 Highlighting good, ALL CAPS LESS SO!

  6997   Fri Jul 20 17:11:50 2012 JamieUpdateCDSAll custom MEDM screens moved to cds_users_apps svn repo

Since there are various ongoing requests for this from the sites, I have moved all of our custom MEDM screens into the cds_user_apps SVN repository.  This is what I did:

For each system in /opt/rtcds/caltech/c1/medm, I copied their "master" directory into the repo, and then linked it back in to the usual place, e.g.:

a=/opt/rtcds/caltech/c1/medm/${model}/master
b=/opt/rtcds/userapps/trunk/${system}/c1/medm/${model}
mv $a $b
ln -s $b $a

Before committing to the repo, I did a little bit of cleanup, to remove some binary files and other known superfluous stuff.  But I left most things there, since I don't know what is relevant or not.

Then committed everything to the repo.

 

  7005   Mon Jul 23 18:16:13 2012 JamieUpdateSUSDQ block added to sus_single_control library parts

I have added a DQ block to the sus_single_control library part.  This means that all sus models will automatically generate DQ channels based on what is specified in this doc block:

#DAQ Channels

SUSPOS_IN1
SUSPIT_IN1
SUSYAW_IN1
SUSSIDE_IN1
ULSEN_OUT
URSEN_OUT
LRSEN_OUT
LLSEN_OUT
SDSEN_OUT
OLPIT_IN1
OLYAW_IN1
OLSUM_IN1

So for instance, for BS will have the following DQ channels:

C1:SUS-BS_SUSPOS_IN1_DQ
C1:SUS-BS_SUSPIT_IN1_DQ
...

etc.  The channels names modified by the activateDQ.py script after install are still modified appropriately.

This is now the place where we should be maintaining DQ channels.

  7007   Mon Jul 23 18:41:15 2012 JamieUpdateGreen LockingALS_END.mdl model added for end station green ALS channels

The end sus models (c1scx and c1scy) both contain some ALS stuff.  This stuff could maybe be moved to their own models, but whatever.

The stuff at X and Y were identical, but were code copies (BAD!).  I made a new library part for the ALS end controls: ${userapps}/isc/c1/model/ALS_END.mdl

It contains just some filter modules for the ALS end laser control, and a monitor of the ALS end REFL PD DC.  I also added a DQ block for the recorded channels (see screen shot).

When I added this new part to c1scx and c1scy I made it so the channel names would be more sensible.  Instead of "GCX" and "GCY", they are now "ALS-X" and "ALS-Y".  They will now all show up under the ALS subsystem.

 

 

Attachment 1: alsend.png
alsend.png
  7008   Mon Jul 23 18:57:52 2012 JamieUpdateCDSc1scx and c1scy models recompiled and restarted

After the changes listed in 7005 and 7007, I have rebuilt, installed, and restarted the c1scx and c1scy models.  Everything seems to have come back up ok.

Running into some daqd troubles because of a change to c1ioo, but will report on the new ALS channels when I can.

  7011   Mon Jul 23 19:50:43 2012 JamieUpdateCDSc1gcv model renamed to c1als

I decided to rename the c1gcv model to be c1als.  This is in an ongoing effort to rename all the ALS stuff as ALS, and get rid of the various GC{V,X,Y} named stuff.

Most of what was in the c1gcv model was already in a subsystem with and ALS top names, but there were a couple of channels that were outside of that that had funky names, namely the "GCV_GREEN" channels.  This fixes that, and make things more consistent and simple.

Of course this required a bunch of other little changes:

  • rename model in userapps svn
  • target/fb/master had to be modified to point to the new chans/daq/C1ALS.ini channel file and gds/param/tpchn_c1als.par testpoint file
  • rename RFM channels appropriately, and fix in receiver models (c1scx, c1scy, c1mcs)
  • move custom medm screens in userapps svn (isc/c1/medm/c1als), and link to it at medm/c1als/master
  • moved old medm/c1gcv directory into a subdirectory of medm/c1als
  • update all medm screens that point to c1gcv stuff (mostly just ALS screens)

The above has been done.  Still todo:

  • FIX SCRIPTS!  There are almost certainly scripts that point to GC{V,X,Y} channels.  Those will have to be fixed as we come across them.
  • Fix the c1sc{x,y}/master/C1SC{X,Y}_GC{X,Y}_SLOW.adl screens.  I need to figure out a more consistent place for those screens.
  • Fix the C1ALS_COMPACT screen
  • ???

 

  7013   Mon Jul 23 20:34:38 2012 JamieOmnistructureComputersCHECK IN YOUR CHANGES TO THE SVN

I'm seeing LOTS OF STUFF NOT CHECKED INTO THE SVN!!!  both modified things that haven't been updated, and things that looked like they haven't been checked in at all.

Please check in your stuff to the SVN!  We need the record!

Look through EVERYTHING that you think you might have touched, or even care about, and make sure it's checked in.

  7043   Fri Jul 27 14:27:14 2012 JamieUpdateCDSnew c1tst model for testing RCG code

Quote:

 

 I wanted to test biquad form in this model. I added biquad=1 flag to cdsParameters, compiled, installed and restarted it. After that c1iscey suspended.

The same thing as we had several month ago

controls@c1iscey /opt/rtcds/caltech/c1/target/c1tst/c1tstepics 0$ cat iocC1.log

Starting iocInit
iocRun: All initialization complete
sh: iniChk.pl: command not found
Failed to load DAQ configuration file

I have fixed the iniChk.pl issue (which actually fixed a separate model startup-on-boot issue that we had been having).  However, that is completely unrelated to the system freeze.  I'll discuss that in a separate post.

ELOG V3.1.3-