40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 44 of 344  Not logged in ELOG logo
ID Date Authorup Type Category Subject
  6784   Thu Jun 7 12:49:50 2012 JamieUpdateComputerszita network configured

Added zita to the martian host table, and configured his network accordingly:

zita            A       192.168.113.217

https://wiki-40m.ligo.caltech.edu/Martian_Host_Table

  6787   Thu Jun 7 17:49:09 2012 JamieUpdateCDSc1sus in weird state, running models but unresponsive otherwise

Somehow c1sus was in a very strange state.  It was running models, but EPICS was slow to respond.  We could not log into it via ssh, and we could not bring up test points.  Since we didn't know what else to do we just gave it a hard reset.

Once it came it, none of the models were running.  I think this is a separate problem with the model startup scripts that I need to debug.  I logged on to c1sus and ran:

rtcds restart all

(which handles proper order of restarts) and everything came up fine.

Have no idea what happened there to make c1sus freeze like that.  Will keep an eye out.

  6801   Tue Jun 12 11:25:05 2012 JamieUpdateComputersrtcds: command not found

Quote:

We can't compile any changes to the LSC or the GCV models since Jamie's new script / program isn't found.  I don't know where it is (I can't find it either), so I can't do the compiling by hand, or point explicitly to the script.  The old way of compiling models in the wiki is obsolete, and didn't work :(

Sorry about that.  I had modified the path environment that pointed to the rtcds util.  The rtcds util is now in /opt/rtcds/caltech/c1/scripts/rtcds, which is in the path.  Starting a new shell should make it available again.

  6803   Tue Jun 12 13:49:32 2012 JamieConfigurationComputer Scripts / Programstconvert

A nicer, better maintained version of tconvert is now supplied by the lalapps package.  It's called lalapps_tconvert.  I installed lalapps on all the workstations and aliased tconvert to point to lalapps_tconvert.

  6837   Wed Jun 20 01:02:20 2012 JamieUpdateGreen LockingRF amp removed from X arm ALS setup

I very badly forgot to log about this in the crush of surfs.

I removed Koji's proto-beatbox RF comparator amp from the X arm ALS setup.  I was investigating hacking it onto one of the discriminator channels in the new beatbox, now that Yuta/Koji's Yuta/Koji's phase tracker is making the coarse beatbox path obsolete.  Upon further reflection we decided to just go ahead and stuff the beatbox board for the X arm, and use the proto-beatbox to test some faster ECL comparators.  This will be done first thing in the morning.

In the meantime the old amp is in my cymac mess on the far left of the electronics bench.

  6850   Thu Jun 21 20:07:18 2012 JamieUpdateGreen LockingImproved beatbox returns

I've reinstalled the beatbox in the 1X2 rack.  This improved version has the X and Y arm channels stuffed, but just one of the DFD channels (fine) each.

I hooked up the beat PD signals for X and Y to the RF inputs, and used the following two delay lines:

  • X: 140' light-colored cable on spool
  • Y: 30m black cable

The following channel --> c1ioo ADC --> c1gcv model connections were made:

  • X I  --> SR560 whitening --> ADC 22 -->  X fine I
  • X Q --> SR560 whitening --> ADC 23 --> X fine Q
  • Y I --> SR560 whitening --> ADC 24 --> Y fine I
  • Y Q --> SR560 whitening --> ADC 25 --> Y fine Q

The connections to the course inputs on the ALS block were grounded.  I then recompiled, reinstalled, and restarted c1gcv.  Functioning fine so far.

 

  6856   Fri Jun 22 19:52:47 2012 JamieUpdateComputersottavia reconfigured as CDS workstation

ottavia has been reinstalled with Ubuntu 10.04 LTS, and has been configured as a CDS workstation.

I have been maintaining a script that takes a stock 10.04 install and configures it as a workstation.  I've attached it here, but it lives at:

/users/controls/workstation-setup.sh

The script is designed to be idempotent, i.e. it can be run on a machine that has already been configured and it will either have no affect or update.

Attachment 1: workstation-setup.sh
#!/bin/bash

# This is a CDS workstation setup script for Ubuntu 10.04 It is
# idempotent, so running it on an already configured workstation is
# fine.

########################################
### controls

if [[ $(id -u controls) != 1001 ]] ; then
... 111 more lines ...
  6857   Fri Jun 22 20:00:14 2012 JamieOmnistructureElectronicstwo RG-405 cables ran from 1X2 rack to control room

[Yaakov, Eric, Jenne, Yuta]

Two of our surfs, Yaakov and Eric, pulled two unused RG-405/SMA cables that had been running from 1X2 to (mysteriously) 1Y2 racks.  They left the 1X2 end where it was and pulled the 1Y2 end and rerouted it to the control room.  We labeled both ends appropriately.

The end at 1X2 is now plugged into a splitter that is combining the RF input monitor outputs for the X and Y beatbox channels, so that we can watch the beat signals with the HP8591 in the control room.

  6867   Mon Jun 25 11:27:13 2012 JamieSummaryComputersc1ioo is down

Quote:

Quote:

I tried to restart c1ioo becuase I can't live without him.

I couldn't ssh or ping c1ioo, so I did hardware reboot.
c1ioo came back, but now ADC/DAC stats are all red.

c1ioo was OK until 3am when I left the control room last night. I don't know what happened, but StripTool from zita tells me that MC lock went off at around 4pm.

 c1ioo was still all red on the CDS status screen, so I tried a couple of things.

mxstreamrestart (which aliases on the front ends to sudo /etc/init.d/mx_stream restart) didn't help

sudo shutdown -r now didn't change anything either....c1ioo came back with red everywhere and 0x2bad on the IOP

eventually doing as Jamie did for c1sus in elog 6742, rtcds stop all, then rtcds start all fixed everything.  Interestingly, when I tried rtcds start iop, I got the error
Cannot start/stop model 'iop' on host c1ioo, so I just tried rtcds start all, and that worked fine....started with c1x03, then c1ioo, then c1gcv.

There seems to be a problem with how models are coming up on boot since the upgrade.  I think the IOP isn't coming up correctly for some reason, which is then preventing the rest of the models from starting since they depend on the IOP.

The simple way to fix this is to run the following:

ssh c1ioo
rtcds restart all

The "restart" command does the smart thing, by stopping all the models in the correct order (IOP last) and then restarting them also in the correct order (IOP first).

  6883   Wed Jun 27 15:10:34 2012 JamieUpdateComputer Scripts / Programs40m summary webpages move

I have moved the summary pages stuff that Duncan set up to a new directory that it accessible to the nodus web server and is therefore available from the outside world:

/users/public_html/40-summary

which is available at:

https://nodus.ligo.caltech.edu:30889/40m-summary/

I updated the scripts, configurations, and crontab appropriately:

/users/public_html/40m-summary/bin/c1_summary_page.sh
/users/public_html/40m-summary/share/c1_summary_page.ini

 

  6897   Fri Jun 29 22:56:35 2012 JamieUpdateVACinput telescope beam clipping on Faraday

[Yuta, Koji, Jamie]

We went into ITMX chamber to inspect the situation there.  We looked for clipping and flipping at PR2, and found none, although we noticed that the beam at PR2 looks a little clipped.

We then went back into the BS chamber and took a closer look at the beam incident on PRM, and the situation with PR3.  The PRM incident beam looked a little clipped, which we expected from the PR2 observation.  But the beam looks well centered on PRM and PR3.  As best I could tell the beam is reflecting off the front surface of PR3, as expected.

Looking at the beam around MMT1 and PJ2 (the second PZT on BS), we could tell that the beam incident and reflected off of MMT1 looked round, where as the beam incident on PJ2 looked clipped.  Using my tallness super power I was able to reach into the IMC chamber and confirm that the beam going from MMT1 to MMT2 clips fairly badly on the edge of the Faraday.   Koji speculates that this is the result of a misalignment of the PSL output beam into the MC.  In any event, it's not clear how this would be the cause of our PRC woes.

We decided to close up for the night, and let Yuta work on aligning PRMI.  We need to figure out what the heck to do now.

We've been watching the input power reduce, from 18 mW initially when we first went in, to about 5mW now.  It seems to be leveling out now.  It's unclear what would have been causing it.  Drift of the input polarization?

  6907   Tue Jul 3 17:56:35 2012 JamieUpdateGreen LockingLaseroptik dichroic optics received

We have received the dichroic optics from Laseroptik.  The coatings are:

HR:

  • 532nm: T(s+p) > 97%
  • 1064nm:  R(p) > 99.9%

AR:

  • 532nm: R(s+p) < 1%
  • 1064nm: R(p) < 2%

We got two sets with these coatings:

  • 6x: 50 x 9.5mm, 2 degree wedge
  • 8x: 25 x 6.35mm, 2 degree wedge
  • 1x: 25 x 3mm, witness
  6909   Tue Jul 3 19:04:59 2012 JamieUpdateGreen LockingLaseroptik dichroic optics received

I put them in the "visible optics" drawer of the newish, metal optics cabinet with the thin drawers down the Y arm.

  6911   Wed Jul 4 17:33:04 2012 JamieUpdateCDStiming, possibly leap second, brought down CDS

I got a call from Koji and Yuta that something was wrong with the CDS system.  I somehow had an immediate suspicion that it had something to do with the recent leap second.

It took a while for nodus to respond, and once he finally let me in I found a bunch of the following in his dmesg, repeated and filling the buffer:

Jul  3 22:41:34 nodus xntpd[306]: [ID 774427 daemon.notice] time reset (step) 0.998366 s
Jul  3 22:46:20 nodus xntpd[306]: [ID 774427 daemon.notice] time reset (step) -1.000847 s

Looking at date on all the front end systems, including fb, I could tell that they all looked a second fast, which is what you would expect if they had missed the leap second.  Everything syncs against nodus, so given nodus's problems above, that might explain everything.

I stopped daqd and nds on fb, and unloaded the mx drivers, which seemed to be showing problems.  I also stopped nodus's xntp:

  sudo /etc/init.d/xntpd stop

His ntp config file is in /etc/inet/ntp.conf, which is definitely the WRONG PLACE, given that the ntp server is not, as far as I can tell, being controlled by inetd.  (nodus is WAY out of date and desperately needs an overhaul.  it's nearly impossible to figure out what the hell is going on in there).  I found an old elog of Rana's that mentioned updating his config to point him to the caltech NTP server, which is now listed in the config, so I tried manually resyncing against that:

  sudo ntpdate -s -b -u 131.215.239.14

Unfortunately that didn't seem to have any effect.  This was making me wonder if the caltech server is off?  Anyway, I tried resyncing against the global NTP pool:

  sudo ntpdate -s -b -u pool.ntp.org

This seemed to work: the clock came back in sync with others that are known good.  Once nodus time was good I reloaded the mx drivers on fb and restarted daqd and nds.  They seemed come up fine.  At this point front ends started coming back on their own.  I went and restarted all the models on the machines that didn't (c1iscey and c1ioo).  Currently everything is looking ok.

I'm worried that there is still a problem with one of the NTP servers that nodus is sync'ing against, and that the problem might come back.  I'll check in again later tonight.

  6917   Thu Jul 5 10:49:38 2012 JamieUpdateCDSfront-end/fb communication lost, likely again due to timing offsets

All the front-ends are showing 0x4000 status and have lost communication with the frame builder.  It looks like the timing skew is back again.  The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds!  I'm looking into it now.

  6919   Thu Jul 5 12:06:35 2012 JamieUpdateComputersNDS2 client now working on Ubuntu machines

What I did

NDS2 was not working on any of the machines, so the first thing I did was simply to install the newest version. I downloaded the latest tarball (0.9.1) from the LDAS Wiki, unzipped and installed it

/users/zach $ tar -xvf nds2-client-0.9.1.tar

/users/zach $ cd nds2-client-0.9.1

/users/zach $ sudo ./configure --prefix=/cvs/cds/caltech/apps/linux64 --with-matlab=/cvs/cds/caltech/apps/linux64/matlab/bin/matlab

/users/zach $ sudo make

/users/zach $ sudo make install

No no, this is totally unnecessary.  NDS2 was already installed on every machine from the official packaged releases (apt-get install nds2-client), and it's known to work fine. We use it with pynds all the time. If the matlab component is not working we should figure out the right way to fix it with the existing packages.

In general, please only manually install software as a very last resort.  Manually installed software doesn't get maintained, where as the officially packaged stuff is being actively maintained by the collaboration. If there is a problem with the distributed packaging we should report it and get it fixed (and hint I was the one who built the original Debian packaging for nds2, so I know how to fix all the issues).  I'm trying to bring the 40m out of the dark days of complete chaos, where random software was installed in random locations.

Even with the new version, it still didn't work. 

That's because this wasn't the problem!

Solution: The main problem was that the cyrus-sasl-gssapi authentication protocol was not installed on these machines, so that even with a kerberos ticket the datalink could not be established. Using information from the LDAS Wiki, I used aptitude to install it as:

$ sudo aptitude install lscsoft-auth

This group installs both the SASL protocol and the package python-kerberos

I also needed to update the kerberos config file for each machine, which is located at /etc/krb5.conf. I found that ottavia had a nice one with many realms, so I copied that one over to the other machines. In any case where there was an old config file overwritten, it is now /etc/krb5.conf.old.

Finally, the matlab path for NDS2 was still set to the old 2010a directory (/cvs/cds/caltech/apps/linux64/lib/matlab2010a) that was created by the NDS2 install when Rana originally did it. The new install I made above created the appropriate 2010b mexa64 files, so I changed the matlab path within matlab to this one:

>> rmpath /cvs/cds/caltech/apps/linux64/lib/matlab2010a

>> addpath /cvs/cds/caltech/apps/linux64/lib/matlab2010b

>> savepath

This sounds like it's more likely the issues. You did the right thing by going to apt to fix the authentication packages.  It's curious to me that you did that here, whereas you went totally out of band for the nds2 client stuff.  Why?

The matlab mex files are the other problem.  But there is also a nds2-client-matlab Debian/Ubuntu package for that as well.  The problem is that the package just distributes the source, and it needs to be compiled.  I'll help figure out a good way to do that.

  • I would like to extend this to the 32-bit machines, but I have to figure out the best way to install the proper NDS2 client without interfering with the 64-bit version. I think it is just a matter of specifying the matlabroot in the .../linux/ instead of .../linux64/

Again, this is handled by the packaging!  Just use apt and the right architecture is installed automatically.

But what 32 bit machines are you referring to?  I think basically everything is 64 bit nowadays.

  • It would be nice to find a way that the nice tool gps('MM/DD/YYYY XX:XX:XX UTC'), which calls the ligotool executable tconvert, can be automatically usable when calling NDS2 functions. Right now, there seems to be an issue preventing that: even though tconvert can be run in the terminal, gps() returns an error and even directly running unix('tconvert now') or !tconvert returns the same error. I have emailed Peter Shawhan to see if he has any advice. 

We are now using lalapps_tconvert for tconvert.  We're not using that ligotools crap anymore.  I've aliased that to tconvert on the command line, but maybe matlab isn't getting the message.  I'll try to think of a more robust solution (e.g. make a wrapper script).

  6920   Thu Jul 5 12:27:05 2012 JamieUpdateCDSfront-end/fb communication lost, likely again due to timing offsets

Quote:

All the front-ends are showing 0x4000 status and have lost communication with the frame builder.  It looks like the timing skew is back again.  The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds!  I'm looking into it now.

This was indeed another leap second timing issue.  I'm guessing nodus resync'd from whatever server is posting the wrong time, and it brought everything out of sync again.  It really looks like the caltech server is off.  When I manually sync form there the time is off by a second, and then when I manually sync from the global pool it is correct.

I went ahead and updated nodus's config (/etc/inet/ntp.conf) to point to the global pool (pool.ntp.org).  I then restarted the ntp daemon:

  nodus$ sudo /etc/init.d/xntpd stop
  nodus$ sudo /etc/init.d/xntpd start

That brought nodus's time in sync.

At that point all I had to do was resync the time on fb:

  fb$ sudo /etc/init.d/ntp-client restart

When I did that daqd died, but it immediately restarted and everything was in sync.

  6951   Tue Jul 10 10:50:02 2012 JamieConfigurationGeneralSeismometer repositioning

Quote:

Today I REPOSITIONED THE SEISMOMETERS in order to triangulate noise sources (as Rana suggested).

I re-levelled all of them, locked them, and turned them on. They should be located out of sight, but just in case:

GUR 1 IS DOWN THE X-ARM, behind the interferometer.

GUR 2 IS BETWEEN THE TWO ARMS, BEHIND THE CABLE TRAP THAT RUNS PARALLEL TO THE X-ARM.

STS 1 IS DOWN THE Y-ARM behind the interferometer.

I'll wait a day for them to stabilize (continuing to reset STS-1 every hour or so) and then begin taking data tomorrow morning, depending on the condition of the signal.

Ideally, I'd like a few days' worth of data, so I'll update when I've changed the configuration back to the way it was prior.  

 Highlighting good, ALL CAPS LESS SO!

  6997   Fri Jul 20 17:11:50 2012 JamieUpdateCDSAll custom MEDM screens moved to cds_users_apps svn repo

Since there are various ongoing requests for this from the sites, I have moved all of our custom MEDM screens into the cds_user_apps SVN repository.  This is what I did:

For each system in /opt/rtcds/caltech/c1/medm, I copied their "master" directory into the repo, and then linked it back in to the usual place, e.g.:

a=/opt/rtcds/caltech/c1/medm/${model}/master
b=/opt/rtcds/userapps/trunk/${system}/c1/medm/${model}
mv $a $b
ln -s $b $a

Before committing to the repo, I did a little bit of cleanup, to remove some binary files and other known superfluous stuff.  But I left most things there, since I don't know what is relevant or not.

Then committed everything to the repo.

 

  7005   Mon Jul 23 18:16:13 2012 JamieUpdateSUSDQ block added to sus_single_control library parts

I have added a DQ block to the sus_single_control library part.  This means that all sus models will automatically generate DQ channels based on what is specified in this doc block:

#DAQ Channels

SUSPOS_IN1
SUSPIT_IN1
SUSYAW_IN1
SUSSIDE_IN1
ULSEN_OUT
URSEN_OUT
LRSEN_OUT
LLSEN_OUT
SDSEN_OUT
OLPIT_IN1
OLYAW_IN1
OLSUM_IN1

So for instance, for BS will have the following DQ channels:

C1:SUS-BS_SUSPOS_IN1_DQ
C1:SUS-BS_SUSPIT_IN1_DQ
...

etc.  The channels names modified by the activateDQ.py script after install are still modified appropriately.

This is now the place where we should be maintaining DQ channels.

  7007   Mon Jul 23 18:41:15 2012 JamieUpdateGreen LockingALS_END.mdl model added for end station green ALS channels

The end sus models (c1scx and c1scy) both contain some ALS stuff.  This stuff could maybe be moved to their own models, but whatever.

The stuff at X and Y were identical, but were code copies (BAD!).  I made a new library part for the ALS end controls: ${userapps}/isc/c1/model/ALS_END.mdl

It contains just some filter modules for the ALS end laser control, and a monitor of the ALS end REFL PD DC.  I also added a DQ block for the recorded channels (see screen shot).

When I added this new part to c1scx and c1scy I made it so the channel names would be more sensible.  Instead of "GCX" and "GCY", they are now "ALS-X" and "ALS-Y".  They will now all show up under the ALS subsystem.

 

 

Attachment 1: alsend.png
alsend.png
  7008   Mon Jul 23 18:57:52 2012 JamieUpdateCDSc1scx and c1scy models recompiled and restarted

After the changes listed in 7005 and 7007, I have rebuilt, installed, and restarted the c1scx and c1scy models.  Everything seems to have come back up ok.

Running into some daqd troubles because of a change to c1ioo, but will report on the new ALS channels when I can.

  7011   Mon Jul 23 19:50:43 2012 JamieUpdateCDSc1gcv model renamed to c1als

I decided to rename the c1gcv model to be c1als.  This is in an ongoing effort to rename all the ALS stuff as ALS, and get rid of the various GC{V,X,Y} named stuff.

Most of what was in the c1gcv model was already in a subsystem with and ALS top names, but there were a couple of channels that were outside of that that had funky names, namely the "GCV_GREEN" channels.  This fixes that, and make things more consistent and simple.

Of course this required a bunch of other little changes:

  • rename model in userapps svn
  • target/fb/master had to be modified to point to the new chans/daq/C1ALS.ini channel file and gds/param/tpchn_c1als.par testpoint file
  • rename RFM channels appropriately, and fix in receiver models (c1scx, c1scy, c1mcs)
  • move custom medm screens in userapps svn (isc/c1/medm/c1als), and link to it at medm/c1als/master
  • moved old medm/c1gcv directory into a subdirectory of medm/c1als
  • update all medm screens that point to c1gcv stuff (mostly just ALS screens)

The above has been done.  Still todo:

  • FIX SCRIPTS!  There are almost certainly scripts that point to GC{V,X,Y} channels.  Those will have to be fixed as we come across them.
  • Fix the c1sc{x,y}/master/C1SC{X,Y}_GC{X,Y}_SLOW.adl screens.  I need to figure out a more consistent place for those screens.
  • Fix the C1ALS_COMPACT screen
  • ???

 

  7013   Mon Jul 23 20:34:38 2012 JamieOmnistructureComputersCHECK IN YOUR CHANGES TO THE SVN

I'm seeing LOTS OF STUFF NOT CHECKED INTO THE SVN!!!  both modified things that haven't been updated, and things that looked like they haven't been checked in at all.

Please check in your stuff to the SVN!  We need the record!

Look through EVERYTHING that you think you might have touched, or even care about, and make sure it's checked in.

  7043   Fri Jul 27 14:27:14 2012 JamieUpdateCDSnew c1tst model for testing RCG code

Quote:

 

 I wanted to test biquad form in this model. I added biquad=1 flag to cdsParameters, compiled, installed and restarted it. After that c1iscey suspended.

The same thing as we had several month ago

controls@c1iscey /opt/rtcds/caltech/c1/target/c1tst/c1tstepics 0$ cat iocC1.log

Starting iocInit
iocRun: All initialization complete
sh: iniChk.pl: command not found
Failed to load DAQ configuration file

I have fixed the iniChk.pl issue (which actually fixed a separate model startup-on-boot issue that we had been having).  However, that is completely unrelated to the system freeze.  I'll discuss that in a separate post.

  7045   Fri Jul 27 14:30:49 2012 JamieUpdatedigital noisebiquad key is working

What is "DQF"?  Is that the biquad?  And what is the difference between DF1 and DF2?  Why don't you just write out the name, so it's more clear.

  7046   Fri Jul 27 16:32:17 2012 JamieUpdateCDSRCG bug exposed by simple c1tst model

As Den mentioned in 7043, attempting to run the c1tst model was causing the entire c1iscey machine to crash.  Alex came over this morning and we spend a couple of hours trying to debug what was going on.

c1tst is the simplest possible model you can have: 1 ADC and 2 filter modules.  It compiles just fine, but when you tried to load it the machine would completely freeze.

We eventually tracked this down to a non-empty filter file for one of the filter modules.  It turns out that the model was crashing when it attempted to load the filter file.  Once we completely deleted all the filters in the module, the model would run.  But then if you added back a filter to the filter file and tried to "load coefficients", the model/machine would immediately crash again.

So it has something to do with the loading of the filter coefficients from the filter file.  We tried different filters and it didn't seem to make a difference.  Alex thought it might have something to do with zeros in some of the second-order sections, but that wasn't it either.

There's speculation that it might be related to a very similar bug that Joe reported at LLO a month ago: https://bugzilla.ligo-wa.caltech.edu/bugzilla/show_bug.cgi?id=398

Things we tried, none of which worked:

  • adding a DAC
  • turning on/off biquad
  • disabling the float denormalization fix

This is a real mystery.  Alex and I are continuing to investigate.

  7049   Mon Jul 30 12:38:45 2012 JamieUpdateCDSMove to RCG 2.5 tag release

I moved the RCG to the advLigoRTS-2.5 tag:

controls@c1iscey ~ 0$ ls -al /opt/rtcds/rtscore/release
lrwxrwxrwx 1 controls 1001 19 Jul 30 12:02 /opt/rtcds/rtscore/release -> tags/advLigoRTS-2.5
controls@c1iscey ~ 0$ 

There are only very minor differences between what we were running on the the 2.5 branch.  I have NOT rebuilt all the models yet.

I was hoping that there was something in the tagged release that might fix this hard-crash-on-filter-load issue that we're seeing in the c1tst model.  It didn't.  Still have no idea what's going on there.

  7057   Tue Jul 31 15:17:58 2012 JamieUpdateCDSc1ifo medm screens checked into CSD userapps svn

I moved the medm/c1ifo directory into the CDS userapps svn at cds/c1/medm/c1ifo, and then linked it back into the medm directory:

controls@rossa:~ 0$ ls -al /opt/rtcds/caltech/c1/medm/c1ifo
lrwxrwxrwx 1 controls controls 56 2012-07-31 11:53 /opt/rtcds/caltech/c1/medm/c1ifo -> /opt/rtcds/caltech/c1/userapps/release/cds/c1/medm/c1ifo
controls@rossa:~ 0$

I then committed whatever useful was in there.  We need to remember to commit when we make changes.

  7060   Tue Jul 31 18:59:59 2012 JamieUpdateCDSupdated medm paths for MISC (IFO,CDS,VIDEO), IFO alignment scripts updated accordingly

In an attempt to clean up the medm situation, I did a bunch of further rearrangement and cleanup.  Instead of having c1ifo, I moved a bunch of stuff into a MISC directory, including all of the CDS, IFO, and VIDEO screens:

controls@rossa:~ 0$ ls -1 /opt/rtcds/caltech/c1/medm/MISC | grep -v _BAK
CDS_BIO_STATUS.adl
CDS_FE_STATUS.adl
CDS_IPC_ERR.adl
help
ifoalign
IFO_ALIGN.adl
ifoalign.orig
IFO_CONFIGURE.adl
IFO_CONFIGURE.txt
IFO_OVERVIEW.adl
IFO_OVERVIEW_AIDAN.adl
IFO_STATE.adl
snap
VIDEO.adl
controls@rossa:~ 0$

I updated the sitemap and the relevant screens accordingly.

I also updated the IFO_ALIGN script infrastructure a bit.  The new IFO ALIGN location is /opt/rtcds/caltech/c1/medm/MISC/ifoalign.  The scripts are now called simply:

/opt/rtcds/caltech/c1/medm/MISC/ifoalign/misalign_soft.csh
/opt/rtcds/caltech/c1/medm/MISC/ifoalign/restore_soft.csh
/opt/rtcds/caltech/c1/medm/MISC/ifoalign/save_soft.csh

These run Jenne's new soft restore stuff, and store burt snapshots for optic alignment in /opt/rtcds/caltech/c1/medm/MISC/ifoalign/burt.

This has all been checked into the svn.

  7062   Tue Jul 31 20:59:35 2012 JamieUpdateCDSfixing up MEDM snapshots

I've started to try to fix all the old non-working medm snapshots.  I've made a new snapshot directory, and some new snapshot scripts to handle taking the snapshots, and view old ones.

Snapshots are now in:  /opt/rtcds/caltech/c1/medm/snap

There is one main script: /opt/rtcds/caltech/c1/medm/snap/snapcommands.  This is linked by:

  • updatesnap: update a snapshot
  • viewsnap: view most recent snapshot
  • viewold: view old snapshots

These commands take either the path to the screen in question, or it's relative path to the medm directory (/opt/rtcds/caltech/c1/medm).  The snapshots for a specific screen are stored in a directory specific to the screen, in a place relative to the snap directory that mimics the screens relative path to the overall medm directory.  So for instance, the snap directory for: 

/opt/rtcds/caltech/c1/medm/c1lsc/master/C1LSC_OVERVIEW.adl

is:

controls@rossa:~ 0$ find /opt/rtcds/caltech/c1/medm/snap/c1lsc/master/C1LSC_OVERVIEW/
/opt/rtcds/caltech/c1/medm/snap/c1lsc/master/C1LSC_OVERVIEW/
/opt/rtcds/caltech/c1/medm/snap/c1lsc/master/C1LSC_OVERVIEW/2012-08-01-02:21:34-UTC.png
/opt/rtcds/caltech/c1/medm/snap/c1lsc/master/C1LSC_OVERVIEW/2012-08-01-02:21:04-UTC.png
/opt/rtcds/caltech/c1/medm/snap/c1lsc/master/C1LSC_OVERVIEW/2012-08-01-02:21:23-UTC.png
/opt/rtcds/caltech/c1/medm/snap/c1lsc/master/C1LSC_OVERVIEW/2012-08-01-03:41:34-UTC.png
/opt/rtcds/caltech/c1/medm/snap/c1lsc/master/C1LSC_OVERVIEW/current.png
controls@rossa:~ 0$ 

The "current.png" is a link to the most recent snapshot, and is what you get when you "viewsnap".

Below is an example medm snippet of what could be used in screens to enable this functionality.  I've fixed up a couple of the screens, but there are a lot more that need to be updated.

"shell command" {
    object {
        x=666
        y=597
        width=40
        height=40
    }
    command[0] {
        label="Settings and Procedures"
        name="emacs"
        args="./help/IFO_ALIGN.txt &"
    }
    command[1] {
        label="View Snapshot"
        name="/opt/rtcds/caltech/c1/medm/snap/viewsnap"
        args="MISC/IFO_ALIGN.adl &"
    }
    command[2] {
        label="View old snapshots"
        name="/opt/rtcds/caltech/c1/medm/snap/viewold"
        args="MISC/IFO_ALIGN.adl &"
    }
    command[3] {
        label="Update Snapshot"
        name="/opt/rtcds/caltech/c1/medm/snap/updatesnap"
        args="MISC/IFO_ALIGN.adl &"
    }
    clr=14
    bclr=30
}
  7067   Wed Aug 1 11:50:49 2012 JamieUpdateCDSadded input monitors to LSC_TRIGGER library part

I added an EPICS monitor to the input of the LSC_TRIGGER part, to allow monitoring the signal used for the trigger.  I then added the monitors to the C1LSC_TRIG_MTRX screen (see below).  This should hopefully aid in setting the trigger levels.

Attachment 1: trigmtrx.png
trigmtrx.png
  7073   Wed Aug 1 18:20:58 2012 JamieUpdateLSCYarm recovered

[Jenne, Jamie]

We recovered lock and alignment of the Y arm.  TRY_OUT is now at ~0.9, after tweaking {I,E}TMY pit/yaw and PZT2.  YARM_GAIN is 0.1.

I saved ITMY, ETMY, and PZT2 alignments in the IFO_ALIGN screen with the new alignment save/restore stuff I got working.

Working on getting Yarm ASS working now...

  7079   Thu Aug 2 21:40:55 2012 JamieUpdateSimulationsETMX simplant model revived

I revived the ETMX simplant model, c1spx.  It's running on cpu4 on c1iscex, and interfaces with C1scx via SHMEM.

The channel names for the simplant suspensions will be SUP, so the channel from this model will C1:SUP-ETMX_.

Next I'll try to get the ITMX and LSC ("LSP") simplant models running so we can run a "full" cavity simulation.

Sasha has been working on LSP, so we should be ready to do something with that soon.  In the mean time she's going to fix up the SPX MEDM screens, since some channel names have been changed since it was last run.

  7093   Mon Aug 6 19:37:50 2012 JamieUpdateCDSdaqd and CDS network problems today

For some reason this afternoon we've been experiencing a lot of problems with the framebuilder, and with the CDS network in general.  The framebuilder has been very unresponsive, although the daqd logs seem to indicate that things are ok.  All models will loose contact with fb for very long stretches.  Attempts to kill/restart daqd don't seem to fix the problem.

These problems seem to be associated with the general CDS network issues as well.  The network seems to become very slow, and the workstations all become very slow.  The later I assume is because of the network and that so much of the work we do is on network mounted filesystems (/opt/rtcds, /ligo, etc.).

My current speculation is that daqd on fb is doing something stupid, like trying to read or write a bunch of stuff from /frames, which is also network mounted, and that clogs up the entire network.  Some serious network debugging is going to be needed to figure out what's going on, though.

I'm afraid daqd is caught in some bad state now, though.  It's not responding to anything, and every attempt to kill it seems to bring it back into the bad state.  Hopefully I can get Alex to help me figure out what's going on tomorrow.   Maybe it will clear up on it's own tonight...

  7094   Mon Aug 6 19:54:53 2012 JamieUpdateCDSdaqd and CDS network problems today

It looks like daqd is indeed caught in some bad state.  It seems to die at some point after making GPS corrections to minute trender:

...
[Mon Aug  6 19:45:13 2012] Minute trender made GPS time correction; gps=1028342727; gps%60=27
tail: `fb/logs/daqd.log' has been replaced;  following end of new file
263596
MX endpoint opened
startup file interpreter thread tid=140334118615312
calling yyparse(5, 6)
[Mon Aug  6 19:50:08 2012] ->5: #set avoid_reconnect
[Mon Aug  6 19:50:08 2012] ->5: set thread_stack_size=102400
[Mon Aug  6 19:50:08 2012] new threads will be created with the stack of size 102400K
[Mon Aug  6 19:50:08 2012] ->5: set allow_tpman_connect_fail
[Mon Aug  6 19:50:08 2012] ->5: #set dcu_status_check=5
[Mon Aug  6 19:50:08 2012] ->5: #set symm_gps_offset=-1
[Mon Aug  6 19:50:08 2012] ->5: #set symm_gps_offset=31535998
[Mon Aug  6 19:50:08 2012] ->5: ##set symm_gps_offset=347155213
[Mon Aug  6 19:50:08 2012] ->5: #set symm_gps_offset=378691215
[Mon Aug  6 19:50:08 2012] ->5: #set symm_gps_offset=378691212
[Mon Aug  6 19:50:08 2012] ->5: #set symm_gps_offset=315964799
[Mon Aug  6 19:50:08 2012] ->5: set symm_gps_offset=315964801
[Mon Aug  6 19:50:08 2012] ->5: set debug=0
[Mon Aug  6 19:50:08 2012] ->5: set log=2
[Mon Aug  6 19:50:08 2012] ->5: set zero_bad_data=0
[Mon Aug  6 19:50:08 2012] ->5: set dcu_status_check=9
[Mon Aug  6 19:50:08 2012] ->5: set controller_dcu=33
[Mon Aug  6 19:50:08 2012] ->5: set master_config="/opt/rtcds/caltech/c1/target/fb/master"
[Mon Aug  6 19:50:10 2012] finished configuring data channels
[Mon Aug  6 19:50:10 2012] ->5: configure channels begin end
Unable to find GDS node 90 system c1x00 in INI files
Unable to find GDS node 92 system c1tst2 in INI files
Unable to find GDS node 95 system c1x10 in INI files
[Mon Aug  6 19:50:10 2012] ->5: tpconfig "/opt/rtcds/caltech/c1/target/gds/param/testpoint.par"
[Mon Aug  6 19:50:10 2012] ->5: set gps_leaps = 820108813
[Mon Aug  6 19:50:10 2012] ->5: set detector_name="CIT"
[Mon Aug  6 19:50:10 2012] ->5: set detector_prefix="C1"
[Mon Aug  6 19:50:10 2012] ->5: set detector_longitude=-90.7742403889
[Mon Aug  6 19:50:10 2012] ->5: set detector_latitude=30.5628943337
[Mon Aug  6 19:50:10 2012] ->5: set detector_elevation=.0
[Mon Aug  6 19:50:10 2012] ->5: set detector_azimuths=1.1,4.7123889804
[Mon Aug  6 19:50:10 2012] ->5: set detector_altitudes=1.0,2.0
[Mon Aug  6 19:50:10 2012] ->5: set detector_midpoints=2000.0, 2000.0
[Mon Aug  6 19:50:10 2012] ->5: set num_dirs = 10
[Mon Aug  6 19:50:10 2012] ->5: set frames_per_dir=225
[Mon Aug  6 19:50:10 2012] ->5: set full_frames_per_file=1
[Mon Aug  6 19:50:10 2012] ->5: set full_frames_blocks_per_frame=16
[Mon Aug  6 19:50:10 2012] ->5: set frame_dir="/frames/full", "C-R-", ".gwf"
[Mon Aug  6 19:50:10 2012] ->5: set trend_num_dirs=10
[Mon Aug  6 19:50:10 2012] ->5: set trend_frames_per_dir=1440
[Mon Aug  6 19:50:10 2012] ->5: set trend_frame_dir= "/frames/trend/second", "C-T-", ".gwf"
[Mon Aug  6 19:50:10 2012] ->5: set raw-minute-trend-dir="/frames/trend/minute_raw"
[Mon Aug  6 19:50:10 2012] ->5: set nds-jobs-dir="/opt/rtcds/caltech/c1/target/fb"
[Mon Aug  6 19:50:10 2012] ->5: set minute-trend-num-dirs=10
[Mon Aug  6 19:50:10 2012] ->5: set minute-trend-frames-per-dir=24
[Mon Aug  6 19:50:10 2012] ->5: set minute-trend-frame-dir="/frames/trend/minute", "C-M-", ".gwf"
[Mon Aug  6 19:50:10 2012] ->5: start main 10
[Mon Aug  6 19:50:12 2012] main started
[Mon Aug  6 19:50:12 2012] ->5: start profiler
[Mon Aug  6 19:50:12 2012] ->5: # comment out this block to stop saving data
[Mon Aug  6 19:50:12 2012] frame saver started
[Mon Aug  6 19:50:12 2012] ->5: start frame-saver
[Mon Aug  6 19:50:13 2012] ->5: sync frame-saver
[Mon Aug  6 19:50:13 2012] ->5: start trender
[Mon Aug  6 19:50:13 2012] trender started
[Mon Aug  6 19:50:13 2012] trend frame saver started
[Mon Aug  6 19:50:13 2012] ->5: start trend-frame-saver
[Mon Aug  6 19:50:14 2012] ->5: sync trend-frame-saver
[Mon Aug  6 19:50:14 2012] minute trend frame saver started
[Mon Aug  6 19:50:14 2012] ->5: start minute-trend-frame-saver
[Mon Aug  6 19:50:14 2012] Done creating ADC structures
[Mon Aug  6 19:50:15 2012] ->5: sync minute-trend-frame-saver
[Mon Aug  6 19:50:15 2012] raw minute trend frame saver started
[Mon Aug  6 19:50:15 2012] ->5: start raw_minute_trend_saver
[Mon Aug  6 19:50:15 2012] ->5: #frame-writer "225.225.225.1" broadcast="131.215.113.0" all
[Mon Aug  6 19:50:15 2012] ->5: #sleep 5
[Mon Aug  6 19:50:15 2012] producer started
[Mon Aug  6 19:50:15 2012] ->5: start producer
[Mon Aug  6 19:50:15 2012] ->5: start epics dcu
[Mon Aug  6 19:50:15 2012] MX receiver thread started
[Mon Aug  6 19:50:15 2012] edcu started
[Mon Aug  6 19:50:15 2012] ->5: start epics server "C0:DAQ-DC0_" "C1:DAQ-DC0_"
[Mon Aug  6 19:50:15 2012] epics server started
[Mon Aug  6 19:50:15 2012] ->5: start listener 8087
[Mon Aug  6 19:50:15 2012] ->5: start listener 8088 1
[Mon Aug  6 19:50:15 2012] ->5: sleep 60
[Mon Aug  6 19:50:15 2012] Epics server started
[Mon Aug  6 19:50:15 2012] EDCU has 2553 channels configured; first=0

[Mon Aug  6 19:50:18 2012] Minute trender made GPS time correction; gps=1028343032; gps%60=32
...

The "tail:..." line indicates that the log was moved and replaced, which indicates a daqd restart.  As far as I know this was not manually triggered.

After the restart the same thing happens again.  About once every five minutes.

  7095   Mon Aug 6 20:08:45 2012 JamieUpdateCDSdaqd and CDS network problems today

When daqd is caught in this state it can not be killed.  It's in "uninterruptable sleep" ('D' state in the top output below).  This usually indicates that it's waiting for the kernel, usually due to some missing or hung IO.

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                      
28038 controls  20   0 4430m 2.0g  19m D    0 27.1   0:15.00 daqd                                                                                         

The memory footprint also seems to be getting big.  It's clearly trying to do something stupid that it can't handle.

  7096   Mon Aug 6 20:22:50 2012 JamieUpdateCDSdaqd segfaulting after five minutes

I tried running daqd manually, and sure enough it segfaults after about five minutes (see log below).  I've uncommented it from /etc/inittab on fb and I'm leaving it off for now until we can figure out what's going on.

controls@fb /opt/rtcds/caltech/c1/target/fb 0$ /opt/rtcds/caltech/c1/target/fb/daqd -c /opt/rtcds/caltech/c1/target/fb/daqdrc
263596
MX endpoint opened
startup file interpreter thread tid=139790943115536
calling yyparse(5, 6)
[Mon Aug  6 20:15:27 2012] ->5: #set avoid_reconnect
[Mon Aug  6 20:15:27 2012] ->5: set thread_stack_size=102400
[Mon Aug  6 20:15:27 2012] new threads will be created with the stack of size 102400K
[Mon Aug  6 20:15:27 2012] ->5: set allow_tpman_connect_fail
[Mon Aug  6 20:15:27 2012] ->5: #set dcu_status_check=5
[Mon Aug  6 20:15:27 2012] ->5: #set symm_gps_offset=-1
[Mon Aug  6 20:15:27 2012] ->5: #set symm_gps_offset=31535998
[Mon Aug  6 20:15:27 2012] ->5: ##set symm_gps_offset=347155213
[Mon Aug  6 20:15:27 2012] ->5: #set symm_gps_offset=378691215
[Mon Aug  6 20:15:27 2012] ->5: #set symm_gps_offset=378691212
[Mon Aug  6 20:15:27 2012] ->5: #set symm_gps_offset=315964799
[Mon Aug  6 20:15:27 2012] ->5: set symm_gps_offset=315964801
[Mon Aug  6 20:15:27 2012] ->5: set debug=0
[Mon Aug  6 20:15:27 2012] ->5: set log=2
[Mon Aug  6 20:15:27 2012] ->5: set zero_bad_data=0
[Mon Aug  6 20:15:27 2012] ->5: set dcu_status_check=9
[Mon Aug  6 20:15:27 2012] ->5: set controller_dcu=33
[Mon Aug  6 20:15:27 2012] ->5: set master_config="/opt/rtcds/caltech/c1/target/fb/master"
[Mon Aug  6 20:15:30 2012] finished configuring data channels
[Mon Aug  6 20:15:30 2012] ->5: configure channels begin end
GDS server NODE=19 HOST=c1iscex DCUID=19
GDS server NODE=20 HOST=c1sus DCUID=20
GDS server NODE=21 HOST=c1sus DCUID=21
GDS server NODE=22 HOST=c1lsc DCUID=22
GDS server NODE=25 HOST=c1iscex DCUID=61
GDS server NODE=28 HOST=c1ioo DCUID=28
GDS server NODE=33 HOST=c1ioo DCUID=33
GDS server NODE=34 HOST=c1ioo DCUID=34
GDS server NODE=36 HOST=c1sus DCUID=36
GDS server NODE=38 HOST=c1sus DCUID=38
GDS server NODE=39 HOST=c1sus DCUID=39
GDS server NODE=40 HOST=c1lsc DCUID=40
GDS server NODE=42 HOST=c1lsc DCUID=42
GDS server NODE=45 HOST=c1iscex DCUID=45
GDS server NODE=46 HOST=c1iscey DCUID=46
GDS server NODE=47 HOST=c1iscey DCUID=47
GDS server NODE=48 HOST=c1lsc DCUID=48
GDS server NODE=50 HOST=c1lsc DCUID=50
GDS server NODE=51 HOST=c1ioo DCUID=51
GDS server NODE=60 HOST=c1lsc DCUID=60
GDS server NODE=61 HOST=c1iscex DCUID=61
GDS server NODE=62 HOST=c1sus DCUID=62
Unable to find GDS node 90 system c1x00 in INI files
GDS server NODE=91 HOST=c1lsc DCUID=60
Unable to find GDS node 92 system c1tst2 in INI files
Unable to find GDS node 95 system c1x10 in INI files
TP: node = 19, host = c1iscex, dup = 0, prog = 0x31002013, vers = 1
Initialized TP interface node=19, host=c1iscex
TP: node = 20, host = c1sus, dup = 0, prog = 0x31002014, vers = 1
Initialized TP interface node=20, host=c1sus
TP: node = 21, host = c1sus, dup = 0, prog = 0x31002015, vers = 1
Initialized TP interface node=21, host=c1sus
TP: node = 22, host = c1lsc, dup = 0, prog = 0x31002016, vers = 1
Initialized TP interface node=22, host=c1lsc
TP: node = 25, host = c1iscex, dup = 0, prog = 0x31002019, vers = 1
Initialized TP interface node=25, host=c1iscex
TP: node = 28, host = c1ioo, dup = 0, prog = 0x3100201c, vers = 1
Initialized TP interface node=28, host=c1ioo
TP: node = 33, host = c1ioo, dup = 0, prog = 0x31002021, vers = 1
Initialized TP interface node=33, host=c1ioo
TP: node = 34, host = c1ioo, dup = 0, prog = 0x31002022, vers = 1
Initialized TP interface node=34, host=c1ioo
TP: node = 36, host = c1sus, dup = 0, prog = 0x31002024, vers = 1
Initialized TP interface node=36, host=c1sus
TP: node = 38, host = c1sus, dup = 0, prog = 0x31002026, vers = 1
Initialized TP interface node=38, host=c1sus
TP: node = 39, host = c1sus, dup = 0, prog = 0x31002027, vers = 1
Initialized TP interface node=39, host=c1sus
TP: node = 40, host = c1lsc, dup = 0, prog = 0x31002028, vers = 1
Initialized TP interface node=40, host=c1lsc
TP: node = 42, host = c1lsc, dup = 0, prog = 0x3100202a, vers = 1
Initialized TP interface node=42, host=c1lsc
TP: node = 45, host = c1iscex, dup = 0, prog = 0x3100202d, vers = 1
Initialized TP interface node=45, host=c1iscex
TP: node = 46, host = c1iscey, dup = 0, prog = 0x3100202e, vers = 1
Initialized TP interface node=46, host=c1iscey
TP: node = 47, host = c1iscey, dup = 0, prog = 0x3100202f, vers = 1
Initialized TP interface node=47, host=c1iscey
TP: node = 48, host = c1lsc, dup = 0, prog = 0x31002030, vers = 1
Initialized TP interface node=48, host=c1lsc
TP: node = 50, host = c1lsc, dup = 0, prog = 0x31002032, vers = 1
Initialized TP interface node=50, host=c1lsc
TP: node = 51, host = c1ioo, dup = 0, prog = 0x31002033, vers = 1
Initialized TP interface node=51, host=c1ioo
TP: node = 60, host = c1lsc, dup = 0, prog = 0x3100203c, vers = 1
Initialized TP interface node=60, host=c1lsc
TP: node = 61, host = c1iscex, dup = 0, prog = 0x3100203d, vers = 1
Initialized TP interface node=61, host=c1iscex
TP: node = 62, host = c1sus, dup = 0, prog = 0x3100203e, vers = 1
Initialized TP interface node=62, host=c1sus
TP: node = 91, host = c1lsc, dup = 0, prog = 0x3100205b, vers = 1
Initialized TP interface node=91, host=c1lsc
[Mon Aug  6 20:15:30 2012] ->5: tpconfig "/opt/rtcds/caltech/c1/target/gds/param/testpoint.par"
[Mon Aug  6 20:15:30 2012] ->5: set gps_leaps = 820108813
[Mon Aug  6 20:15:30 2012] ->5: set detector_name="CIT"
[Mon Aug  6 20:15:30 2012] ->5: set detector_prefix="C1"
[Mon Aug  6 20:15:30 2012] ->5: set detector_longitude=-90.7742403889
[Mon Aug  6 20:15:30 2012] ->5: set detector_latitude=30.5628943337
[Mon Aug  6 20:15:30 2012] ->5: set detector_elevation=.0
[Mon Aug  6 20:15:30 2012] ->5: set detector_azimuths=1.1,4.7123889804
[Mon Aug  6 20:15:30 2012] ->5: set detector_altitudes=1.0,2.0
[Mon Aug  6 20:15:30 2012] ->5: set detector_midpoints=2000.0, 2000.0
[Mon Aug  6 20:15:30 2012] ->5: set num_dirs = 10
[Mon Aug  6 20:15:30 2012] ->5: set frames_per_dir=225
[Mon Aug  6 20:15:30 2012] ->5: set full_frames_per_file=1
[Mon Aug  6 20:15:30 2012] ->5: set full_frames_blocks_per_frame=16
[Mon Aug  6 20:15:30 2012] ->5: set frame_dir="/frames/full", "C-R-", ".gwf"
[Mon Aug  6 20:15:30 2012] ->5: set trend_num_dirs=10
[Mon Aug  6 20:15:30 2012] ->5: set trend_frames_per_dir=1440
[Mon Aug  6 20:15:30 2012] ->5: set trend_frame_dir= "/frames/trend/second", "C-T-", ".gwf"
[Mon Aug  6 20:15:30 2012] ->5: set raw-minute-trend-dir="/frames/trend/minute_raw"
[Mon Aug  6 20:15:30 2012] ->5: set nds-jobs-dir="/opt/rtcds/caltech/c1/target/fb"
[Mon Aug  6 20:15:30 2012] ->5: set minute-trend-num-dirs=10
[Mon Aug  6 20:15:30 2012] ->5: set minute-trend-frames-per-dir=24
[Mon Aug  6 20:15:30 2012] ->5: set minute-trend-frame-dir="/frames/trend/minute", "C-M-", ".gwf"
[Mon Aug  6 20:15:30 2012] ->5: start main 10
Allocated move buffer size 11616356 bytes
[Mon Aug  6 20:15:32 2012] main started
[Mon Aug  6 20:15:32 2012] ->5: start profiler
[Mon Aug  6 20:15:32 2012] ->5: # comment out this block to stop saving data
[Mon Aug  6 20:15:32 2012] frame saver started
[Mon Aug  6 20:15:32 2012] ->5: start frame-saver
[Mon Aug  6 20:15:33 2012] ->5: sync frame-saver
[Mon Aug  6 20:15:33 2012] ->5: start trender
[Mon Aug  6 20:15:33 2012] trender started
[Mon Aug  6 20:15:33 2012] trend frame saver started
[Mon Aug  6 20:15:33 2012] ->5: start trend-frame-saver
[Mon Aug  6 20:15:34 2012] ->5: sync trend-frame-saver
[Mon Aug  6 20:15:34 2012] minute trend frame saver started
[Mon Aug  6 20:15:34 2012] ->5: start minute-trend-frame-saver
[Mon Aug  6 20:15:34 2012] Done creating ADC structures
[Mon Aug  6 20:15:35 2012] ->5: sync minute-trend-frame-saver
[Mon Aug  6 20:15:35 2012] raw minute trend frame saver started
[Mon Aug  6 20:15:35 2012] ->5: start raw_minute_trend_saver
[Mon Aug  6 20:15:35 2012] ->5: #frame-writer "225.225.225.1" broadcast="131.215.113.0" all
[Mon Aug  6 20:15:35 2012] ->5: #sleep 5
[Mon Aug  6 20:15:35 2012] producer started
[Mon Aug  6 20:15:35 2012] ->5: start producer
[Mon Aug  6 20:15:35 2012] ->5: start epics dcu
[Mon Aug  6 20:15:35 2012] MX receiver thread started
[Mon Aug  6 20:15:35 2012] edcu started
[Mon Aug  6 20:15:35 2012] ->5: start epics server "C0:DAQ-DC0_" "C1:DAQ-DC0_"
[Mon Aug  6 20:15:35 2012] epics server started
[Mon Aug  6 20:15:35 2012] ->5: start listener 8087
[Mon Aug  6 20:15:35 2012] ->5: start listener 8088 1
[Mon Aug  6 20:15:35 2012] ->5: sleep 60
Creating C1:DAQ-DC0_PEM_SLOW_STATUS
Creating C1:DAQ-DC0_PEM_SLOW_CRC_CPS
Creating C1:DAQ-DC0_PEM_SLOW_CRC_SUM
Creating C1:DAQ-DC0_C1X01_STATUS
Creating C1:DAQ-DC0_C1X01_CRC_CPS
Creating C1:DAQ-DC0_C1X01_CRC_SUM
Creating C1:DAQ-DC0_C1X02_STATUS
Creating C1:DAQ-DC0_C1X02_CRC_CPS
Creating C1:DAQ-DC0_C1X02_CRC_SUM
Creating C1:DAQ-DC0_C1SUS_STATUS
Creating C1:DAQ-DC0_C1SUS_CRC_CPS
Creating C1:DAQ-DC0_C1SUS_CRC_SUM
Creating C1:DAQ-DC0_C1OAF_STATUS
Creating C1:DAQ-DC0_C1OAF_CRC_CPS
Creating C1:DAQ-DC0_C1OAF_CRC_SUM
Creating C1:DAQ-DC0_C1ALS_STATUS
Creating C1:DAQ-DC0_C1ALS_CRC_CPS
Creating C1:DAQ-DC0_C1ALS_CRC_SUM
Creating C1:DAQ-DC0_C1X03_STATUS
Creating C1:DAQ-DC0_C1X03_CRC_CPS
Creating C1:DAQ-DC0_C1X03_CRC_SUM
Creating C1:DAQ-DC0_C1IOO_STATUS
Creating C1:DAQ-DC0_C1IOO_CRC_CPS
Creating C1:DAQ-DC0_C1IOO_CRC_SUM
Creating C1:DAQ-DC0_C1MCS_STATUS
Creating C1:DAQ-DC0_C1MCS_CRC_CPS
Creating C1:DAQ-DC0_C1MCS_CRC_SUM
Creating C1:DAQ-DC0_C1RFM_STATUS
Creating C1:DAQ-DC0_C1RFM_CRC_CPS
Creating C1:DAQ-DC0_C1RFM_CRC_SUM
Creating C1:DAQ-DC0_C1PEM_STATUS
Creating C1:DAQ-DC0_C1PEM_CRC_CPS
Creating C1:DAQ-DC0_C1PEM_CRC_SUM
Creating C1:DAQ-DC0_C1X04_STATUS
Creating C1:DAQ-DC0_C1X04_CRC_CPS
Creating C1:DAQ-DC0_C1X04_CRC_SUM
Creating C1:DAQ-DC0_C1LSC_STATUS
Creating C1:DAQ-DC0_C1LSC_CRC_CPS
Creating C1:DAQ-DC0_C1LSC_CRC_SUM
Creating C1:DAQ-DC0_C1SCX_STATUS
Creating C1:DAQ-DC0_C1SCX_CRC_CPS
Creating C1:DAQ-DC0_C1SCX_CRC_SUM
Creating C1:DAQ-DC0_C1X05_STATUS
Creating C1:DAQ-DC0_C1X05_CRC_CPS
Creating C1:DAQ-DC0_C1X05_CRC_SUM
Creating C1:DAQ-DC0_C1SCY_STATUS
Creating C1:DAQ-DC0_C1SCY_CRC_CPS
Creating C1:DAQ-DC0_C1SCY_CRC_SUM
Creating C1:DAQ-DC0_C1ASS_STATUS
Creating C1:DAQ-DC0_C1ASS_CRC_CPS
Creating C1:DAQ-DC0_C1ASS_CRC_SUM
Creating C1:DAQ-DC0_C1CAL_STATUS
Creating C1:DAQ-DC0_C1CAL_CRC_CPS
Creating C1:DAQ-DC0_C1CAL_CRC_SUM
Creating C1:DAQ-DC0_C1MCC_STATUS
Creating C1:DAQ-DC0_C1MCC_CRC_CPS
Creating C1:DAQ-DC0_C1MCC_CRC_SUM
Creating C1:DAQ-DC0_C1MCP_STATUS
Creating C1:DAQ-DC0_C1MCP_CRC_CPS
Creating C1:DAQ-DC0_C1MCP_CRC_SUM
Creating C1:DAQ-DC0_C1LSP_STATUS
Creating C1:DAQ-DC0_C1LSP_CRC_CPS
Creating C1:DAQ-DC0_C1LSP_CRC_SUM
Creating C1:DAQ-DC0_C1SPX_STATUS
Creating C1:DAQ-DC0_C1SPX_CRC_CPS
Creating C1:DAQ-DC0_C1SPX_CRC_SUM
Creating C1:DAQ-DC0_C1SUP_STATUS
Creating C1:DAQ-DC0_C1SUP_CRC_CPS
Creating C1:DAQ-DC0_C1SUP_CRC_SUM
[Mon Aug  6 20:15:35 2012] Epics server started
[Mon Aug  6 20:15:35 2012] EDCU has 2553 channels configured; first=0

Symmetricom status: LOCKED
Starting at gps 1028344552 prev_gps 1028344552 frac 312500000 f 314094022
[Mon Aug  6 20:15:38 2012] Minute trender made GPS time correction; gps=1028344552; gps%60=52
Segmentation fault (core dumped)

  7097   Mon Aug 6 20:27:59 2012 JamieUpdateSimulationsMore work on getting simplant models running: c1lsp and c1sup

I'm trying to get more of the simplant models running so that we (me and Sasha Surf) can get a full real-time cavity simplant running.  As I reported last week, c1spx is running again on c1iscex.

The two new simplant models are c1sup, which holds the simplant for ITMX, and c1lsp, which holds the IFO simplant, specifically the one we're working on for XARM.

Here's the relevant info:

model  host     dcuid  cpu
c1spx  c1iscex  61     4
c1sup  c1sus    62     6
c1lsp  c1lsc    60     6

c1spx and c1sup will be running the sus_single_plant parts for ETMX and ITMX simplant.  All the simplant suspension channels will be names "SUP" (as opposed to "SUS" for control).

c1lsp is now running, but c1sup won't run for unknown reasons.  The c1sup model is not very complicated, and in fact is more-or-less identical to c1spx.  It compiles and installs and even loads, but it completely unresponsive after loading.  Unfortunately I've had enough CDS bullshit for today, so I'll try to figure out what's going on tomorrow.

  7101   Tue Aug 7 11:46:24 2012 JamieUpdateCDSAlex working on daqd

Alex is apparently working on daqd (remotely).  I'll report back when I find out more.

  7102   Tue Aug 7 14:17:07 2012 JamieUpdateCDSdaqd running again; related to c1sup issue

So daqd's problem was apparently the bad/non-running c1sup model.  The c1sup model, which I reported on attempting to get running in 7097, was not running because there were no available CPUs on the c1sus FE machine.  This was due to my stupid undercounting of the number of CPUs.  Anyway, for reasons I don't understand, this was causing daqd to segfault.  Removing c1sup from c1sus "fixed" the problem.

Alex agreed that daqd should definitely not be segfaulting in this circumstance.  It's still unclear exactly what daqd was looking at that was causing it to crash.

I'm going to move c1sup to c1iscex, which has a lot of spare CPUs.

  7103   Tue Aug 7 14:34:01 2012 JamieUpdateCDSjk. daqd still segfaulting

Quote:

So daqd's problem was apparently the bad/non-running c1sup model.  The c1sup model, which I reported on attempting to get running in 7097, was not running because there were no available CPUs on the c1sus FE machine.  This was due to my stupid undercounting of the number of CPUs.  Anyway, for reasons I don't understand, this was causing daqd to segfault.  Removing c1sup from c1sus "fixed" the problem.

Alex agreed that daqd should definitely not be segfaulting in this circumstance.  It's still unclear exactly what daqd was looking at that was causing it to crash.

I'm going to move c1sup to c1iscex, which has a lot of spare CPUs.

I spoke too soon.  It's still segfaulting, but at a different place. Alex and I are looking into it.

But another mystery solved is the cause of all the network slowness: the daqd core dump.  When daqd segfaults it dumps it's core, which can typically be >4G, to /opt/rtcds/caltech/c1/target/fb/core.  This is of course an NFS mount from linux1, so it's dumping 4G on the network, which not surprisingly clogs the network.

  7105   Tue Aug 7 15:04:23 2012 JamieUpdateCDSdaqd problem was root-owned files and directories

Apparently the last problem was because of root-owned frame directories that daqd was trying to write to.  During debugging Alex had run daqd as root, but it's supposed to run as controls.  All the /frame directories are supposed to be owned by controls.  When daqd was run as root, it created new frame directories owned by root, which controls couldn't write to when I restarted daqd the proper way.  Once we chown'd the directories daqd started running again.

Alex also put in a "fix" for the core dump problem.  He touched an empty core file owned by root:

-rw-r--r-- 1 root root 0 Aug  7 14:38 /opt/rtcds/caltech/c1/target/fb/core

This will prevent any dying daqd process owned by controls from dumping it's core at that location.  Personally I think this is a horribly hacky "solution" that doesn't actually fix any of the issues that were causing the segfaults to begin with, but it might prevent some of the network slow down we see when the core does dump.  It's mostly just masking the problem, though, so I'm tempted to remove it so we all feel the pain when daqd starts shitting all over the network again.

  7175   Tue Aug 14 11:46:22 2012 JamieUpdateCDSIPC senders know nothing about rates of IPC receivers, we need to filter signals appropriately

Quote:

When signals are transmitted between the models running at different rates, no AI or AA filters are automatically applied. We need to fix our models.

ai.png

This is known, but we just haven't fully groked it yet.  We need to look closely at every place we have IPCs between models running at different rates.  The sender has no information about receivers, so it can't reasonably do anything to pre-filter the signal on it's own.

So for transmission from:

  • faster -> slower models: add anti-aliasing on the sender side
  • slower -> faster models: add anti-imaging on the receiver side
  7182   Tue Aug 14 17:47:44 2012 JamieUpdateCDSc1sus machine replaced

Rolf and Alex came back over with a replacement machine for c1sus.   We removed the old machine, removed it's timing, dolphin, and PCIe extension cards and put them in the new machine.  We then installed the new machine and booted it and it came up fine.  The BIOS in this machine is slightly different, and it wasn't having the same failure-to-boot-with-no-COM issue that the previous one was.  The COM ports are turned off on this machine (as is the USB interface).

Unfortunately the problem we were experiencing with the old machine, that unloading certain models was causing others to twitch and that dolphin IPC writes were being dropped, is still there.  So the problem doesn't seem to have anything to do with hardware settings...

After some playing, Rolf and Alex determined that for some reason the c1rfm model is coming up in a strange state when started during boot.  It runs faster, but the IPC errors are there.  If instead all models are stopped, the c1rfm model is started first, and then the rest of the models are started, the c1rfm model runs ok.  They don't have an explanation for this, and I'm not sure how we can work around it other than knowing the problem is there and do manual restarts after boot.  I'll try to think of something more robust.

A better "fix" to the problems is to clean up all of our IPC routing, a bunch of which we're currently doing very inefficient right now.  We're routing things through c1rfm that don't need to be, which is introducing delays.  It particular, things that can communicate directly over RFM or dolphin should just do so.  We should also figure out if we can put the c1oaf and c1pem models on the same machine, so that they can communicate directly over shared memory (SHMEM).  That should cut down on overhead quite a bit.  I'll start to look at a plan to do that.

 

  7592   Tue Oct 23 00:51:41 2012 JamieUpdateAdaptive Filteringmicrophone noise

Quote:

I will do some experiments on acoustic noise canceling during my stay.
Now I am planning to cancel acoustic noise from PMC and see how the acoustic noise work and how we should place microphones.a

First, I measured the noise in microphones and its circuit.
mic_noise2.png
-blue, green, red, solid lines; microphone signals
-blue, green, red, dashed lines; un-coherent noise in signals
-yellow, black, solid lines; circuit noise (signal input is open, not connected to the microphones)

We can see the acoustic signal above 1 Hz, and the circuit does not seem to limit its sensitivity. But I do not know why yellow and black is so different. I will check it tomorrow.

Hi, Ayaka.  It would be good if you could give a little bit more detail about this plot:

  • What exactly are the "signals"?  Are you making a sound somehow?  If so, what is producing the sound?  What is it's spectrum?
  • Are the blue/green/red traces from three different microphones?
  • Coherence usually implies a comparison between two signals.  Is something being compared in the dashed traces?
  • Are the yellow and black traces from different amplifiers?
  • What are the units of the Y axis?

 

  7818   Wed Dec 12 20:22:03 2012 JamieUpdateComputer Scripts / Programsilluminators fixed and added to VIDEO screen

I fixed the illuminator setup.  ETMY was not hooked up, and the screen wasn't configured quite right.  The ITMX illuminator still needs to be hooked up to the vertex switch.

I made an updated illuminator script that works more like the videoswitch scripts, with a saner interface, and is located here:

/opt/rtcds/caltech/c1/scripts/general/illuminator

I also fixed up the illuminator MEDM interface a bit and added it to the VIDEO screen:

video.png

While I was at it, I cleaned up the sitemap a bit:

sitemap.png

I hope everyone won't be too confused.

  7842   Mon Dec 17 20:22:07 2012 JamieUpdateGeneralVent plan: WE VENT TOMORROW

We will vent tomorrow (12/18)

After lengthy discussion, I have determined that we should vent now, with the primary goal of this vent to replace the input steering PZTs with the new active tip-tilts.  Since we still don't understand exactly what's going on with the PRC, it's unclear what we would do to attack the problem.  We need to do more modeling and measurements first.  We should limit the goal of this vent to replacing the PZTs, and then close up and do more measurements with better modeling and improved input point in hand.

There is limited time this week before everyone leaves for two-week holidays on Thursday or Friday.  The reason to not vent would be that we don't want to leave the IFO at air during the two week holiday.  People seem to think that this is not a problem, so we don't gain anything by waiting.  Therefore we vent now and do what we can before people take off.

The goal for this week is to replace PZT1 only:

Now: Jenne and Manasa are doing vent prep.  Manasa is lowering the input power and preparing the mode cleaner.  Jenne centered IPPOS and IPANG.  This will allow us to check how the input alignment changed during the vent.

12/18: Steve is going to start the vent as soon as he gets back from the dentist, at around 10am.  He will regulate the vent such that we are ready to lock the mode cleaner by 4pm.  At that time we will lock the MC and recheck the input alignment with IPPOS/ANG, with the idea being to see if anything moves during the vent.

12/19: First thing in the morning we take off the access connector ONLY.  The access connector is all we'll need to replace PZT1.    Put light door on the OMC chamber side immediately, since we won't need to get in there at all.  We won't need the light access connector.

For the rest of Wednesday we'll remove PZT1 and install the new active TT.  We'll be using the current out-of-vac cable for PZT1 for the new TT.  We should only have to modify the rack end of the cable to accommodate the coil driver.  This should be a small modification.  Given that we have no wiring diagrams we'll have to pin it out in situ.

12/20: Hopefully finish up TT installation.

Jenne leaves 12/20, Manasa and Jamie leave 12/21.  We will either leave light doors on access connector holes, or possibly Rana, Koji, and Steve will replace access connector on Friday so that we can pump down to 1 Torr or something so that we leave it there over the holiday.

After we return from vacation:

PZT2/TT2 installation.  This will be less straight forward since the new TT has a bigger foot print than PZT2 and will block the PRM optical levers.  We'll need one additional steering mirror to redirect the oplev around the TT.  See elog 7815.

Once the new TTs are installed, we'll reevaluate where we're at.  If PRC modeling has progressed and we have an idea of something to work on with the PRC, we can.  Otherwise, we'll just button up, pump down, and get on with some better PRC measurements.

  7852   Tue Dec 18 16:37:17 2012 JamieUpdateAlignmentPost vent, pre door removal alignment

[Jenne, Manasa, Jamie]

Now that we're up to air we relocked the mode cleaner, tweaked up the alignment, and looked at the spot positions:

mcspot_post_vent.pdf

The measurements from yesterday were made before the input power was lowered.  It appears that things have not moved by that much, which is pretty good.

We turned on the PZT1 voltages and set them back to their nominal values as recorded before shut-down yesterday.  Jenne had centered IPPOS before shutdown (IPANG was unfortunately not coming out of the vacuum).  Now we're at the following value: (-0.63, 0.66).  We need to calibrate this to get a sense of how much motion this actually is, but this is not insignificant.

 

  7863   Thu Dec 20 12:14:19 2012 JamieUpdateGeneralproblem with in-vac wiring for TTs

Nic and I discovered a problem with the in-vac wiring from the feed-thru to the top of the table.  Pin 13 at the top of the stack, which is one of the coil pins on the tip-tilt quadrapus cables, is *the* shield braid on the cable that goes to the feed-thru.  This effectively shorts one of our coil signals.

There are three solutions as we see it:

* swap pin 13 for something else at the top of the stack, and then swap it back somewhere else outside of the vacuum.

* swap *all* the pins at the top of the table to be the mirror.  We would then need to mirror our cables on the outside, but that's less of an issue.

* make a mirror adapter that sits at the top.  This would obviously need to be cleaned/baked.

None of these solutions is particularly good or fast.

ELOG V3.1.3-