40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 233 of 335  Not logged in ELOG logo
ID Date Authorup Type Category Subject
  7897   Mon Jan 14 12:08:39 2013 jamieUpdateAlignmentTT2 connections

Quote:

Quote:

Was the connection between the feedthrough (atmosphere side) and the connector on the optical table confirmed to be OK?

We had a similar situation for the TT1. We found that we were using the wrong feedthrough connector (see TT1 elog).

 The major problem that Manasa and I found was that we weren't getting voltage along the cable between the rack and the chamber (all out-of-vac stuff).  We used a function generator to put voltage across 2 pins, then a DMM to try to measure that voltage on the other end of the cable.  No go.  Jamie and I will look at it again today.

Everything was fine.  Apparently these guys just forgot that the cable from the rack to the chamber flips it's pins.  There was also a small problem with the patch cable from the coil driver that had flipped pins.  This was fixed.  The coil driver signals are now getting to the TTs.

Investigating why the pitch/yaw seems to be flipped...

  7901   Tue Jan 15 19:26:35 2013 jamieUpdateAlignmentAdjustment of active TTs and input alignment

[Jamie, Manasa, Jenne]

We started by verifying that the tip-tilts were getting the correct signals at the correct coils, and were hanging properly without touching.

We started with TT2.  It was not hanging freely.  One of the coils was in much further than the others, and the mirror frame was basically sitting on the back side yaw dampers.  I backed out the coil to match the others, and backed off all of the dampers, both in back and the corner dampers on the front.

Once the mirror was freely suspended, we borrowed the BS oplev to verify that the mirror was hanging vertically.  I adjusted the adjustment screw on the bottom of the frame to make it level.  Once that was done, we verified our EPICS control.  We finally figured out that some of the coils have polarity flipped relative to the others, which is why we were seeing pitch as yaw and vice-versa.  At that point we were satisfied with how TT2 was hanging, and went back to TT1.

Given how hard it is to look at TT1, I just made sure all the dampers were backed out and touched the mirror frame to verify that it was freely swinging.  I leveled TT1 with the lower frame adjustment screw by looking at the spot position on MMT1.  Once it was level, we adjusted the EPICS biases in yaw to get it centered in yaw on MMT1.

I then adjusted the screws on MMT1 to get the beam centered at MMT2, and did the same at MMT2 to get the beam centered vertically at TT2.

I put the target at PRM and the double target at BS.  I loosened TT2 from it's base so that I could push it around a bit.  Once I had it in a reasonable position, with a beam coming out at PR3, I adjusted MMT1 to get the beam centered through the PRM target.  I went back and checked that we were still centered at MMT1.  We then adjusted the pitch and yaw of TT2 to get the transmitted beam through the BS targets as clear as possible.

At this point we stopped and closed up.  Tomorrow first thing AM we'll get our beams at the ETMs, try to finalize the input alignment, and see if we can do some in-air locking.

The plan is still to close up at the end of the week.

  7919   Fri Jan 18 15:08:13 2013 jamieUpdateAlignmentalignment of temporary half PRC

[jenne, jamie]

Jenne and I got the half PRC flashing.  We could see flashes in the PRM and PR2 face cameras.

We took out the mirror in the REFL path on the AP that diverts the beam to the REFL RF pds so that we could get more light on the REFL camera.  Added an ND filter to the REFL camera so as not to saturate.

  7949   Mon Jan 28 21:32:38 2013 jamieUpdateAlignmenttweaking of alignment into half PRC

[Koji, Jamie]

We tweaked up the alignment of the half PRC a bit.  Koji started by looking at the REFL and POP DC powers as a function of TT2 and PRM alignment. 
He found that the reflected beam for good PRC transmission was not well overlapped at REFL.  When the beam was well overlapped at REFL, there was clipping in the REFL path on the AS table.

We started by getting good overlap at REFL, and then went to the AS table to tweak up all the beams on the REFL pds and cameras.
This made the unlocked REFL DC about 40 count. This was about 10mV (=0.2mA) at the REFL55 PD.
This amazed Koji since we found the REFL DC (of the day) of 160 as the maximum of the day for a particular combination of the PRM Pitch and TT2 Pitch. So something wrong could be somewhere.

We then moved to the ITMX table where we cleaned up the POP path.  We noticed that the lens in the POP path is a little slow, so the beam is too big on the POP PD and on the POP camera (and on the camera pick-off mirror as well)
We moved the currently unused POP55 and POP22/110 RFPDs out of the way so we could move the POP RF PD and camera back closer to the focus.  Things are better, but we still need to get a better focus, particularly on the POP PD.

We found two irides on the oplev path. They are too big and one of these is too close to the POP beam. Since it does not make sense too to have two irides in vicinity, we pulled out that one from the post.

Other things we noticed:

  • The POP beam is definitely clipping in the vacuum, looks like on two sides.
  • We can probably get better layout on the POP table, so we're not hitting mirrors at oblique angles and can get beams on grid paths.

After the alignment work on the tables, we started locking the cavity. We already saw the improvement of the POPDC power from 1000 cnt to 2500 cnt without any realignment.
Once PRM is tweaked a little (0.01ish for pitch and yaw), the maximum POPDC of 6000 was achieved. But still the POP camera shows non-gaussian shape of the beam and the Faraday camera shows bright
scattering of the beam. It seems that the scattering at the Faraday is not from the main beam but the halo leaking from the cavity (i.e. unlocking of the cavity made the scattering disappeared)


Tomorrow Jenne and I will go into BS to tweak the alignment of the TEMP PRC flat mirror, and into ITMX to see if we can clean up the POP path.

  8778   Thu Jun 27 23:18:46 2013 jamieUpdateComputer Scripts / ProgramsWARNING: Matlab upgraded

Quote:

I moved the old matlab directory from /cvs/cds/caltech/apps/linux64/matlab_o to /cvs/cds/caltech/apps/linux64/matlab_oo

and moved the previously current matlab dir from /cvs/cds/caltech/apps/linux64/matlab to /cvs/cds/caltech/apps/linux64/matlab_o.

And have installed the new Matlab 2013a into /cvs/cds/caltech/apps/linux64/matlab.

Since I'm not sure how well the new Matlab/Simulink plays with the CDS RCG, I've left the old one and we can easily revert by renaming directories.

Be careful with this.  If Matlab starts re-saving models in a new file format that is unreadable by the RCG, then we won't be able to rebuild models until we do an svn revert.  Or the bigger danger, that the RCG *thinks* it reads the file and generates code that does something unexpected.

Of course this all may be an attempt to drive home the point that we need an RCG test suite.

  8951   Thu Aug 1 15:06:59 2013 jamieUpdateCDSNew model for endtable PZTs

Quote:

I have made a new model for the endtable PZT servo, and have put it in c1iscex. Model name is c1asx. Yesterday, Koji helped me start the model up. The model seems to be running fine now (there were some problems initially, I will post a more detailed elog about this in a bit) but some channels, which are computer generated, don't seem to exist (they show up as white blocks on the MEDM GDS_TP screen). I am attaching a screenshot of the said screen and the names of the channels. More detailed elog about what was done in making the model to follow.

 

C1ASX_GDS_TP.png

 

Channel Names:

C1:DAQ-DC0_C1ASX_STATUS (this is the channel name for the two leftmost white blocks)

C1:DAQ_DC0_C1ASX_CRC_CPS

C1:DAQ-DC0_C1ASX_CRC_SUM

I don't know what's going on here (why the channels are white), and I don't yet have a suggestion of where to look to fix it but...

Is there a reason that you're making a new model for this?  You could just use and existing model at c1iscex, like the c1scx, and put your stuff in a top-names block.  Then you wouldn't have to worry about all of the issues with adding and integrating a new model.

  9086   Wed Aug 28 19:47:28 2013 jamieConfigurationCDSfront end IPC configuration

Quote:

It's hard to believe that c1lsc -> c1sus only has 4 channels. We actuate ITMX/Y/BS/PRM/SRM for the length control.
In addition to these, we control the angles of ITMX/Y/BS/PRM (and SRM in future) via c1ass model on c1lsc.
So there should be at least 12 connections (and more as I ignored MCL).

Koji was correct that I missed some connections from c1lsc to c1sus.  I corrected the graph in the original post.

Also, I should have noted, that that graph doesn't actually include everything that we now have.  I left out all the simplant stuff, which adds extra connections between c1lsc and c1sus, mostly because the sus simplant is being run on c1lsc only because there was no space on c1sus.  That should be corrected, either by moving c1rfm to c1lsc, or by adding a new core to c1sus.

I also spoke to Rolf today and about the possibility of getting a OneStop fiber and dolphin card for c1ioo.  The dolphin card and cable we should be able to order no problem.  As for the OneStop, we might have to borrow a new fiber-supporting card from India, then send our current card to OneStop for fiber-supporting modifications.  It sounds kind of tricky.  I'll post more as I figure things out.

Rolf also said that in newer versions of the RCG, the RFM direct memory access (DMA) has improved in performance considerably, which reduces considerably the model run-time delay involved in using the RFM.  In other words, the long awaited RCG upgrade might alleviate some of our IPC woes.

We need to upgrade the RCG to the latest release (2.7)

  9087   Wed Aug 28 23:09:55 2013 jamieConfigurationCDScode to generate host IPC graph
Attachment 1: hosts.png
hosts.png
Attachment 2: 40m-ipcs-graph.py
#!/usr/bin/env python

# ipc connections: (from, to, number)
ipcs = [
    ('c1scx', 'c1lsc', 1),
    ('c1scy', 'c1lsc', 1),
    ('c1oaf', 'c1lsc', 8),

    ('c1scx', 'c1ass', 1),
    ('c1scy', 'c1ass', 1),
... 96 more lines ...
  9194   Thu Oct 3 08:57:00 2013 jamieUpdateComputer Scripts / Programspianosa can't find Jamie PPA

Quote:

Message on 'pianosa':

Failed to fetch http://ppa.launchpad.net/drgraefy/nds2-client/ubuntu/dists/lucid/main/binary-amd64/Packages.gz  404  Not Found

Sorry, that was an experiment to see if I could set up a general-use repository for the NDS packages.  I've removed it, and did an update/upgrade.

  9266   Wed Oct 23 17:30:17 2013 jamieUpdateSUSETMY sensors compared to ETMX

c1scy has been running slow (compared to c1scx, which does basically the exact same thing *) for many moons now.  We've looked at it but never been able to identify a reason why it should run slower.  I suspect there may be some bios setting that's problematic.

The RCG build process is totally convoluted, and really bad at reporting errors.  In fact, you need to be careful because the errors it does print are frequently totally misleading.  You have to look at the error logs for the full story.  The rtcds utility is ultimately just executing the "standard" build instructions.  The build directory is:

    /opt/rtcds/caltech/c1/rtbuild

The build/error logs are:

    <model>.log     <model>_error.log 
I'll add a command to rtcds to view the last logs.

(*) the phrase "basically the exact same thing" is LIGO code for "empirically not at all the same"
  9278   Thu Oct 24 12:00:11 2013 jamieUpdateCDSfb acquisition of slow channels

Quote:

 

 While that would be good - it doesn't address the EDCU problem at hand. After some verbal emailing, Jamie and I find that the master file in target/fb/ actually doesn't point to any of the EDCU files created by any of the FE machines. It is only using the C0EDCU.ini as well as the *_SLOW.ini files that were last edited in 2011 !!!

So....we have not been adding SLOW channels via the RCG build process for a couple years. Tomorrow morning, Jamie will edit the master file and fix this unless I get to it tonight. There a bunch of old .ini files in the daq/ dir that can be deleted too.

I took a look at the situation here so I think I have a better idea of what's going on (it's a mess, as usual):

The framebuilder looks at the "master" file

    /opt/rtcds/caltech/c1/target/fb/master

which lists a bunch of other files that contain lists of channels to acquire.  It looks like there might have been some notion to just use 

    /opt/rtcds/caltech/c1/chans/daq/C0EDCU.ini

as the master slow channels file.  Slow channels from all over the place have been added to this file, presumably by hand.  Maybe the idea was to just add slow channels manually as needed, instead of recording them all by default.  The full slow channels lists are in the

    /opt/rtcds/caltech/c1/chans/daq/C1EDCU_<model>.ini

files, none of which are listed in the fb master file.

There are also these old slow channel files, like

    /opt/rtcds/caltech/c1/chans/daq/SUS_SLOW.ini

There's a perplexing breakdown of channels spread out between these files and C1EDCU.ini:

controls@fb ~ 0$ grep MC3_URS /opt/rtcds/caltech/c1/chans/daq/C0EDCU.ini
[C1:SUS-MC3_URSEN_OVERFLOW]
[C1:SUS-MC3_URSEN_OUTPUT]
controls@fb ~ 0$ grep MC3_URS /opt/rtcds/caltech/c1/chans/daq/MCS_SLOW.ini
[C1:SUS-MC3_URSEN_INMON]
[C1:SUS-MC3_URSEN_OUT16]
[C1:SUS-MC3_URSEN_EXCMON]
controls@fb ~ 0$

why some of these channels are in one file and some in the other I have no idea.  If the fb finds multiple of the same channel if will fail to start, so at least we've been diligent about keeping disparate lists in the different files.

So I guess the question is if we want to automatically record all slow channels by default, in which case we add in the C1EDCU_<model>.ini files, or if we want to keep just adding them in by hand, in which case we keep the status quo.  In either case we should probably get rid of the *_SLOW.ini files (by maybe integrating their channels in C0EDCU.ini), since they're old and just confusing things.

In the mean time, I added C1:FEC-45_CPU_METER to C0EDCU.ini, so that we can keep track of the load there.

 

  9282   Thu Oct 24 17:26:35 2013 jamieUpdateCDSnew dataviewer installed; 'cdsutils avg' now working.

I installed a new version of dataviewer (2.3.2), and at the same time fixed the NDSSERVER issue we were having with cdsutils.  They should both be working now.

The problem turned out to be that I had setup our dataviewer to use the NDSSERVER environment, whereas by default it uses the LIGONDSIP variable.  Why we have two different environment variables that mean basically exactly the same thing, who knows.

  9285   Thu Oct 24 23:12:21 2013 jamieUpdateCDSnew dataviewer installed; no longer works on Ubuntu 10 workstations

Quote:

I installed a new version of dataviewer (2.3.2), and at the same time fixed the NDSSERVER issue we were having with cdsutils.  They should both be working now.

The problem turned out to be that I had setup our dataviewer to use the NDSSERVER environment, whereas by default it uses the LIGONDSIP variable.  Why we have two different environment variables that mean basically exactly the same thing, who knows.

 Dataviewer seems to run fine on Chiara (Ubuntu 12), but not on Rossa or Pianosa (Ubuntu 10), or Megatron, which I assume is also something medium-old.

We get the error:

controls@megatron:~ 0$ dataviewer
Can't find hostname `fb:8088'
Can't find hostname `fb:8088'; gethostbyname(); error=1
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Error in obtaining chan info.
Can't find hostname `fb:8088'
Can't find hostname `fb:8088'; gethostbyname(); error=1

Sadface :(   We also get the popup saying "Couldn't connect to fb:8088"

  9287   Thu Oct 24 23:30:57 2013 jamieUpdateCDSnew dataviewer installed; no longer works on Ubuntu 10 workstations

Quote:

Quote:

I installed a new version of dataviewer (2.3.2), and at the same time fixed the NDSSERVER issue we were having with cdsutils.  They should both be working now.

The problem turned out to be that I had setup our dataviewer to use the NDSSERVER environment, whereas by default it uses the LIGONDSIP variable.  Why we have two different environment variables that mean basically exactly the same thing, who knows.

 Dataviewer seems to run fine on Chiara (Ubuntu 12), but not on Rossa or Pianosa (Ubuntu 10), or Megatron, which I assume is also something medium-old.

We get the error:

controls@megatron:~ 0$ dataviewer
Can't find hostname `fb:8088'
Can't find hostname `fb:8088'; gethostbyname(); error=1
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Warning: Not all children have same parent in XtManageChildren
Error in obtaining chan info.
Can't find hostname `fb:8088'
Can't find hostname `fb:8088'; gethostbyname(); error=1

Sadface :(   We also get the popup saying "Couldn't connect to fb:8088"

Sorry, that was a goof on my part.  It should be working now.

  9393   Fri Nov 15 10:49:55 2013 jamieUpdateCDSCan't talk to AUXEY?

Please just try rebooting the vxworks machine.  I think there is a key on the card or create that will reset the device.  These machines are "embeded" so they're designed to be hard reset, so don't worry, just restart the damn thing and see if that fixes the problem.

  9531   Tue Jan 7 23:08:01 2014 jamieUpdateCDS/frames is full, causing daqd to die

Quote:

The daqd process is segfaulting and restarting itself every 30 seconds or so.  It's pretty frustrating. 

Just for kicks, I tried an mxstream restart, clearing the testpoints, and restarting the daqd process, but none of things changed anything.  

Manasa found an elog from a year ago (elog 7105 and preceding), but I'm not sure that it's a similar / related problem.  Jamie, please help us

The problem is not exactly the same as what's described in 7105, but the symptoms are so similar I assumed they must have a similar source.

And sure enough, /frames is completely full:

controls@fb /opt/rtcds/caltech/c1/target/fb 0$ df -h /frames/
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              13T   13T     0 100% /frames
controls@fb /opt/rtcds/caltech/c1/target/fb 0$

So the problem in both cases was that it couldn't write out the frames.  Unfortunately daqd is apparently too stupid to give us a reasonable error message about what's going on.

So why is /frames full?  Apparently the wiper script is either not running, or is failing to do it's job.  My guess is that this is a side effect of the linux1 raid failure we had over xmas.

  9533   Tue Jan 7 23:13:47 2014 jamieUpdateCDS/frames is full, causing daqd to die

Quote:

So why is /frames full?  Apparently the wiper script is either not running, or is failing to do it's job.  My guess is that this is a side effect of the linux1 raid failure we had over xmas.

It actually looks like the wiper script has been running fine.  There is a log from Tuesday morning:

controls@fb ~ 0$ cat /opt/rtcds/caltech/c1/target/fb/wiper.log

Tue Jan  7 06:00:02 PST 2014

Directory disk usage:
/frames/trend/minute_raw 385289132k
/frames/trend/second 100891124k
/frames/full 12269554048k
/frames/trend/minute 1906772k
Combined 12757641076k or 12458633m or 12166Gb

/frames size 13460088620k at 94.78%
/frames is below keep value of 95.00%
Will not delete any files
df reported usage 97.72%
controls@fb ~ 0$

So now I'm wondering if something else has been filling up the frames today.  Has anything changed today that might cause more data than usual to be written to frames?

I'm manually running the wiper script now to clear up some /frames.  Hopefully that will solve the problem temporarily.

  9535   Tue Jan 7 23:50:27 2014 jamieUpdateCDS/frames space cleared up, daqd stabilized

The wiper script is done and deleted a whole bunch of stuff to clean up some space:

controls@fb ~ 0$ /opt/rtcds/caltech/c1/target/fb/wiper.pl --delete

Tue Jan  7 23:09:21 PST 2014

Directory disk usage:
/frames/trend/minute_raw 385927520k
/frames/trend/second 125729084k
/frames/full 12552144324k
/frames/trend/minute 2311404k
Combined 13066112332k or 12759875m or 12460Gb

/frames size 13460088620k at 97.07%
/frames above keep value of 95.00%
Frame area size is 12401156668k
/frames/full size 12552144324k keep 11781098835k
/frames/trend/second size 125729084k keep 24802313k
/frames/trend/minute size 2311404k keep 620057k
Deleting some full frames to free 771045488k
- /frames/full/10685/C-R-1068567600-16.gwf
- /frames/full/10685/C-R-1068567616-16.gwf
...
controls@fb ~ 0$ df -h /frames
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              13T   12T  826G  94% /frames
controls@fb ~ 0$
So it cleaned up 826G of space.  It looks like the fb is stabilized for the moment.  On site folks should confirm...

 

asdfasdfsadf sadf asdf

  9618   Mon Feb 10 18:03:41 2014 jamieUpdateCDS12 core c1sus replacement

I have configured one of the spare Supermicro X8DTU-F chassis as a dual-CPU, 12-core CDS front end machine.  This is meant to be a replacement for c1sus.  The extra cores are so we can split up c1rfm and reduce the over-cycle problems we've been seeing related to RFM IPC delays.

I pulled the machine fresh out of the box, and installed the second CPU and additional memory that Steve purchased.  The machine seems to be working fine.  After assigning it a temporary IP address, I can boot it from the front-end boot server on the martian network.  It comes up cleanly with both CPUs recognized, and /proc/cpustat showing all 12 cores, and free showing 12 GB memory.

The plan is:

  1. pull the old c1sus machine from the rack
  2. pull OneStop, Dolphin, RFM cards from c1sus chassis
  3. installed OneStop, Dolphin, RFM cards into new c1sus
  4. install new c1sus back in rack
  5. power everything on and have it start back up with no problems

Obviously the when of all this needs to be done when it won't interfere with locking work.  fwiw, I am around tomorrow (Tuesday, 2/11), but will likely be leaving for LHO on Wednesday.

  9727   Fri Mar 14 10:31:10 2014 jamieUpdateGreen LockingALS Slow servo settings

Quote:

 

Q and I have started to...

 Ha!

  9798   Fri Apr 11 10:30:48 2014 jamieUpdateLSCCARM and DARM both on IR signals!!!!!!!!!

Quote:

[EricQ, Jenne]

We're still working, but I'm really excited, so here's our news:  We are currently holding the IFO on all IR signalsNo green, no ALS is being used at all!!!!  

 Phenomenal!!  Well done, guys!

  9822   Thu Apr 17 11:00:54 2014 jamieUpdateCDSfailed attempt to get Dolphin working on c1ioo

I've been trying to get c1ioo on the Dolphin network, but have not yet been successful.

Background: if we can put the c1ioo machine on the fast Dolphin IPC network, we can essentially eliminate latencies between the c1als model and the c1lsc model, which are currently connected via a rube goldberg-esq c1lsc->dolphin->c1sus->rfm->c1ioo configuration.

Rolf gave us a Dolpin host adapter card, and we purchased a Dolphin fiber cable to run from the 1X2 rack to the 1X4 rack where the Dolphin switch is.

Yesterday I installed the dolphin card into c1ioo.  Unfortunately, c1ioo, which is Sun Fire X4600, and therefore different than the rest of the front end machines, doesn't seem to be recognizing the card.  The /etc/dolphin_present.sh script, which is supposed to detect the presence of the card by grep'ing for the string 'Stargen' in the lspci output, returns null.

I've tried moving the card to different PCIe slots, as well as swapping it out with another Dolphin host adapter that we have.  Neither worked.

I looked at the Dolphin host adapter installed in c1lsc and it's quite different, presumably a newer or older model.  Not sure if that has anything to do with anything.

I'm contacting Rolf to see if he has any other ideas.

  9824   Thu Apr 17 16:59:45 2014 jamieUpdateCDSslightly more successful attempt to get Dolphin working on c1ioo

So it turns out that the card that Rolf had given me was not a Dolphin host adapter after all.  He did have an actual host adapter board on hand, though, and kindly let us take it.  And this one works!

I installed the new board in c1ioo, and it recognized it.  Upon boot, the dolphin configuration scripts managed to automatically recognize the card, load the necessary kernel modules, and configure it.  I'll describe below how I got everything working.

However, at some point mx_stream stopped working on c1ioo.  I have no idea why, and it shouldn't be related to any of this dolphin stuff at all.  But given that mx_stream stopped working at the same time the dolphin stuff started working, I didn't take any chances and completely backed out all the dolphin stuff on c1ioo, including removing the dolphin host adapter from the chassis all together.  Unfortunately that didn't fix any of the mx_stream issues, so mx_stream continues to not work on c1ioo.  I'll follow up in a separate post about that.  In the meantime, here's what I did to get dolphin working on c1ioo:

c1ioo Dolphin configuration

To get the new host recognized on the Dolphin network, I had to make a couple of changes to the dolphin manager setup on fb.  I referenced the following page:

https://cdswiki.ligo-la.caltech.edu/foswiki/bin/view/CDS/DolphinHowTo

Below are the two patches I made to the dolphin ("dis") config files on fb:

--- /etc/dis/dishosts.conf.bak    2014-04-17 09:31:08.000000000 -0700
+++ /etc/dis/dishosts.conf    2014-04-17 09:28:27.000000000 -0700
@@ -26,6 +26,8 @@
 ADAPTER:  c1sus_a0 8 0 4
 HOSTNAME: c1lsc
 ADAPTER:  c1lsc_a0 12 0 4
+HOSTNAME: c1ioo
+ADAPTER:  c1ioo_a0 16 0 4
 
 # Here we define a socket adapter in single mode.
 #SOCKETADAPTER: sockad_0 SINGLE 0

--- /etc/dis/networkmanager.conf.bak    2014-04-17 09:30:40.000000000 -0700
+++ /etc/dis/networkmanager.conf    2014-04-17 09:30:48.000000000 -0700
@@ -39,7 +39,7 @@
 # Number of nodes in X Dimension. If you are using a single ring, please
 # specify number of nodes in ring.
 
--dimensionX 2;
+-dimensionX 3;
 
 # Number of nodes in Y Dimension.

I then had to restart the DIS network manager to see these changes take affect:

$ sudo /etc/init.d/dis_networkmgr restart

I then rebooted c1ioo one more time, after which c1ioo showed up in the dxadmin GUI.

At this point I tried adding a dolphin IPC connection between c1als and c1lsc to see if it worked.  Unfortunately everything crashed every time I tried to run the models (including models on other machines!).  The problem was that I had forgotten to tell the c1ioo IOP (c1x03) to use PCIe RFM (i.e. Dolphin).  This is done by adding the following flag to the cdsParamters block in the IOP:

pciRfm=1

Once this was added, and the IOP was rebuilt/installed/restarted and came back up fine.  The c1als model with the dolphin output also came up fine.

However, at this point I ran into the c1ioo mx_stream problem and started backing everything out.

 

  9825   Thu Apr 17 17:15:54 2014 jamieUpdateCDSmx_stream not starting on c1ioo

While trying to get dolphin working on c1ioo, the c1ioo mx_stream processes mysteriously stopped working.  The mx_stream process itself just won't start now.  I have no idea why, or what could have happened to cause this change.  I was working on PCIe dolphin stuff, but have since backed out everything that I had done, and still the c1ioo mx_stream process will not start.

mx_stream relies on the open-mx kernel module, but that appears to be fine:

controls@c1ioo ~ 0$ /opt/open-mx/bin/omx_info  
Open-MX version 1.3.901
 build: root@fb:/root/open-mx-1.3.901 Wed Feb 23 11:13:17 PST 2011

Found 1 boards (32 max) supporting 32 endpoints each:
 c1ioo:0 (board #0 name eth1 addr 00:14:4f:40:64:25)
   managed by driver 'e1000'
   attached to numa node 0

Peer table is ready, mapper is 00:30:48:d6:11:17
================================================
  0) 00:14:4f:40:64:25 c1ioo:0
  1) 00:30:48:d6:11:17 c1iscey:0
  2) 00:25:90:0d:75:bb c1sus:0
  3) 00:30:48:be:11:5d c1iscex:0
  4) 00:30:48:bf:69:4f c1lsc:0
controls@c1ioo ~ 0$ 

However, if trying to start mx_stream now fails:

controls@c1ioo ~ 0$ /opt/rtcds/caltech/c1/target/fb/mx_stream -s c1x03 c1ioo c1als -d fb:0
c1x03
mmapped address is 0x7f885f576000
mapped at 0x7f885f576000
send len = 263596
OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)
mx_connect failed
controls@c1ioo ~ 1$ 

I'm not quite sure how to interpret this error message.  The "00:00:00:00:00:00" has the form of a 48-bit MAC address that would be used for a hardware identifier, ala the second column of the OMC "peer table" above, although of course all zeros is not an actual address.  So there's some disconnect between mx_stream and the actually omx configuration stuff that's running underneath.

Again, I have no idea what happened.  I spoke to Rolf and he's going to try to help sort this out tomorrow.

Attachment 1: c1ioo_no_mx_stream.png
c1ioo_no_mx_stream.png
  9831   Fri Apr 18 19:05:17 2014 jamieUpdateCDSmx_stream not starting on c1ioo

Quote:

To fix open-mx connection to c1ioo, had to restart the mx mapper on fb machine. Command is /opt/mx/sbin/mx_start_mapper, to be run as root. Once this was done, omx_info on c1ioo computer showed fb:0 in the table and mx_stream started back up on its own. 

Thanks so much Rolf (and Keith)!

  9881   Wed Apr 30 17:07:19 2014 jamieUpdateCDSc1ioo now on Dolphin network

The c1ioo host is now fully on the dolphin network!

After the mx stream issue from two weeks ago was resolved and determined to not be due to the introduction of dolphin on c1ioo, I went ahead and re-installed the dolphin host adapter card on c1ioo.  The Dolphin network configurations changes I made during the first attempt (see previous log in thread) were still in place.  Once I rebooted the c1ioo machine, everything came up fine:

dolphin.png

We then tested the interface by making a cdsIPCx-PCIE connection between the c1ioo/c1als model and the c1lsc/c1lsc model for the ALS-X beat note fine phase signal.  We then locked both ALS X and Y, and compared the signals against the existing ALS-Y beat note phase connection that passes through c1sus/c1rfm via an RFM IPC:

The signal is perfectly coherent and we've gained ~25 degrees of phase at 1kHz.  EricQ calculates that the delay for this signal has changed from:

ALSXonDolphin.pdf

122 us -> 61 us 

I then went ahead and made the needed modifications for ALS-Y as well, and removed ALS->LSC stuff in the c1rfm model.

Next up: move the RFM card from the c1sus machine to the c1lsc machine, and eliminate c1sus/c1rfm model entirely.

  9882   Wed Apr 30 17:45:34 2014 jamieUpdateCDSc1ioo now on Dolphin network

For reference, here are the new IPC entries that were made for the ALS X/Y phase between c1als and c1lsc:

controls@fb ~ 0$ egrep -A5 'C1:ALS-(X|Y)_PHASE' /opt/rtcds/caltech/c1/chans/ipc/C1.ipc
[C1:ALS-Y_PHASE]
ipcType=PCIE
ipcRate=16384
ipcHost=c1ioo
ipcNum=114
desc=Automatically generated by feCodeGen.pl on 2014_Apr_17_14:27:41
--
[C1:ALS-X_PHASE]
ipcType=PCIE
ipcRate=16384
ipcHost=c1ioo
ipcNum=115
desc=Automatically generated by feCodeGen.pl on 2014_Apr_17_14:28:53
controls@fb ~ 0$ 

After all this IPC cleanup is done we should go through and clean out all the defunct entries from the C1.ipc file.

  9883   Wed Apr 30 18:06:06 2014 jamieUpdateCDSPOP QPD signals now on dolphin

The POP QPD X/Y/SUM signals, which are acquired in c1ioo, are now being broadcast over dolphin.  c1ass was modified to pick them up there as well:

c1ioo-POPQPD.pngc1ass-POPQPD.png

Here are the new IPC entries:

controls@fb ~ 0$ egrep -A5 'C1:IOO-POP' /opt/rtcds/caltech/c1/chans/ipc/C1.ipc
[C1:IOO-POP_QPD_SUM]
ipcType=PCIE
ipcRate=16384
ipcHost=c1ioo
ipcNum=116
desc=Automatically generated by feCodeGen.pl on 2014_Apr_30_17:33:22
--
[C1:IOO-POP_QPD_X]
ipcType=PCIE
ipcRate=16384
ipcHost=c1ioo
ipcNum=117
desc=Automatically generated by feCodeGen.pl on 2014_Apr_30_17:33:22
--
[C1:IOO-POP_QPD_Y]
ipcType=PCIE
ipcRate=16384
ipcHost=c1ioo
ipcNum=118
desc=Automatically generated by feCodeGen.pl on 2014_Apr_30_17:33:22
controls@fb ~ 0$ 

Both c1ioo and c1ass were rebuild/install/restarted, and everything came up fine.

The corresponding cruft was removed from c1rfm, which was also rebuild/installed/restarted.

  9890   Thu May 1 10:23:42 2014 jamieUpdateCDSc1ioo dolphin fiber nicely routed

Steve and I nicely routed the dolphin fiber from c1ioo in the 1X2 rack to the dolphin switch in the 1X4 rack.  I shutdown c1ioo before removing the fiber, but still all the dolphin connected models crashed.  After the fiber was run, I brought back c1ioo and restarted all wedged models.  Everything is green again:

green.png

  9903   Fri May 2 11:14:47 2014 jamieUpdateCDSc1ioo dolphin fiber nicely routed

Quote:

This C1IOO business seems to be wiping out the MC2_TRANS QPD servo settings each day.   What kind of BURT is being done to recover our settings after each of these activities?

(also we had to do mxstream restart on c1sus twice so far tonight -- not unusual, just keeping track)

I don't see how the work I did would affect this stuff, but I'll look into it.  I didn't touch the MC2 trans QPD signals.  Also nothing I did has anything to do with BURT.  I didn't change any channels, I only swapped out the IPCs.

  9910   Mon May 5 19:34:54 2014 jamieUpdateCDSc1ioo/c1ioo control output IPCs changed to PCIE Dolphin

Now the c1ioo in on the Dolphin network, I changed the c1ioo MC{1,2,3}_{PIT,YAW} and MC{L,F} outputs to go out over the Dolphin network rather than the old RFM network.

Two models, c1mcs and c1oaf, are ultimately the consumers of these outputs.  Now they are picking up the new PCIE IPC channels directly, rather than from any sort of RFM/PCIE proxy hops.  This should improve the phase for these channels a bit, as well as reduce complexity and clutter.  More stuff was removed from c1rfm as well, moving us to the goal of getting rid of that model entirely.

c1ioo, c1mcs, and c1rfm were all rebuild/installed/restarted, and all came back fine.  The mode cleaner relocked once we reenabled the autolocker.

c1oaf, on the other hand, is not building.  It's not building even before the changes I attempted, though.  I tried reverting c1oaf back to what is in the SVN (which also corresponds to what is currently running) and it doesn't compile either:

controls@c1lsc ~ 2$ rtcds build c1oaf
buildd: /opt/rtcds/caltech/c1/rtbuild
### building c1oaf...
Cleaning c1oaf...
Done
Parsing the model c1oaf...
YARM_BLRMS_SEIS_CLASS TP
YARM_BLRMS_SEIS_CLASS_EQ TP
YARM_BLRMS_SEIS_CLASS_QUIET TP
YARM_BLRMS_SEIS_CLASS_TRUCK TP
YARM_BLRMS_S_CLASS EpicsOut
YARM_BLRMS_S_CLASS_EQ EpicsOut
YARM_BLRMS_S_CLASS_QUIET EpicsOut
YARM_BLRMS_S_CLASS_TRUCK EpicsOut
YARM_BLRMS_classify_seismic FunctionCall
Please check the model for missing links around these parts.
make[1]: *** [c1oaf] Error 1
make: *** [c1oaf] Error 1
controls@c1lsc ~ 2$ 

I've been trying to debug it but have had no success.  For the time being I'm shutting off the c1oaf model, since it's now looking for bogus signals on RFM, until we can figure out what's wrong with it. 

Attachment 1: ioo-ipc.png
ioo-ipc.png
  9911   Mon May 5 19:51:56 2014 jamieUpdateCDSc1oaf model broken because of broken BLRMS block

I finally tracked down the problem with the c1oaf model to the BLRMS part:

/opt/rtcds/userapps/release/cds/common/models/BLRMS.mdl

blrms-hot-mess.pngsddefault.jpg

Note that this is pulling from a cds/common location, so presumably this is a part that's also being used at the sites.

Either there was an svn up that pulled in something new and broken, or the local version is broken, or who knows what.

We'll have to figure how what's going on here, but in the mean time, as I already mentioned, I'm leaving the c1oaf model off for now.

 RXA: also...we updated Ottavia to Ubuntu 12 LTS...but now it has no working network connection. Needs help.  (which of course has nothing whatsoever to do with this point )

  9916   Tue May 6 10:31:58 2014 jamieUpdateCDSc1ioo dolphin fiber

Quote:

I put label  at the dolphin fiber end at 1X2 today.   After this I had to reset it, but it failed.

 If by "fail" you're talking about the c1oaf model being off-line, I did that yesterday (see log 9910).  That probably has nothing to do with whatever you did today, Steve.

  9922   Wed May 7 16:31:12 2014 jamieUpdateCDScdsutils updated to version 226
controls@pianosa:~ 0$ cd /opt/rtcds/cdsutils/trunk/
controls@pianosa:/opt/rtcds/cdsutils/trunk 0$ svn update
...
At revision 226.
controls@pianosa:/opt/rtcds/cdsutils/trunk 0$ make
echo "__version__ = '226'" >lib/cdsutils/_version.py
echo "__version__ = '226'" >lib/ezca/_version.py
...
controls@pianosa:/opt/rtcds/cdsutils/trunk 0$ make ligo-install
python ./setup.py install --prefix=/ligo/apps/linux-x86_64/cdsutils-226
...
controls@pianosa:/opt/rtcds/cdsutils/trunk 0$ ln -sfn cdsutils-226 /ligo/apps/linux-x86_64/cdsutils
controls@pianosa:/opt/rtcds/cdsutils/trunk 0$ exit
...
controls@pianosa:~ 0$ cdsutils --version
cdsutils 226
controls@pianosa:~ 0$ 

  9926   Wed May 7 23:30:21 2014 jamieUpdateCDScdsutils should be working now

Should be fixed now.  There were python2.6 compatibility issues, which only show up on these old distros (e.g. ubuntu 10.04).

controls@pianosa:~ 0$ cdsutils read C1:LSC-DARM_GAIN
0.0
controls@pianosa:~ 0$ cdsutils --version
cdsutils 230
controls@pianosa:~ 0$ 
  9931   Thu May 8 15:55:43 2014 jamieUpdateCDSpython issues

Quote:

On pianosa: The ezca.Ezca class somehow initializes with its prefix set to "C1:", even though the docstring says the default is None. This makes existing scripts act wonky, because they're looking for channels like "C1:C1:FO-BLAH".

In ligo/apps/linux-x86_64, I ran ln -sfn cdsutils-old cdsutils to get the old version back for now, so I don't have to edit all of our up/down scripts.

Also, Chiara can't find the epics package when I try to load Ezca. It exists in '/usr/lib/pymodules/python2.6/epics/__init__.pyc' on pianosa, but there is no corresponding 2.7 folder on chiara.

I just pushed a fix to ezca to allow for having a truly empty prefix even if the IFO env var is set:

controls@pianosa:~ 0$ ipython
Python 2.6.5 (r265:79063, Feb 27 2014, 19:43:51) 
Type "copyright", "credits" or "license" for more information.

IPython 0.10 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.

In [1]: import ezca

In [2]: ezca.Ezca()
Out[2]: Ezca(prefix='C1:')

In [3]: ezca.Ezca(ifo=None)
Out[3]: Ezca(prefix='')

In [4]: ezca.Ezca(ifo=None).read('C1:LSC-DARM_GAIN')
Out[4]: 0.0

This is in cdsutils r232, which I just installed at the 40m.  I linked it in as well, so it's now the default version.  You will have to make a modification to any python scripts utilizing the Ezca object, but now it's a much smaller change (just in the invocation line):

-ca = ezca.Ezca()
+ca = ezca.Ezca(ifo=None)

 

  10155   Tue Jul 8 17:59:12 2014 jamieOmnistructureElectronicsJamie 1811 power supply fixed!

I finally made good on the LIFE TIME WARRANTY on the ancient, Jamie-made 1811 power supply with the faulty switch:

20140708_165010.jpg

Back to fully working form.  Hopefully I'll still be around the next time it breaks in 16 years.

  10156   Tue Jul 8 18:20:15 2014 jamieOmnistructureElectronicsJamie 1811 power supply fixed!

 Placed in PD cabinet in Y arm, next to the OTHER Jamie-made 1811 power supply from 1998.

  10177   Thu Jul 10 17:33:26 2014 jamieOmnistructureComputer Scripts / Programsubuntu12 software installed, gds 2.16.3.2

Rana wanted the latest GDS installed (for newest DTT), so I made an ubuntu 12 install directory into which I installed

  • gds-2.16.3.2
  • root_v5.34.03

I installed this stuff in

/ligo/apps/ubuntu12

which is the "official" location for stuff compiled specifically for ubuntu12.

Given that the workstations are diverging in OS (some ubuntu10, some ubuntu12), we're going to have to start supporting different software packages for the different versions, thus the new ubuntu12 directory.  This will be a pain in the butt, and will certainly lead to different versions of things for different machines, different features, etc.  We should really try to keep things at the same OS.

In any event, if you want to enable the GDS on an ubuntu 12 machine, source the ubuntu12 ligoapps-user-env.sh file:

controls@ottavia|~ > . /ligo/apps/ubuntu12/ligoapps-user-env.sh

  10426   Fri Aug 22 18:00:08 2014 jamieOmnistructureCDSubuntu12 awgstream installed

I installed awgstream-2.16.14 in /ligo/apps/ubuntu12.  As with all the ubuntu12 "packages", you need to source the ubuntu12 ligoapps environment script:

controls@pianosa|~ > . /ligo/apps/ubuntu12/ligoapps-user-env.sh
controls@pianosa|~ > which awgstream
/ligo/apps/ubuntu12/awgstream-2.16.14/bin/awgstream
controls@pianosa|~ > 

I tested it on the SRM LSC filter bank.  In one terminal I opened the following camonitor on C1:SUS-SRM_LSC_OUTMON.  In another terminal I ran the following:

controls@pianosa|~ > seq 0 .1 16384  | awgstream C1:SUS-SRM_LSC_EXC 16384 -
Channel = C1:SUS-SRM_LSC_EXC
File    = -
Scale   =          1.000000
Start   = 1092790384.000000
controls@pianosa|~ > 

The camonitor output was:

controls@pianosa|~ > camonitor C1:SUS-SRM_LSC_OUTMON
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:44:50.997418 0  
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:52:49.155525 218.8  
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:52:49.393404 628.4  
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:52:49.629822 935.6  
...
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:52:58.210810 15066.8  
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:52:58.489501 15476.4  
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:52:58.747095 15886  
C1:SUS-SRM_LSC_OUTMON          2014-08-22 17:52:59.011415 0 

In other words, it seems to work.

  10478   Tue Sep 9 14:25:46 2014 jamieUpdateLSCFiguring out where to do DARM->AS55

Quote:

I have made a ruidimentary lockloss plotting script, that I have put in ..../scripts/LSC/LockLossData, but I'm not satisfied with it yet.  Somehow it's not catching the lockloss, even though it's supposed to run when the ALS watch/down scripts run.  I'll need to look into this when I'm not so sleepy.

We developed a fairly sophisticated lockloss script at the sites, which you could try using as well.  It's at:

USERAPPS/sys/common/scripts/lockloss

It requires a reasonably up-to-date install of cdsutils, and the tconvert utility.  It uses guardian at the sites to determine when locklosses happen, but you can use it without guardian by just feeding it a specific time to plot.  It also accepts a list of channels to plot, one per line.

  10623   Fri Oct 17 15:17:31 2014 jamieUpdateCDSDaqd "fixed"?

I very tentatively declare that this particular daqd crapfest is "resolved" after Jenne rebooted fb and daqd has been running for about 40 minutes now without crapping itself.  Wee hoo.

I spent a while yesterday trying to figure out what could have been going on.  I couldn't find anything.  I found an elog that said a previous daqd crapfest was finally only resolved by rebooting fb after a similar situation, i.e. there had been an issue that was resolved, daqd was still crapping itself, we couldn't figure out why so we just rebooted, daqd started working again.

So, in summary, totally unclear what the issue was, or why a reboot solved it, but there you go.

  10624   Fri Oct 17 16:54:11 2014 jamieUpdateCDSDaqd "fixed"?

Quote:

I very tentatively declare that this particular daqd crapfest is "resolved" after Jenne rebooted fb and daqd has been running for about 40 minutes now without crapping itself.  Wee hoo.

I spent a while yesterday trying to figure out what could have been going on.  I couldn't find anything.  I found an elog that said a previous daqd crapfest was finally only resolved by rebooting fb after a similar situation, i.e. there had been an issue that was resolved, daqd was still crapping itself, we couldn't figure out why so we just rebooted, daqd started working again.

So, in summary, totally unclear what the issue was, or why a reboot solved it, but there you go.

Looks like I spoke too soon.  daqd seems to be crapping itself again:

controls@fb /opt/rtcds/caltech/c1/target/fb 0$ ls -ltr logs/old/ | tail -n 20
-rw-r--r-- 1 4294967294 4294967294    11244 Oct 17 11:34 daqd.log.1413570846
-rw-r--r-- 1 4294967294 4294967294    11086 Oct 17 11:36 daqd.log.1413570988
-rw-r--r-- 1 4294967294 4294967294    11244 Oct 17 11:38 daqd.log.1413571087
-rw-r--r-- 1 4294967294 4294967294    13377 Oct 17 11:43 daqd.log.1413571386
-rw-r--r-- 1 4294967294 4294967294    11481 Oct 17 11:45 daqd.log.1413571519
-rw-r--r-- 1 4294967294 4294967294    11985 Oct 17 11:47 daqd.log.1413571655
-rw-r--r-- 1 4294967294 4294967294    13219 Oct 17 13:00 daqd.log.1413576037
-rw-r--r-- 1 4294967294 4294967294    11150 Oct 17 14:00 daqd.log.1413579614
-rw-r--r-- 1 4294967294 4294967294     5127 Oct 17 14:07 daqd.log.1413580231
-rw-r--r-- 1 4294967294 4294967294    11165 Oct 17 14:13 daqd.log.1413580397
-rw-r--r-- 1 4294967294 4294967294     5440 Oct 17 14:20 daqd.log.1413580845
-rw-r--r-- 1 4294967294 4294967294    11352 Oct 17 14:25 daqd.log.1413581103
-rw-r--r-- 1 4294967294 4294967294    11359 Oct 17 14:28 daqd.log.1413581311
-rw-r--r-- 1 4294967294 4294967294    11195 Oct 17 14:31 daqd.log.1413581470
-rw-r--r-- 1 4294967294 4294967294    10852 Oct 17 15:45 daqd.log.1413585932
-rw-r--r-- 1 4294967294 4294967294    12696 Oct 17 16:00 daqd.log.1413586831
-rw-r--r-- 1 4294967294 4294967294    11086 Oct 17 16:02 daqd.log.1413586924
-rw-r--r-- 1 4294967294 4294967294    11165 Oct 17 16:05 daqd.log.1413587101
-rw-r--r-- 1 4294967294 4294967294    11086 Oct 17 16:21 daqd.log.1413588108
-rw-r--r-- 1 4294967294 4294967294    11097 Oct 17 16:25 daqd.log.1413588301
controls@fb /opt/rtcds/caltech/c1/target/fb 0$

The times all indicate when the daqd log was rotated, which happens everytime the process restarts.  It doesn't seem to be happening so consistently, though.  It's been 30 minutes since the last one.  I wonder if it somehow correlated with actual interaction with the NDS process.  Does some sort of data request cause it to crash?

 

  10628   Tue Oct 21 17:44:28 2014 jamieOmnistructureComputer Scripts / Programsnew version of cdsutils (351) installed

I just installed cdsutils r351 at /ligo/apps/linux-x86_64/cdsutils.  It should be available on all workstations.

It includes a bunch of bug fixes and feature improvements, including the step stuff that Rana was complaining about.

  10854   Mon Jan 5 20:17:26 2015 jamieConfigurationCDSGDS upgraded to 2.16.14

I upgraded the GDS and ROOT installations in /ligo/apps/ubuntu12 the control room workstations:

  • GDS 2.16.14
  • ROOT 5.34.18 (dependency of GDS)

My cursory tests indicate that they seem to be working:

2015-01-05-200025_1694x996_scrot.png

Now that the control room environment has become somewhat uniform at Ubuntu 12, I modified the /ligo/cdscfg/workstationrc.sh file to source the ubuntu12 configuration:

controls@nodus|apps > cat /ligo/cdscfg/workstationrc.sh
# CDS WORKSTATION ENVIRONMENT
source /ligo/apps/ligoapps-user-env.sh
source /ligo/apps/ubuntu12/ligoapps-user-env.sh
source /opt/rtcds/rtcds-user-env.sh
controls@nodus|apps > 

This should make all the newer versions available everywhere on login.

  10878   Thu Jan 8 09:24:40 2015 jamieUpdateComputer Scripts / ProgramsELOG 3.0
Quote:

I've installed the very fresh ELOG 3.0, for nothing else than the new built in text editor which has a LATEX capable equation editor built right in. 

Check out this sweet limerick: 

\int_{1}^{\sqrt[3]{3}}t^2 dt\, \textbf{cos}(\frac{3\pi}{9}) = \textbf{ln}(\sqrt[3]{e})

\int \omega \epsilon \varepsilon \Gamma

  10906   Thu Jan 15 18:10:19 2015 jamieUpdateComputer Scripts / ProgramsInstalled kerberos on Rossa
Quote:

I have installed kerberos on Rossa, so that I don't have to type my name and password every time I do an svn checkin, since I'm making some modifications and want to be sure that everything is checked in before and afterwards. 

I ran sudo apt-get install krb5-user.  I didn't put in a default_realm when it prompted me to during install, so I went into the /etc/krb5.conf file and changed the default_realm line to read default_realm = LIGO.ORG

Now we can use kinit, but we must (as usual) remember to kdestroy our credentials when we're done.

As a reminder, to use:

> kinit albert.einstein

Password for albert.einstein@LIGO.ORG: (type your pw here)

When you're finished, run

> kdestroy

The end.

WARNING: since the workstations are all shared user, if you forget to kdestroy the next user can commit under your user ID.  It might be good to set the timeout to be something much shorter than 24 hours, like maybe 1, or 2.

  11077   Thu Feb 26 13:55:59 2015 jamieUpdateComputer Scripts / ProgramsFB IO load
We should use "ionice" to throttle the rsync. Use something like "ionice -c 3 rsync ..." to set the priority such that the rsync process will only work when there is no other IO contention. See "man ionice" for other options.
  11409   Tue Jul 14 11:57:27 2015 jamieSummaryCDSCDS upgrade: left running in semi-stable configuration
Quote:

There remains a pattern to some of the restarts, the following times are all reported as restart times. (There are others in between, however.)

daqd: Tue Jul 14 00:02:48 PDT 2015
daqd: Tue Jul 14 01:02:32 PDT 2015
daqd: Tue Jul 14 03:02:33 PDT 2015
daqd: Tue Jul 14 05:02:46 PDT 2015
daqd: Tue Jul 14 06:01:57 PDT 2015
daqd: Tue Jul 14 07:02:19 PDT 2015
daqd: Tue Jul 14 08:02:44 PDT 2015
daqd: Tue Jul 14 09:02:24 PDT 2015
daqd: Tue Jul 14 10:02:03 PDT 2015

Before the upgrade, we suffered from hourly crashes too:

daqd_start Sun Jun 21 00:01:06 PDT 2015
daqd_start Sun Jun 21 01:03:47 PDT 2015
daqd_start Sun Jun 21 02:04:04 PDT 2015
daqd_start Sun Jun 21 03:04:35 PDT 2015
daqd_start Sun Jun 21 04:04:04 PDT 2015
daqd_start Sun Jun 21 05:03:45 PDT 2015
daqd_start Sun Jun 21 06:02:43 PDT 2015
daqd_start Sun Jun 21 07:04:42 PDT 2015
daqd_start Sun Jun 21 08:04:34 PDT 2015
daqd_start Sun Jun 21 09:03:30 PDT 2015
daqd_start Sun Jun 21 10:04:11 PDT 2015

So, this isn't neccesarily new behavior, just something that remains unfixed. 

That's interesting, that we're still seeing those hourly crashes.

We're not writing out the full set of channels, though, and we're getting more failures than just those at the hour, so we're still suffering.

  11410   Tue Jul 14 13:55:28 2015 jamieUpdateCDSrunning test on daqd, please leave undisturbed

I'm running a test with daqd right now, so please do not disturb for the moment.

I'm temporarily writing frames into a tempfs, which is a filesystem that exists purely in memory.  There should be ZERO IO contention for this filesystem, so if the daqd failures are due to IO then all problems should disappear.  If they don't, then we're dealing with some other problem.

There will be no data saved during this period.

ELOG V3.1.3-