40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 91 of 341  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  6801   Tue Jun 12 11:25:05 2012 JamieUpdateComputersrtcds: command not found

Quote:

We can't compile any changes to the LSC or the GCV models since Jamie's new script / program isn't found.  I don't know where it is (I can't find it either), so I can't do the compiling by hand, or point explicitly to the script.  The old way of compiling models in the wiki is obsolete, and didn't work :(

Sorry about that.  I had modified the path environment that pointed to the rtcds util.  The rtcds util is now in /opt/rtcds/caltech/c1/scripts/rtcds, which is in the path.  Starting a new shell should make it available again.

  6807   Tue Jun 12 17:46:09 2012 JenneUpdateComputersrtcds: command found

Quote:

Quote:

We can't compile any changes to the LSC or the GCV models since Jamie's new script / program isn't found.  I don't know where it is (I can't find it either), so I can't do the compiling by hand, or point explicitly to the script.  The old way of compiling models in the wiki is obsolete, and didn't work :(

Sorry about that.  I had modified the path environment that pointed to the rtcds util.  The rtcds util is now in /opt/rtcds/caltech/c1/scripts/rtcds, which is in the path.  Starting a new shell should make it available again.

 Added TRX and TRY and POY11_I_ERR and POX11_I_ERR to the c1lsc.mdl using a new-style DAQ Channels block, recompiled, installed, started the model, all good.  Restarted the daqd on the framebuilder, and everything is green.  I can go back and get recorded data using dataviewer (for the last few minutes since I started fb), so it all looks good.

Note on the new DAQ Channels block:  Put the text block (from CDS_PARTS) at the same level as the channel you want to save, and name it exactly as it is in the model.  The code-generator will add the _DQ for you.  i.e. if you define a channel "TRY_OUT_DQ" in the lsc model, you'll end up with a channel "C1:LSC-TRY_OUT_DQ_DQ".

  6830   Mon Jun 18 17:28:03 2012 yutaSummaryComputersbugs in CDS_PARTS/simLinkParts/Fcn

Fcn module in CDS_PARTS is used to include a user defined function in a model.
We should be able to use this by entering desired function, but I found some bugs.

BUG1: Fcn doen't work without ";"

If you put ";" after the function, we can compile.

 sin(u[1]);

But if you put without ";", like

 sin(u[1])

you get the following error message when compiling.

controls@c1ioo
~ 0$ rtcds make c1gcv
### building c1gcv...
Cleaning c1gcv...
Done
Parsing the model c1gcv...
Done
Building EPICS sequencers...
Done
Building front-end Linux kernel module c1gcv...
echo >> target/c1gcvepics/README.making_changes
echo 'Built on date' `date` >> target/c1gcvepics/README.making_changes
make[1]: Leaving directory `/opt/rtcds/caltech/c1/rtbuild'

make[1]: Entering directory `/opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv'
make -C /lib/modules/2.6.34.1/build SUBDIRS=/opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv modules
make[2]: Entering directory `/usr/src/linux-2.6.34.1-cs'
  CC [M]  /opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv/c1gcv.o
make[2]: Leaving directory `/usr/src/linux-2.6.34.1-cs'
make[1]: Leaving directory `/opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv'
/opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv/c1gcv.c: In function 'feCode':
/opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv/c1gcv.c:615: error: expected expression before ';' token
make[3]: *** [/opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv/c1gcv.o] Error 1
make[2]: *** [_module_/opt/rtcds/caltech/c1/rtbuild/src/fe/c1gcv] Error 2
make[1]: *** [default] Error 2
make: *** [c1gcv] Error 1


BUG2: sindeg doesn't work properly

sindeg should work as cosine with input in degrees.
I made a simple model to test this(below).
model_sindegbug.png


Output of the filter module C1:ALS-BEATY_FINE_PHASE goes to "PHASE_in"
sindeg of this goes to C1:ALS-BEATY_FINE_I_ERR
cosdeg of this goes to C1:ALS-BEATY_FINE_Q_ERR

If you sweep the phase input, you should get sin and cos, but you get the following.
cosdeg (C1:ALS-BEATY_FINE_Q_ERR) looks OK, but sindeg (C1:ALS-BEATY_FINE_I_ERR) looks funny. It looks like ~20000 is its period.

dv_sindegbug.png

  6854   Fri Jun 22 13:37:17 2012 JenneUpdateComputersfb lost connection

...Perhaps related to the fact that Jamie is copying a lot of stuff over the network to back up Ottavia before converting her to Ubuntu, perhaps totally independent. 

After restarting the daqd, c1lsc was the only computer whose mx_stream came up on its own.  I restarted c1sus. c1ioo, c1iscey, c1iscex by hand.

  6856   Fri Jun 22 19:52:47 2012 JamieUpdateComputersottavia reconfigured as CDS workstation

ottavia has been reinstalled with Ubuntu 10.04 LTS, and has been configured as a CDS workstation.

I have been maintaining a script that takes a stock 10.04 install and configures it as a workstation.  I've attached it here, but it lives at:

/users/controls/workstation-setup.sh

The script is designed to be idempotent, i.e. it can be run on a machine that has already been configured and it will either have no affect or update.

Attachment 1: workstation-setup.sh
#!/bin/bash

# This is a CDS workstation setup script for Ubuntu 10.04 It is
# idempotent, so running it on an already configured workstation is
# fine.

########################################
### controls

if [[ $(id -u controls) != 1001 ]] ; then
... 111 more lines ...
  6861   Sat Jun 23 19:57:22 2012 yutaSummaryComputersc1ioo is down

I tried to restart c1ioo becuase I can't live without him.

I couldn't ssh or ping c1ioo, so I did hardware reboot.
c1ioo came back, but now ADC/DAC stats are all red.

c1ioo was OK until 3am when I left the control room last night. I don't know what happened, but StripTool from zita tells me that MC lock went off at around 4pm.

  6865   Mon Jun 25 10:35:59 2012 JenneSummaryComputersc1ioo is down

Quote:

I tried to restart c1ioo becuase I can't live without him.

I couldn't ssh or ping c1ioo, so I did hardware reboot.
c1ioo came back, but now ADC/DAC stats are all red.

c1ioo was OK until 3am when I left the control room last night. I don't know what happened, but StripTool from zita tells me that MC lock went off at around 4pm.

 c1ioo was still all red on the CDS status screen, so I tried a couple of things.

mxstreamrestart (which aliases on the front ends to sudo /etc/init.d/mx_stream restart) didn't help

sudo shutdown -r now didn't change anything either....c1ioo came back with red everywhere and 0x2bad on the IOP

eventually doing as Jamie did for c1sus in elog 6742, rtcds stop all, then rtcds start all fixed everything.  Interestingly, when I tried rtcds start iop, I got the error
Cannot start/stop model 'iop' on host c1ioo, so I just tried rtcds start all, and that worked fine....started with c1x03, then c1ioo, then c1gcv.

  6866   Mon Jun 25 11:23:14 2012 JenneUpdateComputerstdsavg not working

Quote:

LSCoffsets script, and any others depending on tdsavg will not work until this is fixed.

 LSCoffsets is working again. 

tdsavg (now, but didn't used to) needs "LIGONDSIP=fb" to be specified.  Jamie just put this in the global environment, so tdsavg should just work like normal again.

Also, the rest of the LSCoffsets script (really the subcommand offset2) was tsch syntax, so I created offset3 which is bash syntax.

Now we can use LSCoffsets again.

  6867   Mon Jun 25 11:27:13 2012 JamieSummaryComputersc1ioo is down

Quote:

Quote:

I tried to restart c1ioo becuase I can't live without him.

I couldn't ssh or ping c1ioo, so I did hardware reboot.
c1ioo came back, but now ADC/DAC stats are all red.

c1ioo was OK until 3am when I left the control room last night. I don't know what happened, but StripTool from zita tells me that MC lock went off at around 4pm.

 c1ioo was still all red on the CDS status screen, so I tried a couple of things.

mxstreamrestart (which aliases on the front ends to sudo /etc/init.d/mx_stream restart) didn't help

sudo shutdown -r now didn't change anything either....c1ioo came back with red everywhere and 0x2bad on the IOP

eventually doing as Jamie did for c1sus in elog 6742, rtcds stop all, then rtcds start all fixed everything.  Interestingly, when I tried rtcds start iop, I got the error
Cannot start/stop model 'iop' on host c1ioo, so I just tried rtcds start all, and that worked fine....started with c1x03, then c1ioo, then c1gcv.

There seems to be a problem with how models are coming up on boot since the upgrade.  I think the IOP isn't coming up correctly for some reason, which is then preventing the rest of the models from starting since they depend on the IOP.

The simple way to fix this is to run the following:

ssh c1ioo
rtcds restart all

The "restart" command does the smart thing, by stopping all the models in the correct order (IOP last) and then restarting them also in the correct order (IOP first).

  6912   Wed Jul 4 18:25:44 2012 ZachUpdateComputersNDS2 client now working on Ubuntu machines

After plenty of work, NDS2 can now be used to get site data within MATLAB using the following machines:

  • allegra
  • megatron
  • ottavia
  • pianosa
  • rosalba
  • rossa

What I did

NDS2 was not working on any of the machines, so the first thing I did was simply to install the newest version. I downloaded the latest tarball (0.9.1) from the LDAS Wiki, unzipped and installed it

/users/zach $ tar -xvf nds2-client-0.9.1.tar

/users/zach $ cd nds2-client-0.9.1

/users/zach $ sudo ./configure --prefix=/cvs/cds/caltech/apps/linux64 --with-matlab=/cvs/cds/caltech/apps/linux64/matlab/bin/matlab

/users/zach $ sudo make

/users/zach $ sudo make install

 

Even with the new version, it still didn't work.

Solution: The main problem was that the cyrus-sasl-gssapi authentication protocol was not installed on these machines, so that even with a kerberos ticket the datalink could not be established. Using information from the LDAS Wiki, I used aptitude to install it as:

$ sudo aptitude install lscsoft-auth

This group installs both the SASL protocol and the package python-kerberos

 

I also needed to update the kerberos config file for each machine, which is located at /etc/krb5.conf. I found that ottavia had a nice one with many realms, so I copied that one over to the other machines. In any case where there was an old config file overwritten, it is now /etc/krb5.conf.old.

Finally, the matlab path for NDS2 was still set to the old 2010a directory (/cvs/cds/caltech/apps/linux64/lib/matlab2010a) that was created by the NDS2 install when Rana originally did it. The new install I made above created the appropriate 2010b mexa64 files, so I changed the matlab path within matlab to this one:

>> rmpath /cvs/cds/caltech/apps/linux64/lib/matlab2010a

>> addpath /cvs/cds/caltech/apps/linux64/lib/matlab2010b

>> savepath

 

Now everything works fine on all these machines. As in Rana's original post, you get data in the following way:

$ kinit albert.einstein %then enter password

$ matlab -nosplash -nodesktop

>> d = NDS2_GetData({'H1:LSC-NPTRX_OUT16.mean'},963968415,6000,'nds.ligo.caltech.edu:31200')

 

d = 


            name: 'H1:LSC-NPTRX_OUT16.mean' 

            chan_type: 'm-trend'             

            rate: 0.0167       

            data_type: 'real_8'     

            signal_gain: 1   

            signal_offset: 0     

            signal_slope: 1     

            signal_units: ''   

            start_gps_sec: 963968415     

            duration_sec: 6000             

            data: [100x1 double]           

            exists: 1

 

>> quit % since you've seen that the data is really here

$ kdestroy % so that no one uses your credentials

 

Some thoughts

  • I would like to extend this to the 32-bit machines, but I have to figure out the best way to install the proper NDS2 client without interfering with the 64-bit version. I think it is just a matter of specifying the matlabroot in the .../linux/ instead of .../linux64/
  • It would be nice to find a way that the nice tool gps('MM/DD/YYYY XX:XX:XX UTC'), which calls the ligotool executable tconvert, can be automatically usable when calling NDS2 functions. Right now, there seems to be an issue preventing that: even though tconvert can be run in the terminal, gps() returns an error and even directly running unix('tconvert now') or !tconvert returns the same error. I have emailed Peter Shawhan to see if he has any advice.

 

 

  6919   Thu Jul 5 12:06:35 2012 JamieUpdateComputersNDS2 client now working on Ubuntu machines

What I did

NDS2 was not working on any of the machines, so the first thing I did was simply to install the newest version. I downloaded the latest tarball (0.9.1) from the LDAS Wiki, unzipped and installed it

/users/zach $ tar -xvf nds2-client-0.9.1.tar

/users/zach $ cd nds2-client-0.9.1

/users/zach $ sudo ./configure --prefix=/cvs/cds/caltech/apps/linux64 --with-matlab=/cvs/cds/caltech/apps/linux64/matlab/bin/matlab

/users/zach $ sudo make

/users/zach $ sudo make install

No no, this is totally unnecessary.  NDS2 was already installed on every machine from the official packaged releases (apt-get install nds2-client), and it's known to work fine. We use it with pynds all the time. If the matlab component is not working we should figure out the right way to fix it with the existing packages.

In general, please only manually install software as a very last resort.  Manually installed software doesn't get maintained, where as the officially packaged stuff is being actively maintained by the collaboration. If there is a problem with the distributed packaging we should report it and get it fixed (and hint I was the one who built the original Debian packaging for nds2, so I know how to fix all the issues).  I'm trying to bring the 40m out of the dark days of complete chaos, where random software was installed in random locations.

Even with the new version, it still didn't work. 

That's because this wasn't the problem!

Solution: The main problem was that the cyrus-sasl-gssapi authentication protocol was not installed on these machines, so that even with a kerberos ticket the datalink could not be established. Using information from the LDAS Wiki, I used aptitude to install it as:

$ sudo aptitude install lscsoft-auth

This group installs both the SASL protocol and the package python-kerberos

I also needed to update the kerberos config file for each machine, which is located at /etc/krb5.conf. I found that ottavia had a nice one with many realms, so I copied that one over to the other machines. In any case where there was an old config file overwritten, it is now /etc/krb5.conf.old.

Finally, the matlab path for NDS2 was still set to the old 2010a directory (/cvs/cds/caltech/apps/linux64/lib/matlab2010a) that was created by the NDS2 install when Rana originally did it. The new install I made above created the appropriate 2010b mexa64 files, so I changed the matlab path within matlab to this one:

>> rmpath /cvs/cds/caltech/apps/linux64/lib/matlab2010a

>> addpath /cvs/cds/caltech/apps/linux64/lib/matlab2010b

>> savepath

This sounds like it's more likely the issues. You did the right thing by going to apt to fix the authentication packages.  It's curious to me that you did that here, whereas you went totally out of band for the nds2 client stuff.  Why?

The matlab mex files are the other problem.  But there is also a nds2-client-matlab Debian/Ubuntu package for that as well.  The problem is that the package just distributes the source, and it needs to be compiled.  I'll help figure out a good way to do that.

  • I would like to extend this to the 32-bit machines, but I have to figure out the best way to install the proper NDS2 client without interfering with the 64-bit version. I think it is just a matter of specifying the matlabroot in the .../linux/ instead of .../linux64/

Again, this is handled by the packaging!  Just use apt and the right architecture is installed automatically.

But what 32 bit machines are you referring to?  I think basically everything is 64 bit nowadays.

  • It would be nice to find a way that the nice tool gps('MM/DD/YYYY XX:XX:XX UTC'), which calls the ligotool executable tconvert, can be automatically usable when calling NDS2 functions. Right now, there seems to be an issue preventing that: even though tconvert can be run in the terminal, gps() returns an error and even directly running unix('tconvert now') or !tconvert returns the same error. I have emailed Peter Shawhan to see if he has any advice. 

We are now using lalapps_tconvert for tconvert.  We're not using that ligotools crap anymore.  I've aliased that to tconvert on the command line, but maybe matlab isn't getting the message.  I'll try to think of a more robust solution (e.g. make a wrapper script).

  6921   Thu Jul 5 13:12:12 2012 ZachUpdateComputersNDS2 client now working on Ubuntu machines

From my conversations with JZ and Leo, it seemed there was no package that generated the appropriate mex files. It was clear that the right ones weren't there from the absence of a /cvs/cds/caltech/apps/linux64/lib/matlab2010b directory. I'm sorry if I screwed anything up with pynds, but I have repeatedly asked for help with NDS2+matlab and no one has done anything.

It would be nice to do it via apt if there indeed is a versioned package that can make the mexs. Sorry again if I jumped the gun, but I didn't think anyone was going to do anything.

I guess the only 32-bit machine I can think of is mafalda.

About tconvert, I think the best solution is to make a new wrapper M-file. gps was just a convenient remnant of mDV, but all that we need is some matlab function that can output a GPS time given a date/time string. We can use whatever command-line utility you want.

  6923   Thu Jul 5 16:49:35 2012 JenneUpdateComputersc1sus is funny

I was trying to use a new BLRMs c-code block that the seismic people developed, instead of Mirko's more clunky version, but putting this in crashed c1sus.

I reverted to a known good c1pem.mdl, and Jamie and I did a reboot, but c1sus is still funny - none of the models are actually running. 

rtcds restart all - all the models are happy again, c1sus is fine.

But, we still need to figure out what was wrong with the c-code block.

Also, the BLRMS channels are listed in a Daq Channels block inside of the (new) library part, so they're all saved with the new CDS system which became effective as of the upgrade.  (I made the Mirko copy-paste BLRMS into a library part, including a DAQ channels block before trying the c-code.  This is the known-working version to which I reverted, and we are currently running.)

  6924   Fri Jul 6 01:12:02 2012 JenneUpdateComputersc1sus is fine

Quote:

I was trying to use a new BLRMs c-code block that the seismic people developed, instead of Mirko's more clunky version, but putting this in crashed c1sus.

I reverted to a known good c1pem.mdl, and Jamie and I did a reboot, but c1sus is still funny - none of the models are actually running. 

rtcds restart all - all the models are happy again, c1sus is fine.

But, we still need to figure out what was wrong with the c-code block.

Also, the BLRMS channels are listed in a Daq Channels block inside of the (new) library part, so they're all saved with the new CDS system which became effective as of the upgrade.  (I made the Mirko copy-paste BLRMS into a library part, including a DAQ channels block before trying the c-code.  This is the known-working version to which I reverted, and we are currently running.)

 The reason I started looking at BLRMS and c1sus today was that the BLRMS striptool was totally wacky.  I finally figured out that the pemepics hadn't been burt restored, so none of the channels were being filtered.  It's all better now, and will be even better soon when Masha finishes updating the filters (she'll make her own elog later)

  6927   Fri Jul 6 08:38:04 2012 steveUpdateComputerstime out

Why do we have these timing blanks?

Attachment 1: timingblanks.png
timingblanks.png
  6928   Fri Jul 6 09:00:34 2012 not ZachUpdateComputersNDS2 client now working on Ubuntu machines

Quote:

From my conversations with JZ and Leo, it seemed there was no package that generated the appropriate mex files. It was clear that the right ones weren't there from the absence of a /cvs/cds/caltech/apps/linux64/lib/matlab2010b directory. I'm sorry if I screwed anything up with pynds, but I have repeatedly asked for help with NDS2+matlab and no one has done anything.

It would be nice to do it via apt if there indeed is a versioned package that can make the mexs. Sorry again if I jumped the gun, but I didn't think anyone was going to do anything.

There is a package that provides the mex source, but it doesn't actually provide the mex binaries.  The problem is that the binary depends on the matlab version, so you can't possibly provide binaries for every version.

The solution is to just build the binaries from the source package.  We should put together a nice script that builds the binaries from the source, and installs them in the directory of your choosing.  If we get something nice working, we can probably get them to include it with the package, to make it easier in the future.

Here's what's included in the source package:

controls@pianosa:~ 0$ sudo apt-get install nds2-client-matlab
...
controls@pianosa:~ 0$ dpkg -L nds2-client-matlab | sort
/.
/usr
/usr/share
/usr/share/doc
/usr/share/doc/nds2-client-matlab
/usr/share/doc/nds2-client-matlab/changelog.Debian.gz
/usr/share/doc/nds2-client-matlab/changelog.gz
/usr/share/doc/nds2-client-matlab/copyright
/usr/share/matlab
/usr/share/matlab/NDS2_GetChannels.m
/usr/share/matlab/NDS2_GetData.m
/usr/share/matlab/NDS_GetChannels.m
/usr/share/matlab/NDS_GetData.m
/usr/share/matlab/NDS_GetMinuteTrend.m
/usr/share/matlab/NDS_GetSecondTrend.m
/usr/share/matlab/src
/usr/share/matlab/src/NDS2_GetChannels.c
/usr/share/matlab/src/NDS2_GetData.c
/usr/share/matlab/src/NDS_GetChannels.c
/usr/share/matlab/src/NDS_GetData.c
/usr/share/matlab/src/nds_mex_utils.c
/usr/share/matlab/src/nds_mex_utils.h
controls@pianosa:~ 0$ 
  6944   Mon Jul 9 11:27:27 2012 JenneUpdateComputersc1oaf has been down for several days - BURT restore wasn't done correctly on startup

The c1oaf model hasn't been running for a few days (since the leap second problems we were having last week).  I had looked into it, but finally figured it out (with Jamie's help) today. 

The BURT restore has to be given to the model during startup, but for whatever reason it wasn't BURT restoring until *after* the model had already failed to start.  The symptoms were:  no 'heartbeat' for the oaf model, no connection to the fb, NO SYNC on the GDS screen, 0x4000.  the BURT restore button was green, which threw me off the scent, but that's just because it did, in fact, get set, just way too late.

I ended up looking in the dmesg of the lsc computer, and the last set of stuff was several lines of "[3354303.626446] c1oaf: Epics burt restore is 0".  Nothing else was written after that.  Jamie pointed out that this meant the BURT restore wasn't getting sent before the model unloaded itself and decided not to run.

The solution:  restart the model, and manually click the BURT restore button as soon as you're able (after everything comes back from being white).  We used to have to do this, but then there was a "fix", which apparently isn't super robust and failed for the oaf (even though it used to work just fine).  Bugzilla report submitted.

  6993   Thu Jul 19 15:55:21 2012 ranaUpdateComputersused 'aptitude' to install software (TexLive) on rosalba

I used APTITUDE to install texlive-full on rosalba so that I could work on a paper. pdflatex was not found on pianosa, rosalba, megatron, etc. I used this command:

sudo aptitude install texlive-full

While this was installing, I read a bunch of forums which claim that its better to bypass the apt-get and use the 
TexLive installer instead so that we can use its own package updater 'tlmgr'. Otherwise, the standard Ubuntu distributions
are years out of date (e.g. doesn't have RevTex 4.1).



  7013   Mon Jul 23 20:34:38 2012 JamieOmnistructureComputersCHECK IN YOUR CHANGES TO THE SVN

I'm seeing LOTS OF STUFF NOT CHECKED INTO THE SVN!!!  both modified things that haven't been updated, and things that looked like they haven't been checked in at all.

Please check in your stuff to the SVN!  We need the record!

Look through EVERYTHING that you think you might have touched, or even care about, and make sure it's checked in.

  7138   Fri Aug 10 09:47:19 2012 MashaUpdateComputersFE Status

The c1lsc and c1sus screens are red in the front-end status. I restarted the frame builder, and hit the "global diag reset" button, but to no avail. Yesterday, the only thing Den and I did to c1sus was install a new c1pem model. I got rid of the changes and switched to the old one (I ran rtcds build, install, restart), but the status is still the same.

  7143   Fri Aug 10 11:08:26 2012 jamieUpdateComputersFE Status

Quote:

The c1lsc and c1sus screens are red in the front-end status. I restarted the frame builder, and hit the "global diag reset" button, but to no avail. Yesterday, the only thing Den and I did to c1sus was install a new c1pem model. I got rid of the changes and switched to the old one (I ran rtcds build, install, restart), but the status is still the same.

 The issue you're seeing here is stalled mx_stream processes on the front ends.  On the troublesome front ends you can log in and restart the mx_streams with the "mxstreamrestart" command.

  7243   Tue Aug 21 15:25:41 2012 JenneUpdateComputerspyepics installed

I installed pyepics version 3 (http://cars9.uchicago.edu/software/python/pyepics3/overview.html) in ..../scripts/pylibs    .   I also added an "epics.conf" file to /etc/ld.so.conf.d/ , which points to the place in /ligo/apps/epics/base/lib/linux-x86_64/ where the DLLs live.  All .conf files in /etc/ld.so.conf.d/   get included in the path, so python should always automatically be able to use epics now, after you "import epics" in a script.

This is supposed to give us direct channel access to all epics channels, rather than using Yuta's wrapper scripts for ezca stuff.  I was going to write a tdsavg equivalent using camonitor, since it's unclear whether tds tools are being supported anymore.

However, I'm not getting it to connect to the server that serves epics, so I can't get the values of any channels.  All of the info in the link above assumes that you automatically get a connection, and I'm out of ideas right now of things to try.  Does anyone else have any ideas?

  7259   Thu Aug 23 17:17:49 2012 MashaSummaryComputersCode Folder Status

I cleaned up my directory (/users/masha) today. A lot of the files are just code that I experimented with, but the important files for training the classification neural network are in "neural_network_classification". The "EarthquakeData" subdirectory contains my entire dataset. Files of the form "GenerateRNNInput" are used to create input vector sets to the network, while files of the "*NeuralNetworkClassification* actually run the code that generates the neural network vectors for the classification code block in the c1pem model.

Also, the folder "feed_c", which can also be found in Den's directory, contains the neural network controller code we played around with.

  7362   Fri Sep 7 15:31:52 2012 Mike J.UpdateComputersSensoray back up

Video Capture with the Sensoray works again. Pianosa just needed mplayer installed for it to play properly.

Attachment 1: output_5.mp4
  7364   Fri Sep 7 17:24:16 2012 Mike J.UpdateComputersSensoray Video Capture

To capture video with the Sensoray, open the GUI (python ./demo.py), simply press "Save," enter a filename, and hit "Stop" when you wish to stop recording. If you want to change the video format, there is a dropdown menu labelled "Format." I recommend MP4 for standard video, and nv12 for RAW video.

  7365   Fri Sep 7 17:34:53 2012 JenneUpdateComputersSensoray Video Capture

Quote:

To capture video with the Sensoray, open the GUI (python ./demo.py), simply press "Save," enter a filename, and hit "Stop" when you wish to stop recording. If you want to change the video format, there is a dropdown menu labelled "Format." I recommend MP4 for standard video, and nv12 for RAW video.

 I also installed mplayer on rossa, so we can play the videos there.

Even though Mike won't admit it, the video stuff is all in /users/sensoray/ .  I opened the demo.py from there, and it also works.

  7495   Sun Oct 7 12:11:00 2012 AidanUpdateComputersRebooted cymac0

I rebooted cymac0 a couple of times. When I first got here it was just frozen. I rebooted it and then ran a model (x1ios). The machine froze the second time I ran ./killx1ios. I've rebooted it again.

  7498   Mon Oct 8 09:45:28 2012 jamieUpdateComputersRebooted cymac0

Quote:

I rebooted cymac0 a couple of times. When I first got here it was just frozen. I rebooted it and then ran a model (x1ios). The machine froze the second time I ran ./killx1ios. I've rebooted it again.

For context, there's a is stand-alone cymac test system running at the 40m.  It's not hooked up to anything, except for just being on the martian network (it's not currently mounting any 40m CDS filesystems, for instance).  The machine is temporarily between the 1Y4 and 1Y5 racks.

  7552   Mon Oct 15 22:24:45 2012 JenneUpdateComputersLots of new White :(

Evan and I are starting to lock, and there is lots of new, unfortunate white stuff on several different screens.

C1:TIM-PACIFIC_STRING is gone, C1:IFO-STATE (MC state) is gone, C1:LSC-PZT..._requests are gone (all 4 of them), C1:PSL-FSS_FASTSWEEPTEST from the FSS screen is gone (although I'm not sure that that one is newly gone), lots of the WF AA lights on the LSC screen are gone.

Those are the things I find in a few minutes of not really looking around.

EDIT:  IPPOS is also gone, so I can't see how my current alignment relates to old alignments.

  7570   Wed Oct 17 19:35:58 2012 KojiUpdateComputersRe: Lots of new White :(

Solved. The power code of c1iscaux was loose.
Has anyone worked around the back side of 1Y3?


I looked into the problem. I went around the channel lists for each slow machines and found the variables are supported by c1iscaux

controls@pianosa:/cvs/cds/caltech/target/c1iscaux 0$ cd /cvs/cds/caltech/target/c1iscaux
controls@pianosa:/cvs/cds/caltech/target/c1iscaux 0$ grep C1:IF *
C1IFO_STATE.db:grecord(ai,"C1:IFO-STATE")

It seemed that the machine was not responding to ping. I went to 1Y3 and found the crate was off. Actually this is not correct.
The key was on but the power was off. I looked at the back and found the power code was loose from its inlet.
Once the code was pushed in and the crate was keyed, the white boxes got back online.

Just in case I burtrestored these slow channels by the snapshot at 6:07am on Sunday.

  7574   Thu Oct 18 08:00:40 2012 jamieUpdateComputersRe: Lots of new White :(

Quote:

Solved. The power code of c1iscaux was loose.
Has anyone worked around the back side of 1Y3?


I looked into the problem. I went around the channel lists for each slow machines and found the variables are supported by c1iscaux

controls@pianosa:/cvs/cds/caltech/target/c1iscaux 0$ cd /cvs/cds/caltech/target/c1iscaux
controls@pianosa:/cvs/cds/caltech/target/c1iscaux 0$ grep C1:IF *
C1IFO_STATE.db:grecord(ai,"C1:IFO-STATE")

It seemed that the machine was not responding to ping. I went to 1Y3 and found the crate was off. Actually this is not correct.
The key was on but the power was off. I looked at the back and found the power code was loose from its inlet.
Once the code was pushed in and the crate was keyed, the white boxes got back online.

Just in case I burtrestored these slow channels by the snapshot at 6:07am on Sunday.

I was working around 1Y2 and 1Y3 when I wired the DAC in the c1lsc IO chassis in 1Y3 to the tip-tilt electronics in 1Y2.  I had to mess around in the back of 1Y3 to get it connected.  I obviously did not intend to touch anything else, but it's certainly possible that I did.

  7577   Fri Oct 19 00:55:35 2012 JenneUpdateComputersc1lsc is down (at least all of the models)

When Evan and I were dithering the BS and ITMY (see his elog), I noticed that c1lsc was acting weird.  the IOP was the only one with the blinky heartbeat.  The IOP was all green lights, but all the other models had red for the fb connection, as well as the rightmost indicator (I don't know what that one is for).  I logged on to c1lsc and ran 'rtcds restart all'.  The script didn't get anywhere beyond saying it was beginning to stop the 1st model (sup, the bottom one on the lsc list).  Then all of the cpus went white.  I can still ping c1lsc, but I can't ssh to it.

I'm not sure what to do here Jamie.  Heelp. 

  7746   Mon Nov 26 18:56:34 2012 JenneHowToComputersData logging suggestions

We've been talking for a while about how we want to store data.  I'm not in love with keeping it on the elog, although I think we should always be able to reference and go back and forth between the elogs and the data.

I have made a new folder: /data    EDIT: nevermind.  I want it to be on the file system just like /users, but I don't know how to do that.  Right now the folder is just on Ottavia. Jamie will help me tomorrow.

In this folder, we will save all of the data which goes into the elog. 

I propose that we should have a common format for the names of the data files, so that we can easily find things.

My proposal is that one begins ones elog regarding the data to be saved, and submit it immediately after putting in the first ~sentence or so. One should then make a new folder inside the data folder with a title "elog#####_Anything_Else_You_Want" Then, data (which was originally saved in ones own users folder) should be copied into the /data/elog#####_AnythingElse/ folder. Also in that folder should be any Matlab scripts used to create the plots that you post in the elog.  One should then edit the elog to continue making a regular, very thorough elog, including the path to the data.  Elog should include all of the information about the measurement, state of the IFO (or whatever you were measuring), etc. 

Riju will be alpha-testing this procedure tonight.  EDIT: nevermind...see previous edit.

  7749   Tue Nov 27 00:26:00 2012 jamieOmnistructureComputersUbuntu update seems to have broken html input to elog on firefox

 After some system updates this evening, firefox can no longer handle the html input encoding for the elog.  I'm not sure what happened.  You can still use the "ELCode" or "plain" input encodings, but "HTML" won't work.  The problem seems to be firefox 17.  ottavia and rosalba were upgraded, while rossa and pianosa have not yet been.

I've installed chromium-browser (debranded chrome) on all the machines as a backup.  Hopefully the problem will clear itself up with the next update.  In the mean time I'll try to figure out what happened.

To use chromium: Appliations -> Internet -> Chromium

  7757   Wed Nov 28 17:40:28 2012 jamieOmnistructureComputerselog working again on firefox 17

Koji and I figured out what the problem is.  Apparently firefox 17.0 (specifically it's user-agent string) breaks fckeditor, which is the javascript toolbox the elog uses for the wysiwyg text editor.  See https://support.mozilla.org/en-US/questions/942438.

The suspect line was in elog/fckeditor/editor/js/fckeditorcode_gecko.js.  I hacked it up so that it stopped whatever crappy conditional user agent crap it was doing.  It seems to be working now.

Edit by Koji: In order to make this change work, I needed to clear the cache of firefox from Tool/Clear Recent History menu.

  7786   Tue Dec 4 20:38:51 2012 jamieOmnistructureComputersnew (beta) version of nds2 installed on control room machines

I've installed the new nds2 packages on the control room machines.

These new packages include some new and improved interfaces for python, matlab, and octave that were not previously available. See the documentation in:

  /usr/share/doc/nds2-client-doc/html/index.html

for details on how to use them.  They all work something like:

  conn = nds2.connection('fb', 8088)
  chans = conn.findChannels()
  buffers = conn.fetch(t1, t2, {c1,...})
  data = buffers(1).getData()

NOTE: the new interface for python is distinct from the one provided by pynds.  The old pynds interface should continue to work, though.

To use the new matlab interface, you have to first issue the following command:

   javaaddpath('/usr/lib/java')

I'll try to figure out a way to have that included automatically.

The old Matlab mex functions (NDS*_GetData, NDS*_GetChannel, etc.) are now provided by a new and improved package.  Those should now work "out of the box".

  7788   Tue Dec 4 23:08:46 2012 DenOmnistructureComputersnew (beta) version of nds2 installed on control room machines

Quote:

I've installed the new nds2 packages on the control room machines.

 I've tried new nds2 Java interface in Matlab. Using findChannels method of the connection class I see only slow, DQ and trend channels. I could even download data online using iterate method. When it will be possible to do the same with fast non-DQ channels?

>> conn = nds2.connection('fb', 8088);
>> conn.iterate({'C1:LSC-XARM_OUT'})
??? Java exception occurred:
java.lang.RuntimeException: No such channel.
    at nds2.nds2JNI.connection_iterate__SWIG_0(Native Method)
    at nds2.connection.iterate(connection.java:91)

  7791   Wed Dec 5 09:42:46 2012 ranaOmnistructureComputersnew (beta) version of NDS2 installed on control room machines

NDS2 is not designed for non DQ channels - it gets data from the frames, not through NDS1.

For getting the non-DQ stuff, I would just continue using our NDS1 compatible NDS mex files (this is what is used in mDV).

  7793   Wed Dec 5 16:54:29 2012 jamieOmnistructureComputersnew (beta) version of NDS2 installed on control room machines

Quote:

NDS2 is not designed for non DQ channels - it gets data from the frames, not through NDS1.

For getting the non-DQ stuff, I would just continue using our NDS1 compatible NDS mex files (this is what is used in mDV).

The NDS2 protocol is not for non-DQ, but the NDS2 client is capable of talking both the NDS1 and NDS2 protocols.

fb:8088 is an NDS1 server, so the client is talking NDS1 to fb.  It should therefore be capable of getting online data.

It doesn't seem to be seeing the online channels, though, so I'll work with Leo to figure out what's going on there.

The old mex functions, which like I said are now available, aren't capable of getting online data.

  7805   Mon Dec 10 16:28:13 2012 jamieOmnistructureComputersprogressive retrieval of online data now possible with the new NDS2 client

Leo fixed an issue with the new nds2-client packages that was preventing it from retrieving online data.  It's working now from matlab, python, and octave.

Here's an example of a dataviewer-like script in python:

#!/usr/bin/python

import sys
import nds2
from pylab import *

# channels are command line arguments
channels = sys.argv[1:]

conn = nds2.connection('fb', 8088)

fig = figure()
fig.show()
for bufs in conn.iterate(channels):
    fig.clf()
    for buf in bufs:
        plot(buf.data)
    draw()

  7859   Wed Dec 19 20:18:51 2012 ranaUpdateComputersWe are Changing the Passwerdz next week----

Be Prepared

http://xkcd.com/936/

  7920   Sat Jan 19 15:05:37 2013 JenneUpdateComputersAll front ends but c1lsc are down

Message I get from dmesg of c1sus's IOP:

[   44.372986] c1x02: Triggered the ADC
[   68.200063] c1x02: Channel Hopping Detected on one or more ADC modules !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[   68.200064] c1x02: Check GDSTP screen ADC status bits to id affected ADC modules
[   68.200065] c1x02: Code is exiting ..............
[   68.200066] c1x02: exiting from fe_code()

Right now, c1x02's max cpu indicator reads 73,000 micro seconds.  c1x05 is 4,300usec, and c1x01 seems totally fine, except that it has the 02xbad.

c1x02 has 0xbad (not 0x2bad).  All other models on c1sus, c1ioo, c1iscex and c1iscey all have 0x2bad.

Also, no models on those computers have 'heartbeats'.

C1x02 has "NO SYNC", but all other IOPs are fine.

I've tried rebooting c1sus, restarting the daqd process on fb, all to no avail.  I can ssh / ping all of the computers, but not get the models running.  Restarting the models also doesn't help.

upside_down_cat-t2.jpg

c1iscex's IOP dmesg:

[   38.626001] c1x01: Triggered the ADC
[   39.626001] c1x01: timeout 0 1000000
[   39.626001] c1x01: exiting from fe_code()

c1ioo's IOP has the same ADC channel hopping error as c1sus'.

 

  7922   Sat Jan 19 18:23:31 2013 ranaUpdateComputersAll front ends but c1lsc are down

After sshing into several machines and doing 'sudo shutdown -r now', some of them came back and ran their processes.

After hitting the reset button on the RFM switch, their diagnostic lights came back. After restarting the Dolphin task on fb:

"sudo /etc/init.d/dis_networkmgr restart"

the Dolphin diagnostic lights came up green on the FE status screen.

iscex still wouldn't come up. The awgtpman tasks on there keep trying to start but then stop due to not finding ADCs.

 

Then power cycled the IO Chassis for EX and then awtpman log files changed, but still no green lights. Then tried a soft reboot on fb and now its not booting correctly.

Hardware lights are on, but I can't telnet into it. Tried power cycling it once or twice, but no luck.

Probably Jamie will have to hook up a keyboard and monitor to it, to find out why its not booting.

FE.png

P.S. The snapshot scripts in the yellow button don't work and the MEDM screen itself is missing the time/date string on the top.

  7963   Wed Jan 30 13:50:27 2013 JenneUpdateComputersc1iscex still down

[Koji, Jenne]

We noticed that the iscex computer is still down, but the IOP is (was) running.  When we sat down to look at it, c1x01 was 'breathing', had a non-zero CPU_METER time, and the error was 0x4000, which I've never seen before.  The fb connection was still red though.  Also, it is claiming that its sync source is 1pps, not TDS like it usually is. 

Since things were different, Koji restarted the 2 other models running on iscex, with no resulting change.  We then did a 'rtcds restart all', and the IOP is no longer breathing, and the error message has changed to 0xbad.  The sync source is still 1pps.

Moral of the story:  c1iscex is still down, but temporarily showed signs of life that we wanted to record.

  7970   Thu Jan 31 10:23:39 2013 JamieUpdateComputersc1iscex still down

Quote:

[Koji, Jenne]

We noticed that the iscex computer is still down, but the IOP is (was) running.  When we sat down to look at it, c1x01 was 'breathing', had a non-zero CPU_METER time, and the error was 0x4000, which I've never seen before.  The fb connection was still red though.  Also, it is claiming that its sync source is 1pps, not TDS like it usually is. 

Since things were different, Koji restarted the 2 other models running on iscex, with no resulting change.  We then did a 'rtcds restart all', and the IOP is no longer breathing, and the error message has changed to 0xbad.  The sync source is still 1pps.

Moral of the story:  c1iscex is still down, but temporarily showed signs of life that we wanted to record.

There's definitely a timing issue with this machine.  I looked at it a bit yesterday.  I'll try to get to it by the end of the week.

  8036   Fri Feb 8 12:43:26 2013 yutaUpdateComputersvideocapture.py now supports movie capturing

I updated /opt/rtcds/caltech/c1/scripts/general/videoscripts.py so that it supports movie capturing. It saves captured images (bmp) and movies (mp4) in /users/sensoray/SensorayCaptures/ directory.
I also updated /opt/rtcds/caltech/c1/scripts/pylibs/pyndslib.py because /usr/bin/lalapps_tconvert is not working and now /usr/bin/tconvert works.
However, tconvert doesn't run on ottavia, so I need Jamie to fix it.

videocapture.py -h:
Usage:
    videocapture.py [cameraname] [options]

Example usage:
    videocapture.py MC2F -s 320x240 -t off
       (Camptures image of MC2F with the size of 320x240, without timestamp on the image. MUST RUN ON PIANOSA!)
    videocapture.py AS -m 10
       (Camptures 10 sec movie of AS with the size of 720x480. MUST RUN ON PIANOSA!)


Options:
  -h, --help          show this help message and exit
  -s SIZE             specify image size [default: 720x480]
  -t TIMESTAMP_ONOFF  timestamp on or off [default: on]
  -m MOVLENGTH        specity movie length (in sec; takes movie if specified) [default: 0]

  8062   Mon Feb 11 18:44:34 2013 JamieUpdateComputerspasswerdz changed

Quote:

Be Prepared

http://xkcd.com/936/

Password for nodus and all control room workstations has been changed.  Look for new one in usual place.

We will try to change the password on all the RTS machines soon.  For the moment, though, they remain with the old passwerd.

  8088   Fri Feb 15 15:21:07 2013 JamieUpdateComputersc1iscex IO-chassis dead

I appears that the c1iscex IO-chassis is either dead or in a very bad state.  The PCIe interface card in the IO-chassis is showing four red lights, where it's supposed to be showing a dozen or so green lights.  Obviously this is going to prevent anything from running.

We've had power issues with this chassis before, so possibly that's what we're running into now.  I'll pull the chassis and diagnose asap.

 

  8140   Fri Feb 22 20:28:17 2013 JamieUpdateComputerslinux1 dead, then undead

At around 2:30pm today something brought down most of the martian network.  All control room workstations, nodus, etc. were unresponsive.  After poking around for a bit I finally figured it had to be linux1, which serves the NFS filesystem for all the important CDS stuff.  linux1 was indeed completely unresponsive.

Looking closer I noticed that the Fibrenetix FX-606-U4 SCSI hardware RAID device connected to linux1 (see #1901), which holds cds network filesystem, was showing "IDE Channel #4 Error Reading" on it's little LCD display.  I assumed this was the cause of the linux1 crash.

I hard shutdown linux1, and powered off the Fibrenetix device.  I pulled the disk from slot 4 and replaced it with one of the spares we had in the control room cabinets.  I powered the device back up and it beeped for a while.  Unfortunately the device was requiring a password to access it from the front panel, and I could find no manual for the device in the lab, nor does the manufacturer offer the manual on it's web site.

Eventually I was able to get linux1 fully rebooted (after some fscks) and it seemed to mount the hardware RAID (as /dev/sdc1) fine.  The brought the NFS back.  I had to reboot nodus to get it recovered, but all the control room and front-end linux machines seemed to recover on their own (although the front-ends did need an mxstream restart).

The remaining problem is that the linux1 hardware RAID device is still currently unaccessible, and it's not clear to me that it's actually synced the new disk that I put in it.  In other words I have very little confidence that we actually have an operational RAID for /opt/rtcds.  I've contacted the LDAS guys (ie. Dan Kozak) who are managing the 40m backup to confirm that the backup is legit.  In the mean time I'm going to spec out some replacement disks onto which to copy /opt/rtcds, and also so that we can get rid of this old SCSI RAID thing.

Attachment 1: FX-606-U4_1205.pdf
FX-606-U4_1205.pdf
  8141   Sat Feb 23 00:34:28 2013 yutaUpdateComputerscrontab in op340m deleted and restored (maybe)

I accidentally overwrote crontab in op340m with an empty file.
By checking /var/cron in op340m, I think I restored it.

But somehow, autolockMCmain40m does not work in cron job, so it is currently running by nohup.

What I did:
  1. I ssh-ed op340m to edit crontab to change MC autolocker to usual power mode. I used "crontab -e", but it did not show anything. I exited emacs and op340m.
  2. Rana found that the file size of crontab went 0 when I did "crontab -e".
  3. I found my elog #6899 and added one line to crontab

55 * * * *  /opt/rtcds/caltech/c1/scripts/general/scripto_cron /opt/rtcds/caltech/c1/scripts/MC/autolockMCmain40m >/cvs/cds/caltech/logs/scripts/mclock.cronlog 2>&1

  4. It didn't run correctly, so Rana used his hidden power "nohup" to run autolockMCmain40m in background.
  5. Koji's hidden magic "/var/cron/log" gave me inspiration about what was in crontab. So, I made a new crontab in op340m like this;

34 * * * *  /opt/rtcds/caltech/c1/scripts/general/scripto_cron /opt/rtcds/caltech/c1/scripts/MC/autolockMCmain40m >/cvs/cds/caltech/logs/scripts/mclock.cronlog 2>&1
55 * * * * /opt/rtcds/caltech/c1/scripts/general/scripto_cron /opt/rtcds/caltech/c1/scripts/PSL/FSS/RCthermalPID.pl >/cvs/cds/caltech/logs/scripts/RCthermalPID.cronlog 2>&1
07 * * * * /opt/rtcds/caltech/c1/scripts/general/scripto_cron /opt/rtcds/caltech/c1/scripts/PSL/FSS/FSSSlowServo >/cvs/cds/caltech/logs/scripts/FSSslow.cronlog 2>&1
00 * * * * /opt/rtcds/caltech/c1/burt/autoburt/burt.cron >> /opt/rtcds/caltech/c1/burt/burtcron.log
13 * * * * /cvs/cds/caltech/conlog/bin/check_conlogger_and_restart_if_dead
14,44 * * * * /opt/rtcds/caltech/c1/scripts/SUS/rampdown.pl > /dev/null 2>&1


  6. It looks like some of them started running, but I haven't checked if they are working or not. We need to look into them.

Moral of the story:
  crontab needs backup.

ELOG V3.1.3-