40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 257 of 355  Not logged in ELOG logo
ID Date Author Type Categorydown Subject
  1033   Wed Oct 8 12:35:56 2008 josephbConfigurationComputersNew Network diagram for the 40m
Attached is a pdf of the new network diagram for the 40m after having removed all of the old hubs.
Attachment 1: 40m_network_10-07-08.pdf
40m_network_10-07-08.pdf
  1038   Fri Oct 10 00:34:52 2008 robOmnistructureComputersFEs are down

The front-end machines are all down. Another cosmic-ray in the RFM, I suppose. Whoever comes in first in the morning should do the all-boot described in the wiki.
  1039   Fri Oct 10 10:20:42 2008 AlbertoOmnistructureComputersFEs are down

Quote:

The front-end machines are all down. Another cosmic-ray in the RFM, I suppose. Whoever comes in first in the morning should do the all-boot described in the wiki.


Yoichi and I went along the arms turning off and on all the FE machines. Then, from the control room we rebooted them all following the procedures in the wiki. Everything is now up again.

I restored the full IFO, re-locked the mode cleaner.
  1040   Fri Oct 10 13:57:33 2008 AlbertoOmnistructureComputersProblems in locking the X arm
This morning for some reason that I didn't clearly understand I could not lock the Xarm. The Y arm was not a problem and the Restore and Align script worked fine.

Looking at the LSC medm screen something strange was happening on the ETMX output. Even if the Input switch for c1:LSC-ETMX_INMON was open, there still was some random output going into c1:LSC-ETMX_INMON, and it was not a residual of the restor script running. Probably something bad happened this monring when we rebooted all the FE computers for the RFM network crash that we had last night.

Restarting the LSC computer didn't solve the problem so I decided to reboot the scipe25 computer, corresponding to c1dcuepics, that controls the LSC channels.

Somehow rebooting that machine erased all the parameters on almost all medm screens. In particular the mode cleaner mirrors got a kick and took a while to stop. I then burtrestored all the medm screen parameters to yesterday Thursday October 9 at 16:00. After that everything came back to normal. I had to re-lock the PMC and the MC.

Burtrestoring c1dcuepics.snap required to edit the .snap file because of a bug in burtrestore for that computer wich adds an extra return before the final quote symbol in the file. That bug should be fixed sometime.

The rebooting apparently fixed the problem with ETMX on the LSC screen. The strange output is not present anymore and I was able to easily lock the X arm. I then run the Align and the Restore full IFO scripts.
  1041   Fri Oct 10 20:03:35 2008 YoichiConfigurationComputersmedm, dataviewer, dtt on 64 bit linux
I compiled EPICS (base, medm and ezca) and dataviewer for 64 bit linux.
These are installed in /cvs/cds/caltech/apps/linux64/.
I also configured cshrc.40m to make it possible to run the 32bit dtt on 64bit machines.
64bit ligotools is also installed to /cvs/cds/caltech/apps/linux64/ligotools although I haven't tested it extensively.

With those essential tools available for 64bit linux, Joe and I decided to install 64bit CentOS to the new linux machine.
It is named allegra.
Now, medm, dataviewer, dtt, awg, foton and ezca commands all work on rosalba and allegra.
I put some notes on how to make things work on 64bit in the wiki.
http://lhocds.ligo-wa.caltech.edu:8000/40m/Building_LIGO_softwares_for_64_bit_linux

I compiled dtt (actually the whole GDS tree) for 64bit linux and the build process finished normally.
But somehow dtt does not work properly. It starts on my laptop but does not retrieve data. It crashes on rosalba.
So I had to retreat to 32bit.
  1047   Tue Oct 14 19:18:18 2008 YoichiUpdateComputersBootFest
Rana, Yoichi

Most of the FE computers failed around the lunch time.
We power cycled those machines and now all of them are up and running.
I confirmed that the both arms lock.
Now the IFO is in "Restore last auto-alignment" status.
  1055   Fri Oct 17 16:43:10 2008 YoichiConfigurationComputersmcup/down scripts on linux
I made some changes to /cvs/cds/caltech/medm/c1/ioo/cmd/medmMCup and medmMCdown so that those can be run from medm on linux machines.
  1059   Mon Oct 20 15:02:00 2008 YoichiConfigurationComputers/cvs/cds restored
I moved missing files in /cvs/cds restored by Alan and Stuart to the original locations.
I confirmed autoburt runs, and dtt, which had also been having trouble running, runs ok now.

I found an interesting piece of evidence on allegra, our new 64bit linux machine.
In the Trash of controls Desktop on that machine, there is /cvs/cds/vw/ directory.
I remember that when I last time emptied the trash bin on the machine (yesterday), it took somewhat long time.
Too bad that I did not pay attention to what was actually in the Trash, but now I have a feeling that in the Trash were
missing /cvs/cds/* directories.
While emptying the Trash, I encountered several errors saying permission denied or something like that, and skipped those files.
Sometimes, when you move something from NFS mounted directories to the Trash, you get this kind of errors.
So my guess is that someone accidentally (or intentionally) moved /cvs/cds/* except for "caltech" to the Trash of allegra.
And I completely removed them carelessly.
  1060   Mon Oct 20 16:18:00 2008 AlanConfigurationComputers/cvs/cds restored

Quote:
I moved missing files in /cvs/cds restored by Alan and Stuart to the original locations.
I confirmed autoburt runs, and dtt, which had also been having trouble running, runs ok now.

I found an interesting piece of evidence on allegra, our new 64bit linux machine.
In the Trash of controls Desktop on that machine, there is /cvs/cds/vw/ directory.
I remember that when I last time emptied the trash bin on the machine (yesterday), it took somewhat long time.
Too bad that I did not pay attention to what was actually in the Trash, but now I have a feeling that in the Trash were
missing /cvs/cds/* directories.
While emptying the Trash, I encountered several errors saying permission denied or something like that, and skipped those files.
Sometimes, when you move something from NFS mounted directories to the Trash, you get this kind of errors.
So my guess is that someone accidentally (or intentionally) moved /cvs/cds/* except for "caltech" to the Trash of allegra.
And I completely removed them carelessly.


In the meantime, I have re-started the nightly backup for /frames/minute-trends
but NOT YET for /cvs/cds ,
since I fear that we'll find another problem and will need to go back to the June 27 backup.
Let's wait a few days for the dust to settle,
and if everyone feels confident that /cvs/cds is ok,
I'll restart the backup of that.

How I restored the files, for the record:

Stuart mounted /archive/backup onto an accessible computer (garrak.ligo.caltech.edu ) and I logged on to controls@nodus and ran this command:

/cvs/cds/caltech/scripts/backup/rsync --rsync-path=/usr/bin/rsync --rsh=/usr/bin/ssh --compress --verbose --archive --hard-links --exclude=caltech/ ajw@garrak.ligo.caltech.edu:/backup/40m/cvs /cvs/cds/recover_20081020

I had to type in my GC password, and it ran for ~20 minutes (would have been much longer had I asked for /cvs/cds/caltech as well!).

you can view the backups by logging on to garrak.ligo.caltech.edu with your GC account:
/backup/40m/cvs/cds/
/archive/frames/trend/minute-trend/40m
  1065   Tue Oct 21 18:19:42 2008 YoichiConfigurationComputersLISO and Eagle installed
I installed LISO, a circuit simulation software, into the control room linux machines.
I also installed a PCB CAD called Eagle to serve as a graphical editor for LISO.
I put a brief explanation in the wiki.
http://lhocds.ligo-wa.caltech.edu:8000/40m/LISO

As a demonstration, I made a model of the FSS PC path and did a stability analysis of the op-amps.

The first attachment is the schematic of the model.
You can find the model in /cvs/cds/caltech/apps/linux/eagle/projects/liso-examples/FSS

The second attachment shows the stability analysis plot of the first two op-amps when AD829s are used.
The op-amp model is for the uncompensated AD829. The graph includes the bode plots of the open-loop transfer function of each op-amp.
If the phase delay is more than 360deg (in the plot it is 0 deg because the phase is wrapped within +/-180 deg) at the unity gain frequency,
the op-amp is unstable.
It is clear from the plot that this circuit is unstable. This is consistent with what I experienced when I replaced the chips to AD829 without
compensation.
Unfortunately, I don't have an op-amp model for phase compensated AD829. So I can't make a plot with compensation caps.

The third attachment is the stability analysis of the same circuit with AD797. It also shows that the circuit is unstable at 200MHz, though I
observed oscillation at 50MHz.

Finally, I did an estimate of frequency noise contribution from the noise of AD829.
First I estimated the voltage noise at the output of the board caused by the first AD829 using LISO's noise command.
Then I converted it into the input equivalent noise at the stage right after the mixer by calculating the transfer function
of the circuit using LISO.
Within the control bandwidth of the FSS, this input equivalent noise appears at the mixer output with the opposite sign.
Since we know the calibration factor from the mixer output voltage to the frequency noise, we can convert this into the frequency noise.
The final attachment is the estimated contribution of the AD829 to the frequency noise. As expected, it is negligible.
Attachment 1: FSS_PC_Path.pdf
FSS_PC_Path.pdf
Attachment 2: AD829Stability.png
AD829Stability.png
Attachment 3: AD797Stability.png
AD797Stability.png
Attachment 4: FreqNoiseByAD829.png
FreqNoiseByAD829.png
  1066   Wed Oct 22 09:42:41 2008 AlbertoDAQComputersc1iool0 rebooted and MC autolocker restarted
This morning I found the MC unlocked. The MC-Down script didn't work because of network problems in communicating with scipe7, a.k.a. c1iool0. Telneting to the computer was also impossible so I power cycled it from its key switch. The first time it failed so I repeated it a second time and then it worked.
Yoichi then restarted c1iovme. It was also necessary to restart the MC autolocker script according to the following procedure:
- ssh into op440m
- from op440m, ssh into op340m
- restart /cvs/cds/caltech/scripts/scripts/MC/autolockMCmain40
  1067   Wed Oct 22 12:37:47 2008 josephbUpdateComputersNetwork spreadsheet
Attached in open office format as well as excel format is spreadsheet containing all the devices with IP addresses at the 40m. Please contact me with any corrections.
Attachment 1: 40m_network_10-15-08.ods
Attachment 2: 40m_network_10-15-08.xls
  1070   Wed Oct 22 20:50:30 2008 AlbertoOmnistructureComputersGPS
Today I measured the GPS clock frequency at the output of CLOCK_MON in a board on the same crate where the c1iool0 computer is located. The monitor was connected with a BNC cable to the 10MHz reference input of the frequency counter on top of that rack, where it was used to check the 166MHz coming from one of the Marconi.

The frequency was supposed to be 10MHz but I actually measured 8 MHz. I tracked down the GPS input cable to the board and it turned out to come from one of the 1Y7 rack. Here it was connected to a board with a display that was showing corrupted digits, plus some leds on the front panel were red.

I'm not sure the GPS reference is working properly.
  1076   Thu Oct 23 18:51:19 2008 AlbertoMetaphysicsComputerseLog
I checked it and the latest version of the elog software, the 2.7.5 (we have the 2.6.5) has, among new nice features, the very good ability to fit the entries into the screen width without showing kilometric lines like we see now. Should we upgrade it?
  1088   Fri Oct 24 20:54:41 2008 ranaConfigurationComputerslinux2
I have removed linux2 and its cables from the control room and put it into 1Y3 along with op340m.

When Joe next comes in we can ask him to Cat6 it to the rest of the world, although it already
seems to me that the CDS hub/switch next Alberto's desk is too full and that we need to purchase
a 48 port device for there.
  1098   Tue Oct 28 12:01:01 2008 josephbConfigurationComputerslinux2

Quote:
I have removed linux2 and its cables from the control room and put it into 1Y3 along with op340m.

When Joe next comes in we can ask him to Cat6 it to the rest of the world, although it already
seems to me that the CDS hub/switch next Alberto's desk is too full and that we need to purchase
a 48 port device for there.


Note I still need to remove a fair bit of cabling no longer in use from the Martian network switch next to Alberto's desk. There's actually about 8-10 cables there which show no connectivity and are not being used. So there's really about 33% of the ports open in the control room hub, it just doesn't look like it.

As for linux2, I'll probably just connect it to the 1Y2 or 1Y6 Hubs when I get the chance.
  1101   Thu Oct 30 11:07:25 2008 YoichiUpdateComputersWireless bridges arrived
Five wireless bridges for the GPIB-Ethernet converters arrived.
One of them had a broken AC adapter. We have to send it back.
I configured the rest of the bridges for the 40MARS wireless network.
One of them was installed to the SR785.
I put the remaining ones in the top drawer of the cabinet, on which the label printers are sitting.
You can use those to connect any network device with a LAN port to the 40MARS network.
  1119   Thu Nov 6 22:07:56 2008 ranaConfigurationComputersELOG compile on Solaris
From the ELOG web pages:

Solaris:

Martin Huber reports that under Solaris 7 the following command line is needed to compile elog:

gcc -L/usr/lib/ -ldl -lresolv -lm -ldl -lnsl -lsocket elogd.c -o elogd

With some combinations of Solaris servers and client-side browsers there have also been problems with ELOG's keep-alive feature. In such a case you need to add the "-k" flag to the elogd command line to turn keep-alives off.
  1126   Mon Nov 10 11:32:49 2008 robUpdateComputersc1iscex rebooted

it was running a few cycles late
  1157   Fri Nov 21 21:28:32 2008 ranaSummaryComputersc0daqawg restart
A few minutes after restarting fb0 for the Guralp channels, the DAQAWG lights went red on the DAQ screens.
Why?? I chose revival procedure #3 for c0daqawg from the Wiki and it came back in a couple minutes.
  1159   Mon Nov 24 16:43:34 2008 ranaConfigurationComputersAlex and Jay took away some computers from the racks
I was over at Wilson house and saw Jay and Alex bring in 3 rackmount computers. One was a Sun 4600 and
then there were 2 3U black boxes. I got the impression that these were the data concentrators or
data collectors or framebuilder test boxes. They said that they got these from the 40m and no one was
in the lab to oppose them except for Bob and he didn't put up much of a fight.

Everything looks green on the DAQ Detail and RFM network screens so perhaps everything is OK. Beware.
  1170   Wed Dec 3 12:49:11 2008 jenneUpdateComputerssomething sketchy with NDS ... or something
Never mind...I had forgotten that you have to run mdv_config every time you open matlab, not just every time you boot a computer.

I am not able to get channels using get_data from the mDV toolbox on Allegra, Megatron or Rosalba.

The error I get while running the "hello_world" test program is:
hello_world
setting up configuration...
added paths for nds
added paths for qscan
couldn't add path for matapps_SDE
couldn't add path for matapps_path
couldn't add path for framecache
couldn't add path for ligotools_matlab
added paths for home_pwd
fetching channels for C...
Warning: get_channel_list() failed.
??? Error using ==> NDS_GetChannels
Failed to get channel list.

Error in ==> fetch_nds at 47
eval(['CONFIG.chl.' server ' = NDS_GetChannels(ab);']);

Error in ==> get_data at 100
out = fetch_nds(channels,dtype,start_time,duration);

Error in ==> hello_world at 6
aa = get_data('C1:LSC-DARM_ERR', 'raw', gps('now - 1 hour'), 32);
  1175   Thu Dec 4 16:29:20 2008 josephbConfigurationComputersError message on Frame Builder Raid Array
The Fibrenetix FX-606-U4 RAID connected to the frame builder in 1Y7 is showing the following error message: IDE Channel #4 Error Reading
  1177   Fri Dec 5 01:41:33 2008 YoichiConfigurationComputersMEDM screen snapshot now works on linux machines
As a part of my "make everything work on linux" project, I modified 'updatesnap' script so that linux machines can update MEDM screen snapshots.
Now, all 'updatesnap' in the subsystem directories (like medm/c1/lsc/cmd/updatesnap) are sym-link to /cvs/cds/caltech/medm/c1/cmd/updatesnap.
This script will take a window snapshot to a PNG file, and move the old snapshot to archive folders with date information added to the filename.
For compatibility, it also saves JPEG snapshot. Right now, most of 'view snapshot' menus in MEDM screens are calling 'sdtimage' command, which cannot display PNG files. I installed Imagemagick to op440m. We should change MEDM files to use 'display' command instead of 'sdtimage' so that it can show PNG files.
I've already changed some MEDM screens, but there are so many remaining to be modified.

PNG is better than JPEG for crisp images like screen shots. JPEG performs a sort of spacial Fourier transformations and low-pass filtering to compress the information. If it is used with sharp edges like boundaries of buttons on an MEDM screen, it naturally produces spacial aliasing (ghost images).

I also created several sym-links on the apps/linux/bin directory to mimic the Solaris-only commands, such as 'sdtimage', 'nedit' and 'dtterm'.
For example, nedit is symbolic linked to gedit. Many MEDM buttons/menus, which used to be incompatible with linux, now work fine on the linux machines.
  1181   Fri Dec 5 20:40:38 2008 YoichiHowToComputersElog multi-keyword search
The current Elog search allows you to look for only one keyword in the text.
You cannot search for two keywords by simply separating them with a white space.
That is, a search term "abc def" matches a literal "abc def", not a text containing "abc" and "def".
This is extremely annoying. However, there are still some ways to search for multiple keywords.
The Elog search fields are treated as regular expressions.
In order to match a text containing "abc" and "def", you can use a search term "abc.*def".
A period (.) means "any character", and an asterisk (*) means "any number of repetition of the preceding character".
Therefore, ".*" matches "any number of any character" i.e. anything.
The search term "abc.*def" works fine when you know "abc" appears first in the text you are looking for.
If you don't know the order of appearance of the keywords, you have two choices: either to use,
"(abc.*def)|(def.*abc)"
or
"(abc|def).*(abc|def)"
The vertical bar (|) means "or". Parentheses are used for grouping.
The first example does exactly what you want. However, you have to list all the permutations of your keywords
separated by |. If you have more than two keywords, it can be a very very long search word.
(The length of the search word is O(n!), where n is the number of keywords).
In the second example, the length of the keyword is O(n). However, it can also match a text containing two "abc".
This means the search result may contain some garbages (entries containing only "abc").
I guess in most cases we can tolerate this.

To automatically construct a multiple keyword search term for the Elog, I wrote a bash script called elogkeywd
and it is installed in the control room machines.
You can type
elogkeywd keyword1 keyword2 keyword3
to generate a regular expression for searching a text containing "keyword1", "keyword2" and "keyword3".
The generated expression is of the second type shown above. You can then copy-and-paste the result to
the Elog search field.
The script takes any number of keywords. However, there seems to be a limit on the number of characters you can type
into the search field of the Elog. I found the practical limit is about 3 keywords.
  1200   Sun Dec 21 14:18:04 2008 YoichiUpdateComputersRFM network bypass box's power supply is dead
I restarted the front-end computers by power cycling them one-by-one.
After issuing startup commands, most of them started normally at least by looking
at the output from telnet/ssh.
However, the status monitors of the FE computers on the EPICS screen are still red.
I noticed that all the LEDs on the VMIC 5594 RFM network bypass box are off.
According to the labels, fb40m, c0daqctrl, c0dcu are connected to the box.
This means (I believe) c1dcuepics cannot access the RFM network. So we have no control over
the FE computers through EPICS.

I pushed the reset button on the box, power cycled it, but nothing changed.
I checked the fuse and it was OK. Then I found that the power supply was dead.
It is a small AC adapter supplying +5VDC with a 5-pin DIN like connector.
We have to find a replacement.
  1201   Mon Dec 22 13:48:22 2008 YoichiUpdateComputersRFM network bypass box's power supply is dead
As a temporary fix, I cut the cable of the power supply and connected it to the Sorensen power supply +5V on the rack.
Now, the RFM bypass box is powered up, but some LEDs are red, which looks like a bad sign.
I restarted all the FE computers, but this time I got errors during the execution of the startup commands in the VxWorks machines.
The errors are "General Protection Fault" or "Invalid Opcode".
The linux machines do not show errors but still the status lights in EPICS are red.
We need Alex's help. He did not answer the phone, so Alberto left a voice mail.
  1202   Tue Dec 23 10:35:40 2008 YoichiUpdateComputersRFM network breakdown mostly fixed
Rana, Rolf, Alberto, Yoichi

The source of the problem was the RFM bypass box, as expected.
Rana pointed out that the long cable I used to bring the 5V from the Sorensen to the box
may cause a large voltage drop considering that the box is sucking ~3A.
So we connected the cable to another power supply (5V/5A linear power supply).
Then the LEDs on the bypass box turned green from red, and everything started to work.

A weired thing is that when I connected the cable to the wrong terminals of the power supply which
have lower current supply capabilities, the supply voltage dropped to 3V, but still the LEDs on the bypass box
turned green. This means the bypass box can live with 3V.
I noticed that there is a long cable from the Sorensen to the cross connect on the side of the rack, where I
connected my cable to the bypass box. This long cable had somewhat large resistance (1 or 2 Ohms) and dropped
the supply voltage to less than 3V ?
Anyway, the bypass box is now on a temporary power supply. Alberto was assigned a task to find a replacement power
supply.

There are two remaining problems.
c1susvme1 fails to start often claiming a DMA error on a Pentek. After several attempts, you can start the machine,
but after a while (1 hour ?) it fails again.
op340m is not responding to ssh login. It responds to ping.
We hooked up a monitor and keyboard (USB because the machine does not have a PS/2 port) to it and rebooted.
At the boot, it briefly displays a message "No keyboard, try TTYa", but after that no display signal.
Steve found me a serial cable. I will try to login to the machine using the serial port.

  1203   Wed Dec 24 10:33:24 2008 YoichiUpdateComputersSeveral fixes. Test point problem remains.
Yesterday, I fixed several remaining problems from the power failure.

I found a LEMO cable connecting the timing board to the Penteks was lose on the c1susvme1 crate.
After I pushed it in, the DMA error has not occured on c1susvme1.

I logged into op340m using a Null Modem Cable.
The computer was failing to boot because there were un-recoverable disk errors by the automatic fsck.
I run fsck manually and corrected some errors. After that, op340m booted normally and now it is working fine.
Here is the serial communication parameters I used to communicate with op340m:
>kermit      (I used kermit command for serial communication.)
>set modem type none
>set line /dev/ttyS0     (ttyS0 should be the device name of your serial port)
>set speed 9600
>set parity none
>set stop-bits 1
>set flow-control none
>connect

After fixing op340m, the MC locked.
Then I reset the HV amps. for the steering PZTs.
Somehow, the PZT1 PIT did not work. But after moving the slider back and forth several times, it started to work.

I reset the mechanical shutters around the lab.

I went ahead to align the mirrors. The X-arm locked but the alignment script did not improve
the arm power.
I found that test points are not available. (diag said test point management not available).
Looks like test point manager is not running. Called Rolf, but could not reach him.
I'm not even sure on which machine, the tp manager is supposed to be run.
Is it c0daqawg ?
  1204   Wed Dec 24 12:46:54 2008 YoichiUpdateComputersTest points are back
Rob told me how to restart the test point manager.
It runs on fb40m and actually there is an instruction on how to do that in the Wiki.
http://lhocds.ligo-wa.caltech.edu:8000/40m/Computer_Restart_Procedures#fb40m

I couldn't find the page because when I put a keyword in the search box on the upper right
corner of the Wiki page and hit "enter", it only searches for titles. To do a full text
search, you have to click on the "Text" button.

Anyway, now the test points are back.
  1206   Mon Dec 29 21:38:57 2008 YoichiUpdateComputersSnapshots of MEDM screens
I wrote scripts to take snapshots of MEDM screens in the background.
These scripts work even on a computer without a physical display attached.
You don't need to have X running.
So now the scripts run on nodus every 5 minutes from cron.
The screen shots are saved in /cvs/cds/caltech/statScreen/images/

There is a wiki page for the scripts.
http://lhocds.ligo-wa.caltech.edu:8000/40m/captureScreen.sh

Someone has to make a nice web page summarizing the captured images.
  1207   Mon Dec 29 21:51:02 2008 YoichiConfigurationComputersWeb server on nodus
The apache on nodus has been solely serving for the svn web access.
I changed the configuration and all files under /cvs/cds/caltech/users/public_html/ can be seen under
https://nodus.ligo.caltech.edu:30889/

The page is not password protected, but you can add a protection by putting an appropriate .htaccess
in your directory.
For the standard LVC password, put the following in your .htaccess
AuthType Basic  
AuthName "LVC password"
AuthUserFile /cvs/cds/caltech/apache/etc/LVC.auth
Require valid-user
  1221   Fri Jan 9 17:30:10 2009 KakeruUpdateComputersSnapshots of MEDM screens
I wrote a web page which shows snapshots of MEDM screens generated by Yoich's script (e-log #1206).
https://nodus.ligo.caltech.edu:30889/medm/screenshot.html
This page refreshes itself every 5 minutes automatically.

The .html file is generated by /cvs/cds/caltech/statScreen/bin/genHtml.pl
This script generates the .html file contains snapshots listed on /cvs/cds/caltech/statScreen/etc/medmScreens.txt every 5 minutes with cron.
When you wont to display other screens, please edit this .txt file and wait 5 minutes!


To make thumbnails, I wrote /cvs/cds/caltech/statScreen/bin/genThumbnail.pl
This script reads /cvs/cds/caltech/statScreen/etc/medmScreens.txt, too.
(Sometimes, it makes thumbnails with larger storage...)


Quote:
I wrote scripts to take snapshots of MEDM screens in the background.
These scripts work even on a computer without a physical display attached.
You don't need to have X running.
So now the scripts run on nodus every 5 minutes from cron.
The screen shots are saved in /cvs/cds/caltech/statScreen/images/

There is a wiki page for the scripts.
http://lhocds.ligo-wa.caltech.edu:8000/40m/captureScreen.sh

Someone has to make a nice web page summarizing the captured images.
  1224   Tue Jan 13 11:10:42 2009 robConfigurationComputersconlogger restarted
unknown how long it's been down.
  1231   Fri Jan 16 11:28:54 2009 YoichiUpdateComputersLab. laptop needs wireless lan driver update
One of the lab. laptops (belladonna) cannot connect to the network now.
I guess this was caused by someone clicked the update icon and unknowingly updated the kernel, which resulted in the wireless lan driver malfunctioning.
It was using a Windows driver through ndiswrapper.
Someone has to fix it.
  1235   Fri Jan 16 18:33:54 2009 YoichiSummaryComputersc1lsc rebooted to fix 16Hz glitches
Kakeru, Yoichi

There were 16Hz harmonics in the PD3 and PD4 channels even when there is no light falling on it.
Actually, even when the connection to the ADC was removed, the 16Hz noise was still there.

Rob suggested that this might be digital problem, because data is sent to the daq computer very 1/16 of a second.

We restarted c1lsc and the problem went away.
  1238   Mon Jan 19 15:10:37 2009 YoichiHowToComputersloadLIGOData a GUI for mDV
I installed loadLIGOData, a product of my weekend project, in /cvs/cds/caltech/apps/loadLIGOData.
This is a Matlab GUI for getting data from nds servers. It uses a modified version of mDV to retrieve data.
You can choose and download LIGO data into Matlab quickly.
I also wrote a GUI to plot the downloaded data easily.
With this GUI, you can plot multiple channel data in a single figure, which is useful to identify the cause for a lock loss etc.
You can change the time axis labels to UTC or Local time in stead of GPS second.

You can run it by typing loadLIGOData in a terminal of a linux machine.
A brief explanation of how to use it is written here:
http://lhocds.ligo-wa.caltech.edu:8000/40m/loadLIGOData

At this moment, data from test points cannot be retrieved properly (of course there is no way to go back to the past for test points.
But still we should be able to get data in real time.). I'll try to find a solution.
Attachment 1: loadLIGOData.png
loadLIGOData.png
Attachment 2: plotLigoData.png
plotLigoData.png
  1239   Mon Jan 19 18:21:41 2009 ranaUpdateComputersloadLIGOData a GUI for mDV
The tool is very nice; I looked at the seismic trend for 16 days (attached).
However, it gives some kind of error when trying to get Hanford or Livingston data.
Attachment 1: a.png
a.png
  1240   Tue Jan 20 15:28:42 2009 YoichiUpdateComputersloadLIGOData a GUI for mDV

Quote:
The tool is very nice; I looked at the seismic trend for 16 days (attached).
However, it gives some kind of error when trying to get Hanford or Livingston data.


I fixed it.
You have to click "Load channels" button when you select a new site.
I plotted one minute of MC_F signals from H1, H2, L1 and 40m.
Looks like L1 MC was swinging a lot.
Attachment 1: MC_F.png
MC_F.png
  1255   Wed Jan 28 12:51:32 2009 YoichiUpdateComputersMegatron is dying
For the past three days, Megatron has been making a huge noise. Sounds like a fan is failing.
There is an LED with "!" sign on the front panel. It is now orange. Looks like some kind of warning.
We can login to the machine. "top" shows the CPU load is almost zero.
Shall we try rebooting it ?
  1258   Thu Jan 29 16:50:53 2009 josephb, albertoConfigurationComputersMegatron fixed
The warning light on megatron and the fans at full speed were fixed by not just power cycling, but completely unplugging megatron from power, waiting for a minute, and then reconnecting the power cables.

Apparently, the Sunfire X4600s at Hanford have also had intermittent problems, which required unplugging the machines completely. In their case, they were unable to access the machine normally, nor could they access the the Intergrated Lights Out Manager (ILOM).

Here, we could interact normally with megatron, but were unable to connect to the ILOM. We were able to get BIOS, but unable to change any of the setting for the ILOM connection. Since the ILOM is a seperate processor and effectively always on, even when the power light is off, a normal shutdown won't reset it. Hence the need to completely disconnect the power supply.
  1261   Fri Jan 30 17:30:31 2009 Alberto, JosephbConfigurationComputersNew computer Ottavia set up
Alberto, Joseph,

Today we installed the computer that some time ago Joe bought for his GigE cameras. It was baptized "OTTAVIA".

Ottavia is black, weighs about 20 lbs and it's all her sister, Allegra (who also pays for bad taste in picking names). She runs an Intel Core 2 Quad and has 4GB of RAM. We expect much from her.

Some typical post-natal operations were necessary.

1) Editing of the user ID
  • By means of the command "./usermod -u 1001 controls" we set the user ID of the user controls to 1001, as it is supposed to be.

2) Connection to the Martian network
  • Ottavia was given IP address 131.215.113.097 by editing the file /etc/sysconfig/networ-scripts/ifcfg-eth0 (we also edited the netmask and the gateway address as in the Wiki)
  • In linux1, which serves as name server, in the directory /var/named/chroot/var/named, we modified both the IP-to-name and name-to-IP register files 131.215.113.in-addr.arpa.zone and 131.215.11in-addr.martian.zone.
  • We set the file /etc/resolv.conf so that the OS knows who is the name server.

3) Mounting of the /cvs/cds path
  • We created locally the empty directories /cvs/cds
  • We edited the files /etc/fstab adding the line "linux1:/home/cds /cvs/cds nfs rw,bg,soft 0 0"
  • We implemented the common variables of the controls environment by sourcing the cshrc.40m: in the file /home/controls/.cshrc we added the two lines "source /cvs/cds/caltech/cshrc.40m" and "setenv PATH ${PATH}:/cvs/cds/caltech/apps/linux64/matlab/bin/"
  1268   Tue Feb 3 15:01:38 2009 AlbertoFrogsComputersmegatron slow?

I notice that Megatron is slower than any other computer in running code that invokes optickle or looptickle (i.e. three times slower than Ottavia). Even without the graphics.

Has anyone ever experienced that?

  1275   Thu Feb 5 16:21:07 2009 JenneFrogsComputersBelladonna connects to the wireless Martian network again

Symptoms:  Belladonna could not (for a while) connect to the wireless network, since there was a driver problem for the wireless card.  This (I believe) started when Yoichi was doing updates on it a while back.

The system: Belladonna is a Dell Inspirion E1505 laptop, with a Broadcom Corporation Dell Wireless 1390 WLAN Mini-PCI Card (rev 01)

Result:  Belladonna now can talk to it's wireless card, and is connected to the Martian network.  (MEDM and Dataviewer both work, so it must be on the network.)

 

What I did:

0.  Find a linux forum with the following method:  http://www.thelinuxpimp.com/main/index.php?name=News&file=article&sid=749

The person who wrote this has the exact same laptop, with the same wireless card.

1.  Get a new(er) version of ndiswrapper, which "translates" the Windows Driver for the wireless card to Linux-ese.  Belladonna previously was using ndiswrapper-1.37.

$wget http://nchc.dl.sourceforge.net/sourceforge/ndiswrapper/ndiswrapper-1.42.tar.gz

2.  Put the ndiswrapper in /home/controls/Drivers, and installed it.

$ndiswrapper -i bcmwl5.inf  3.  Get and put the Windows driver in /home/controls/Drivers/WiFi

$wget http://ftp.us.dell.com/network/R140747.EXE
4. Unzip the driver $unzip -a R140747.EXE

5.  Make Fedora use ndiswrapper

$ndiswrapper -m

$modprobe ndiswrapper

6. Change some files to make everything work:

/etc/sysconfig/wpa_supplicant      CHANGE FROM: DRIVERS="-Dndiswrapper"     CHANGE TO: DRIVERS="-Dwext"

/etc/sysconfig/network-scripts/ifcfg-wlan0      CHANGE FROM: BOOTPROTO=none      CHANGE TO: BOOTPROTO=dhcp

/etc/rc.d/init.d/wpa_supplicant        CHANGE FROM: daemon $prog -c $conf $INTERFACES $DRIVERS -B        CHANGE TO: daemon $prog -c$conf $INTERFACES $DRIVERS -B

6.  Restart things

$service wpa_supplicant restart
$service network restart

7.  Restart computer (since it wasn't working after 1-6, so give a restart a try)

8. Success!!!  MEDM and Dataviewer work without any wired internet connection => wireless card is all good again!

  1286   Mon Feb 9 17:09:51 2009 YoichiUpdateComputersA bunch of updates for the network GPIB stuff.
During the work on ISS, we noticed that netgpibdata.py is very unreliable for SR785.
The problem was caused by flakiness of the "DUMP" command of SR785, which dumps the data from the analyzer to the client.
So I decided to use other GPIB commands to download data from SR785. The new method is a bit slower but much more reliable.

I also rewrote netgpibdata.py and related modules using a new class "netGPIB".
This class is provided by netgpib.py module in the netgpibdata directory. If you use this class for your python program, all technical details and dirty tricks are hidden in the class methods. So you can concentrate on your job.
Since python can also be used interactively, you can use this class for a quick communication with an GPIB instrument.

Here is an example.
>ipython #start interactive python
>>import netgpib #Import the module
>>g=netgpib.netGPIB('teofila',10) #Create a netGPIB object. 'teofila' is the hostname of
#the GPIB-Ethernet converter. 10 is the GPIB address.
>>g.command('ACTD0') #Send a GPIB command "ACTD0". This is an SR785 command meaning "Change active display to 0".
>>ans=g.query('DFMT?') #If you expect a response from the instrument, use query command.
#For SR785, "DFMT?" will return the current display format (0 for single, 1 for dual).
>>g.close() #Close the connection when you are done.

Sometimes, SR785 gets stuck to a weird state and netgpibdata.py may not work properly. I wrote resetSR785.py command to reset it remotely.
Wait for 30sec after you issue this command before doing anything.

I wrote two utility commands to perform measurements with SR785 automatically.
TFSR785.py commands SR785 to perform a transfer function measurement.
SPSR785.py will execute spectrum measurements.
You can control various parameters (bandwidth, resolution, window, etc) with command-line options.
Run those commands with '-h' for help.
It is recommended to use those commands even when you are in front of the analyzer, because they save various measurement parameters (input coupling, units, average number, etc) into a parameter file along with the measured data. Those parameters are useful but recording them for each measurement by hand is a pain.
  1294   Wed Feb 11 15:01:47 2009 josephbConfigurationComputersAllegra

So after having broke Allegra by updating the kernel, I was able to get it running again by copying the xorg.conf.backup file over xorg.conf in /etc/X11.  So at this point in time, Allegra is running with generic video drivers, as opposed to the ATI specific and proprietary drivers.

  1303   Sat Feb 14 16:15:19 2009 robConfigurationComputersc1susvme1

c1susvme1 is behaving weirdly.  I've restarted it several times but its computation time is hanging out around 260 usec, making it useless for suspension control and locking.  I also found a PS/2 keyboard plugged in, which doesn't work, so I unplugged it.  It needs to be plugged into a PS/2 keyboard/mouse Y-splitter cable. 

  1307   Mon Feb 16 00:43:46 2009 ranaUpdateComputersmedm directory wiped on nodus
I accidentally did an 'rm -rf' on the medm directory in nodus, instead of on my laptop as was intended.

I then did an svn checkout. So everything should be current as of the last update, but I am sure that
we have not done a checkin on all of the latest screen enhancements. So...we may have to revert to the
Sunday morning tar to get the latest changes back.
  1310   Mon Feb 16 15:54:07 2009 YoichiUpdateComputersmedm directory wiped on nodus

Quote:
I accidentally did an 'rm -rf' on the medm directory in nodus, instead of on my laptop as was intended.

I then did an svn checkout. So everything should be current as of the last update, but I am sure that
we have not done a checkin on all of the latest screen enhancements. So...we may have to revert to the
Sunday morning tar to get the latest changes back.


Indeed, some changes to the medm directory I made were lost.
It was my fault not to check-in those changes.
I asked Alan to restore the directory from the daily rsync backup.
However, the backup job executed this morning have already overwritten the previous (good) backup with the current (bad) medm directory, which Rana restored from the svn. Alan will ask Stuart and Phil if there is still older backup remaining somewhere.

Anyway, I realized that we should stop the backup cron job whenever you think you made a mistake on /cvs/cds/ directory to prevent unwanted overwriting.
The procedure is:
(1) Login to fb40m
(2) Type 'crontab -e'. Emacs will open up in the terminal.
(3) Comment out the backup job (insert # at the beginning of the line containing /cvs/cds/caltech/scripts/backup/rsync.backup ).
(4) Save the file (Ctrl-x Ctrl-s) and exit (Ctrl-x Ctrl-c).

I will post this information on the wiki.
  1311   Mon Feb 16 16:26:29 2009 robUpdateComputersmedm directory wiped on nodus

Quote:

Quote:
I accidentally did an 'rm -rf' on the medm directory in nodus, instead of on my laptop as was intended.

I then did an svn checkout. So everything should be current as of the last update, but I am sure that
we have not done a checkin on all of the latest screen enhancements. So...we may have to revert to the
Sunday morning tar to get the latest changes back.


Indeed, some changes to the medm directory I made were lost.
It was my fault not to check-in those changes.
I asked Alan to restore the directory from the daily rsync backup.
However, the backup job executed this morning have already overwritten the previous (good) backup with the current (bad) medm directory, which Rana restored from the svn. Alan will ask Stuart and Phil if there is still older backup remaining somewhere.

Anyway, I realized that we should stop the backup cron job whenever you think you made a mistake on /cvs/cds/ directory to prevent unwanted overwriting.
The procedure is:
(1) Login to fb40m
(2) Type 'crontab -e'. Emacs will open up in the terminal.
(3) Comment out the backup job (insert # at the beginning of the line containing /cvs/cds/caltech/scripts/backup/rsync.backup ).
(4) Save the file (Ctrl-x Ctrl-s) and exit (Ctrl-x Ctrl-c).

I will post this information on the wiki.


We should change the rsync script so that it does not delete stuff. Maybe it can keep deleted stuff for 6 months or something.
ELOG V3.1.3-