40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 81 of 341  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  897   Fri Aug 29 11:01:49 2008 josephbConfigurationComputersAttempt to change a channel gain in ICS-110B
As noted earlier by Rana, I was playing around with the /cvs/cds/caltech/chans/daq/C1IOOF.ini file with help from Rob. I had made a backup before hand and saved it as C1IOOF.ini.Aug-28-2008. (I have since been informed that C1IOOF.ini.082808 would have been prefered as a name).

We had been trying to up the gain in the C1: PSL-ISS_INMONPD_F in order to do a very low power PMC sweep, in an attempt to get clean modes for fitting. Initially we pressed the reconfig button on the C0DAQ_DETAIL screen, but all that seemed to do was change the Config File CRC. We proceeded to reboot fb40m remotely. However, any change to the ini file (even an extra space at the end of the file) caused a 0x2000 status for C1IOVME16k on the C0DAQ_DETAIL screen. At the time I presumed it was comparing the CRC of the ini-file to something else.

Digging around on in Alex's webspace at http://www.ligo.caltech.edu/~aivanov/ , I found the NDS Access page, which indicated that 0x2000 was a conflict between the front-end and frame builder .ini files.

"There is also status bit 0x2000 which gets added when the DCU configuration is different in front-end and frame builder. That is you can change and .ini file an then reload DAQ configuration with Epics button, which reconfigures the front-end, but leaves frame builders with invalid old configuration. They will detect this change and set the status to 0x2000 to indicate this condition. You will have to restart frame builders to pick up new .ini file and set status back to zero for the affected DCU."

It was when I was going to try reseting the c1iovme via the C0DAQ_RFMNETOWRK medm screen that we realized the EPICS controls were not responding properly. The .ini file was returned to its original form, and mass reboots commenced.
  898   Fri Aug 29 11:05:11 2008 josephbSummaryComputersc1asc was down this morning
I had to manually reboot c1asc this morning, as for some unknown reason its status was red, and the fiber lights on the board were status:red, sig det:amber, own data: nothing. Shut the crate down, turned it back on, heard a beep, then followed wiki reboot instructions. Seems to be working now.
  899   Fri Aug 29 12:41:26 2008 josephb, EricConfigurationComputersMore front ends moved to new network
Used Cat6 cables to finish moving all the front ends in 1Y4 and 1Y5 over to the new GigE network switches, specifically to the switch in 1Y6. This included the ones labeled c1susvme2, c1sosvme, and c1dscl1epics0.
  900   Fri Aug 29 12:43:44 2008 josephbSummaryComputersc1susvme1 down
Around noon today, c1susvme was having problems. The C0DAQ_RFMNETWORK light was red. The status light was off, the sig det light was amber and the own data light was green. I could also ssh in, but could not not run startup. I switched off the watchdogs for c1susvme2 (the watchdogs for c1susvme1 had already been tripped), and manually power cycled the crate.

However, when c1susvme1 when it came back up it had not mounted the usual cvs/cds/ directories. c1susvme2 did however. c1susvme1 has been on the new network for awhile, while c1susvme2 was switch over today. So apparently switching networks doesn't help this particular problem.

I did a remote reboot of c1susvme1, and it came up with the correct files mounted. Both machines ran their approriate startup.cmd files and are currently green.
  917   Wed Sep 3 19:09:56 2008 YoichiDAQComputersc1iovme power cycled
When I tried to measure the sideband power of the FSS using the scan of the reference cavity, I noticed that the RC trans. PD signal was not
properly recorded by the frame builder.
Joe restarted c1iovme software wise. The medm screen said c1iovme is running fine, and actually some values were recorded by the FB.
Nonetheless, I couldn't see flashes of the RC when I scanned the laser frequency.
I ended up power cycling the c1iovme and run the restart script again. Now the signals recorded by c1iovme look fine.
Probably, the DAQ boards were not properly initialized only by the software reset.
I will re-try the sideband measurement tomorrow morning.
  922   Thu Sep 4 11:33:25 2008 josephb, Eric, JenneConfigurationComputersAttempt to increase gain for C1:PSL-ISS_INMONPD_F via 110B
We were attempting to increase the gain on the channel C1:PSL-ISS_INMONPD_F in preparation to do a scan of the PMC at very low input power.

We started by adding a line to the C1:IOOF.ini file in /cvs/cds/caltech/chans/daq/ under that channel that said "gain=10.0". Before touching anything, the channel was outputting around 4000 counts.

We hit the reconfig button for c1iovme16k, then rebooted c1iovme (which turned out to do nothing) and then the framebuilder, in a method consistent with the wiki. This turned out to put the channel in an odd state, where it was showing very rapid, random spikes, virtually but still around 4000ish counts. We returned the file back to its original format, hit reconfig, and then rebooted the framebuilder. The channel however, was still behaving in the same broken way.

After poking around the PSL table, looking at some direct outputs, we came back and rebooted c1iovme and the framebuilder again, which fixed the channel, such that it was reading out correctly. Taking this as a sign that maybe we should reboot the framebuilder, then c1iovme to get the channel to load changes, we changed the file again to have "gain=10.0". Upon reboot of the framebuilder, the channel was still reading out fine, but at the same level. So we continued with the reboot of c1iovme. This still had no effect on the channel output.

The ini file has been set back at this point, however since Yoichi is working, I'm holding off doing a reconfig and reboot on the framebuilder until later.
  925   Thu Sep 4 16:24:56 2008 ranaConfigurationComputersAttempt to increase gain for C1:PSL-ISS_INMONPD_F via 110B

Quote:
We were attempting to increase the gain on the channel C1:PSL-ISS_INMONPD_F in preparation to do a scan of the PMC at very low input power.

According to the Wikipedia, certain esoteric mathematical
operations lead to the result that 4000 x 10 > 32768.
  932   Fri Sep 5 09:56:14 2008 josephb, EricConfigurationComputersFunny channels, reboots, and ethernet connections
1) Apparently the I00-ICS type channels had gotten into a funny state last night, where they were showing just noise, exactly when Rana changed the accelerometer gains and did major reboots. A power cycle of the c1ioo crate and appropriate restarts fixed this.

2) c1asc looks like it was down all night. When I walked out to look at the terminal, it claimed to be unable to read the input file from the command line I had entered the previous night ( < /cvs/cds/caltech/target/c1asc/startup.cmd). In addition we were unable to telnet in, suggesting an ethernet breakdown and inability to mount the appropriate files. So we have temporarily run a new cat6 cable from the c1asc board to the ITMX prosafe switch (since there's a nice knee high cable tray right there). One last power cycle and we were able to telnet in and get it running.
  941   Thu Sep 11 11:29:14 2008 josephbConfigurationComputersFinal netgear switch in place in 1Y2
I've placed the final (of 4) Netgear prosafe 24 port switch at the very top of 1Y2. At that location, there are no holes left to screw into, so it has 4 rubber feet and is sitting on the top most signal generator. It has been plugged in and connected to the control room hub with a labeled cat6 ethernet cable.

Its IP address has been set to 131.215.113.253, and has the usual controls password if using the "Smart Wizard Discovery Tool" which comes on the Netgear CD. The CD can be found in the Equipment manuals filing cabinet under Netgear. This program unfortunately only runs on a window PC.

To Do: Fix the C1:ASC ethernet connection which is currently coming straight out the front door and connected to the 1X4 switch (again through the front door).
  948   Mon Sep 15 14:00:52 2008 josephbConfigurationComputers1Y9 Hub and C1asc
The 1Y9 switch is now using a labeled Cat6 cable in cable trays to connect to the main switch in the offices. In addition, the c1asc cable which had been coming out the door was fixed last Friday, and is now labeled, going out the top and connects to the hub in 1Y2.

Note: Do not connect new ethernet cable from switch to switch without disconnecting the old cable to the rest of the network - this tends to make the Ethernet network unhappy with white flashing alarms.
  961   Thu Sep 18 01:14:23 2008 robSummaryComputersEPICS BAD

Somehow the EPICS system got hosed tonight. We're pretty much dead in the water till we can get it sorted.

The alignment scripts were not working: the SUS_[opt]_[dof]_COMM CA clients were having consistent network failures.
I figured it might be related to the network work going on recently--I tried rebooting the c1susaux (the EPICS VME
processor in 1Y5 which controls all the vertex angle biases and watchdogs). This machine didn't come back after
multiple attempts at keying the crate and pressing the reset button. All the other cards in the crate are displaying
red FAIL lights. The MEDM screens which show channels from this processor are white. It appears that the default
watchdog switch position is OFF, so the suspensions are not receiving any control signals. I've left the damping loops
off for now. I'm not sure what's going on, as there's no way to plug in a monitor and see why the processor is not coming up.

A bit later, the c1psl also stopped communicating with MEDM, so all the screens with PSL controls are also white. I didn't try
rebooting that one, so all the switches are still in their nominal state.
  963   Thu Sep 18 12:16:01 2008 YoichiUpdateComputersEPICS BACK

Quote:

Somehow the EPICS system got hosed tonight. We're pretty much dead in the water till we can get it sorted.


The problem was caused by the installation of a DNS server into linux1 by Joe.
Joe removed /etc/hosts file after running the DNS server (bind). This somehow prevented proper boot of
frontend computers.
Joe and I confirmed that putting back /etc/hosts file resolved the problem.
Right now, the DNS server is also running on linux1.

We are not sure why /etc/hosts file is still necessary. My guess is that the NFS server somehow reads /etc/hosts
when he decides which computer to allow mounting. We will check this later.

Anyway, now the computers are mostly running fine. The X-arm locks.
The Y-arm doesn't, because one of the digital filters for the Y-arm lock fails to be loaded to the frontend.
I'm working on it now.
  964   Thu Sep 18 13:05:05 2008 YoichiUpdateComputersEPICS BACK

Quote:

The Y-arm doesn't, because one of the digital filters for the Y-arm lock fails to be loaded to the frontend.
I'm working on it now.


Rob told me that the filter "3^2:20^2" is switched on/off dynamically by the front end code for the LSC.
Therefore, the failure to manually load it was not actually a problem.
The Y-arm did not lock just because the alignment was bad.
Now the Y-arm alignment is ok and the arm locks.
  965   Thu Sep 18 14:36:54 2008 josephbConfigurationComputersName server and Epics
The problems Rob was experiencing last night was due to part of the setup (or rather testing of the setup) of the new nameserver running on linux1.

The name server was setup on linux1 by doing the following:

1) Installed xorg-x11-xauth via yum which was necessary to get remote x windows to work in linux1

2) Installed xorg-x11-fonts-Type1 in order to get the gui system-config-* programs to work

3) Ran system-config-bind, which created a default set of nameserver files. I unfortunately didn't understand the gui all that well, so I manually edited and added files to these base ones. The base files were generated in /var/named/chroot/etc/ and /var/named/chroot/var/named.

4) I added martian.zone and 113.215.131.in-addr.arpa.zone, named.conf.local, and edited named.conf so it loaded named.conf.local. The martian.zone file acts a forward look up (i.e. give it a name and it returns an IP number like 131.215.113.20). The 113.215.131.in-addr.arpa.zone acts as a reverse look up (i.e. give it an IP number like 131.215.113.20 and it tells you the name). The file named.conf.local merely points to these two files.

Note: One can add or change IP lookup by simply updating these two files. The format should be obvious from the files.

5) I specifically ssh'd in as root to linux1 (using su wasn't sufficient) and then typed "service named start" (without quotes). You can also use "restart" or "stop" instead of "start". This started the name server, giving an [Ok] message.

6) I edited the /etc/resolve.conf file on linux1 so that it pointed to itself first ("nameserver 127.0.0.1" at the top of the file). I also added the line "search martian", which allows one to simply use linux1 as opposed to linux1.martian.

I also edited the /etc/resolve/conf file on linux2, and it seems to resolve names fine.

7) And here is where I broke things. As a test, I moved /etc/hosts to /etc/hosts.bak, and then tested to see if names were being resolved correctly. By using the command host, I determined they were in fact working. I also tested with ssh.

However, something basic didn't like me moving the hosts file. Apparently when a front-end machine needed to reboot, it wouldn't come back up, without any ability to SSH or telnet into them.

With Yoichi and I did quite a bit of debugging this morning and determined the nameserver itself isn't conflicting, merely the lack of the host file was the source of the problem. One theory is that services don't know to go to DNS to resolve host names. I think by modifying the /etc/nsswitch.conf file to include dns as an option for services and other programs, it might work without the host file, however, I'm going to leave that to tomorrow morning which is less likely to interfere with current operations.

As it stands, things are working with the nameserver running and the host file in place.
  966   Thu Sep 18 18:38:14 2008 YoichiHowToComputersHow to compile an SNL code for VxWorks
Dave Barker guided me through how to compile an SNL code into a Motorola 162 CPU object.

Here is the procedure:

(1) You need an account at LHO and a password for ops account at LHO. Contact Dave if you don't have these.

(2) Copy your code (say Particle.st) to the LHO gateway machine.
scp Particle.st username@lhocds.ligo-wa.caltech.edu:/cvs/cds/lho/target/t0sandbox0
(3) Login to lhocds.ligo-wa.caltech.edu
ssh username@lhocds.ligo-wa.caltech.edu
(4) Login to control0
ssh ops@control0
(5) Change directory to the sandbox dir.
cd /cvs/cds/lho/target/t0sandbox0
(6) Prepare for the compilation
setup epics
(7) Edit makefile in the directory. You have to modify a few lines at the end of the file.
There are comments for how to do it in the file.

(8) Compile
make Particle.o
(9) Copy the object file to the 40m target directory
scp Particle.o controls@nodus.ligo.caltech.edu:/cvs/cds/caltech/target/c1psl/

That is it.
  969   Fri Sep 19 00:18:14 2008 ranaUpdateComputerssvn is old
linux2:mDV>ssh nodus
Password:
Last login: Fri Sep 19 00:11:44 2008 from gwave-69.ligo.c
Sun Microsystems Inc.   SunOS 5.9       Generic May 2002
nodus:~>c
nodus:caltech>cd apps/
nodus:apps>cd mDV
nodus:mDV>svn update
svn: This client is too old to work with working copy '.'; please get a newer Subversion client
nodus:mDV>whoami
controls
nodus:mDV>uname -a
SunOS nodus 5.9 Generic_118558-39 sun4u sparc SUNW,A70 Solaris
nodus:mDV>pwd
/cvs/cds/caltech/apps/mDV
nodus:mDV>
Frown
  972   Fri Sep 19 09:49:42 2008 YoichiUpdateComputerssvn is old
The problem below is fixed now.
The cause was .svn/entries and .svn/format had wrong version number "9" where it had to be "8".
I changed those files in all the sub-directories. Now svn up runs fine.
I don't know how this version discrepancy happened.



Quote:
linux2:mDV>ssh nodus
Password:
Last login: Fri Sep 19 00:11:44 2008 from gwave-69.ligo.c
Sun Microsystems Inc.   SunOS 5.9       Generic May 2002
nodus:~>c
nodus:caltech>cd apps/
nodus:apps>cd mDV
nodus:mDV>svn update
svn: This client is too old to work with working copy '.'; please get a newer Subversion client
nodus:mDV>whoami
controls
nodus:mDV>uname -a
SunOS nodus 5.9 Generic_118558-39 sun4u sparc SUNW,A70 Solaris
nodus:mDV>pwd
/cvs/cds/caltech/apps/mDV
nodus:mDV>
Frown
  973   Fri Sep 19 11:21:45 2008 josephbConfigurationComputersNameserver and Rosalba
I tried modifying the nsswitch.conf file to include going to dns in addition to local files for everything (services, network, etc) and then moving the /etc/hosts file to /etc/hosts.bak. Unfortunately, this still didn't allow front-ends to reboot properly. So I'm not sure what is using the hosts file, but whatever it is, is apparently important. After the test I placed the hosts file back and reverted the nsswitch.conf file.

I also noticed that Rosalba was having problems connecting to the network. This apparently was because I had shut down the dhcp server on the NAT router, as had been discussed at the meeting on Wednesday.

To fix this, I modified the /etc/sysconfig/network-scripts/ifcfg-eth1 file to fix rosalba's ip as 131.215.113.24 (which doesn't seem to be in use). I also updated rosalba's /etc/resolv.conf file to point at linux1's name server, and two additional name servers as well, and added the "search martian" line. I modified the /etc/sysconfig/network-scripts/ifcfg-eth0 file so the built in network card doesn't come up automatically, since its currently not plugged into anything. Lastly, I added rosalba and its IP to linux1's name server files.
  974   Fri Sep 19 11:48:14 2008 steveUpdateComputers old hubs can make one happy
Joseph finds a XIX century bottle neck hub: CentreCOM 3624TR 10Base-T
and happily replaces it with Netgear GS724T 1000Base-T
Attachment 1: P1020934.jpg
P1020934.jpg
  977   Mon Sep 22 16:51:27 2008 YoichiHowToComputersNetwork GPIB
I was able to make the wireless connected GPIB interface work with SR785.
Now you can download data from SR785 through network, wherever it is located.
Say good bye to floppy disks.

I wrote an installation note in the wiki.
http://lhocds.ligo-wa.caltech.edu:8000/40m/GPIB

I wrote a new script called "netgpibdata.py" which works similarly as "getgpibdata.py".
It is in the 40m svn. Instructions on how to use it is on the above mentioned wiki page.
  983   Tue Sep 23 00:47:24 2008 YoichiHowToComputersNetwork GPIB

Quote:

I wrote a new script called "netgpibdata.py" which works similarly as "getgpibdata.py".
It is in the 40m svn. Instructions on how to use it is on the above mentioned wiki page.


netgpibdata.py is now installed on the controls machines (/cvs/cds/caltech/scripts/general/netgpibdata/netgpibdata.py).
You can use it like,
netgpibdata.py -i 131.215.113.106 -d AG4395A -a 10 -f spectrum01

In this example, data from Agilent 4395A analyzer at GPIB address 10 connected to the GPIB-LAN box with the IP address 131.215.113.106
is downloaded and saved to spectrum01.dat. The measurement parameters are saved to spectrum01.par.
  990   Thu Sep 25 03:12:13 2008 ranaSummaryComputersconlog and linux1
It would be nice to have conlog from outside. Right now its on linux1 and so its unavailable. To
test it for speed we ran the command line conlog on linux1, linux2, and nodus.

It was slightly faster on nodus than linux1, implying that its not a network speed issue. It was
phenomenally slower on linux2.

I used the command '/sbin/lspci -vvv' to check what network cards are installed where. As it turns
out, linux2 has a GigE card, but linux1, our NFS server, has only a 100 Mbit card:
01:08.0 Ethernet controller: Intel Corporation 82562EZ 10/100 Ethernet Controller (rev 01)
        Subsystem: Intel Corporation Unknown device 304a
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min, 14000ns max), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 209
        Region 0: Memory at ff8ff000 (32-bit, non-prefetchable) [size=4K]
        Region 1: I/O ports at bc00 [size=64]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=2 PME-

We (Joe) need to buy a GigE card for linux1 and to also set up conlog and conlogger to run on Nodus.
  992   Thu Sep 25 14:03:08 2008 josephbConfigurationComputers 

Quote:

We (Joe) need to buy a GigE card for linux1 and to also set up conlog and conlogger to run on Nodus.


A spare Intel Pro 1000/GT desktop adapter (gigabit ethernet card) has been added to Linux1 and is now using that card to connect to the network.

This was after a slight scare when I somehow reset the bios on Linux1 during the first reboot after adding the card.
After some debugging and discussion with Yoichi, the bios was fixed and the computer works again, with its new faster network connection.

Although we both noted that Linux1 is a rather old machine, with only half a gig of Ram and reaching about 80% capacity on its 58 gigabyte hard drive (raid). Might be worth upgrading in general.

Need to figure out how to install conlog/conlogger programs next...
  997   Fri Sep 26 14:10:21 2008 YoichiConfigurationComputersLab laptops maintenance
The linux laptops were unable to write to the NFS mounted directories.
That was because the UID of the controls account on those compters was different from linux1 and other control room computers.
I changed the UID of the controls account on the laptops. Of course it required not only editing /etc/password but also dealing with
numerous errors caused by the sudden change of the UID. I had to chown all the files/directories in the /home/controls.
I also had to remove /tmp/gconf-controls because it was assigned the old UID.

Whenever we add a new machine, we have to make sure the controls account has the same UID/GID as other machines, that is 1001/1001.


I did some cleanups of the laptop environment.
I made dataviewer work on the laptops *locally*. We no longer have to ssh -X to other computers to run dataviewer.
The trick was to install grace using Fedora package by
sudo yum install grace
Then i modified /usr/local/stow_pkgs/dataviewer/dataviewer to change the option to dc3 from "-s fb" to "-s fb40m".
  1006   Mon Sep 29 13:33:39 2008 josephbConfigurationComputersGigabit network finished and conlog available on Nodus
The last 100 Mb unmounted hub has been removed (or at least of the ones I could find). We should be on a fully gigabit network with Cat6 cables and lots and lots of labels.

In other news, the pearl script that runs the web interface on linux1 for the conlog has been copied to /cvs/cds/caltech/apache/cgi-bin/ and is now being pointed to by the apache server on Nodus.

https://nodus.ligo.caltech.edu:30889/cgi-bin/conlog_web.pl
  1015   Wed Oct 1 12:05:58 2008 AlbertoConfigurationComputers"StochMon" added to the Alarm Handler
John, Alberto,

we added the four channels of the RF Amplitude Monitor (aka StochMon) to the Alarm HAndler. First we modified the 40m.alh file just copying some lines and switching the name of the channels to the ones we wanted. Than we also added a few lines to the database file ioo.db in order to define the alrm levels. So far I used just test values for the thresholds of green, yellow and red states and need to update to some reasonable ones. To do that I need to calibrate those EPICS channels. I have the old data saved and I'm now trying to figure out how to properly change the database file.
  1016   Wed Oct 1 12:09:25 2008 AlbertoConfigurationComputers"StochMon" added to the Alarm Handler

Quote:
John, Alberto,

we added the four channels of the RF Amplitude Monitor (aka StochMon) to the Alarm HAndler. So far I used just test values for the thresholds of green, yellow and red states and need to update to some reasonable ones. To do that I need to calibrate those EPICS channels. I have the old data saved and I'm now trying to figure out how to properly change the database file.


John, Yoichi, Alberto

We restarted the C1iool0 computer both directly by the main key and remotely via telnet. We had to do it a couple of times and in one occasion the computer didn't restart properly and had connection problem with the newtowrk. We had to call Alex that did just the same thing, but used the CTRL+X command to reboot. It worked and the Alarm Handler now includes the StochMon.
  1031   Tue Oct 7 12:17:57 2008 AlbertoConfigurationComputersTime reset on MEDM
Yoichi, Alberto

I noticed the MEDM screen time was about 7 minutes ahead of the right time. The time on MEDM is read on channel C0:TIM-PACIFIC_STRING which takes it from the C1VCU-EPICS computer. Yoichi found that that computer did not have the right time because one of the startup scripts, ntpd, which are contained in the directory /etc/init.d/ for some reason did not start. So restring it by typing ./ntpd start updated the time on that computer and fixed the problem.
  1033   Wed Oct 8 12:35:56 2008 josephbConfigurationComputersNew Network diagram for the 40m
Attached is a pdf of the new network diagram for the 40m after having removed all of the old hubs.
Attachment 1: 40m_network_10-07-08.pdf
40m_network_10-07-08.pdf
  1038   Fri Oct 10 00:34:52 2008 robOmnistructureComputersFEs are down

The front-end machines are all down. Another cosmic-ray in the RFM, I suppose. Whoever comes in first in the morning should do the all-boot described in the wiki.
  1039   Fri Oct 10 10:20:42 2008 AlbertoOmnistructureComputersFEs are down

Quote:

The front-end machines are all down. Another cosmic-ray in the RFM, I suppose. Whoever comes in first in the morning should do the all-boot described in the wiki.


Yoichi and I went along the arms turning off and on all the FE machines. Then, from the control room we rebooted them all following the procedures in the wiki. Everything is now up again.

I restored the full IFO, re-locked the mode cleaner.
  1040   Fri Oct 10 13:57:33 2008 AlbertoOmnistructureComputersProblems in locking the X arm
This morning for some reason that I didn't clearly understand I could not lock the Xarm. The Y arm was not a problem and the Restore and Align script worked fine.

Looking at the LSC medm screen something strange was happening on the ETMX output. Even if the Input switch for c1:LSC-ETMX_INMON was open, there still was some random output going into c1:LSC-ETMX_INMON, and it was not a residual of the restor script running. Probably something bad happened this monring when we rebooted all the FE computers for the RFM network crash that we had last night.

Restarting the LSC computer didn't solve the problem so I decided to reboot the scipe25 computer, corresponding to c1dcuepics, that controls the LSC channels.

Somehow rebooting that machine erased all the parameters on almost all medm screens. In particular the mode cleaner mirrors got a kick and took a while to stop. I then burtrestored all the medm screen parameters to yesterday Thursday October 9 at 16:00. After that everything came back to normal. I had to re-lock the PMC and the MC.

Burtrestoring c1dcuepics.snap required to edit the .snap file because of a bug in burtrestore for that computer wich adds an extra return before the final quote symbol in the file. That bug should be fixed sometime.

The rebooting apparently fixed the problem with ETMX on the LSC screen. The strange output is not present anymore and I was able to easily lock the X arm. I then run the Align and the Restore full IFO scripts.
  1041   Fri Oct 10 20:03:35 2008 YoichiConfigurationComputersmedm, dataviewer, dtt on 64 bit linux
I compiled EPICS (base, medm and ezca) and dataviewer for 64 bit linux.
These are installed in /cvs/cds/caltech/apps/linux64/.
I also configured cshrc.40m to make it possible to run the 32bit dtt on 64bit machines.
64bit ligotools is also installed to /cvs/cds/caltech/apps/linux64/ligotools although I haven't tested it extensively.

With those essential tools available for 64bit linux, Joe and I decided to install 64bit CentOS to the new linux machine.
It is named allegra.
Now, medm, dataviewer, dtt, awg, foton and ezca commands all work on rosalba and allegra.
I put some notes on how to make things work on 64bit in the wiki.
http://lhocds.ligo-wa.caltech.edu:8000/40m/Building_LIGO_softwares_for_64_bit_linux

I compiled dtt (actually the whole GDS tree) for 64bit linux and the build process finished normally.
But somehow dtt does not work properly. It starts on my laptop but does not retrieve data. It crashes on rosalba.
So I had to retreat to 32bit.
  1047   Tue Oct 14 19:18:18 2008 YoichiUpdateComputersBootFest
Rana, Yoichi

Most of the FE computers failed around the lunch time.
We power cycled those machines and now all of them are up and running.
I confirmed that the both arms lock.
Now the IFO is in "Restore last auto-alignment" status.
  1055   Fri Oct 17 16:43:10 2008 YoichiConfigurationComputersmcup/down scripts on linux
I made some changes to /cvs/cds/caltech/medm/c1/ioo/cmd/medmMCup and medmMCdown so that those can be run from medm on linux machines.
  1059   Mon Oct 20 15:02:00 2008 YoichiConfigurationComputers/cvs/cds restored
I moved missing files in /cvs/cds restored by Alan and Stuart to the original locations.
I confirmed autoburt runs, and dtt, which had also been having trouble running, runs ok now.

I found an interesting piece of evidence on allegra, our new 64bit linux machine.
In the Trash of controls Desktop on that machine, there is /cvs/cds/vw/ directory.
I remember that when I last time emptied the trash bin on the machine (yesterday), it took somewhat long time.
Too bad that I did not pay attention to what was actually in the Trash, but now I have a feeling that in the Trash were
missing /cvs/cds/* directories.
While emptying the Trash, I encountered several errors saying permission denied or something like that, and skipped those files.
Sometimes, when you move something from NFS mounted directories to the Trash, you get this kind of errors.
So my guess is that someone accidentally (or intentionally) moved /cvs/cds/* except for "caltech" to the Trash of allegra.
And I completely removed them carelessly.
  1060   Mon Oct 20 16:18:00 2008 AlanConfigurationComputers/cvs/cds restored

Quote:
I moved missing files in /cvs/cds restored by Alan and Stuart to the original locations.
I confirmed autoburt runs, and dtt, which had also been having trouble running, runs ok now.

I found an interesting piece of evidence on allegra, our new 64bit linux machine.
In the Trash of controls Desktop on that machine, there is /cvs/cds/vw/ directory.
I remember that when I last time emptied the trash bin on the machine (yesterday), it took somewhat long time.
Too bad that I did not pay attention to what was actually in the Trash, but now I have a feeling that in the Trash were
missing /cvs/cds/* directories.
While emptying the Trash, I encountered several errors saying permission denied or something like that, and skipped those files.
Sometimes, when you move something from NFS mounted directories to the Trash, you get this kind of errors.
So my guess is that someone accidentally (or intentionally) moved /cvs/cds/* except for "caltech" to the Trash of allegra.
And I completely removed them carelessly.


In the meantime, I have re-started the nightly backup for /frames/minute-trends
but NOT YET for /cvs/cds ,
since I fear that we'll find another problem and will need to go back to the June 27 backup.
Let's wait a few days for the dust to settle,
and if everyone feels confident that /cvs/cds is ok,
I'll restart the backup of that.

How I restored the files, for the record:

Stuart mounted /archive/backup onto an accessible computer (garrak.ligo.caltech.edu ) and I logged on to controls@nodus and ran this command:

/cvs/cds/caltech/scripts/backup/rsync --rsync-path=/usr/bin/rsync --rsh=/usr/bin/ssh --compress --verbose --archive --hard-links --exclude=caltech/ ajw@garrak.ligo.caltech.edu:/backup/40m/cvs /cvs/cds/recover_20081020

I had to type in my GC password, and it ran for ~20 minutes (would have been much longer had I asked for /cvs/cds/caltech as well!).

you can view the backups by logging on to garrak.ligo.caltech.edu with your GC account:
/backup/40m/cvs/cds/
/archive/frames/trend/minute-trend/40m
  1065   Tue Oct 21 18:19:42 2008 YoichiConfigurationComputersLISO and Eagle installed
I installed LISO, a circuit simulation software, into the control room linux machines.
I also installed a PCB CAD called Eagle to serve as a graphical editor for LISO.
I put a brief explanation in the wiki.
http://lhocds.ligo-wa.caltech.edu:8000/40m/LISO

As a demonstration, I made a model of the FSS PC path and did a stability analysis of the op-amps.

The first attachment is the schematic of the model.
You can find the model in /cvs/cds/caltech/apps/linux/eagle/projects/liso-examples/FSS

The second attachment shows the stability analysis plot of the first two op-amps when AD829s are used.
The op-amp model is for the uncompensated AD829. The graph includes the bode plots of the open-loop transfer function of each op-amp.
If the phase delay is more than 360deg (in the plot it is 0 deg because the phase is wrapped within +/-180 deg) at the unity gain frequency,
the op-amp is unstable.
It is clear from the plot that this circuit is unstable. This is consistent with what I experienced when I replaced the chips to AD829 without
compensation.
Unfortunately, I don't have an op-amp model for phase compensated AD829. So I can't make a plot with compensation caps.

The third attachment is the stability analysis of the same circuit with AD797. It also shows that the circuit is unstable at 200MHz, though I
observed oscillation at 50MHz.

Finally, I did an estimate of frequency noise contribution from the noise of AD829.
First I estimated the voltage noise at the output of the board caused by the first AD829 using LISO's noise command.
Then I converted it into the input equivalent noise at the stage right after the mixer by calculating the transfer function
of the circuit using LISO.
Within the control bandwidth of the FSS, this input equivalent noise appears at the mixer output with the opposite sign.
Since we know the calibration factor from the mixer output voltage to the frequency noise, we can convert this into the frequency noise.
The final attachment is the estimated contribution of the AD829 to the frequency noise. As expected, it is negligible.
Attachment 1: FSS_PC_Path.pdf
FSS_PC_Path.pdf
Attachment 2: AD829Stability.png
AD829Stability.png
Attachment 3: AD797Stability.png
AD797Stability.png
Attachment 4: FreqNoiseByAD829.png
FreqNoiseByAD829.png
  1066   Wed Oct 22 09:42:41 2008 AlbertoDAQComputersc1iool0 rebooted and MC autolocker restarted
This morning I found the MC unlocked. The MC-Down script didn't work because of network problems in communicating with scipe7, a.k.a. c1iool0. Telneting to the computer was also impossible so I power cycled it from its key switch. The first time it failed so I repeated it a second time and then it worked.
Yoichi then restarted c1iovme. It was also necessary to restart the MC autolocker script according to the following procedure:
- ssh into op440m
- from op440m, ssh into op340m
- restart /cvs/cds/caltech/scripts/scripts/MC/autolockMCmain40
  1067   Wed Oct 22 12:37:47 2008 josephbUpdateComputersNetwork spreadsheet
Attached in open office format as well as excel format is spreadsheet containing all the devices with IP addresses at the 40m. Please contact me with any corrections.
Attachment 1: 40m_network_10-15-08.ods
Attachment 2: 40m_network_10-15-08.xls
  1070   Wed Oct 22 20:50:30 2008 AlbertoOmnistructureComputersGPS
Today I measured the GPS clock frequency at the output of CLOCK_MON in a board on the same crate where the c1iool0 computer is located. The monitor was connected with a BNC cable to the 10MHz reference input of the frequency counter on top of that rack, where it was used to check the 166MHz coming from one of the Marconi.

The frequency was supposed to be 10MHz but I actually measured 8 MHz. I tracked down the GPS input cable to the board and it turned out to come from one of the 1Y7 rack. Here it was connected to a board with a display that was showing corrupted digits, plus some leds on the front panel were red.

I'm not sure the GPS reference is working properly.
  1076   Thu Oct 23 18:51:19 2008 AlbertoMetaphysicsComputerseLog
I checked it and the latest version of the elog software, the 2.7.5 (we have the 2.6.5) has, among new nice features, the very good ability to fit the entries into the screen width without showing kilometric lines like we see now. Should we upgrade it?
  1088   Fri Oct 24 20:54:41 2008 ranaConfigurationComputerslinux2
I have removed linux2 and its cables from the control room and put it into 1Y3 along with op340m.

When Joe next comes in we can ask him to Cat6 it to the rest of the world, although it already
seems to me that the CDS hub/switch next Alberto's desk is too full and that we need to purchase
a 48 port device for there.
  1098   Tue Oct 28 12:01:01 2008 josephbConfigurationComputerslinux2

Quote:
I have removed linux2 and its cables from the control room and put it into 1Y3 along with op340m.

When Joe next comes in we can ask him to Cat6 it to the rest of the world, although it already
seems to me that the CDS hub/switch next Alberto's desk is too full and that we need to purchase
a 48 port device for there.


Note I still need to remove a fair bit of cabling no longer in use from the Martian network switch next to Alberto's desk. There's actually about 8-10 cables there which show no connectivity and are not being used. So there's really about 33% of the ports open in the control room hub, it just doesn't look like it.

As for linux2, I'll probably just connect it to the 1Y2 or 1Y6 Hubs when I get the chance.
  1101   Thu Oct 30 11:07:25 2008 YoichiUpdateComputersWireless bridges arrived
Five wireless bridges for the GPIB-Ethernet converters arrived.
One of them had a broken AC adapter. We have to send it back.
I configured the rest of the bridges for the 40MARS wireless network.
One of them was installed to the SR785.
I put the remaining ones in the top drawer of the cabinet, on which the label printers are sitting.
You can use those to connect any network device with a LAN port to the 40MARS network.
  1119   Thu Nov 6 22:07:56 2008 ranaConfigurationComputersELOG compile on Solaris
From the ELOG web pages:

Solaris:

Martin Huber reports that under Solaris 7 the following command line is needed to compile elog:

gcc -L/usr/lib/ -ldl -lresolv -lm -ldl -lnsl -lsocket elogd.c -o elogd

With some combinations of Solaris servers and client-side browsers there have also been problems with ELOG's keep-alive feature. In such a case you need to add the "-k" flag to the elogd command line to turn keep-alives off.
  1126   Mon Nov 10 11:32:49 2008 robUpdateComputersc1iscex rebooted

it was running a few cycles late
  1157   Fri Nov 21 21:28:32 2008 ranaSummaryComputersc0daqawg restart
A few minutes after restarting fb0 for the Guralp channels, the DAQAWG lights went red on the DAQ screens.
Why?? I chose revival procedure #3 for c0daqawg from the Wiki and it came back in a couple minutes.
  1159   Mon Nov 24 16:43:34 2008 ranaConfigurationComputersAlex and Jay took away some computers from the racks
I was over at Wilson house and saw Jay and Alex bring in 3 rackmount computers. One was a Sun 4600 and
then there were 2 3U black boxes. I got the impression that these were the data concentrators or
data collectors or framebuilder test boxes. They said that they got these from the 40m and no one was
in the lab to oppose them except for Bob and he didn't put up much of a fight.

Everything looks green on the DAQ Detail and RFM network screens so perhaps everything is OK. Beware.
  1170   Wed Dec 3 12:49:11 2008 jenneUpdateComputerssomething sketchy with NDS ... or something
Never mind...I had forgotten that you have to run mdv_config every time you open matlab, not just every time you boot a computer.

I am not able to get channels using get_data from the mDV toolbox on Allegra, Megatron or Rosalba.

The error I get while running the "hello_world" test program is:
hello_world
setting up configuration...
added paths for nds
added paths for qscan
couldn't add path for matapps_SDE
couldn't add path for matapps_path
couldn't add path for framecache
couldn't add path for ligotools_matlab
added paths for home_pwd
fetching channels for C...
Warning: get_channel_list() failed.
??? Error using ==> NDS_GetChannels
Failed to get channel list.

Error in ==> fetch_nds at 47
eval(['CONFIG.chl.' server ' = NDS_GetChannels(ab);']);

Error in ==> get_data at 100
out = fetch_nds(channels,dtype,start_time,duration);

Error in ==> hello_world at 6
aa = get_data('C1:LSC-DARM_ERR', 'raw', gps('now - 1 hour'), 32);
ELOG V3.1.3-