40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 80 of 339  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  658   Fri Jul 11 00:30:24 2008 robMetaphysicsComputersstrange SUS controllers

rob, johnnieM

We were hampered early tonight by the fact that someone sneakily turned off the HP RF Ampflier on the AS table.

After that, we were hampered further by mode cleaner strangeness. It would occasionally spontaneously unlock & blow its watchdogs. It never made it through the ontoMCL script (putting DC-CARM onto the MCL). After some investigation, we found that c1susvme1 and c1susvme2 were running stochastically late (SYNC_FE != 0), even though their computation times never got above 61. Also, the end SUS controllers were never late.

Weird.

After rebooting the vertex SUS controllers and the c1lsc, things appear to be working again.
  667   Mon Jul 14 12:43:07 2008 JohnSummaryComputersRestarted fb40m, tpman and c1ass
  682   Wed Jul 16 16:28:14 2008 josephbConfigurationComputersFixed IP address on Switch
Realized today that the change I made back on June 30th to the switch was to the wrong switch. I had disabled the DHCP setting and mislabeled the switch in the control room (which seems to not have affected anything).

I've turned DHCP back on and labeled it correctly using the Netgear "Smartwizard discovery" program.
  695   Fri Jul 18 17:06:20 2008 JenneUpdateComputersComputers down for most of the day, but back up now
[Sharon, Alex, Rob, Alberto, Jenne]

Sharon and I have been having trouble with the C1ASS computer the past couple of days. She has been corresponding with Alex, who has been rebooting the computers for us. At some point this afternoon, as a result of this work, or other stuff (I'm not totally sure which) about half of the computers' status lights on the MEDM screen were red. Alberto and Sharon spoke to Alex, who then fixed all of them except C1ASC. Alberto and I couldn't telnet into C1ASC to follow the restart procedures on the Wiki, so Rob helped us hook up a monitor and keyboard to the computer and restart it the old fashioned way.

It seems like C1ASC has some confusion as to what its IP address is, or some other computer is now using C1ASC's IP address.

As of now, all the computers are back up.
  700   Fri Jul 18 19:43:55 2008 YoichiDAQComputersPSL fast channels cannot be read by dataviewer
At this moment only the PSL fast channels have trouble.
Rob restarted fb40m, c1IOVME, but no effect.
  724   Wed Jul 23 16:31:02 2008 AlbertoConfigurationComputersMegatron connected
Joe, Rana, Alberto,

we found out the password for Megatron so we could log in and set a new one so that now it's the same as that for controls.
The IP address is 131.215.113.59.

We had to switch to another LAN ports to actually connect it.
  725   Wed Jul 23 17:19:48 2008 AlbertoConfigurationComputersMegatron connected
We changed the IP address. Ther new one is 131.215.113.95.

Joe, Alberto


Quote:
Joe, Rana, Alberto,

we found out the password for Megatron so we could log in and set a new one so that now it's the same as that for controls.
The IP address is 131.215.113.59.

We had to switch to another LAN ports to actually connect it.
  742   Sat Jul 26 15:09:57 2008 AidanUpdateComputersReboot of op440m

I was reviewing the PSL Overview screen this afternoon and op440m completely froze when I center-clicked on the REF CAVITY TRANSMISSION indicator. It was unresponsive to any keyboard or mouse control. The moon button had no effect to shut the machine down.

Called Alberto in and we logged into op440m from rosalba. From there we logged in as 'root' and run a shutdown script '/usr/sbin/shutdown -i S -g 1'. The medm screens started disappearing from the op440m display and we were eventually asked to enter System Maintenance Mode. From here we selected RUN LEVEL 5: "state 5: Shut the machine down so that it is safe to remove the power". Following this the machine turned itself off.

We powered it back on, logged back in as controls and restarted the medm screens. Everything seems to be running fine now.
Aidan.
  744   Sun Jul 27 20:49:21 2008 ranaConfigurationComputersNTP
After Aidan did whatever he did on op440m, I had to restart ntpd. I noticed it didn't actually do
anything so I restarted it by hand with the '-l' option to make a logfile. Essentially, the
problem is that NTPD is not allowed access to the outside world's NTP servers by our NAT router;
this should be fixed.

So for now I set all of the .conf files to point to rana and nodus' IP addresses. According to the
log files, that is successful. Rosalba and Mafalda, however, seem to have correct time but are
looking at rhel.ntp.pool.org and time.nist.gov, respectively. Maybe these have special rules?

For reference, the linux machines' conf files are /etc/ntp.conf
and the solaris machines' conf files are /etc/inet/ntp.conf

I also logged into dcuepics (aka scipe25) and did as instructed.
  777   Thu Jul 31 16:11:22 2008 josephbConfigurationComputersMatlab on Megatron
Matlab now works on megatron.

I did a few things:

1) Added to the PATH environment variable. Did this in .bash_profile in the /home/controls directory by adding the line

PATH=$PATH:/cvs/cds/caltech/apps/linux64/matlab/bin/
export PATH

This probably should be somewhere else up further up the line, but I was too lazy to figure it out.

2)Fixed a gateway mistake I had added earlier so the megatron could use the NAT router and see the outside world so yum worked.

3) Removed the i386 based libXp and openmotif packages.

4) Installed the x86_64 based libXp and openmotif packages.

Edit: Forgot that I also added the following line to the /etc/fstab file in order to mount the shared code. This was stolen directly from Rosalba's /etc/fstab file. This was so that it could see the matlab code.
linux1:/home/cds/ /cvs/cds nfs rw,bg,soft 0 0
  779   Fri Aug 1 10:45:46 2008 josephbConfigurationComputersMegatron now running tcsh
At Rana's request, I've remotely switched Megatron over to using tcsh. I had to ssh -X in order ot use the "/sbin/system-config-users" program which is a graphical UI for modifying users. I had to go to preferences and uncheck hide system users, which then allowed me to see the controls user (at the bottom of the list), and edit it.

I also created a .tcshrc file in the /home/controls directory and copied the information from the .bashrc file, and also moved the matlab path definition into the PATH environment variable.

Does anyone know if sourcing /cvs/cds/caltech/cshrc.40m would be usable on a 64 bit machine, or does a new one need to be made for Megatron and/or Rosalba?
  780   Fri Aug 1 11:51:15 2008 justingOmnistructureComputersadded /cvs/cds/site directory
I added a /cvs/cds/site directory. This is the same as is dicsussed here. Right now it just has the text file 'cit' in it, but eventually the other scripts should be added. I'll probably use it in the next version of mDV.
  815   Fri Aug 8 12:21:57 2008 josephbConfigurationComputersSwitched X end ethernet connections over to new switch
In 1X4, I've switched the ethernet connections from c1iscex and c1auxex over to the new Prosafe 24 port switches. They also use the new cat6 cables, and are labeled.

At the moment, everything seems to be working as normally as it was before. In addition:

I can telnet into c1auxex (and can do the same to c1auxey which I didn't touch).
I can't telnet into c1iscex (but I couldn't do that before, nor can I telnet into c1iscey either, and I think these are computers which once running don't let you in).
  822   Mon Aug 11 11:36:11 2008 josephb, SteveConfigurationComputersc1susvme1 minor problems
Around 11 am c1susvme1 start having issues. Namely C1:SUS-PRM_FE_SYNC was railing at some large value like 16384 (2^14). I presume this means the computer was running catastophically late.

I turned off the BS and ITM watch dogs (the PRM was already off), tried hitting reset and sshing in, and running startup, but this didn't help. I then turned off the c1susvme2 associated watch dogs (MC1-3, SRM) and went out to do a hard reboot by switching the crate power off. c1susvme2 came back up fine, was restarted and associated watch dogs turned back on. However, c1susvme1 came back up without mounting /cvs/cds/.

As a test, I replaced the ethernet connection with a CAT6 cable to the Prosafe switch in 1Y6, and then ran reboot on c1susvme1. When it came back up, it had mounted properly, and I was able to run the ./startup.cmd file. At this point it seems to be happy. The new cable is in the trays coming in from the top of the 1Y4 and 1Y6 and approriately labeled.

Edit: Apparently ITMX and ITMY became excited after the reboot (perhaps I turned the watchdogs back on too early? Although that was after the DAQ light was listed as green for c1susvme). Steve noticed this when the alarms went off again (I had turned them off after the reboot seemed successful), and he damped them. Interestingly, the BS remained unexcited.
  823   Mon Aug 11 12:42:04 2008 josephbConfigurationComputersContinuing saga of c1susvme1
Coming back after lunch around 12:30pm, c1susvme1's status was again red. After switching off watchdogs, a reboot (ssh, su, reboot) and restarting startup.cmd, c1susvme1 is still reporting a max sync value (16384), occassionally dropping down to about 16377. The error light cycles between green and red as well.

At this point, I'm under the impression further reboots are not going to solve the problem.

Currently leaving the watchdogs associated with c1susvme1 off for the moment, at least until I get an idea of how to proceed.
  824   Mon Aug 11 13:59:23 2008 josephbConfigurationComputers 
While poking around the crate, I noticed an error light on one of the c1susvme2 related boards was lit, while the corresponding light on the c1susvme1 was not. This confuses me as the c1susvme1 is the one having problems.

As a quick sanity check, I unplugged the ethernet connection from the c1susvme1 labeled board, and confirmed I couldn't log into it, and then plugged it back in, restarted it, and re-ran the startup script. This time c1susvme1 seemed to come up fine. Re-enabling the watchdogs doesn't seem to kick anything, and in fact seems to be bringing everything into line properly.

Although the error light on the c1susvme2 clk drvr board is still on. So I'm not sure what thats trying to tell us. Open to suggestions.
  825   Mon Aug 11 15:07:49 2008 josephbConfigurationComputersProcyon aka fb40m switched to new switch
I've connected Procyon to the Prosafe 24 port switch with a new, labeled Cat6 cable. Quick tests with dataviewer shows that its working.
  827   Tue Aug 12 12:05:36 2008 YoichiUpdateComputersHP color printer is back
I restarted the HP printer server (a little box connected to the HP color laser) so that we can use the HP LaserJet 2550.
After this treatment, the printer spat out a bunch of pages from suspended jobs, many of these were black and white.
I think people should use the black-and-white printer for these kind of jobs, because the color printer is slow and troublesome.
  852   Tue Aug 19 13:34:58 2008 josephbConfigurationComputersSwitched c1pem1, c0daqawg, c0daqctrl over to new switches
Moved the Ethernet connections for c1pem1, c0daqawg, and c0daqctrl over to the Netgear Prosafe switch in 1Y6, using new cat6 cables.
  858   Wed Aug 20 11:42:49 2008 JohnSummaryComputerspdftk
I've installed pdftk on all the control room machines.

http://www.pdfhacks.com/pdftk/
  859   Wed Aug 20 11:50:10 2008 JohnSummaryComputersStripTools on op540m

To restart the striptools on op540m:

cd /cvs/cds/caltech/scripts/general/

./startstrip.csh
  889   Tue Aug 26 19:07:37 2008 YoichiHowToComputersReading data from Agilent 4395A analyzer through GPIB from *Linux* machine
I succeeded in reading data from Agilent 4395A analyzer, who's floppy is crappy, through GPIB from a Linux machine using
agilent 82357B USB-GPIB interface.
I installed the linux GPIB driver to one of the lab. laptops (the silver DELL one currently sitting on the 4395A analyzer).
I wrote an initialization script for the USB-GPIB interface and a small python script for reading data from the analyzer.

[Usage]

1. Connect the USB-GPIB interface to the laptop and the analyzer.
2. Run /usr/local/bin/initGPIB command (it takes about 10sec to complete).
3. Run /usr/local/bin/getgpibdata.py > data.txt to save data from the analyzer to a text file.

The data format is explained in the comments of getgpibdata.py
This method is way faster than the unreliable floppy. The data is transfered in a few sec.

I'm now writing a wiki page on this
http://lhocds.ligo-wa.caltech.edu:8000/40m/GPIB

I will install the same thing into the other DELL laptop soon.
Let me know if you have trouble with this.
  890   Wed Aug 27 10:55:35 2008 YoichiHowToComputersAnnoying behavior of the touch pads of the lab. laptops is fixed
I was sick of the stupid touch pad behavior of the lab. laptops, i.e. firefox goes back and forth in the history when the cursor is moved.
It was caused by firefox mis-interpreting the horizontal scroll signal as back/forward command.
I stopped it by going to about:config in firefox and set mousewheel.horizscroll.withnokey.action to 0 and
mousewheel.horizscroll.withnokey.sysnumlines to true.
  894   Thu Aug 28 19:02:25 2008 rana, josephb, robSummaryComputersbig boot
This afternoon Joe did something with an .ini file (look for his detailed elog entry) and the computers went bad.
RFM network screen not active - filter modules not working.

We went around and booted every machine as has been done before. The correct order for a memory corruption
fixing big boot is the following:

    [1] RESET the RFM switches near the FB racks.
    [2] Power cycle c1dcuepics.
    [3] Power cycle all other crates with real time CPUs:
    c1iscey, daqctrl, daqawg, c1susvme1, c1susvme2, c1sosvme, c1iovme, c1lsc, c1asc, & c1iscex
    [4] Start up all FEs as described in Wiki.
    [5] Burt restore everyone (losepics, iscepics, assepics, omcepics?)
  897   Fri Aug 29 11:01:49 2008 josephbConfigurationComputersAttempt to change a channel gain in ICS-110B
As noted earlier by Rana, I was playing around with the /cvs/cds/caltech/chans/daq/C1IOOF.ini file with help from Rob. I had made a backup before hand and saved it as C1IOOF.ini.Aug-28-2008. (I have since been informed that C1IOOF.ini.082808 would have been prefered as a name).

We had been trying to up the gain in the C1: PSL-ISS_INMONPD_F in order to do a very low power PMC sweep, in an attempt to get clean modes for fitting. Initially we pressed the reconfig button on the C0DAQ_DETAIL screen, but all that seemed to do was change the Config File CRC. We proceeded to reboot fb40m remotely. However, any change to the ini file (even an extra space at the end of the file) caused a 0x2000 status for C1IOVME16k on the C0DAQ_DETAIL screen. At the time I presumed it was comparing the CRC of the ini-file to something else.

Digging around on in Alex's webspace at http://www.ligo.caltech.edu/~aivanov/ , I found the NDS Access page, which indicated that 0x2000 was a conflict between the front-end and frame builder .ini files.

"There is also status bit 0x2000 which gets added when the DCU configuration is different in front-end and frame builder. That is you can change and .ini file an then reload DAQ configuration with Epics button, which reconfigures the front-end, but leaves frame builders with invalid old configuration. They will detect this change and set the status to 0x2000 to indicate this condition. You will have to restart frame builders to pick up new .ini file and set status back to zero for the affected DCU."

It was when I was going to try reseting the c1iovme via the C0DAQ_RFMNETOWRK medm screen that we realized the EPICS controls were not responding properly. The .ini file was returned to its original form, and mass reboots commenced.
  898   Fri Aug 29 11:05:11 2008 josephbSummaryComputersc1asc was down this morning
I had to manually reboot c1asc this morning, as for some unknown reason its status was red, and the fiber lights on the board were status:red, sig det:amber, own data: nothing. Shut the crate down, turned it back on, heard a beep, then followed wiki reboot instructions. Seems to be working now.
  899   Fri Aug 29 12:41:26 2008 josephb, EricConfigurationComputersMore front ends moved to new network
Used Cat6 cables to finish moving all the front ends in 1Y4 and 1Y5 over to the new GigE network switches, specifically to the switch in 1Y6. This included the ones labeled c1susvme2, c1sosvme, and c1dscl1epics0.
  900   Fri Aug 29 12:43:44 2008 josephbSummaryComputersc1susvme1 down
Around noon today, c1susvme was having problems. The C0DAQ_RFMNETWORK light was red. The status light was off, the sig det light was amber and the own data light was green. I could also ssh in, but could not not run startup. I switched off the watchdogs for c1susvme2 (the watchdogs for c1susvme1 had already been tripped), and manually power cycled the crate.

However, when c1susvme1 when it came back up it had not mounted the usual cvs/cds/ directories. c1susvme2 did however. c1susvme1 has been on the new network for awhile, while c1susvme2 was switch over today. So apparently switching networks doesn't help this particular problem.

I did a remote reboot of c1susvme1, and it came up with the correct files mounted. Both machines ran their approriate startup.cmd files and are currently green.
  917   Wed Sep 3 19:09:56 2008 YoichiDAQComputersc1iovme power cycled
When I tried to measure the sideband power of the FSS using the scan of the reference cavity, I noticed that the RC trans. PD signal was not
properly recorded by the frame builder.
Joe restarted c1iovme software wise. The medm screen said c1iovme is running fine, and actually some values were recorded by the FB.
Nonetheless, I couldn't see flashes of the RC when I scanned the laser frequency.
I ended up power cycling the c1iovme and run the restart script again. Now the signals recorded by c1iovme look fine.
Probably, the DAQ boards were not properly initialized only by the software reset.
I will re-try the sideband measurement tomorrow morning.
  922   Thu Sep 4 11:33:25 2008 josephb, Eric, JenneConfigurationComputersAttempt to increase gain for C1:PSL-ISS_INMONPD_F via 110B
We were attempting to increase the gain on the channel C1:PSL-ISS_INMONPD_F in preparation to do a scan of the PMC at very low input power.

We started by adding a line to the C1:IOOF.ini file in /cvs/cds/caltech/chans/daq/ under that channel that said "gain=10.0". Before touching anything, the channel was outputting around 4000 counts.

We hit the reconfig button for c1iovme16k, then rebooted c1iovme (which turned out to do nothing) and then the framebuilder, in a method consistent with the wiki. This turned out to put the channel in an odd state, where it was showing very rapid, random spikes, virtually but still around 4000ish counts. We returned the file back to its original format, hit reconfig, and then rebooted the framebuilder. The channel however, was still behaving in the same broken way.

After poking around the PSL table, looking at some direct outputs, we came back and rebooted c1iovme and the framebuilder again, which fixed the channel, such that it was reading out correctly. Taking this as a sign that maybe we should reboot the framebuilder, then c1iovme to get the channel to load changes, we changed the file again to have "gain=10.0". Upon reboot of the framebuilder, the channel was still reading out fine, but at the same level. So we continued with the reboot of c1iovme. This still had no effect on the channel output.

The ini file has been set back at this point, however since Yoichi is working, I'm holding off doing a reconfig and reboot on the framebuilder until later.
  925   Thu Sep 4 16:24:56 2008 ranaConfigurationComputersAttempt to increase gain for C1:PSL-ISS_INMONPD_F via 110B

Quote:
We were attempting to increase the gain on the channel C1:PSL-ISS_INMONPD_F in preparation to do a scan of the PMC at very low input power.

According to the Wikipedia, certain esoteric mathematical
operations lead to the result that 4000 x 10 > 32768.
  932   Fri Sep 5 09:56:14 2008 josephb, EricConfigurationComputersFunny channels, reboots, and ethernet connections
1) Apparently the I00-ICS type channels had gotten into a funny state last night, where they were showing just noise, exactly when Rana changed the accelerometer gains and did major reboots. A power cycle of the c1ioo crate and appropriate restarts fixed this.

2) c1asc looks like it was down all night. When I walked out to look at the terminal, it claimed to be unable to read the input file from the command line I had entered the previous night ( < /cvs/cds/caltech/target/c1asc/startup.cmd). In addition we were unable to telnet in, suggesting an ethernet breakdown and inability to mount the appropriate files. So we have temporarily run a new cat6 cable from the c1asc board to the ITMX prosafe switch (since there's a nice knee high cable tray right there). One last power cycle and we were able to telnet in and get it running.
  941   Thu Sep 11 11:29:14 2008 josephbConfigurationComputersFinal netgear switch in place in 1Y2
I've placed the final (of 4) Netgear prosafe 24 port switch at the very top of 1Y2. At that location, there are no holes left to screw into, so it has 4 rubber feet and is sitting on the top most signal generator. It has been plugged in and connected to the control room hub with a labeled cat6 ethernet cable.

Its IP address has been set to 131.215.113.253, and has the usual controls password if using the "Smart Wizard Discovery Tool" which comes on the Netgear CD. The CD can be found in the Equipment manuals filing cabinet under Netgear. This program unfortunately only runs on a window PC.

To Do: Fix the C1:ASC ethernet connection which is currently coming straight out the front door and connected to the 1X4 switch (again through the front door).
  948   Mon Sep 15 14:00:52 2008 josephbConfigurationComputers1Y9 Hub and C1asc
The 1Y9 switch is now using a labeled Cat6 cable in cable trays to connect to the main switch in the offices. In addition, the c1asc cable which had been coming out the door was fixed last Friday, and is now labeled, going out the top and connects to the hub in 1Y2.

Note: Do not connect new ethernet cable from switch to switch without disconnecting the old cable to the rest of the network - this tends to make the Ethernet network unhappy with white flashing alarms.
  961   Thu Sep 18 01:14:23 2008 robSummaryComputersEPICS BAD

Somehow the EPICS system got hosed tonight. We're pretty much dead in the water till we can get it sorted.

The alignment scripts were not working: the SUS_[opt]_[dof]_COMM CA clients were having consistent network failures.
I figured it might be related to the network work going on recently--I tried rebooting the c1susaux (the EPICS VME
processor in 1Y5 which controls all the vertex angle biases and watchdogs). This machine didn't come back after
multiple attempts at keying the crate and pressing the reset button. All the other cards in the crate are displaying
red FAIL lights. The MEDM screens which show channels from this processor are white. It appears that the default
watchdog switch position is OFF, so the suspensions are not receiving any control signals. I've left the damping loops
off for now. I'm not sure what's going on, as there's no way to plug in a monitor and see why the processor is not coming up.

A bit later, the c1psl also stopped communicating with MEDM, so all the screens with PSL controls are also white. I didn't try
rebooting that one, so all the switches are still in their nominal state.
  963   Thu Sep 18 12:16:01 2008 YoichiUpdateComputersEPICS BACK

Quote:

Somehow the EPICS system got hosed tonight. We're pretty much dead in the water till we can get it sorted.


The problem was caused by the installation of a DNS server into linux1 by Joe.
Joe removed /etc/hosts file after running the DNS server (bind). This somehow prevented proper boot of
frontend computers.
Joe and I confirmed that putting back /etc/hosts file resolved the problem.
Right now, the DNS server is also running on linux1.

We are not sure why /etc/hosts file is still necessary. My guess is that the NFS server somehow reads /etc/hosts
when he decides which computer to allow mounting. We will check this later.

Anyway, now the computers are mostly running fine. The X-arm locks.
The Y-arm doesn't, because one of the digital filters for the Y-arm lock fails to be loaded to the frontend.
I'm working on it now.
  964   Thu Sep 18 13:05:05 2008 YoichiUpdateComputersEPICS BACK

Quote:

The Y-arm doesn't, because one of the digital filters for the Y-arm lock fails to be loaded to the frontend.
I'm working on it now.


Rob told me that the filter "3^2:20^2" is switched on/off dynamically by the front end code for the LSC.
Therefore, the failure to manually load it was not actually a problem.
The Y-arm did not lock just because the alignment was bad.
Now the Y-arm alignment is ok and the arm locks.
  965   Thu Sep 18 14:36:54 2008 josephbConfigurationComputersName server and Epics
The problems Rob was experiencing last night was due to part of the setup (or rather testing of the setup) of the new nameserver running on linux1.

The name server was setup on linux1 by doing the following:

1) Installed xorg-x11-xauth via yum which was necessary to get remote x windows to work in linux1

2) Installed xorg-x11-fonts-Type1 in order to get the gui system-config-* programs to work

3) Ran system-config-bind, which created a default set of nameserver files. I unfortunately didn't understand the gui all that well, so I manually edited and added files to these base ones. The base files were generated in /var/named/chroot/etc/ and /var/named/chroot/var/named.

4) I added martian.zone and 113.215.131.in-addr.arpa.zone, named.conf.local, and edited named.conf so it loaded named.conf.local. The martian.zone file acts a forward look up (i.e. give it a name and it returns an IP number like 131.215.113.20). The 113.215.131.in-addr.arpa.zone acts as a reverse look up (i.e. give it an IP number like 131.215.113.20 and it tells you the name). The file named.conf.local merely points to these two files.

Note: One can add or change IP lookup by simply updating these two files. The format should be obvious from the files.

5) I specifically ssh'd in as root to linux1 (using su wasn't sufficient) and then typed "service named start" (without quotes). You can also use "restart" or "stop" instead of "start". This started the name server, giving an [Ok] message.

6) I edited the /etc/resolve.conf file on linux1 so that it pointed to itself first ("nameserver 127.0.0.1" at the top of the file). I also added the line "search martian", which allows one to simply use linux1 as opposed to linux1.martian.

I also edited the /etc/resolve/conf file on linux2, and it seems to resolve names fine.

7) And here is where I broke things. As a test, I moved /etc/hosts to /etc/hosts.bak, and then tested to see if names were being resolved correctly. By using the command host, I determined they were in fact working. I also tested with ssh.

However, something basic didn't like me moving the hosts file. Apparently when a front-end machine needed to reboot, it wouldn't come back up, without any ability to SSH or telnet into them.

With Yoichi and I did quite a bit of debugging this morning and determined the nameserver itself isn't conflicting, merely the lack of the host file was the source of the problem. One theory is that services don't know to go to DNS to resolve host names. I think by modifying the /etc/nsswitch.conf file to include dns as an option for services and other programs, it might work without the host file, however, I'm going to leave that to tomorrow morning which is less likely to interfere with current operations.

As it stands, things are working with the nameserver running and the host file in place.
  966   Thu Sep 18 18:38:14 2008 YoichiHowToComputersHow to compile an SNL code for VxWorks
Dave Barker guided me through how to compile an SNL code into a Motorola 162 CPU object.

Here is the procedure:

(1) You need an account at LHO and a password for ops account at LHO. Contact Dave if you don't have these.

(2) Copy your code (say Particle.st) to the LHO gateway machine.
scp Particle.st username@lhocds.ligo-wa.caltech.edu:/cvs/cds/lho/target/t0sandbox0
(3) Login to lhocds.ligo-wa.caltech.edu
ssh username@lhocds.ligo-wa.caltech.edu
(4) Login to control0
ssh ops@control0
(5) Change directory to the sandbox dir.
cd /cvs/cds/lho/target/t0sandbox0
(6) Prepare for the compilation
setup epics
(7) Edit makefile in the directory. You have to modify a few lines at the end of the file.
There are comments for how to do it in the file.

(8) Compile
make Particle.o
(9) Copy the object file to the 40m target directory
scp Particle.o controls@nodus.ligo.caltech.edu:/cvs/cds/caltech/target/c1psl/

That is it.
  969   Fri Sep 19 00:18:14 2008 ranaUpdateComputerssvn is old
linux2:mDV>ssh nodus
Password:
Last login: Fri Sep 19 00:11:44 2008 from gwave-69.ligo.c
Sun Microsystems Inc.   SunOS 5.9       Generic May 2002
nodus:~>c
nodus:caltech>cd apps/
nodus:apps>cd mDV
nodus:mDV>svn update
svn: This client is too old to work with working copy '.'; please get a newer Subversion client
nodus:mDV>whoami
controls
nodus:mDV>uname -a
SunOS nodus 5.9 Generic_118558-39 sun4u sparc SUNW,A70 Solaris
nodus:mDV>pwd
/cvs/cds/caltech/apps/mDV
nodus:mDV>
Frown
  972   Fri Sep 19 09:49:42 2008 YoichiUpdateComputerssvn is old
The problem below is fixed now.
The cause was .svn/entries and .svn/format had wrong version number "9" where it had to be "8".
I changed those files in all the sub-directories. Now svn up runs fine.
I don't know how this version discrepancy happened.



Quote:
linux2:mDV>ssh nodus
Password:
Last login: Fri Sep 19 00:11:44 2008 from gwave-69.ligo.c
Sun Microsystems Inc.   SunOS 5.9       Generic May 2002
nodus:~>c
nodus:caltech>cd apps/
nodus:apps>cd mDV
nodus:mDV>svn update
svn: This client is too old to work with working copy '.'; please get a newer Subversion client
nodus:mDV>whoami
controls
nodus:mDV>uname -a
SunOS nodus 5.9 Generic_118558-39 sun4u sparc SUNW,A70 Solaris
nodus:mDV>pwd
/cvs/cds/caltech/apps/mDV
nodus:mDV>
Frown
  973   Fri Sep 19 11:21:45 2008 josephbConfigurationComputersNameserver and Rosalba
I tried modifying the nsswitch.conf file to include going to dns in addition to local files for everything (services, network, etc) and then moving the /etc/hosts file to /etc/hosts.bak. Unfortunately, this still didn't allow front-ends to reboot properly. So I'm not sure what is using the hosts file, but whatever it is, is apparently important. After the test I placed the hosts file back and reverted the nsswitch.conf file.

I also noticed that Rosalba was having problems connecting to the network. This apparently was because I had shut down the dhcp server on the NAT router, as had been discussed at the meeting on Wednesday.

To fix this, I modified the /etc/sysconfig/network-scripts/ifcfg-eth1 file to fix rosalba's ip as 131.215.113.24 (which doesn't seem to be in use). I also updated rosalba's /etc/resolv.conf file to point at linux1's name server, and two additional name servers as well, and added the "search martian" line. I modified the /etc/sysconfig/network-scripts/ifcfg-eth0 file so the built in network card doesn't come up automatically, since its currently not plugged into anything. Lastly, I added rosalba and its IP to linux1's name server files.
  974   Fri Sep 19 11:48:14 2008 steveUpdateComputers old hubs can make one happy
Joseph finds a XIX century bottle neck hub: CentreCOM 3624TR 10Base-T
and happily replaces it with Netgear GS724T 1000Base-T
Attachment 1: P1020934.jpg
P1020934.jpg
  977   Mon Sep 22 16:51:27 2008 YoichiHowToComputersNetwork GPIB
I was able to make the wireless connected GPIB interface work with SR785.
Now you can download data from SR785 through network, wherever it is located.
Say good bye to floppy disks.

I wrote an installation note in the wiki.
http://lhocds.ligo-wa.caltech.edu:8000/40m/GPIB

I wrote a new script called "netgpibdata.py" which works similarly as "getgpibdata.py".
It is in the 40m svn. Instructions on how to use it is on the above mentioned wiki page.
  983   Tue Sep 23 00:47:24 2008 YoichiHowToComputersNetwork GPIB

Quote:

I wrote a new script called "netgpibdata.py" which works similarly as "getgpibdata.py".
It is in the 40m svn. Instructions on how to use it is on the above mentioned wiki page.


netgpibdata.py is now installed on the controls machines (/cvs/cds/caltech/scripts/general/netgpibdata/netgpibdata.py).
You can use it like,
netgpibdata.py -i 131.215.113.106 -d AG4395A -a 10 -f spectrum01

In this example, data from Agilent 4395A analyzer at GPIB address 10 connected to the GPIB-LAN box with the IP address 131.215.113.106
is downloaded and saved to spectrum01.dat. The measurement parameters are saved to spectrum01.par.
  990   Thu Sep 25 03:12:13 2008 ranaSummaryComputersconlog and linux1
It would be nice to have conlog from outside. Right now its on linux1 and so its unavailable. To
test it for speed we ran the command line conlog on linux1, linux2, and nodus.

It was slightly faster on nodus than linux1, implying that its not a network speed issue. It was
phenomenally slower on linux2.

I used the command '/sbin/lspci -vvv' to check what network cards are installed where. As it turns
out, linux2 has a GigE card, but linux1, our NFS server, has only a 100 Mbit card:
01:08.0 Ethernet controller: Intel Corporation 82562EZ 10/100 Ethernet Controller (rev 01)
        Subsystem: Intel Corporation Unknown device 304a
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min, 14000ns max), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 209
        Region 0: Memory at ff8ff000 (32-bit, non-prefetchable) [size=4K]
        Region 1: I/O ports at bc00 [size=64]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=2 PME-

We (Joe) need to buy a GigE card for linux1 and to also set up conlog and conlogger to run on Nodus.
  992   Thu Sep 25 14:03:08 2008 josephbConfigurationComputers 

Quote:

We (Joe) need to buy a GigE card for linux1 and to also set up conlog and conlogger to run on Nodus.


A spare Intel Pro 1000/GT desktop adapter (gigabit ethernet card) has been added to Linux1 and is now using that card to connect to the network.

This was after a slight scare when I somehow reset the bios on Linux1 during the first reboot after adding the card.
After some debugging and discussion with Yoichi, the bios was fixed and the computer works again, with its new faster network connection.

Although we both noted that Linux1 is a rather old machine, with only half a gig of Ram and reaching about 80% capacity on its 58 gigabyte hard drive (raid). Might be worth upgrading in general.

Need to figure out how to install conlog/conlogger programs next...
  997   Fri Sep 26 14:10:21 2008 YoichiConfigurationComputersLab laptops maintenance
The linux laptops were unable to write to the NFS mounted directories.
That was because the UID of the controls account on those compters was different from linux1 and other control room computers.
I changed the UID of the controls account on the laptops. Of course it required not only editing /etc/password but also dealing with
numerous errors caused by the sudden change of the UID. I had to chown all the files/directories in the /home/controls.
I also had to remove /tmp/gconf-controls because it was assigned the old UID.

Whenever we add a new machine, we have to make sure the controls account has the same UID/GID as other machines, that is 1001/1001.


I did some cleanups of the laptop environment.
I made dataviewer work on the laptops *locally*. We no longer have to ssh -X to other computers to run dataviewer.
The trick was to install grace using Fedora package by
sudo yum install grace
Then i modified /usr/local/stow_pkgs/dataviewer/dataviewer to change the option to dc3 from "-s fb" to "-s fb40m".
  1006   Mon Sep 29 13:33:39 2008 josephbConfigurationComputersGigabit network finished and conlog available on Nodus
The last 100 Mb unmounted hub has been removed (or at least of the ones I could find). We should be on a fully gigabit network with Cat6 cables and lots and lots of labels.

In other news, the pearl script that runs the web interface on linux1 for the conlog has been copied to /cvs/cds/caltech/apache/cgi-bin/ and is now being pointed to by the apache server on Nodus.

https://nodus.ligo.caltech.edu:30889/cgi-bin/conlog_web.pl
  1015   Wed Oct 1 12:05:58 2008 AlbertoConfigurationComputers"StochMon" added to the Alarm Handler
John, Alberto,

we added the four channels of the RF Amplitude Monitor (aka StochMon) to the Alarm HAndler. First we modified the 40m.alh file just copying some lines and switching the name of the channels to the ones we wanted. Than we also added a few lines to the database file ioo.db in order to define the alrm levels. So far I used just test values for the thresholds of green, yellow and red states and need to update to some reasonable ones. To do that I need to calibrate those EPICS channels. I have the old data saved and I'm now trying to figure out how to properly change the database file.
ELOG V3.1.3-