40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 88 of 344  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  2544   Mon Jan 25 11:42:24 2010 josephbUpdateComputersMegatron and BO board

I was talking with Vladimir on Friday discussing the Binary Output board, we looked at the manual for it (Contec DO-32L-PE) and we realized its an opto-coupler isolated open-collector output.  He mentioned they had the right kind of 50 channel breakout board for testing in Riccardo's lab.

This morning I borrowed the 50 channel breakout board from Riccardo's lab, and along with some resistor loads, test the BO board.  It seems to be working, and I can control the output channel on/off state.

  2551   Thu Jan 28 09:14:51 2010 AlbertoConfigurationComputersc1iscey, c1iscex, c1lsc, c1asc rebooted

This morning the LSC scripts wheren't running properly. I had to reboot c1iscey, c1iscex, c1lsc, c1asc .

I burtrestored to Monday January 25 at 12:00. 

  2558   Tue Feb 2 10:38:30 2010 josephbUpdateComputersMegatron BO test

Last night, I connected megatron's BO board to the analog dewhitening board.  However, I was unable to lock the y arm (although once I disconnected the cable and reconnected it back the xy220 the yarm did lock).

So either A) I've got the wrong cable, or B) I've got the wrong logic being sent to the analog dewhitening filters.

During testing, I ran into an odd continuous on/off cycle on one of the side filer modules (on megatron).  Turns out I had forgotten to use a ground input to the control filer bank (which allows you to both set switches as well as read them out), and it was reading a random variable.  Grounding it in the model fixed the problem (after re-making).

 

 

  2570   Thu Feb 4 12:29:04 2010 josephbUpdateComputersMartian IP switch over notes

What is the change:

We will be moving the 131.215.113.XXX ip addresses of the martian network over to a 192.168.XXX.YYY scheme.

What will users notice:

Computer names (i.e. linux1, scipe25/c1dcuepics) will not change.  The domain name martian, will not change (i.e. linux1.martian.).  What will change is the underlying IP address associated with the host names.  Linux1 will no longer be 131.215.113.20 but something like 192.168.0.20.  If everything is done properly, that should be it.  There should be no impact or need to change anything in EPICS for example.  However, if there are custom scripts with hard coded IP addresses rather than hostnames, those would need to be updated, if they exist.

What needs to be done:

Each computer and router will need to either be accessed remotely, or directly, and the configuration files controlling the IP address (and/or dns lookup locations) be modified.  Then it needs to be rebooted so the configuration changes take effect. I'll be making an updated list of computers this week (tracked down via their physical ethernet cables), and next week, probably on Thursday, and then we simply go down the list one by one.

LINUX

For a linux machine, this means checking the /etc/hosts file and making sure it doesn't have old information.  It should look like:

127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

Then change the /etc/sysconfig/network-scripts/ifcfg-eth0 file (or ethX file depending on the ethernet card in question).  The IPADDR, NETWORK, and GATEWAY lines will need to be changed.  You can change the hostname (although I don't plan on it) by modifying the /etc/sysconfig/network file.

The /etc/resolv.conf file will need to be updated with the new DNS server location (i.e. 131.215.113.20 to 192.168.0.20 for example).

SOLARIS

Simlarly to linux, the /etc/hosts file will need to be updated and/or simplified.  The /etc/defaultrouter file will need to be updated to the new router ip.  /etc/defaultdomain will need to be updated.  The /etc/resolv.conf will need to be updated with the new dns server.

vxWorks

Looking at the vxWorks machines, the command bootChange can be used to view or edit the IP configuration.

The following is an example from c1iscey.

-> bootChange

'.' = clear field;  '-' = go to previous field;  ^D = quit

boot device          : eeE0
processor number     : 0
host name            : linux1
file name            : /cvs/cds/vw/pIII_7751/vxWorks
inet on ethernet (e) : 131.215.113.79:ffffff00
inet on backplane (b):
host inet (h)        : 131.215.113.20
gateway inet (g)     :
user (u)             : controls
ftp password (pw) (blank = use rsh):
flags (f)            : 0x0
target name (tn)     : c1iscey
startup script (s)   :
other (o)            :

value = 0 = 0x0

By updating the the host (name of machine where its mounting /cvs/cds from - i.e. linux1), inet on ethernet (the IP of c1iscey) and host inet (linux1's ip address), we should be able to change all the vxWorks machines.

LINUX1

The DNS server running on linux1 will need to be updated with the new IPs and domain information.  The host file on linux1 will also need to be updated for all the new IP addresses as well.

This will need to be handled carefully as the last time I tried getting away without the host file on linux1, it broke NFS mounting from other machines.  However, as long as the host on linux1 is kept in sync with the dns server files everything should work.

  2580   Mon Feb 8 17:00:36 2010 josephbUpdateComputersMegatron ETMY model updated (tst.mdl)

I've added the control logic for the outputs going to the Contec Digital Output board.  This includes outputs from the QPD filters (2 filters per quadrant, 8 in total), as well as outputs going to the coil input sensor whitening.

At this point, the ETMY controls should have everything the end station FE normally does.  I'm hoping to do some testing tomorrow afternoon with this with a fully locked IFO.

  2583   Tue Feb 9 17:18:45 2010 josephbSummaryComputersLocking Y arm successful with fully replaced front-end for ITMY

We were able to lock the Y-arm using Megatron and the RCG generated code, with nothing connected to c1iscey.

All relevant cables were disconnected from c1iscey and plugged into the approriate I/O ports, including the digital output.  Turns out the logic for the digital output is opposite what I expected and added XOR bitwise operators in the tst.mdl model just before it went out to DO board.  Once that was added, the Y arm locked within 10 seconds or so.  (Compared to the previous 30 minutes trying to figure out why it wouldn't lock).

  2591   Thu Feb 11 18:33:54 2010 josephb, alexUpdateComputersStatus of the IP change over

A few machines have still not been changed over, including a few laptops, mafalda, ottavia, and c0rga.

All the front ends have been changed over.

fb40m died during a reboot and was replaced with a spare Sun blade 1000 that Larry had.  We had to swap in our old hard drive and memory.

All the front ends, belladonna, aldabella, and the control room machines have been switched over. Nodus was changed over after we realized we hosed the elog and svn by switching linux1's IP.

At this point, 90% of the machines seem to be working, although c0daqawg seems to be having some issues with its startup.cmd code.

  2593   Thu Feb 11 19:20:44 2010 ranaUpdateComputersStatus of the IP change over

After Joe left:

  1. Turned on op440m and returned him his keyboard and mouse.
  2. Damped MC2.
  3. Opened PSL shutter - locked PMC, FSS,
  4. Started StripTool displays on op540m.
  5. op340m doesn't respond to ping from anyone.
  6. started FSS  SLOW and RCPID scripts on op540 - need to kill and restart on op430m.
  7. ASS wouldn't come up - it doesn't know who linux1 is.
  8. MC autolocker wouldn't run on op540m because of a perl module issue, started it on op440m - it needs to be killed and restarted on op430m.
  9. probably mafalda, linux2, and op430m need some attention - they are all in the same rack.

As of 7:18 PM, the MC is locked and the PSL seems normal + all suspensions are damped and the ELOG is back up as well as the SVN.

  2594   Fri Feb 12 11:44:11 2010 josephbUpdateComputersStatus of the IP change over

Quote:

After Joe left:

  1. Turned on op440m and returned him his keyboard and mouse.
  2. Damped MC2.
  3. Opened PSL shutter - locked PMC, FSS,
  4. Started StripTool displays on op540m.
  5. op340m doesn't respond to ping from anyone.
  6. started FSS  SLOW and RCPID scripts on op540 - need to kill and restart on op430m.
  7. ASS wouldn't come up - it doesn't know who linux1 is.
  8. MC autolocker wouldn't run on op540m because of a perl module issue, started it on op440m - it needs to be killed and restarted on op430m.
  9. probably mafalda, linux2, and op430m need some attention - they are all in the same rack.

As of 7:18 PM, the MC is locked and the PSL seems normal + all suspensions are damped and the ELOG is back up as well as the SVN.

5) op340m has had its hosts table and other network files updated.  I also removed its outdated hosts.deny file which was causing some issues with ssh.

6) On op340m I started FSSSlowServo, with "nohup ./FSSSlowServo", after killing it on op540m.

I also kill RCthermalPID.pl, and started with "nohup ./RCthermalPID.pl" on op540m.

7) c1ass is fixed now.  There was a typo in the resolv.conf file (namerserver -> nameserver) which has been fixed.  It is now using the DNS server running on linux1 for all its host name needs.

8) I killed the autlockMCmain40m process running on op440m, modified the script to run on op340m, logged into op340m, went to /cvs/cds/caltech/scripts/MC and ran nohup ./autolockMCmain40m

9) Linux2 does not look like it has not been connected for awhile and its wasn't connected when we started the IP change over yesterday.  Is it supposed to still be in use?  If so, I can hook it up fairly easily.  op340m, as noted earlier, has been switched over.  Mafalda has been switched over.

10) c0rga has now been switched over. 

11) aldabella, the vacuum laptop has had its starting environment variables tweaked (in the /home/controls/.cshrc file) so that it looks on the 192.168.113 network instead of the 131.215.113.  This should mean Steve will not have any more trouble starting up his vacuum control screen.

12) Ottavia has been switched over.

13) At this time, only the GPIB devices and a few laptops remain to get switched over.

  2595   Fri Feb 12 11:56:02 2010 josephbUpdateComputersNodus slow ssh fixed

Koji pointed out that logging into Nodus was being abnormally slow.  I tracked it down to the fact we had forgotten to update the address for the DNS server running on linux1 in the resolv.conf file on nodus.  Basically it was looking for a DNS server which didn't exit, and thus was timing out before going to the next one.  SSHing into nodus should be more responsive.

  2597   Fri Feb 12 13:56:16 2010 josephbUpdateComputersFinishing touches on IP switch over

The GPIB interfaces have been updated to the new 192.168.113.xxx addresses, with Alberto's help.

Spare ethernet cables have been moved into a cabinet halfway down the x-arm.

The illuminators have a white V error on the alarm handler, but I'm not sure why.  I can turn them on and off using the video screen controls (except for the x arm, which has no computer control, just walk out and turn it on).

There's a laptop or two I haven't tracked down yet, but that should be it on IPs. 

At some point, a find and replace on 131.215.xxx.yyy addresses to 192.168.xxx.yyy should be done on the wiki.  I also need to generate an up to date ethernet IP spreadsheet and post it to the wiki.

 

  2599   Fri Feb 12 15:59:16 2010 josephbUpdateComputersTestpoints not working

Non-testpoint channels seem to be working in data viewer, however testpoints are not.  The tpman process is not running on fb40m.  My rudimentary attempts to start it have failed. 

# /usr/controls/tpman &
13929
# VMIC RFM 5565 (0) found, mapped at 0x2868c90
VMIC RFM 5579 (1) found, mapped at 0x2868c90
Could not open 5565 reflective memory in /dev/daqd-rfm1
16 kHz system
Spawn testpoint manager
no test point service registered
Test point manager startup failed; -1

It looks like it may be an issue with the reflected memory (although the cables are plugged in and I see the correct lights lit on the RFM card in back of fb40m.)

The fact that this is a RFM error is confirmed by /usr/install/rfm2g_solaris/vmipci/sw-rfm2g-abc-005/util/diag/rfm2g_util and entering 3 (which should be the device number).

Interestingly, the device number 4 works, and appears to be the correct RFM network (i.e. changing ETMY lscPos offset changes to the corresponding value in memory).

So, my theory is that when Alex put the cards back in, the device number (PCI slot location?) was changed, and now the tpman code doesn't know where to look for it.

Edit: Doesn't look like PCI slot location is it, given there's 4 slots and its in #3 currently (or 2 I suppose, depending on which way you count).  Neither seems much the number 4.  So I don't know how that device number gets set.

 

 

  2603   Sat Feb 13 18:58:31 2010 josephb, alexUpdateComputersfb40m testpoints fixed

I received an e-mail from Alex indicating he found the testpoint problem and fixed it today:

Quote from Alex: "After we swapped the frame builder computer it has reconfigured all device files and I needed to create some symlinks on /dev/ to make tpman work again. I test the testpoints and they do work now."

 

  2610   Wed Feb 17 12:45:19 2010 josephbUpdateComputersUpdated Megatron and its firewall

I updated the IP address on the Cisco Linksys wireless N router, which we're using to keep megatron separated from the rest of the network.  I then went in and updated megatrons resolv.conf and host files.  It is now possible to ssh into megatron again from the control machines.

  2626   Mon Feb 22 11:46:55 2010 josephbUpdateComputersfb40m

I fixed the JetStor 416S raid array IP address by plugging in my laptop to its ethernet port, setting my IP to be on the same subnet, and using the web interface.  (After finally tracking down the password, it has been placed in the usual place).

After this change, I powered up the fb40m2 machine and reboot the fb40m machine. This seems to have made all the associated lights green.

Data viewer is working such that is recording from the point I fixed the JetStor raid array and did the fb40m reboot.  It also can go back in time before the IP switch over.

  2627   Mon Feb 22 12:48:31 2010 josephb, alex, kojiUpdateComputersFE machines now coming up

Even after bringing up the fb40m, I was unable to get the front ends to come up, as they would error out with an RFM problem.

We proceeded to reboot everything I could get my hands on, although its likely it was daqawg and daqctrl which were the issue, as on the C0DAQ_DETAIL screen their status had been showing as 0xbad, but after the reboot showed up as 0x0.  They had originally come up before the frame builder had been fixed, so this might have been the culprit.  In the course of rebooting, I also found c1omc and c1lsc had been turned off as well, and turned them on.

After this set of reboots, we're now able to bring the front ends up one by one.

  2628   Mon Feb 22 13:08:27 2010 josephbUpdateComputersMinor tweaks to c1omc

While working on c1omc, I created a .cshrc file in the controls home directory, and had it source the cshrc.40m file so that useful shortcuts like "target" and "c" work, among other things.  I also fixed the resolv.conf file so that it correctly uses linux1 as its name server (speeding up ssh login times).

  2637   Wed Feb 24 12:08:31 2010 KojiUpdateComputersRFM goes red -> recovered by the nuclear option

Most of the RFM went red this morning. I took the nuclear option and it seemed to be recovered.

  2646   Sun Feb 28 23:47:52 2010 ranaUpdateComputersrosalba

Since Rosalba wanted to update ~500 packages, I let it do it. This, of course, stopped the X server from running. I downloaded and installed the newest Nvidia driver and its mostly OK.

The main problem with the auto-update on our workstations is that we've updated some packages by hand; i.e. not using the standard CentOS yum. So that means that the auto-update doesn't work right. From now on, if you want to install a fancier package than what CentOS distributes, you should commit to handle the system maintenance for these workstations for the future. Its not that we can't have new programs, we just have to pay the price.

 

At 23:45 PST, I also started a slow triangle wave on the AOM drive amplitude. This is to see if there's a response in the FSS-FAST which might imply a coupling from intensity noise to frequency noise via absorbed power and the dn/dT effect in the coatings.

Its a 93 second period triangle modulating the RC power from 100% down to 50%.

  2649   Mon Mar 1 22:38:12 2010 ranaUpdateComputersRC sensitivity to RIN

The overnight triangle wave I ran on the AOM drive turns out to have produced no signal in the FAST feedback to the PZT.

The input power to the cavity was ~10 mW (I'm totally guessing). The peak-peak amplitude of the triangle wave was 50% of the total power.

The spectral density of the fast signal at the fundamental frequency (~7.9 mHz) is ~0.08 V/rHz. The FAST calibration is ~5 MHz/V. So, since we

see no signal, we can place an upper limit on the amount of frequency shift = (5 MHz/V) * (0.08 V/rHz) * sqrt(0.0001 Hz) = 4 kHz.

Roughly this means that the RIN -> Hz coefficient must be less than 4 kHz / 5 mW or ~ 1 Hz/uW.

For comparison, the paper on reference cavities by the Hansch group lists a coefficient of ~50 Hz/uW. However, they have a finesse of 400000

while we only have a finesse of 8000-10000. So our null result means that our RC mirrors' absorption is perhaps less than theirs. Another possibility

is that their coating design has a higher thermo-optic coefficient. This is possible, since they probably have much lower transmission mirrors. It would be

interesting to know how the DC thermo-optic coefficient scales with transmission for the standard HR coating designs.

Attachment 1: Untitled.png
Untitled.png
  2695   Mon Mar 22 16:57:45 2010 josephb,daisuke, alexConfigurationComputersMegatron test points working again

We changed the pointer on /cvs/cds/caltech/target/gds/bin/awgtpman from

/opt/gds/awgtpman    to

/cvs/cds/caltech/target/gds/bin/awgtpman.091215.

Then killed the megatron framebuilder and testpoint manager (daqd, awgtpman), restarted, hit the daq reload button from the GDS_TP screen. 

This did not fix everything. However, it did seem to fix the problem where it needed a rtl_epics under the root directory which did not exist.  Alex continued to poke around.  When next he spoke, he claimed to have found a problem in the daqdrc file.  Specifically, the cvs/cds/caltech/target/fb/ daqdrc file.

set gds_server = "megatron" "megatron" 10 "megatron" 11;

He said this need to be:

set gds_server = "megatron" "megatron"  11 "megatron" 12;


However, during this, I had looked file, and found dataviewer working, while still with the 10 and 11.  Doing a diff on a backup of daqdrc, shows that Alex also changed

set controller_dcu=10  to set controller_dcu=12, and commented the previous line. 

He also changed set debug=2 to set debug=0.

In a quick test, we changed the 11 and 12 back to 10 and 11, and everything seemed to work fine.  So I'm not sure what that line actually does.  However, the set controller_dcu line seems to be important, and probably needs  to be set to the dcu id of an actually running module (it probably doesn't matter which one, but at least one that is up).  Anyways, I set the gds_server line back to 11 and 12, just in case there's numerology going on.

I'll add this information to the wiki.

  2729   Mon Mar 29 15:26:47 2010 MottHowToComputersNew script for controlling the AG4395A

I just put a script in the /cvs/cds/caltech/scripts/general/netgpibdata/ directory to control the network analyzer called AG4395A_Run.py .   A section has been added to the wiki with the other GPIB script sections (http://lhocds.ligo-wa.caltech.edu:8000/40m/netgpib_package#AG4395A_Run.py)

  2734   Tue Mar 30 11:16:05 2010 josephbHowToComputersezca update information (CDS SVN)

I'd like to try installing an updated multi-threaded ezca extension later this week, allowing for 64-bit builds of GDS ezca tools, provided by Keith Thorne.  The code can be found in the LDAS CVS under gds, as well as in CDS subversion repository, located at 

https://redoubt.ligo-wa.caltech.edu/websvn/

Its under gds/epics/ in that repository.  The directions are fairly simple:

1) to install ezca with mult-threading in an existing EPICS installation
-copy ezca_2010mt.tar.gz (EPICS_DIR)/extensions/src
-cd (EPICS_DIR)/extensions/src
-tar -C -xzf ezca_2010mt
-modify (EPICS_DIR)/extensions/Makefile to point 'ezca' at 'ezca_2010mt'
-cd ezca_2010mt
-set EPICS_HOST_ARCH appropriately
-make


 

 

  2738   Wed Mar 31 03:45:49 2010 MottHowToComputersNew script for controlling the AG4395A

 

I took data for the 2 NPRO PLL using the script I wrote for the AG4395, but it is very noisy above about 1 MHz.  I assume this is something to do with the script (since I am fairly confident we don't have 600 dB response in the PZT), so I will go in tomorrow to more carefully understand what is going on, I may need to include a bit more latency in the script to allow the NA to settle a bit more.  I am attaching the spectrum just to show the incredibly high noise level, 

Attachment 1: noisy_spec.png
noisy_spec.png
  2744   Wed Mar 31 16:55:05 2010 josephbUpdateComputers2 computers from Alex and Rolf brought to 40m

I went over to Downs today and was able to secure two 8 core machines, along with mounting rails.  These are very thin looking 1U chassis computers. I was told by Rolf the big black box computers might be done tomorrow afternoon.  Alex also kept one of the 8 core machines since he needed to replace a hard drive on it, and also wanted to keep for some further testing, although he didn't specify how long.

I also put in a request with Alex and Rolf for the RCG system to produce code which includes memory location hooks for plant models automatically, along with a switch to flip from the real to simulated inputs/outputs.

 

  2772   Mon Apr 5 13:52:45 2010 AlbertoUpdateComputersFront-ends down. Rebooted

This morning, at about 12 Koji found all the front-ends down.

At 1:45pm rebooted ISCEX, ISCEY, SOSVME, SUSVME1, SUSVME2, LSC, ASC, ISCAUX

Then I burtestored ISCEX, ISCEY, ISCAUX to April 2nd, 23:07.

The front-ends are now up and running again.

  2786   Sun Apr 11 13:51:04 2010 AlbertoOmnistructureComputersWhere are the laptops?

I can't find the DELL laptop anywhere in the lab. Does anyone know where it is?

Also one of the two netbooks is missing.

  2787   Sun Apr 11 19:05:34 2010 KojiOmnistructureComputersWhere are the laptops?

One dell is in the clean room for the suspension work.

Quote:

I can't find the DELL laptop anywhere in the lab. Does anyone know where it is?

Also one of the two netbooks is missing.

 

  2791   Mon Apr 12 17:37:52 2010 josephbUpdateComputersY end simulated plant progress

Currently, the y end plant is yep.mdl.  In order to compile it properly (for the moment at least) requires running the normal makefile, then commenting out the line in the makefile which does the parsing of the mdl, and rerunning after modifying the /cds/advLigo/src/fe/yep/yep.c file.

The modifications to the yep.c file are to change the six lines that look like:

"plant_mux[0] = plant_gndx"  into lines that look like "plant_mux[0] = plant_delayx".  You also have to add initialization of the plant_delayx type variables to zero in the if(feInt) section, near where plant_gndx is set to zero.

This is necessary to get the position feedback within the plant model to work properly.

 

#NOTE by Koji

CAUTION:
This entry means that Makefile was modified not to parse the mdl file.
This affects making any of the models on megatron.

Attachment 1: YEP.png
YEP.png
Attachment 2: YEP_PLANT.png
YEP_PLANT.png
  2798   Tue Apr 13 12:49:35 2010 josephbUpdateComputersY end simulated plant progress

Quote:

Currently, the y end plant is yep.mdl.  In order to compile it properly (for the moment at least) requires running the normal makefile, then commenting out the line in the makefile which does the parsing of the mdl, and rerunning after modifying the /cds/advLigo/src/fe/yep/yep.c file.

The modifications to the yep.c file are to change the six lines that look like:

"plant_mux[0] = plant_gndx"  into lines that look like "plant_mux[0] = plant_delayx".  You also have to add initialization of the plant_delayx type variables to zero in the if(feInt) section, near where plant_gndx is set to zero.

This is necessary to get the position feedback within the plant model to work properly.

 

#NOTE by Koji

CAUTION:
This entry means that Makefile was modified not to parse the mdl file.
This affects making any of the models on megatron.

 To prevent this confusion in the future, at Koji's suggestion I've created a Makefile.no_parse_mdl in /home/controls/cds/advLIGO on megatron.  The normal makefile is the original one (with correct parsing now).  So the correct procedure is:

1) "make yep"

2) Modify yep.c code

3) "make -f Makefile.no_parse_mdl yep"

  2808   Mon Apr 19 13:23:03 2010 josephbConfigurationComputersyum update fixed on control room machines

I went to Ottavia, and tried running yum update.  It was having dependancy issues with mjpegtools, which was a rpmforge provided package.  In order to get it to update, I moved the rpmforge priority above (a lower number) that of epel ( epel -> 20 from 10, rpmforge -> 10 to 20).  This resolved the problem and the updates proceeded (all 434 of them). yum update on Ottavia now reports nothing needs to be done.

I went to Rosalba and found rpmfusion repositories enabled.  The only one of the 3 repositories in each file enabled was the first one.

I then added priority listing to all the repositories on Rosalba.  I set CentOS-Base and related to priority=1.  I set CentOS-Media.repo priority to 1 (although it is disabled - just to head off future problems). I set all epel related to priorities to 20. I set all rpmforge related priorities to 10.  I set all rpmfusion related priorities to 30, and left the first repo in rpmfusion-free-updates and rpmfusion-nonfree-updates were enabled.  All other rpmfusion testing repositories were disabled by me.

I then had to by hand downgrade expat to expat-1.95.8-8.3.el5_4.2.x86_64 (the rpmforge version).  I also removed and reinstalled x264.x86_64.  Lastly I removed and reinstalled lyx. yum update was then run and completed successfully.

I installed yum-priorities on Allegra and made all CentOS-Base repositories priority 1. I similarly made the still disabled CentOS-Media priority 1.  I made all epel related repos priority 20.  I made all lscsoft repos priority=50 (not sure why its on Allegra and none of the other ones).  I made all rpmforge priorities 10.  I then ran "yum update" which updated 416 packages.

 

So basically all the Centos control room machines are now using the following order for repositories:

CentOS-Base > rpmforge > epel > (rpmfusion - rosalba only) > lscsoft (allegra only)

I'm not sure if rpmfusion and lscsoft are necessary, but I've left them for now.  This should mean "yum update" will have far fewer problems in the future.

 

  2832   Thu Apr 22 15:44:34 2010 josephbUpdateComputers 

I updated the default FILTER.adl file located in /home/controls/cds/advLigo/src/epics/util/ on megatron.  I moved the yellow ! button up slightly, and fixed the time string in the upper right.

Attachment 1: New_example_CDS_filter.png
New_example_CDS_filter.png
  2844   Mon Apr 26 11:29:37 2010 josephbUpdateComputersUpdated bitwise.pm in RCG SVN plus other fixes

To fix a problem one of the models was having, I checked the CVS version of the Bitwise.pm file into the SVN (located in /home/controls/cds/advLigoRTS/src/epics/util/lib), which adds left and right bit shifting funtionality.  The yec model now builds with the SVN checkout.

Also while trying to get things to work, I discovered the cdsRfmIO piece (used to read and write to the RFM card) now only accepts 8 bit offsets.  This means we're going to have to change virtually all of the RFM memory locations for the various channels, rather than using the values from its previous incarnation, since most were 4 bit numbers.  It also means it going to eat up roughly twice as much space, as far as I can tell.

Turns out the problem we were having getting to compile was nicely answered by Koji's elog post.  The shmem_daq value was not set to be equal to 1.  This caused it to look for myrimnet header files which did not exist, and caused compile time errors.  The model now compiles on megatron.

[Edit by KA: 4 bit and 8 bit would mean "bytes". I don't recall which e-log of mine Joe is referring.]

 

  2878   Tue May 4 14:57:53 2010 josephbUpdateComputersOttavia has moved

Ottavia was moved this afternoon from the control room into the lab, adjacent to Mafalda in 1Y3 on the top shelf.  It has been connected to the camera hub, as well as the normal network.  Its cables are clearly labeled.  Note the camera hub cable should be plugged into the lower ethernet port. Brief tests indicate everything is connected and it can talk to the control room machines.

The space where Ottavia used to be is now temporarily available as a good place to setup a laptop, as there is keyboard, mouse, and an extra monitor available.  Hopefully this space may be filled in with a new workstation in the near future.

  2891   Thu May 6 19:23:54 2010 FrankSummaryComputerssvn problems

i tried to commit something this afternoon and got the following error message:

Command: Commit 
Adding: C:\Caltech\Documents\40m-svn\nodus\frank 
Error: Commit failed (details follow): 
Error: Server sent unexpected return value (405 Method Not Allowed) in response to  
Error: MKCOL request for '/svn/!svn/wrk/d2523f8e-eda2-d847-b8e5-59c020170cec/trunk/frank' 
Finished!:  

anyone had this before? what's wrong?

  2971   Fri May 21 16:41:38 2010 Alberto, JoUpdateComputersIt's a boy!

Today the new Dell computer for the GSCS (General SURF Computing Side) arrived.

We put it together and hooked it up to a monitor. And guess what? It works!

I'm totally impressed by how the Windows get blurred on Windows 7 when you move them around. Good job Microsoft! Totally worth 5 years of R&D.

  2998   Thu May 27 08:22:57 2010 AidanUpdateComputersRestarted the elog this morning
  3061   Wed Jun 9 21:05:44 2010 ranaSummaryComputersop540m is not to be used

This is a reminder (mainly for Steve, who somehow doesn't believe these things) that op540m is not to be used for your general pleasure.

No web, no dataviewer, no DTT. Using these things often makes the graphical X-Windows crash. I have had to restart the StripTool, our seismic BLRMS and our Alarms many times because someone uses op540m, makes it crash, and then does not restart the processes.

Stop breaking op540m, Steve!

  3080   Wed Jun 16 11:31:19 2010 josephbSummaryComputersRemoved scaling fonts from medm on Allegra

Because it was driving me crazy while working on the new medm screens for the simulated plant, I went and removed the aliased font entries in /usr/share/X11/fonts/misc/fonts.alias that are associated with medm.  Specifically I removed the lines  starting with widgetDM_.  I made a backup in the same directory called fonts.alias.bak with the old lines.

Medm now behaves the same on op440m, rosalba, and allegra - i.e. it can't find the widgetDM_ scalable fonts and defaults to a legible fixed font.

  3081   Wed Jun 16 18:12:16 2010 nancyConfigurationComputers40MARS

i added my laptop's mac address to teh martian at port 13 today.

 

  3083   Wed Jun 16 18:44:07 2010 AlbertoConfigurationComputers40MARS

Quote:

i added my laptop's mac address to teh martian at port 13 today.

 

No personal laptop is allowed to the martian network. Only access to the General Computing Side is permitted.

Please disconnect it.

  3106   Wed Jun 23 15:15:53 2010 josephbSummaryComputers40m computer security issue from last night and this morning

The following is not 100% accurate, but represents my understanding of the events currently.  I'm trying to get a full description from Christian and will hopefully be able to update this information later today.

 

Last night around 7:30 pm, Caltech detected evidence of computer virus located behind a linksys router with mac address matching our NAT router, and at the IP 131.215.114.177.  We did not initially recognize the mac address as the routers because the labeled mac address was off by a digit, so we were looking for another old router for awhile.  In addition, pings to 131.215.114.177 were not working from inside or outside of the martian network, but the router was clearly working.  

However, about 5 minutes after Christian and Mike left, I found I could ping the address.  When I placed the address into a web browser, the address brought us to the control interface for our NAT router (but only from the martian side, from the outside world it wasn't possible to reach it).

They turned logging on the router (which had been off by default) and started monitoring the traffic for a short time.  Some unusual IP addresses showed up, and Mike said something about someone trying to IP spoof warning coming up.  Something about a file sharing port showing up was briefly mentioned as well.

The outside IP address was changed to 131.215.115.189 and dhcp which apparently was on, was turned off.  The password was changed and is in the usual place we keep router passwords.

Update: Christian said Mike has written up a security report and that he'll talk to him tomorrow and forward the relevant information to me.  He notes there is possibly an infected laptop/workstation still at large.  This could also be a personal laptop that was accidently connected to the martian network.  Since it was found to be set to dhcp, its possible a laptop was connected to the wrong side and the user might not have realized this.

 

  3115   Thu Jun 24 13:02:59 2010 JenneUpdateComputersSome lunchtime reboots

[Jenne, Megan, Frank]

We rebooted c1iovme, c1susvme1, and c1susvme2 during lunch.  Frank is going to write a thrilling elog about why c1iovme needed some attention.

C1susvme 1&2 have had their overflow numbers on the DAQ_RFMnetwork screen red at 16384 for the past few days.  While we were booting computers anyway, we booted the suses.  Unfortunately, they're still red.  I'm too hungry right now to deal with it....more to follow.

  3120   Fri Jun 25 12:09:27 2010 kiwamuUpdateComputersGPIB controller of HP8591E

I've just stolen a GPIB controller, an yellow small box, from the spectrum analyzer HP8591E.

The controller is going to be used for driving the old spectrum analyzer HP3563A for a while.

Gopal and I will be developing and testing a GPIB program code for HP3563A via the controller.

Once after we get a new GPIB controller, it will be back to the original place, i.e. HP8591E.

 

--- GPIB controller ----

name: teofila

address: 131.215.113.106

  3159   Tue Jul 6 17:05:30 2010 Megan and JoeUpdateComputersc1iovme reboot

We rebooted c1iovme because the lines stopped responding to inputs on C1:I00-MC_DRUM1. This fixed the problem.

  3172   Wed Jul 7 22:22:49 2010 JenneUpdateComputersSome channels not being recorded!!!

[Rana, Jenne]

We discovered to our great dismay that several important channels (namely C1:IOO-MC_L, but also everything on c1susvme2) are not being recorded, and haven't been since May 17th.  This corresponds to the same day that some other upgrade computers were installed.  Coincidence?

We've rebooted pretty much every FE computer and the FrameBuilder and DAQ_CONTROL approximately 18 times each (plus or minus some number).  No matter what we do, or what channels we comment out of the C1SUS2.ini file, we get a Status on the DAQ_Detail screen for c1susvme2 of 0x1000.  Except sometimes it is 0x2000.  Anyhow, it's bad, and we can't make it good again. 

I have emailed Joe about fixing this (with some assistance from Alberto, since we all know how much he likes doing the Nuclear Reboot option for the computers :)

  3176   Thu Jul 8 14:11:16 2010 josephbUpdateComputersSome channels not being recorded!!!

This has been fixed, thanks to some help from Alex. It doesn't correspond to new computers being put in, but rather corresponds to a dcu_id change I had made in the new LSC model.

The fundamental problem is way back when I built the new LSC model using "lsc" as the name instead of something like "tst", I forgot to go to the current frame builder master file (/cvs/cds/caltech/chans/daq/master) and comment out the C1LSC.ini line. Initially there was no conflict with c1susvme, because the initially was dcu_id 13. The dcu_id was eventually changed to 10 from 13 , and thats when it conflicted with the c1susvme2 dcu_id which was also 10. I checked it against wiki edits to my dcu_id list page and I apparently updated the list on May 20th when it changed from 13 to 10, so the time frame fits.  Apparently it was previously conflicting with C0GDS.ini or C1EXC.ini, which both seem to have dcu_id = 13 set, although the C1EXC file is all commented out. The C0GDS.ini file seems to be LSC and ASC test points only.

The solution was to comment out the C1LSC.ini file line in the /cvs/cds/caltech/chans/daq/master file and restart the framebuilder with the fixed file.

Quote:

[Rana, Jenne]

We discovered to our great dismay that several important channels (namely C1:IOO-MC_L, but also everything on c1susvme2) are not being recorded, and haven't been since May 17th.  This corresponds to the same day that some other upgrade computers were installed.  Coincidence?

We've rebooted pretty much every FE computer and the FrameBuilder and DAQ_CONTROL approximately 18 times each (plus or minus some number).  No matter what we do, or what channels we comment out of the C1SUS2.ini file, we get a Status on the DAQ_Detail screen for c1susvme2 of 0x1000.  Except sometimes it is 0x2000.  Anyhow, it's bad, and we can't make it good again. 

I have emailed Joe about fixing this (with some assistance from Alberto, since we all know how much he likes doing the Nuclear Reboot option for the computers :)

 

  3178   Thu Jul 8 15:19:27 2010 josephb, kojiConfigurationComputersAdded Zonet camera to IP table on linux1

We gave the Zonet camera the IP 192.168.113.26 and the name Zonet1.

We did this by modifying the /var/named/chroot/var/named/113.168.192.in-addr.arpa.zone and martian.zone files on linux1 as root.

  3179   Thu Jul 8 15:43:58 2010 ranaUpdateComputersSome channels not being recorded!!!

Quote:

This has been fixed, thanks to some help from Alex. It doesn't correspond to new computers being put in, but rather corresponds to a dcu_id change I had made in the new LSC model.

 Just as I expected, since these hunuman didn't actually check MC_L after doing this stuff, MC_L was only recording ZERO. Joe and I reset and restarted c1susmve2 and then

verified (for real this time) that the channel was visible in both the Dataviewer real time display as well as in the trend.

3-monkeys-ComicPosition.gif

The lesson here is that you NEVER trust that the problem has been fixed until you check for yourself. Also, we must always

specify a very precise test that must be used when we ask for help debugging some complicated software problem.

 

  3185   Fri Jul 9 11:09:14 2010 josephbUpdateComputersFb40m and a few other machines turned off briefly just before 11am

I turned off fb40m2 and fb40m temporarily while we added an extra power strip  to the (new) 1X6 rack at the bottom in the back.  This is to allow for the addition of the 4600 computer  given to us by Rolf (which needs a good name) into the rack above the fb machine.  The fb40m2 was unfortunately plugged into the main power connectors, so we unplugged two of its cables, and put them into the new strip. While trying to undo some of the rats nest of cables in the back I also powered down and unpluged briefly the c0dcu1, the pem crate, and the myrinet bypass box.

I am in the process of bringing those machines back up and restoring the network.

Also this morning, Megatron was moved from the end station into the (new) 1X3 rack, along with its router.  This is to allow for the installation of the new end computer and IO chassis.

 

ELOG V3.1.3-