40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 85 of 344  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  1686   Fri Jun 19 13:38:42 2009 AlbertoConfigurationComputerselog rebooted

Today I found the elog down, so I rebooted it following the instructions in the wiki.

  1688   Fri Jun 19 14:30:47 2009 AlbertoConfigurationComputerselog rebooted

Quote:

Today I found the elog down, so I rebooted it following the instructions in the wiki.

 I have the impression that Nodus has been rebooted since last night, hasn't it?

  1689   Sun Jun 21 00:08:26 2009 ranaConfigurationComputerselog rebooted

 

nodus:log>dmesg

Sun Jun 21 00:06:43 PDT 2009
Mar  6 15:46:32 nodus sshd[26490]: [ID 800047 auth.crit] fatal: Timeout before authentication for 131.215.114.93
Mar 10 11:11:32 nodus sshd[22775]: [ID 800047 auth.crit] fatal: Timeout before authentication for 131.215.114.93
Mar 11 13:27:37 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to 131.215.115.52 port 7000: Connection refused
Mar 11 13:27:37 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to nodus port 7000: failed.
Mar 11 13:31:40 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to 131.215.115.52 port 7000: Connection refused
Mar 11 13:31:40 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to nodus port 7000: failed.
Mar 11 13:31:45 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to 131.215.115.52 port 7000: Connection refused
Mar 11 13:31:45 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to nodus port 7000: failed.
Mar 11 13:34:58 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to 131.215.115.52 port 7000: Connection refused
Mar 11 13:34:58 nodus sshd[7768]: [ID 800047 auth.error] error: connect_to nodus port 7000: failed.
Mar 12 16:09:23 nodus sshd[22785]: [ID 800047 auth.crit] fatal: Timeout before authentication for 131.215.114.93
Mar 14 20:14:42 nodus sshd[13563]: [ID 800047 auth.crit] fatal: Timeout before authentication for 131.215.114.93
Mar 25 19:47:19 nodus sudo: [ID 702911 local2.alert] controls : 3 incorrect password attempts ; TTY=pts/2 ; PWD=/cvs/cds ; USER=root ; COMMAND=/usr/bin/rm -rf kamioka/
Mar 25 19:48:46 nodus su: [ID 810491 auth.crit] 'su root' failed for controls on /dev/pts/2
Mar 25 19:49:17 nodus last message repeated 2 times
Mar 25 19:51:14 nodus sudo: [ID 702911 local2.alert] controls : 1 incorrect password attempt ; TTY=pts/2 ; PWD=/cvs/cds ; USER=root ; COMMAND=/usr/bin/rm -rf kamioka/
Mar 25 19:51:22 nodus su: [ID 810491 auth.crit] 'su root' failed for controls on /dev/pts/2
Jun  8 16:12:17 nodus su: [ID 810491 auth.crit] 'su root' failed for controls on /dev/pts/4

nodus:log>uptime
 12:06am  up 150 day(s), 11:52,  1 user,  load average: 0.05, 0.07, 0.07

  1699   Wed Jun 24 14:43:19 2009 robUpdateComputerstdsresp on linux => pzresp

 

tdsresp is broken on our linux control room machines.  I made a little perl replacement which uses the DiagTools.pm perl module, called pzresp.  It's in the $SCRIPTS/general directory, and so in the path of all the machines.  I also edited the cshrc.40m file so that on linux machines tdsresp points to this perl replacement.

I've patched DiagTools.pm to circumnavigate the tdsdmd bug described here.  I also added a function to DiagTools.pm called diagRespNoLog, which is just like diagResp but without that pesky log file.

 


Here's the output from the tdsresp binary on CentOS:
allegra:~>tdsresp 941.54 10000 100 10 C1:LSC-ITMX_EXC C1:LSC-PD1_Q C1:LSC-PD1_I
nan nan nan nan nan nan nan
nan nan nan nan nan nan nan
nan nan nan nan nan nan nan
*** glibc detected *** tdsresp: free(): invalid next size (fast): 0x089483e8 ***
======= Backtrace: =========
/lib/libc.so.6[0xa800f1]
/lib/libc.so.6(cfree+0x90)[0xa83bc0]
/usr/lib/libstdc++.so.6(_ZdlPv+0x21)[0xf7f36571]
tdsresp[0x8057fbb]
tdsresp[0x805b394]
/lib/libc.so.6(__libc_start_main+0xdc)[0xa2ce8c]
tdsresp(__gxx_personality_v0+0x169)[0x804ddd1]
======= Memory map: ========
00242000-00249000 r-xp 00000000 fd:00 15400987                           /lib/librt-2.5.so
00249000-0024a000 r--p 00006000 fd:00 15400987                           /lib/librt-2.5.so
0024a000-0024b000 rw-p 00007000 fd:00 15400987                           /lib/librt-2.5.so
009f9000-00a13000 r-xp 00000000 fd:00 15400963                           /lib/ld-2.5.so
00a13000-00a14000 r--p 00019000 fd:00 15400963                           /lib/ld-2.5.so
00a14000-00a15000 rw-p 0001a000 fd:00 15400963                           /lib/ld-2.5.so
00a17000-00b55000 r-xp 00000000 fd:00 15400974                           /lib/libc-2.5.so
00b55000-00b57000 r--p 0013e000 fd:00 15400974                           /lib/libc-2.5.so
00b57000-00b58000 rw-p 00140000 fd:00 15400974                           /lib/libc-2.5.so
00b58000-00b5b000 rw-p 00b58000 00:00 0 
00b5d000-00b70000 r-xp 00000000 fd:00 15400984                           /lib/libpthread-2.5.so
00b70000-00b71000 r--p 00012000 fd:00 15400984                           /lib/libpthread-2.5.so
00b71000-00b72000 rw-p 00013000 fd:00 15400984                           /lib/libpthread-2.5.so
00b72000-00b74000 rw-p 00b72000 00:00 0 
00b76000-00b78000 r-xp 00000000 fd:00 15400981                           /lib/libdl-2.5.so
00b78000-00b79000 r--p 00001000 fd:00 15400981                           /lib/libdl-2.5.so
00b79000-00b7a000 rw-p 00002000 fd:00 15400981                           /lib/libdl-2.5.so
00b7c000-00ba1000 r-xp 00000000 fd:00 15400975                           /lib/libm-2.5.so
00ba1000-00ba2000 r--p 00024000 fd:00 15400975                           /lib/libm-2.5.so
00ba2000-00ba3000 rw-p 00025000 fd:00 15400975                           /lib/libm-2.5.so
00bca000-00bdd000 r-xp 00000000 fd:00 15401011                           /lib/libnsl-2.5.so
00bdd000-00bde000 r--p 00012000 fd:00 15401011                           /lib/libnsl-2.5.so
00bde000-00bdf000 rw-p 00013000 fd:00 15401011                           /lib/libnsl-2.5.so
00bdf000-00be1000 rw-p 00bdf000 00:00 0 
00dca000-00dd5000 r-xp 00000000 fd:00 15400986                           /lib/libgcc_s-4.1.2-20080825.so.1
00dd5000-00dd6000 rw-p 0000a000 fd:00 15400986                           /lib/libgcc_s-4.1.2-20080825.so.1
08048000-080b7000 r-xp 00000000 00:17 6455328                            /cvs/cds/caltech/apps/linux/tds/bin/tdsresp
080b7000-080ba000 rw-p 0006e000 00:17 6455328                            /cvs/cds/caltech/apps/linux/tds/bin/tdsresp
080ba000-080bb000 rw-p 080ba000 00:00 0 
0893d000-0896b000 rw-p 0893d000 00:00 0                                  [heap]
f5e73000-f5e74000 ---p f5e73000 00:00 0 
f5e74000-f6874000 rw-p f5e74000 00:00 0 
f692d000-f6931000 r-xp 00000000 fd:00 15400995                           /lib/libnss_dns-2.5.so
f6931000-f6932000 r--p 00003000 fd:00 15400995                           /lib/libnss_dns-2.5.so
f6932000-f6933000 rw-p 00004000 fd:00 15400995                           /lib/libnss_dns-2.5.so
f6956000-f6a12000 rw-p f6a31000 00:00 0 
f6a74000-f6a7d000 r-xp 00000000 fd:00 15400997                           /lib/libnss_files-2.5.so
f6a7d000-f6a7e000 r--p 00008000 fd:00 15400997                           /lib/libnss_files-2.5.so
f6a7e000-f6a7f000 rw-p 00009000 fd:00 15400997                           /lib/libnss_files-2.5.so
f6a7f000-f6a80000 ---p f6a7f000 00:00 0 
f6a80000-f7480000 rw-p f6a80000 00:00 0 
f7480000-f7481000 ---p f7480000 00:00 0 
f7481000-f7e83000 rw-p f7481000 00:00 0 
f7e83000-f7f63000 r-xp 00000000 fd:00 6236924                            /usr/lib/libstdc++.so.6.0.8
f7f63000-f7f67000 r--p 000df000 fd:00 6236924                            /usr/libAbort

 
  1702   Thu Jun 25 17:27:42 2009 ranaUpdateComputerstdsresp on linux
I downloaded the tdsresp.pl from LLO and put it into the various TDS/bin paths. Also updated the LLO specific path stuff. It runs.
  1722   Wed Jul 8 11:13:36 2009 AlbertoOmnistructureComputerswireless router disconnected

Once again, this morning I found the wireless router disconnected from the LAN cable. No martian WiFi was available.

I wonder who is been doing that and for what reason.

  1733   Sun Jul 12 20:06:44 2009 JenneDAQComputersAll computers down

I popped by the 40m, and was dismayed to find that all of the front end computers are red (only framebuilder, DAQcontroler, PEMdcu, and c1susvmw1 are green....all the rest are RED).

 

I keyed the crates, and did the telnet.....startup.cmd business on them, and on c1asc I also pushed the little reset button on the physical computer and tried the telnet....startup.cmd stuff again.  Utter failure. 

 

I have to pick someone up from the airport, but I'll be back in an hour or two to see what more I can do.

  1735   Mon Jul 13 00:34:37 2009 AlbertoDAQComputersAll computers down

Quote:

I popped by the 40m, and was dismayed to find that all of the front end computers are red (only framebuilder, DAQcontroler, PEMdcu, and c1susvmw1 are green....all the rest are RED).

 

I keyed the crates, and did the telnet.....startup.cmd business on them, and on c1asc I also pushed the little reset button on the physical computer and tried the telnet....startup.cmd stuff again.  Utter failure. 

 

I have to pick someone up from the airport, but I'll be back in an hour or two to see what more I can do.

 I think the problem was caused by a failure of the RFM network: the RFM MEDM screen showed frozen values even when I was power recycling any of the FE computers. So I tried the following things:

- resetting the RFM switch
- power cycling the FE computers
- rebooting the framebuilder
 
but none of them worked.  The FEs didn't come back. Then I reset C1DCU1 and power cycled C1DAQCTRL.
 
After that, I could restart the FEs by power recycling them again. They all came up again except for C1DAQADW. Neither the remote reboot or the power cycling could bring it up.
 
After every attempt of restarting it its lights on the DAQ MEDM  screen turned green only for a fraction of a second and then became red again.
 
So far every attempt to reanimate it failed.
  1736   Mon Jul 13 00:53:50 2009 AlbertoDAQComputersAll computers down

Quote:

Quote:

I popped by the 40m, and was dismayed to find that all of the front end computers are red (only framebuilder, DAQcontroler, PEMdcu, and c1susvmw1 are green....all the rest are RED).

 

I keyed the crates, and did the telnet.....startup.cmd business on them, and on c1asc I also pushed the little reset button on the physical computer and tried the telnet....startup.cmd stuff again.  Utter failure. 

 

I have to pick someone up from the airport, but I'll be back in an hour or two to see what more I can do.

 I think the problem was caused by a failure of the RFM network: the RFM MEDM screen showed frozen values even when I was power recycling any of the FE computers. So I tried the following things:

- resetting the RFM switch
- power cycling the FE computers
- rebooting the framebuilder
 
but none of them worked.  The FEs didn't come back. Then I reset C1DCU1 and power cycled C1DAQCTRL.
 
After that, I could restart the FEs by power recycling them again. They all came up again except for C1DAQADW. Neither the remote reboot or the power cycling could bring it up.
 
After every attempt of restarting it its lights on the DAQ MEDM  screen turned green only for a fraction of a second and then became red again.
 
So far every attempt to reanimate it failed.

 

After Alberto's bootfest which was more successful than mine, I tried powercycling the AWG crate one more time.  No success.  Just as Alberto had gotten, I got the DAQ screen's AWG lights to flash green, then go back to red.  At Alberto's suggestion, I also gave the physical reset button another try.  Another round of flash-green-back-red ensued.

When I was in a few hours ago while everything was hosed, all the other computer's 'lights' on the DAQ screen were solid red, but the two AWG lights were flashing between green and red, even though I was power cycling the other computers, not touching the AWG at the time.  Those are the lights which are now solid red, except for a quick flash of green right after a reboot.

I poked around in the history of the curren and old elogs, and haven't found anything referring to this crazy blinking between good and bad-ness for the AWG computers.  I don't know if this happens when the tpman goes funky (which is referred to a lot in the annals of the elog in the same entries as the AWG needing rebooting) and no one mentions it, or if this is a new problem.  Alberto and I have decided to get Alex/someone involved in this, because we've exhausted our ideas. 

  1737   Mon Jul 13 15:14:57 2009 AlbertoUpdateComputersDAQAWG

Today Alex came over, performed his magic rituals on the DAQAWG computer and fixed it. Now it's up and running again.

I asked him what he did, but he's not sure of what fixed it. He couldn't remember exactly but he said that he poked around, did something somewhere somehow, maybe he tinkered with tpman and eventually the computer went up again.

Now everything is fine.

  1743   Tue Jul 14 14:54:19 2009 steveConfigurationComputersfb40m2 in 1Y6

Alex and Steve,

SunFire x4600 ( not  MEGATRON 2 , it is fb40m2 ) and JetStor ( 16 x 1 TB drives ) were installed on side rails at the bottom of 1Y6

We cleaned up the fibres and cabling in 1Y7 also

  1746   Wed Jul 15 08:59:30 2009 steveUpdateComputersfb40m

The fb40m just went out of order with status indicator number 8

It recovered on its own five minutes later.

  1752   Wed Jul 15 17:18:24 2009 JenneDAQComputersDAQAWG gone, now back

Yet again, the DAQAWG flipped out for an unknowable reason.  In order of restart activities listed on the Wiki, I keyed the crate and nothing really happened, then I hit the physical reset button and nothing happened, and then I did the 'telnet....vmeBusReset', and a couple minutes later, it was all good again.

  1756   Thu Jul 16 09:49:52 2009 AlanUpdateComputersfb40m

Quote:

The fb40m just went out of order with status indicator number 8

It recovered on its own five minutes later.

 Backup script restarted, backup of trend frames and /cvs/cds is up-to-date.

 

  1766   Tue Jul 21 01:11:30 2009 DmassAoGComputersAlarms going off

I came into the 40m to sign things out briefly then swiftly return them, and the alarms were going off on op540m at 1am.

 

The cat and donkey? were making much noise.

  1771   Tue Jul 21 18:46:47 2009 steveConfigurationComputerscomputers are down

All suspentions are kicked up. Sus dampings and  oplev servos turned off.

c1iscey and c1lsc are down. c1asc and c1iovme are up-down

  1772   Wed Jul 22 01:57:19 2009 AlbertoConfigurationComputerscomputers are down

Quote:

All suspentions are kicked up. Sus dampings and  oplev servos turned off.

c1iscey and c1lsc are down. c1asc and c1iovme are up-down

 The computers and RFM network are up working again. A boot fest was necessary.Then I restored all the parameters with burtgooey.

The mode cleaner alignment is in a bad state. The autolocker can't get it locked. I don't know what caused it to move so far from the good state that it was till this afternoon.  I went tuning the periscope but the cavity alignment is so bad that it's taking more time than expected. I'll continue working on that tomorrow morning.

  1773   Wed Jul 22 09:04:10 2009 AlbertoConfigurationComputerscomputers are down

Quote:

Quote:

All suspentions are kicked up. Sus dampings and  oplev servos turned off.

c1iscey and c1lsc are down. c1asc and c1iovme are up-down

 The computers and RFM network are up working again. A boot fest was necessary.Then I restored all the parameters with burtgooey.

The mode cleaner alignment is in a bad state. The autolocker can't get it locked. I don't know what caused it to move so far from the good state that it was till this afternoon.  I went tuning the periscope but the cavity alignment is so bad that it's taking more time than expected. I'll continue working on that tomorrow morning.

 I now suspect that after the reboot the MC mirrors didn't really go back to their original place even if the MC sliders were on the same positions as before.

  1776   Wed Jul 22 11:14:11 2009 AlbertoConfigurationComputerscomputers are down

Quote:

Quote:

Quote:

All suspentions are kicked up. Sus dampings and  oplev servos turned off.

c1iscey and c1lsc are down. c1asc and c1iovme are up-down

 The computers and RFM network are up working again. A boot fest was necessary.Then I restored all the parameters with burtgooey.

The mode cleaner alignment is in a bad state. The autolocker can't get it locked. I don't know what caused it to move so far from the good state that it was till this afternoon.  I went tuning the periscope but the cavity alignment is so bad that it's taking more time than expected. I'll continue working on that tomorrow morning.

 I now suspect that after the reboot the MC mirrors didn't really go back to their original place even if the MC sliders were on the same positions as before.

 Alberto, Rob,

we diagnosed the problem. It was related with sticky sliders. After a reboot of C1:IOO the actual output of the DAC does not correspond anymore to the values read on the sliders. In order to update the actual output it is necessary to do a change of the values of the sliders, i.e. jiggling a bit with them.

  1777   Wed Jul 22 11:18:49 2009 robConfigurationComputerssticky sliders

Quote:

Quote:

Quote:

Quote:

All suspentions are kicked up. Sus dampings and  oplev servos turned off.

c1iscey and c1lsc are down. c1asc and c1iovme are up-down

 The computers and RFM network are up working again. A boot fest was necessary.Then I restored all the parameters with burtgooey.

The mode cleaner alignment is in a bad state. The autolocker can't get it locked. I don't know what caused it to move so far from the good state that it was till this afternoon.  I went tuning the periscope but the cavity alignment is so bad that it's taking more time than expected. I'll continue working on that tomorrow morning.

 I now suspect that after the reboot the MC mirrors didn't really go back to their original place even if the MC sliders were on the same positions as before.

 Alberto, Rob,

we diagnosed the problem. It was related with sticky sliders. After a reboot of C1:IOO the actual output of the DAC does not correspond anymore to the values read on the sliders. In order to update the actual output it is necessary to do a change of the values of the sliders, i.e. jiggling a bit with them.

 I've updated the slider twiddle script to include the MC alignment biases.  We should run this script whenever we reboot all the hardware, and add any new sticky sliders you find to the end of the script.  It's at

 

/cvs/cds/caltech/scripts/Admin/slider_twiddle

  1780   Wed Jul 22 18:04:14 2009 robOmnistructureComputersweird noise coming from Gigabit switch

in the rack next to the printer.  It sounds like a fan is hitting something.

  1781   Wed Jul 22 20:11:26 2009 peteUpdateComputersRCG front end

I compiled and ran a simple (i.e. empty) front end controller on scipe12 at wilson house.  I hooked a signal into the ADC and watched it in the auto-generated medm screens. 

There were a couple of gotchas:

1. Add an entry SYS to the file /etc/rc.local, to the /etc/setup_shmem.rtl line, where the system file is SYS.mdl.

2. If necessary, do a BURT restore.  Or in the case of a mockup set the BURT Restore bit (in SYS_GDS_TP.adl) to 1.

 

  1785   Fri Jul 24 11:04:11 2009 AlbertoConfigurationComputerselog restarted

This morning I found the elog down. I restarted it using the procedure in the wiki.

  1805   Wed Jul 29 12:14:40 2009 peteUpdateComputersRCG work

Koji, Pete 

Yesterday, Jay brought over the IO box for megatron, and got it working.  We plan to firewall megatron this afternoon, with the help of Jay and Alex, so we can set up GDS there and play without worrying about breaking things.  In the meantime, we went to Wilson House to get some breakout boards so we can take transfer functions with the 785, for an ETMX controller.  We put in a sine wave, and all looks good on the auto-generated epics screens, with an "empty" system (no filters on). Next we'll load in filters and take transfer functions.

Unfortunately we promised to return the breakout boards by 1pm today.  This is because, according to denizens of Wilson House, Osamu "borrowed" all their breakout boards and these were the last two!  If we can't locate Osamu's cache, they expect to have more in a day or two.

Here is the transfer function of the through filter working at 16KHz sampling. It looks fine except for the fact that the dc gain is ~0.8. Koji is going to characterize the digital down sampling filter in order to try to compare with the generated code and the filter coefficients.


Attachment 1: TF090729_1.png
TF090729_1.png
Attachment 2: TF090729_1.png
TF090729_1.png
  1809   Wed Jul 29 19:31:17 2009 ranaConfigurationComputerselog restarted

Just now found it dead. Restarted it. Is our elog backed up in the daily backups?

  1819   Mon Aug 3 13:47:42 2009 peteUpdateComputersRCG work

Alex has firewalled megatron.  We have started a framebuilder there and added testpoints.  Now it is possible to take transfer functions with the shared memory MDC+MDP sandbox system.  I have also copied filters into MDC (the controller) and made a really ugly medm master screen for the system, which I will show to no one.

  1826   Tue Aug 4 13:40:17 2009 peteUpdateComputersRCG work - rate

Koji, Pete

 

Yesterday we found that the channel C1:MDP-POS_EXC looked distorted and had what appeared to be doubled frequency componenets, in the dataviewer.  This was because the dcu_rate in the file /caltech/target/fb/daqdrc was set to 16K while the adl file was set to 32K.  When daqdrc was corrected it was fixed.  I am going to recompile and run all these models at 16K.  Once the 40 m moves over to the new front end system, we may find it advantageous to take advantage of the faster speeds, but maybe it's a good idea to get everything working at 16K first.

  1827   Tue Aug 4 15:48:25 2009 JenneUpdateComputersmini boot fest

Last night Rana noticed that the overflows on the ITM and ETM coils were a crazy huge number.  Today I rebooted c1dcuepics, c1iovme, c1sosvme, c1susvme1 and c1susvme2 (in that order).  Rob helped me burt restore losepics and iscepics, which needs to be done whenever you reboot the epics computer.

Unfortunately this didn't help the overflow problem at all.  I don't know what to do about that.

  1828   Tue Aug 4 16:12:27 2009 robUpdateComputersmini boot fest

Quote:

Last night Rana noticed that the overflows on the ITM and ETM coils were a crazy huge number.  Today I rebooted c1dcuepics, c1iovme, c1sosvme, c1susvme1 and c1susvme2 (in that order).  Rob helped me burt restore losepics and iscepics, which needs to be done whenever you reboot the epics computer.

Unfortunately this didn't help the overflow problem at all.  I don't know what to do about that.

 

Just start by re-setting them to zero.  Then you have to figure out what's causing them to saturate by watching time series and looking at spectra.

  1829   Tue Aug 4 17:51:25 2009 peteUpdateComputersRCG work

Koji, Peter

 

We put a simple pendulum into the MDP model, and everything communicates.  We're still having some kind of TP or daq problem, so we're still in debugging mode.  We went back to 32K in the .adl's, and when driving MDP,  the MDC-ETMX_POS_OUT is nasty, it follows the sine wave envelope but goes to zero 16 times per second.

 

The breakout boards have arrived.  The plan is to fix this daq problem, then demonstrate the model MDC/MDP system.  Then we'll switch to the "external" system (called SAM) and match control TF to the model.  Then we'd like to hook up ETMX, and run the system isolated from the rest of the IFO.  Finally we'd like to tie it into the IFO using reflective memory.

  1831   Wed Aug 5 07:33:04 2009 steveDAQComputersfb40m is down
  1832   Wed Aug 5 09:25:57 2009 AlbertoDAQComputersfb40m is up

FB40m up and running again after restarting the DAQ.

  1837   Wed Aug 5 15:57:05 2009 AlbertoConfigurationComputersPMC MEDM screen changed

I added a clock to the PMC medm screen.

I made a backup of the original file in the same directory and named it *.bk20090805

  1839   Wed Aug 5 17:41:54 2009 peteUpdateComputersRCG work - daq fixed

The daq on megatron was nuts.  Alex and I discovered that there was no gds installation for site_letter=C (i.e. Caltech) so the default M was being used (for MIT).  Apparently we are the first Caltech installation.  We added the appropriate line to the RCG Makefile and recompiled and reinstalled (at 16K).  Now DV looks good on MDP and MDC, and I made a transfer function that replicates  bounce-roll filter.  So DTT works too.

  1854   Fri Aug 7 13:42:12 2009 ajwOmnistructureComputersbackup of frames restored

Ever since July 22, the backup script that runs on fb40m has failed to ssh to ldas-cit.ligo.caltech.edu to back up our trend frames and /cvs/cds.

This was a new failure mode which the scripts didn't catch, so I only noticed it when fb40m was rebooted a couple of days ago.

Alex fixed the problem (RAID array was configured with the wrong IP address, conflicting with the outside world), and I modified the script ( /cvs/cds/caltech/scripts/backup/rsync.backup ) to handle the new directory structure Alex made.

Now the backup is current and the automated script should keep it so, at least until the next time fb40m is rebooted...

 

  1856   Fri Aug 7 16:00:17 2009 peteUpdateComputersRCG work. MDC MDP open loop transfer function

Today I was able to make low frequency transfer function with DTT on megatron.  There seems to have been a timing problem, perhaps Alex fixed it or it is intermittent.

I have attached the open loop transfer function for the un-optimized system, which is at least stable to step impulses with the current filters and gains.  The next step is to optimize, transfer this knowledge to the ADC/DAC version, and hook it up to isolated ETMX.

Attachment 1: tf_au_natural.pdf
tf_au_natural.pdf tf_au_natural.pdf
  1870   Sun Aug 9 16:32:18 2009 ranaUpdateComputersRCG work. MDC MDP open loop transfer function

This is very nice. We have, for the first time, a real time plant with which we can test our changes of the control system. From my understanding, we have a control system with the usual POS/PIT/YAW matrices and filter banks. The outputs go to a separate real-time system which is running something similar and where we have loaded the pendulum TF as a filter. Cross-couplings, AA & AI filters, and saturations to come later.

The attached plot is just the same as what Peter posted earlier, but with more resolution. I drove at the input to the SUSPOS filter bank and measured the open loop with the loop closed. The loop wants an overall gain of -0.003 or so to be stable.

Attachment 1: a.png
a.png
  1879   Mon Aug 10 17:36:32 2009 peteUpdateComputersRCG work. PIT, YAW, POS in MDP/MDC system

I've added the PIT and YAW dofs to the MDC and MDP systems.  The pendula frequencies in MDP are 0.8, 0.5, 0.6 Hz for POS, PIT, and YAW respectively.  The three dofs are linear and uncoupled, and stable, but there is no modeled noise in the system (yet) and some gains may need bumping up in the presence of noise.  The MDC filters are identical for each dof (3:0.0 and Cheby). The PIT and YAW transfer functions look pretty much like the one Rana recently took of POS, but of course with the different pendulum frequencies.  I've attached one for YAW.

Attachment 1: mdcmdpyaw.jpg
mdcmdpyaw.jpg
  1881   Mon Aug 10 17:49:10 2009 peteUpdateComputersRCG work - plans

Pete, Koji

 

We discussed a preliminary game plan for this project.  The thing I really want to see is an ETMX RCG controller hooked into the existing frontend via reflective memory, and the 40 m behaving normally with this hybrid system, and my list is geared toward this.  I suspect the list may cause controversy.

+ copy the MDC filters into SAM, and make sure everything looks good there with DTT and SR785.

+ get interface / wiring boards from Wilson House, to go between megatron and the analog ETMX system

+ test tying the ETMX pendulum and bare-bones SAM together (use existing watchdogs, and "bare-bones" needs defining)

+ work some reflective memory magic and create the hybrid frontend

 

In parallel with the above, the following should also happen:

+ MEDM screen design

+ add non-linear bits to the ETMX MDP/MDC model system

+ make game plan for the rest of the RCG frontend

  1887   Tue Aug 11 23:17:21 2009 ranaSummaryComputersNodus rebooted / SVN down

Looks like someone rebooted nodus at ~3 PM today but did not elog it. Also the SVN is not running. Why?

  1890   Wed Aug 12 10:35:17 2009 jenneSummaryComputersNodus rebooted / SVN down

Quote:

Looks like someone rebooted nodus at ~3 PM today but did not elog it. Also the SVN is not running. Why?

 The Nodus business was me....my bad.  Nodus and the elog were both having a bad day (we couldn't ssh into nodus from op440m (which doesn't depend on the names server)), so I called Alex, and he fixed things, although I think that all he did was reboot.  I then restarted the elog per the instructions on the wiki.

 

  1892   Wed Aug 12 13:35:03 2009 josephb, AlexConfigurationComputersTested old Framebuilder 1.5 TB raid array on Linux1

Yesterday, Alex attached the old frame builder 1.5 TB raid array to linux1, and tested to make sure it would work on linux1.

This morning he tried to start a copy of the current /cvs/cds structure, however realized at the rate it was going it would take it roughly 5 hours, so he stopped.

Currently, it is planned to perform this copy on this coming Friday morning.

  1893   Wed Aug 12 15:02:33 2009 AlbertoConfigurationComputerselog restarted

In the last hour or so the elog crashed. I  have restarted it.

  1901   Fri Aug 14 10:39:50 2009 josephbConfigurationComputersRaid update to Framebuilder (specs)

The RAID array servicing the Frame builder was finally switched over to JetStor Sata 16 Bay raid array. Each bay contains a 1 TB drive.  The raid is configured such that 13 TB is available, and the rest is used for fault protection.

The old Fibrenetix FX-606-U4, a 5 bay raid array which only had 1.5 TB space, has been moved over to linux1 and will be used to store /cvs/cds/.

This upgrade provides an increase in look up times from 3-4 days for all channels out to about 30 days.  Final copying of old data occured on August 5th, 2009, and was switched over on that date.

  1902   Fri Aug 14 14:19:25 2009 KojiSummaryComputersnodus rebooted

nodus was rebooted by Alex at Fri Aug 14 13:53. I launched elogd.

cd /export/elog/elog-2.7.5/
./elogd -p 8080 -c /export/elog/elog-2.7.5/elogd.cfg -D

  1903   Fri Aug 14 14:33:51 2009 JenneSummaryComputersnodus rebooted

Quote:

nodus was rebooted by Alex at Fri Aug 14 13:53. I launched elogd.

cd /export/elog/elog-2.7.5/
./elogd -p 8080 -c /export/elog/elog-2.7.5/elogd.cfg -D

 It looks like Alex also rebooted all of the control room computers.  Or something.  The alarm handler and strip tool aren't running.....after I fix susvme2 (which was down when I got in earlier today), I'll figure out how to restart those.

  1904   Fri Aug 14 15:20:42 2009 josephbSummaryComputersLinux1 now has 1.5 TB raid drive

Quote:

Quote:

nodus was rebooted by Alex at Fri Aug 14 13:53. I launched elogd.

cd /export/elog/elog-2.7.5/
./elogd -p 8080 -c /export/elog/elog-2.7.5/elogd.cfg -D

 It looks like Alex also rebooted all of the control room computers.  Or something.  The alarm handler and strip tool aren't running.....after I fix susvme2 (which was down when I got in earlier today), I'll figure out how to restart those.

 Alex switched the mount point for /cvs/cds on Linux1 to the 1.5 TB RAID array after he finished copying the data from old drives.  This required a reboot of linux1, with all the resulting /cvs/cds mount points on the other computers becoming stale.  Easiest way to fix that he found was to do a reboot of all the control room machines.  In addition, a reboot fest should probably happen in the near futuer for all the front end machines since they will also have stale mount points as well from linux1.

The 1.5 TB RAID array mount is now mounted on /home of linux1, which was the old mount point of the ~300 GB drive.  The old drive is now at /oldhome on linux1.

 

  1905   Fri Aug 14 15:29:43 2009 JenneUpdateComputersc1susvme2 was unmounted from /cvs/cds

When I came in earlier today, I noticed that c1susvme2 was red on the DAQ screens.  Since the vme computers always seem to be happier as a set, I hit the physical reset buttons on sosvme, susvme1 and susvme2.  I then did the telnet or ssh in as appropriate for each computer in turn.  sosvme and susvme1 came back just fine. However, I couldn't cd to /cvs/cds/caltech/target/c1susvme2 while ssh-ed in to susvme2.  I could cd to /cvs/cds, and then did an ls, and it came back totally blank.  There was nothing at all in the folder. 

Yoichi showed me how to do 'df' to figure out what filesystems are mounted, and it looked as though the filesystem was mounted.  But then Yoichi tried to unmount the filesystem, and it claimed that it wasn't mounted at all.  We then remounted the filesystem, and things were good again.  I was able to continue the regular restart procedure, and the computer is back up again.

Recap: c1susvme2 mysteriously got unmounted from /cvs/cds!  But it's back, and the computers are all good again.

  1906   Fri Aug 14 15:32:50 2009 YoichiHowToComputersnodus boot procedure
The restart procedures for the various processes running on nodus are explained here:

http://lhocds.ligo-wa.caltech.edu:8000/40m/Computer_Restart_Procedures#nodus

Please go through those steps when you reboot nodus, or notice it rebooted then elog it.
I did these this time.
  1910   Sat Aug 15 10:36:02 2009 AlanHowToComputersnodus boot procedure

Quote:
The restart procedures for the various processes running on nodus are explained here:

http://lhocds.ligo-wa.caltech.edu:8000/40m/Computer_Restart_Procedures#nodus

Please go through those steps when you reboot nodus, or notice it rebooted then elog it.
I did these this time.


fb40m was also rebooted. I restarted the ssh-agent for backup of minute-trend and /cvs/cds.
ELOG V3.1.3-