40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  ATF eLog, Page 3 of 56  Not logged in ELOG logo
ID Date Author Typeup Category Subject
  444   Mon Nov 16 11:33:16 2009 AidanComputingDAQUnable to restart framebuilder ...

I can't get the framebuilder restarted because the network topology is now totally different and the old commands now seem obselete. Neither does the 'restart fb0' button in the medm screens work anymore. We need a description of how to restart the DAQ on the wiki. We even have a shiny new page to put the information.

https://vcs.ligo.caltech.edu:448/wiki/wiki?DAQ/Restarting_The_DAQ

 

  445   Mon Nov 16 11:50:27 2009 AidanComputingDAQUnable to restart framebuilder ...

Quote:

I can't get the framebuilder restarted because the network topology is now totally different and the old commands now seem obselete. Neither does the 'restart fb0' button in the medm screens work anymore. We need a description of how to restart the DAQ on the wiki. We even have a shiny new page to put the information.

https://vcs.ligo.caltech.edu:448/wiki/wiki?DAQ/Restarting_The_DAQ

 

The frame builder is running now. It's not at all obvious whether this is as a result of something I did which had a particularly long time constant to initialize or if someone else logged in and got things going.

 

  446   Tue Nov 17 12:13:23 2009 FrankComputingDAQUnable to restart framebuilder ...

 Quote:

Quote:

I can't get the framebuilder restarted because the network topology is now totally different and the old commands now seem obselete. Neither does the 'restart fb0' button in the medm screens work anymore. We need a description of how to restart the DAQ on the wiki. We even have a shiny new page to put the information.

https://vcs.ligo.caltech.edu:448/wiki/wiki?DAQ/Restarting_The_DAQ

 

The frame builder is running now. It's not at all obvious whether this is as a result of something I did which had a particularly long time constant to initialize or if someone else logged in and got things going.

 

As we figured out yesterday everything is working fine and all the old scripts are working too. The changes of the network or the computers do not affect the scripts as the names and configuration of the machines are the same as before except that they have different ip addresses (but the names are the same) and the primary location of the directories mounted on every machine is now different. But these are transparent changes which you don't see when using the stuff on the user-level.

The problem was that one of the channels configured for the fb has caused the problem. We don't know why,  as the channel exists and the configuration of this individual channel seems to be ok but nevertheless adding this channel to the fb causes an error in the fb and by removing this channel again everything is working fine. It's a testpoint so maybe something is wrong there and we have to recreate the list of testpoints. By manually editing those files sometimes things go wrong...

  455   Thu Nov 19 16:06:09 2009 AlbertoComputingGeneralElog debugging output - Down time programmed today to make changes

We want the elog process to run in verbose mode so that we can see what's going. The idea is to track the events that trigger the elog crashes.

Following an entry on the Elog Help Forum, I added this line to the elog starting script start-elog-nodus:

./elogd -p 8080 -c /cvs/cds/caltech/elog/elog-2.7.5/elogd.cfg -D -v > elogd.log 2>&1

which replaces the old one without the part with the -v argument.

The -v argument should make the verbose output to be written into a file called elogd.log in the same directory as the elog's on Nodus.

I haven't restarted the elog yet because someone might be using it. I'm planning to do it later on today.

So be aware that:

We'll be restarting the elog today at 6.00pm PT. During this time the elog might not be accessible for a few minutes.

  456   Thu Nov 19 18:50:22 2009 AlbertoComputingGeneralElog debugging output - Down time programmed today to make changes

Quote:

We want the elog process to run in verbose mode so that we can see what's going. The idea is to track the events that trigger the elog crashes.

Following an entry on the Elog Help Forum, I added this line to the elog starting script start-elog-nodus:

./elogd -p 8080 -c /cvs/cds/caltech/elog/elog-2.7.5/elogd.cfg -D -v > elogd.log 2>&1

which replaces the old one without the part with the -v argument.

The -v argument should make the verbose output to be written into a file called elogd.log in the same directory as the elog's on Nodus.

I haven't restarted the elog yet because someone might be using it. I'm planning to do it later on today.

So be aware that:

We'll be restarting the elog today at 6.00pm PT. During this time the elog might not be accessible for a few minutes.

 

 I tried applying the changes but they didn't work. It seems that nodus doesn't like the command syntax.

I have to go through the problem...

The elog is up again.

  459   Sat Nov 21 16:03:58 2009 FrankComputingDAQFE error messages

since a couple of days we get error messages from the FE which look like this:

100_0370.JPG

 

the performance of the DAQ seems to be unaffected by this. everything is working fine as far as i could test it. anyone seen this before?

  460   Sat Nov 21 17:52:27 2009 KojiComputingDAQFE error messages

Yes, I get the same message on megatron at the 40m. At each "startSYS' (SYS is the three-letter system name), I get something like the following. I assume this is normal.

adc card on bus 7; device 0 prim 7
ACPI: PCI Interrupt 0000:07:00.0[A] -> Link [LNKD] -> GSI 19 (level, low) -> IRQ 16
pci0 = 0xfdfffc00
pci2 = 0xfdfff800
ADC I/O address=0xfdfff800  0xffffc200000e2800
BCR = 0x242e0
RAG = 0x117d8
BCR = 0x84260
SSC = 0x16
IDBC = 0x1f
dac card on bus f; device 4
ACPI: PCI Interrupt 0000:0f:04.0[A] -> Link [LNKD] -> GSI 19 (level, low) -> IRQ 16
pci0 = 0xfe0ffc00
dac pci2 = 0xfe0ff800
DAC I/O address=0xfe0ff800  0xffffc200000f8800
DAC BCR = 0x30080
DAC BCR after init = 0x30080
DAC CSR = 0xffff
DAC BOR = 0x3417

Quote:

since a couple of days we get error messages from the FE which look like this:

100_0370.JPG

 

the performance of the DAQ seems to be unaffected by this. everything is working fine as far as i could test it. anyone seen this before?

 

  461   Mon Nov 23 14:28:51 2009 MottComputingGeneralslight change to network topology

I put the fb in the DMZ of the router down in the lab, to see if I can get mDV working without running ndsproxy.  Previously, ws1 was in the DMZ, so any connections to the network (ie ssh) went to WS1.   Because of this changewe cannot ssh directly into ws1 for the moment, logging into the external IP will bring you to fb, from which you can log into ws1 if need be.  I will check if mDV is actually working and, if so, figure out how to allow direct login to ws1.

  462   Mon Nov 23 15:50:07 2009 MottComputingGeneralslight change to network topology

It looks like both mDV and ligoDV can now access channels in the lab.  I will take a look at getting a direct login port to ssh into WS1, but for the time being the access is through fb. 

Right now what happens ssh'ing to 131.215.115.216 on port 22 gets you to fb.  Unfortunately, it doesn't appear to be possible to use port forwarding with DMZ enabled, at least with the router we have, so we cannot have direct acecss to any machine except fb.

  463   Mon Nov 23 17:01:01 2009 FrankComputingDAQFE error messages

Is this a new feature? I haven't  seen this before.

Quote:

Yes, I get the same message on megatron at the 40m. At each "startSYS' (SYS is the three-letter system name), I get something like the following. I assume this is normal.

adc card on bus 7; device 0 prim 7
ACPI: PCI Interrupt 0000:07:00.0[A] -> Link [LNKD] -> GSI 19 (level, low) -> IRQ 16
pci0 = 0xfdfffc00
pci2 = 0xfdfff800
ADC I/O address=0xfdfff800  0xffffc200000e2800
BCR = 0x242e0
RAG = 0x117d8
BCR = 0x84260
SSC = 0x16
IDBC = 0x1f
dac card on bus f; device 4
ACPI: PCI Interrupt 0000:0f:04.0[A] -> Link [LNKD] -> GSI 19 (level, low) -> IRQ 16
pci0 = 0xfe0ffc00
dac pci2 = 0xfe0ff800
DAC I/O address=0xfe0ff800  0xffffc200000f8800
DAC BCR = 0x30080
DAC BCR after init = 0x30080
DAC CSR = 0xffff
DAC BOR = 0x3417

Quote:

since a couple of days we get error messages from the FE which look like this:

100_0370.JPG

 

the performance of the DAQ seems to be unaffected by this. everything is working fine as far as i could test it. anyone seen this before?

 

 

  470   Mon Dec 7 14:01:18 2009 AidanComputingfubarFront-end is down ...

I tried to make a change to the front-end in Simulink and compile it on fb0 - which is supposedly now our front-end machine. For some reason it won't build the .rtl file that is the front-end itself. When I tried to revert to the backed-up model I had the same problem. It doesn't look like anyone has tried to rebuild the front-end since late September and there have been some changes to the network since then.

I'm going to track down Alex and sort this out.

  472   Mon Dec 7 17:18:40 2009 AidanComputingDAQFront-end is back up (Alex's comments)

Quote:

I tried to make a change to the front-end in Simulink and compile it on fb0 - which is supposedly now our front-end machine. For some reason it won't build the .rtl file that is the front-end itself. When I tried to revert to the backed-up model I had the same problem. It doesn't look like anyone has tried to rebuild the front-end since late September and there have been some changes to the network since then.

I'm going to track down Alex and sort this out.

 

Some piece of the code wasn't getting generated properly. /cvs/cds/advLigo/src/fe/atf/atffe.rtl was missing, so the makes did not go as planned. This was fixed by a reboot of fb0 with minor complications:

I tried to reboot fb0 via an ssh, and it wouldn't boot. It kept giving errors (on the monitor on top of the rack) saying something about not being able to find fb1. We power cycled the bottom box (just fb0?), and still nothing. Alex got on the case and fixed it. Apparently the problem is a trash file that gets generated if the RCG model is compiled on a computer that is not running RTLinux. That one was my fault.

Here are his comments:

Email 1:

 

Hi Aidan,

 

Seems to compile fine in ~/advLigo. I did make clean-atf and then make

 

atf. What kind of an error message were you getting?
-alex

My reply:

Hi Alex,
I just tried it again and this is what happens:
$ make install-daq-atf
Installing ... (this all works okay, until)
/bin/cp: cannot stat 'src/fe/atf/atffe.rtl' No such file or directory
Of course when you try 'startatf' the log.txt file in target/c2atf/  
just says:
/cvs/cds/caltech/target/c2atf/atffe.rtl: command not found
Thoughts?
Aidan.
Email 2:

 

 

 


OK, I have it fixed. Have you tried building this on some other computer?
advLigo/src/fe/atf/GNUmakefile got created somehow and it does get
created if you try building on the computer not running the RTLinux
system. I have put some code into the build script to clean this file up
and it the build is working fine now.

-alex

 

 

 

 

 

 

 

 

 

 

  480   Fri Dec 11 19:33:54 2009 FrankComputingDAQBeckhoff (Laser) channel to EPICS channel translation moved to 10.0.1.12

As the IBM laptop converting the laser (Beckhoff) channels to EPICS channels has stopped working very often within the last two month i decided to move this part now to the new OPC server i've set up a couple of weeks ago for the environmental monitoring. I first wanted to test the reliability of the new computer before moving everything. Since this afternoon everything is now running on this new computer. It's name is opc-server and it's ip address is 10.0.1.12. It's running WinXP (it has to) and if someone needs access to it plz use the monitor on top of the rack and the KVM switch (it's labeled). It also runs a VNC server on port 5900 for remote access (only from inside the network, not available from the outside, security risk)

  483   Sat Dec 12 18:02:47 2009 DmassComputingDAQframebuilding funniness

The framebuilder wouldn't start. Now fixed. This is what I did:

  • I rebooted fb1 via power cycling.
  • Rebooted fb0 the same way
  • did a startatf.
  • did a killatf
  • Couldn't telnet into fb (or fb0) via port 8088 or 8087
  • used the "restart fb0" button in the FE diagnostic

The process seemed to respawn some minutes after this.

  499   Thu Dec 17 19:52:22 2009 AidanComputingDAQframebuilding funniness - "hold please"

Quote:

The framebuilder wouldn't start. Now fixed. This is what I did:

  • I rebooted fb1 via power cycling.
  • Rebooted fb0 the same way
  • did a startatf.
  • did a killatf
  • Couldn't telnet into fb (or fb0) via port 8088 or 8087
  • used the "restart fb0" button in the FE diagnostic

The process seemed to respawn some minutes after this.

 

 

I encountered the same problem this evening. I tried to set another channel to acquire in the frame builder. I uncommented C2:ATF-GENERIC_GEN7_IN_DAQ and set acquire = 1.

When I restarted FB0 from the FE diagnostic screen I just got a message from FB0 saying that the process was spawning too fast and it was going to wait 5 minutes and try again. This repeated many times.

INIT: Id "fb" respawning too fast: disabled for 5 minutes.

Right now it is still trying to start fb again.

  500   Thu Dec 17 20:40:35 2009 AidanComputingDAQRestarting DAQ - something else to avoid

I had some issues restarting the FE. It turned out there was a typo in C2ATF.ini so one of the data rates was 2^N. I fixed that and the FE started nicely again. I even locked up the PMC again.

 

  501   Thu Dec 17 23:49:38 2009 DmassComputingDAQRebuilt ATF.mdl, added channel

I redrew some of the architecture of atf.mdl to better reflect what I was doing in lab, and am now using channels 23-30 for the Mach Zehnder. I also added the channels to the framebuilder with the following naming scheme:

MZ_ACx_IN  - the nth AC coupled channel

MZ_DCx_IN - the input of the nth DC coupled channel

MZ_ACplusDCx - the the nth recombined channel with optimal SNR filter applied. These correspond to channels 1, 3, 5, 7 of the output matrix associated with these channels.

Dataviewer made me type their name in by hand, though DTT did so fine. I suspect it's a naming thing. Everything seems to be kosher in terms of the DAQ

  514   Mon Dec 28 16:20:34 2009 FrankComputingDAQbooting VME crates from fb1

short introduction how to setup a vme crate booting it's OS from fb1:

i copied most of the stuff from peters sun workstations to a directory on fb1 (/caltech/target/vme/). So for new projects we should create a nice structure of all the important directories we need. So far it's historic and only for a first test... i created a link "vme" in the root directory pointing to this directory...

in order to get the vxWorks booted from the fb1 it had to be configured as a remote shell server. The package can be installed using "yum install rsh-server". rsh can be enabled using the command "chkconfig rsh on" and "service xinetd restart" starts the service. The remote machine which should be able to access the computer as root has to be on a trusted computer list. This list is located in the file "/root/.rhosts" which has to be created if set up for the first time. Put the name of the trusted remote machine in here.

setting up the VME CPU is easy: connect a terminal to the serial port (9600 8N1). After applying power it shows something like this:


VEND DEV  REV BASE0    BASE1    BASE2    BASE3    BASE4    BASE5
---- ---- --- -------- -------- -------- -------- -------- --------
1011    9  22 30000000 20000000 ........ ........ ........ ........

                            VxWorks System Boot


Copyright 1984-1996  Wind River Systems, Inc.

CPU: Heurikon Baja4700
Version: 5.3.1
BSP version: 1.1/1
Creation date: Feb 25 1998, 16:45:50

Press any key to stop auto-boot...

now press any key and then "c" to edit the configuration. Now enter all the information required.

host name               :  host name where you boot the OS from
file name                  : filename of the OS including the path
inet on ethernet (e) : ip address:
netmask
host inet (h)            : ip address of the host name
user (u)                   : should be root
target name (tn)     : network name of your VME card, same as added to the .rhosts file !
startup script (s)     :  script started when booting

example:

boot device          : dec
processor number     : 0
host name            : fb1
file name            : /vme/baja/vxWorks
inet on ethernet (e) : 10.0.1.150:ffffff00
host inet (h)        : 10.0.1.11
user (u)             : root
flags (f)            : 0x0
target name (tn)     : tcs_vme1
startup script (s)   : /vme/acav/startup.cmd

that's it. after rebooting it should boot the vxWorks from fb1 now...

                            VxWorks System Boot

Copyright 1984-1996  Wind River Systems, Inc.

CPU: Heurikon Baja4700
Version: 5.3.1
BSP version: 1.1/1
Creation date: Feb 25 1998, 16:45:50

Press any key to stop auto-boot...
 0
auto-booting...

boot device          : dec
processor number     : 0
host name            : fb1
file name            : /vme/baja/vxWorks
inet on ethernet (e) : 10.0.1.150:ffffff00
host inet (h)        : 10.0.1.11
user (u)             : root
flags (f)            : 0x0
target name (tn)     : tcs_vme1
startup script (s)   : /vme/acav/startup.cmd

Mapping RAM base to A32 space at 0x40000000... done.
Waiting for VME /SYSFAIL signal to clear... done.
Attaching network interface dec0...
DEC 21140 Ethernet driver ver. 02t - Copyright 1996-1998 Artesyn Technologies
done.
Attaching network interface lo0... done.
Loading... 795028
Starting at 0x80010000...

Mapping RAM base to A32 space at 0x40000000... done.
No longer waiting for sysFail to clear. D.Barker 14th Sept 1998
Attaching network interface dec0...
DEC 21140 Ethernet driver version 02n - Copyright 1996-1997 Heurikon Corp.
done.
Attaching network interface lo0... done.
Mounting NFS file systems from host fb1 for target tcs_vme1:
...done
Loading symbol table from fb1:/vme/baja/vxWorks.sym ...done


 ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
 ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
 ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
      ]]]]]]]]]]]  ]]]]     ]]]]]]]]]]       ]]              ]]]]         (R)
 ]     ]]]]]]]]]  ]]]]]]     ]]]]]]]]       ]]               ]]]]
 ]]     ]]]]]]]  ]]]]]]]]     ]]]]]] ]     ]]                ]]]]
 ]]]     ]]]]] ]    ]]]  ]     ]]]] ]]]   ]]]]]]]]]  ]]]] ]] ]]]]  ]]   ]]]]]
 ]]]]     ]]]  ]]    ]  ]]]     ]] ]]]]] ]]]]]]   ]] ]]]]]]] ]]]] ]]   ]]]]
 ]]]]]     ]  ]]]]     ]]]]]      ]]]]]]]] ]]]]   ]] ]]]]    ]]]]]]]    ]]]]
 ]]]]]]      ]]]]]     ]]]]]]    ]  ]]]]]  ]]]]   ]] ]]]]    ]]]]]]]]    ]]]]
 ]]]]]]]    ]]]]]  ]    ]]]]]]  ]    ]]]   ]]]]   ]] ]]]]    ]]]] ]]]]    ]]]]
 ]]]]]]]]  ]]]]]  ]]]    ]]]]]]]      ]     ]]]]]]]  ]]]]    ]]]]  ]]]] ]]]]]
 ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
 ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]       Development System
 ]]]]]]]]]]]]]]]]]]]]]]]]]]]]
 ]]]]]]]]]]]]]]]]]]]]]]]]]]]       VxWorks version 5.3.1
 ]]]]]]]]]]]]]]]]]]]]]]]]]]       KERNEL: WIND version 2.5
 ]]]]]]]]]]]]]]]]]]]]]]]]]       Copyright Wind River Systems, Inc., 1984-1997

                               CPU: Heurikon Baja4700.  Processor #0.
                              Memory Size: 0x4000000.  BSP version 1.1/1.
                             WDB: Ready.

Executing startup script /vme/acav/startup.cmd ...

  521   Sun Jan 10 20:01:41 2010 DmassComputingDAQProblems with transfer functions

As usual, my problems in understanding the results of a measurement I made resulted from not understading all the pieces of my setup. I was naively assuming that the DAC - > AI path drove some non differential amplifier. This is wrong. It seems that the AI board has a 3rd order butterworth topology (as shown in linked document).

I will now be making my filter measurements in analog and the world will be a better place for it.

 

  527   Thu Jan 14 01:53:11 2010 DmassComputingDAQDAQ Changes

I noticed that the CPU load was almost at capacity (14/15) for the frontend, which was running (unnecessarily) at 64k. I also noticed that when you tried to do a killatf on the frontend, it would spit out:

CA.Client.Exception...........

....................................
    Warning: "Identical process variable names on multiple servers"
    Context: "Channel: "C2:ATF-ACCoup_AC2_Name03", Connecting to: fb0:5064, Ignored: 10.0.0.10:5064"
    Source File: ../cac.cpp line 1224
    Current Time: Tue Jan 12 2010 17:13:12.568973000
 
I can't restart the frontend without restarting the framebuilder.

I changed the clock rate in the model to 32k, and changed the associated .ini file (C2ATF.ini). I built the model (make atf, etc) without issue, but ran into the same problem when I tried to kill the old code. I restarted the computer as a workaround, and the frontend wouldn't  mount /dev/sdb1 (the local drive for the frontend machine). It wanted me to run fsck, so I did.

The FSCK spat out:

"Inodes that were part of a corrupted ophan linked list found. fix?       INODE 19400785 was part of the orphaned inode list. FIXED"

I just told it to fix everything as it spat out a series of requests like this. Afterwards, the code would still run just fine, though you could not restart it. Any time you sent killatf to fb, it gave the error and you had to kill the kill process (with ctrl-c).

I emailed Alex about this.

For now, we have a working system, though you have to reboot the computer to restart the frontend code (e.g. if you want to change the .ini file).

 

  528   Thu Jan 14 02:01:43 2010 DmassComputingDAQOptimal Filters

I calculated the optimal filters for recombining the AC and DC channels, as well as the filter needed to recover the signal.

For the AC coupling I have set up, the optimal filter is just the AC coupled filter, repeated, in the digital part of the AC path.

This is the setup:

                    _______ AC couple ___Digitization____AC Couple___

                   |                                                                              |

---point1-----|                                                                            (+)------>Recombination Filter---point2

                    |___________________Digitization_______________|

 

The recombination filter I got using MATLABs roots function on an algebraic expression:

If A is the AC couple filter, then the recombination filter is B = 1/(1+A*A)

Attached is a plot of the signal recovery, the Transfer function of point2/point1

Attachment 1: ACCoupRecomboTF.pdf
ACCoupRecomboTF.pdf
  530   Thu Jan 14 14:52:44 2010 DmassComputingDAQOptimal Filters

Quote:

I calculated the optimal filters for recombining the AC and DC channels, as well as the filter needed to recover the signal.

For the AC coupling I have set up, the optimal filter is just the AC coupled filter, repeated, in the digital part of the AC path.

This is the setup:

                    _______ AC couple ___Digitization____AC Couple___

                   |                                                                              |

---point1-----|                                                                            (+)------>Recombination Filter---point2

                    |___________________Digitization_______________|

 

The recombination filter I got using MATLABs roots function on an algebraic expression:

If A is the AC couple filter, then the recombination filter is B = 1/(1+A*A)

Attached is a plot of the signal recovery, the Transfer function of point2/point1

 

I guess zpk or foton doesn't like negative poles.

I tried: zpk([11.3;11.3;11.4;.11.4],[1.2928+i*1.0486i;1.2928-i*1.0486;-1.2761+i*1.6634;-1.2761-i*1.6634,1,"n") made foton choke.

The poles were obtained as described above.

I am unsure if this is just a zpk problem which can be bypassed by having some routine calculate the SOS filter to shove into the frontend. I hope this is the case.

 

Acausal filters are bad

 

  532   Fri Jan 15 16:46:14 2010 FrankComputingGeneralmonitor utility

i installed a pc monitor utility on FB1 as a test. The utility is providing the following EPICS channels:

C2:PCMON-FB1_CPU
C2:PCMON-FB1_CPUIDLE
C2:PCMON-FB1_CPUNICE
C2:PCMON-FB1_CPUSYSTEM
C2:PCMON-FB1_CPUUSER
C2:PCMON-FB1_LOAD15MIN
C2:PCMON-FB1_LOAD1MIN
C2:PCMON-FB1_LOAD5MIN
C2:PCMON-FB1_MEM
C2:PCMON-FB1_MEMAV
C2:PCMON-FB1_MEMBUFF
C2:PCMON-FB1_MEMFREE
C2:PCMON-FB1_MEMSHRD
C2:PCMON-FB1_MEMUSED
C2:PCMON-FB1_SWAPAV
C2:PCMON-FB1_SWAPCACH
C2:PCMON-FB1_SWAPFREE
C2:PCMON-FB1_SWAPUSED
C2:PCMON-FB1_WD
C2:PCMON-FB1_HBT
C2:PCMON-FB1_HBTMOD
C2:PCMON-FB1_LOAD
C2:PCMON-FB1_CNT
C2:PCMON-FB1_BOOTTIME
C2:PCMON-FB1_IPADDR
C2:PCMON-FB1_MACHINE
C2:PCMON-FB1_RELEASE
C2:PCMON-FB1_SYSNAME
C2:PCMON-FB1_TIME
C2:PCMON-FB1_UPTIME
C2:PCMON-FB1_VERSION

all those channels are available in fb1 by now. If the tool is usefull we can install it on the other machines as well to monitor cpu load, memory usage or free disk space e.g. An medm screen is available but not for everybody as it has to be run from the command line providing macro information to generate the channel names within the screen.

medm.xwd1263602599.ps

  533   Sat Jan 16 23:16:45 2010 AidanComputingDAQRestarted front end ... and then it broke

I switched some channels to acquire for the fiber stuff. That is, after saving the current version of C2ATF.ini as C2ATF.ini.old100116, I edited the file to set the follwing channels to acquire: C2:ATF-GENERIC_GENx_IN1_DAQ and C2:ATF-GENERIC_GENx_OUT_DAQ for x = 1:7. Unfortunately, I meant to reboot the framebuilder and I accidentally restarted the whole front-end. I rebooted fb0 and did a startatf but now the front-end is not responding - at least, all of the EPICS displays corresponding to the DAQ inputs are showing 0 rather than fluctuating values.

It's been a really long day and I'm too tired to try fixing this right now.

  535   Mon Jan 18 17:53:13 2010 AidanComputingDAQFront end restarted - "burtrestore = 1" was the trick

Got the front-end going again. It turned out to be straight forward. I tried the following:

  1. Restored C2ATF.ini from the backed up version
  2. Logged into fb0 and rebooted: [controls@fb0 ~]$ sudo reboot
  3. Once fb0 was back up I logged back in as controls and started the front end: [controls@fb0 ~]$ startatf
  4. The EPICS channel displays went from white to showing all zeros. 
  5. On the C2ATF_GDS_TP.adl screen (the FE Diagnostic screen) I switched burtrestore from 0 to 1 and this brought all the channels back up to full operation
  536   Tue Jan 19 09:17:09 2010 AidanComputingDAQNew fiber channels in frame

I added the following channels to C2ATF.ini

 

# The follwoing channels were added by Aidan Brooks on 18th Jan 2010
# They are just C2:ATF-GENERIC_GENx_IN1_DAQ and GENx_OUT_DAQ for x =
# 1:7
# GEN1_IN1 - BNC CH9 on Anti-Alias Chassis
[C2:ATF-FIBER_PD_OUTLOOP_DC_IN1]
datatype=4
chnnum=10241
acquire=1
datarate=8192
[C2:ATF-FIBER_PD_OUTLOOP_DC_OUT]
chnnum=10243
acquire=1
datarate=8192
datatype=4
# GEN2_IN1 - BNC CH10 on Anti-Alias Chassis
[C2:ATF-FIBER_PD_INLOOP_DC_IN1]
datarate=8192
acquire=1
datatype=4
chnnum=10244
[C2:ATF-FIBER_PD_INLOOP_DC_OUT]
datatype=4
datarate=8192
chnnum=10246
acquire=1
# GEN3_IN1 - BNC CH11 on Anti-Alias Chassis
[C2:ATF-FIBER_REF_BEAM_PWR_IN1]
chnnum=10247
datatype=4
acquire=1
datarate=8192
[C2:ATF-FIBER_REF_BEAM_PWR_OUT]
datatype=4
datarate=8192
acquire=1
chnnum=10249
# GEN4_IN1 - BNC CH12 on Anti-Alias Chassis
[C2:ATF-FIBER_TRANS_BEAM_PWR_IN1]
chnnum=10250
datatype=4
datarate=8192
acquire=1
[C2:ATF-FIBER_TRANS_BEAM_PWR_OUT]
datarate=8192
datatype=4
acquire=1
chnnum=10252
# GEN5_IN1 - BNC CH13 on Anti-Alias Chassis
[C2:ATF-FIBER_PD_OUTLOOP_AC_IN1]
datarate=8192
acquire=1
datatype=4
chnnum=10253
[C2:ATF-FIBER_PD_OUTLOOP_AC_OUT]
datatype=4
datarate=8192
acquire=1
chnnum=10255
# GEN6_IN1 - BNC CH14 on Anti-Alias Chassis
[C2:ATF-FIBER_PD_INLOOP_AC_IN1]
datatype=4
chnnum=10256
acquire=1
datarate=8192
[C2:ATF-FIBER_PD_INLOOP_AC_OUT]
datarate=8192
datatype=4
chnnum=10258
acquire=1
# GEN7_IN1 - BNC CH15 on Anti-Alias Chassis
[C2:ATF-FIBER_DBL_TRANS_PWR_IN1]
chnnum=10259
datarate=8192
datatype=4
acquire=1
[C2:ATF-FIBER_DBL_TRANS_PWR_OUT]
chnnum=10261
datarate=8192
datatype=4
acquire=1

  561   Thu Jan 28 21:08:27 2010 DmassComputingDAQPD Trend Confusion...

I see the following periodically in the trend of my photodetectors. This is data point number on the x axis (sample rate = 4096) and counts on the y axis. The signal goes from -3500 to -9300 as the Mach Zehdner goes through fringes. I don't know what the jumps to zero are.

  • Digital problem - downsampling filters doing crazy ish?
  • Photodetectors output going bye bye (unlikely)?
Attachment 1: confused.pdf
confused.pdf
  565   Fri Jan 29 17:53:03 2010 DmassComputingfubarMental Computing Broken

I was attempting to replicate the pwelch command using the numerical recipes formula for the Welch's Periodogram PSD estimate. Koji helped me figure out that the calculation was missing a factor of bandwidth, and even though it said explicitly it was a PSD, it was a power spectrum. He also helped me figure out some stupid errors I was making in indexing.

I have recreated the algorithm now, and am going to use this, along with TFestimate to do frequency domain subtraction. I think this could still be a good thing even with the MZWino code, but it will be good to at least compare.

  566   Sat Jan 30 04:13:53 2010 DmassComputingDoublingFrequency Domain Subtraction

Frequency Domain Subtraction results attached. I estimated the transfer function between the two by averaging over the data in the normal way (50% overlap, hanning window), and applied it to the data, and used the Welch's Periodogram method to estimate the PSD.

 I am not sure why the black curve seems to want a lot more averaging. Perhaps there is a typo somewhere in my code? I will post the code, and a better description of my algorithm later.

Attachment 1: 19b1MZSubtrFreqDom.pdf
19b1MZSubtrFreqDom.pdf
  572   Wed Feb 3 03:06:23 2010 DmassComputingDAQNDS server down?

I was unable to run get_data in MATLAB:

Warning: C2:ATF-MZ_DC4_IN, at channel list index 1, could not be found.
Warning: Some of the requested channels could not be found in the
server-provided channel list

I telnet'ed into fb0, and restarted it. I won't reboot the frontend without first checking that others aren't doing stuff with it. This did not fix the problem.

 

  573   Wed Feb 3 17:15:12 2010 DmassComputingDAQNDS INFO FOR NOT THE 40m

If you want to run get_data on your local machine, but the data you want is not on the 40m, do the following:

  • Get the contents of /svn/trunk/mDV from the 40m.
  • Get ligotools (google will tell you more)
  • Change mdv_config2 (in .../mDV/) so that it agrees with the ligotools path is what you have on your machine
    paths.ligotools_matlab = '/ligoapps/ligotools/bin';  (something like this - I reccomend this filestructure so that we can all svn happily together)
  • run mdv_config2 in MATLAB
  • You are ready to use get_data

NOTE: If you do not run mdv_config2, this will try to get data from the 40m NDS sever, which will fail.

NOTE2: The only other change that is made to the mdv_config file is changing the server from nodus to 131.215.115.216:8088

  574   Thu Feb 4 00:59:23 2010 ranaComputingfubarELOG restarted: no more .ps files

I restarted the ELOG on NODUS just now. Our attempt to set up error logging worked - it turns out ELOG was choking on the .ps file attachment.

So for the near future: NO MORE .PS files! Use PDF - move into the 20th century at least.

matlab can directly make either PNG or PDF files for you, you can also use various other conversion tools on the web.

Of course, it would be nice if nodus could handle .ps, but its a Solaris machine and I don't feel like debugging this. Eventually, we'll give him away and make the new nodus a Linux box, but that day is not today.

To restart the elog:  http://lhocds.ligo-wa.caltech.edu:8000/40m/How_To

  596   Sat Feb 13 13:29:22 2010 KojiComputingDoublingMATLAB MZ SUBTRACTION CODE

  597   Sun Feb 14 04:31:09 2010 DmassComputingDoublingMATLAB MZ SUBTRACTION CODE

Quote:

Do you detrend x1-x4 before you calculate phi532 and phi1064? I thought detrending changes the DC values and thus leads wrong results for the phases.

Quote:

I have diagrammed my MZ subtraction algorithm. I am redoing my code now that I have realized a few things so that another person can read it. I will update the guide with the associated MATLAB code names, and continue to put everything in the SVN.

 

x1,x2,x3,x4 are the time series of my PD's during a segment of free running MZ data.

n1,...,n4 is a (detrended) time series of my noise in each PD due to some source.

 

 oops, I do not. Changing diagram to reflect this.

  598   Tue Feb 16 03:32:58 2010 dmassComputingDoublingMatlab MZ Subtraction Code Update

I have added all my subtraction code to the SVN, and will try to keep it under version control. It should also now be comprehensible to anyone who wants to read it. It is in this directory on the 40m svn:

/svn/trunk/mDV/extra/C2/dmass/mzdata/

I have included a .png diagram of the code.

  599   Tue Feb 16 11:39:38 2010 FrankComputingGeneralfb0 error

we had a disk failure on the weekend on one of the harddisks. So fb0 will be down until we replaced that disk...

  602   Tue Feb 16 19:28:25 2010 FrankComputingGeneralfb0 is now working again

as it turned out that one of the hard disks failed so i had to replace it. The device contains all "full frame" data, nothing else. Unfortunately i didn't have a spare disk of that size so i replaced it by a smaller one. So the new volume is empty and no past (full) data is available. Trend data is OK.

So if nobody needs the full data from the last couple of weeks i will send the disk back to WD to get it replaced. If the data is/was important we can wait until i get a new disk of the same size (i ordered one today). If this shows up the next two days or so we could try to copy most of the full data to the new one, but only if really required as it takes ~6h or so to duplicate the disk and i would like to avoid setting everything up. So if anyone needs the old full data plz let me know. If i don't here something within the next two days i will send the broken disk back to WD. Again: any TREND data is good, only FULL data is broken.

  603   Wed Feb 17 01:27:24 2010 FrankComputingGeneralfb0 is now working again

Quote:

as it turned out that one of the hard disks failed so i had to replace it. The device contains all "full frame" data, nothing else. Unfortunately i didn't have a spare disk of that size so i replaced it by a smaller one. So the new volume is empty and no past (full) data is available. Trend data is OK.

So if nobody needs the full data from the last couple of weeks i will send the disk back to WD to get it replaced. If the data is/was important we can wait until i get a new disk of the same size (i ordered one today). If this shows up the next two days or so we could try to copy most of the full data to the new one, but only if really required as it takes ~6h or so to duplicate the disk and i would like to avoid setting everything up. So if anyone needs the old full data plz let me know. If i don't here something within the next two days i will send the broken disk back to WD. Again: any TREND data is good, only FULL data is broken.

 No data I have taken needs to be recovered. If it costs us *very little effort/money, I would like to have the last few days, but I can retake it all too.

  604   Wed Feb 17 03:40:10 2010 DmassComputingDoublingMATLAB MZ SUBTRACTION CODE

I have updated the 40m SVN with all the code needed to run the MZ subtraction. The data files are about 250 megs total. I will upload them to the elog if that is acceptable, if not I will upload the downsampled data (which should be about 16 megs).

I put the transfer function inside the subtraction function (because I was doing it wrong before). Now f_domainsubtraction.m should be useful as a general frequency domain subtraction tool.

Attachment 1: MZ_Code_Diagram.png
MZ_Code_Diagram.png
  616   Mon Feb 22 13:14:26 2010 FrankComputingDAQfb0 replacement disks arrived

the new disk for the framebuilder arrived. So i try to move the data from the last days before the crash to the disk which is currently installed. As i was asked to copy only the last view days the space on this smaller disk is sufficient. If everything is done i will partition the new, larger disk and copy everything to there. Is there any time i can't shut down the frontend? Any plans for working with the RT stuff so far? If not, i'll try to do this late today or tomorrow afternoon, depending on the work on the cavity stuff...

  618   Mon Feb 22 15:44:47 2010 FrankComputingDAQday(s) for data rescue?

which days do we need to copy? It's a lot more data as expected, 64gb per day (!) so plz let me know which day is required to copy. We should time it well as the framebuilder will delete the old data almost instantly to free space for tyhe new data. So if we delete all current data it depends on how much of the old stuff we copy how long we have to access the data before its deleted automatically

  630   Wed Mar 3 15:49:09 2010 FrankComputingDAQhard disk replaced

i've replaced the temporary hard disk for the full frames by a new, larger one. As nobody wants data from the busted one i will send it to the manufacturer for an exchange now...

  647   Sat Mar 6 15:02:37 2010 DmassComputingDAQTest Points Busted

Test Points are broken...When I try to access general test points, (only 2 at a time), there are 8Hz combs on them. DAQ channels seem to be fine to call with DTT. Spectra of several test points that I have tried to look at all have had this 8 Hz comb on them.

Last I emailed Alex, when this happened before, he said "oh thats weird, it seems test points break when you try to access more than 3 of them at once". In the interest of time, I will workaround this by using DAQ channels, but may look into this after the LV meeting.

  649   Mon Mar 8 13:53:42 2010 DmassComputingDAQDon't do this to the front end...

Channel 9 of the ADC was going through a filter called "temp" to channel 9 of the DAC and railing it.

  650   Mon Mar 8 21:06:35 2010 ZachComputingDAQDon't do this to the front end...

That was us. Didn't think about the slow loop railing when the cavity's not locked or when there are long term length shifts in temperature... We'll make sure to shut this off when we're not using it in the future and possibly use an output amplifier if we need to use it continuously over several days, etc.

Quote:

Channel 9 of the ADC was going through a filter called "temp" to channel 9 of the DAC and railing it.

 

  684   Fri Mar 19 15:20:47 2010 DmassComputingDAQFoton Broken?

I am trying to change the filters in my front end model via foton. I edit them, and click save. This does not work...

  • when I click "load coefficients" in the medm screen, nothing happens.
  • When I click save again, it says "no changes need to be saved"
  • When I switch out of the filter bank I edit, and switch back, it is the same
  • When I close Foton and reopen it, none of my changes have been saved

 

  717   Tue Apr 6 21:32:42 2010 FrankComputingGeneralspare hard disks

got 20 250GB hard disks today (same as the ones we use already in both the frontend and the second framebuilder / fileserver). There is already a spare disk in the frontend which (for unknown reasons) we don't use for a simple mirror of the main disk (as we do it in the other computer already) So we should create a simple raid and use that disk for it. Before we do that we should re-arrange the order as that disk is i think sdc, sda is the boot disk, sdb full frames and sdd trend data. So it would be nice to have sda/sdb the raid and then sdc+sdd the data.

  723   Fri Apr 9 21:19:33 2010 FrankComputingDAQfb0 online again

i changed the order of the drives. Now sda/sdb are the boot drives (mirrored), sdc for full data and sdd for trend data.

Added four more disks to fb0 in order to speed up the copy process. As sda is a spare disk from alex this drive is not identical with sdb, so i don't know if we can create a mirror.
I think we should create the mirror with two identical drives in order to avoid problems. So i copied everything to sdb already and i will copy the content of sdb to the other four disks.
Two them we will keep as a backup, the other two will be the boot disks for the second RT system for peters lab.

 

  724   Tue Apr 13 00:19:00 2010 Mott, FrankComputingDAQfb0 boot disk copies

Alex and I made four copies of the master boot disk of the fronend computer. I used one of those disks for the PSl frontend computer i'm currently setting up.
So we have three spares now, which we keep in our lab and which can be used to set up new frontends or as a backup in the case of fatal failure (as the one happened some weeks ago).
We will also create a mirror on fb0 the next days to reduce the risk such a failure.

  729   Mon Apr 19 19:09:42 2010 FrankComputingDAQtodays DAQ odyssey - part 1

when we started last week setting up a second fronend i thought it might be simple as making another copy of the main disk, changing some config files and here we go - what alex and i learned is that it is way more complicated with a touch of impossible at all. but lets get started from the beginning:

the idea was to set up a second RT frontend computer with it's own framebuilder, either in the same network or a different subnetwork. The reason why we should have seperate framebuilders is that if we have only one and the first (which whatever we define as the first) of the frontends is down, the whole thing is down. So having more then one model running on different machines, the one which we define as the "main" or first one has to be alive, at all times. If not the others don't work anymore. 

Trying to setting it up with independent framebuilders in the same network is impossible due to broadcast messages on the network prohibiting both working at the same time and still using tools like DTT, which listen only to one broadcasting in the network. The minimum requirement is to have physically seperate networks.

Ok, thats fine with us, we decided to split our networks into seperate subnetworks anyway, but then we can't use the existing workstations and the installed tools for more than one network due to the broadcasting stuff. Using the same workstations requires to log into the corresponding frontend/framebuilder and start all tools localy, which is not nice but still works.


Said this we decided to set everything up like that an the next thing we realized is that the stuff we have currently installed uses a the cvs-stuff mounted from one central source. But the frontend code we had was not designed for that, e.g. important parameters are not set in the matlab model and main configuration files are simply overwritten when compiling one of the frontend codes. So we had to add a couple of things in the matlab file, like the gds_node_id. An example of current cdsparameters are:

site=C3
rate=64K
dcuid=10
gds_node_id=2
shmem_daq=1
specific_cpu=3

We had to hack plenty of things which i can't remember all but e.g. we had to add the right node ID in the testpoint.par file in /opt/apps/Linux/gds/param/ as "L-node" - yes we have to use LHO not Caltech here , e.g

[L-nodex]  (here x=gds_node_id of atf model)
hostname=ip of frontend running atf model
system=atf

[L-nodez] (here y=gds_node_id of psl model)
hostname=ip of frontend running psl model
system=psl

the testpoints are created and written to a file named tpchn_Cx.par, where x equals the gds_node_id again, so a model in ifo=C1 and node-id=2 creates a file tpchn_C2.par !!! So this C2 does not correspond to the IFO set in the model. So e.g. using two different models, IFO=C2 and C3, both running on different frontends (!) but starting with node-id=1 (if you don't specify it in the model default is 1) overwrites the one from the other model each time the model is recompiled !!!! So be carefull. Also a link named tpchn_Lx.par has to point to tpchn_Cx.par (the LHO thing again).... this file has to be also added to the list in the fb master file...

the gds stuff is configured in diag_Cx.conf, x is abritrary, but independent for each system. It looks like:

&nds * *  10.0.1.10 8088 *
&chn * *  10.0.1.10 822087685 1
&leap * * 10.0.1.10 822087686 1
&ntp * *  10.0.1.10 * *
&err 0 *  10.0.1.10 5353 *

containing the ip address for the corresponding machine for all the services.

Same for AWG, which setting can be found in awg.par (again, use LHO(!):

[L1-awg0]
hostname=10.0.1.10

at the end it didn't work because the second frontend computer, even if it is almost identical (it's the same identical model, but a slightly newer version of the mainboard/bios) which is not capable of running the RTcore in realtime.  So, this is the end of part 1 of this odyssey, having only one computer where we can run stuff on which brought us to the point where we had to use the same machine (fb0) for both models, but read part 2 of this odissey why it took another four hours to get it (basically) work, which will be posted soon.....

 

 

ELOG V3.1.3-