40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 43 of 357  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  4075   Mon Dec 20 10:06:36 2010 kiwamuUpdateCDSETMY damped

  Last Saturday I succeeded in damping the ETMY suspension eventually.

This means now ALL the suspensions are happily damped. 

It looked like some combination of gains and control filters had made unstabie conditions.

 2010Dec18.png

  I actually was playing with the on/off switches of the control filters and the gain values just for fun.

Then finally I found it worked when the chebyshev filters were off. This is the same situation as Yuta told me about two months before.

Other things like the input and the output matrix looked nothing is wrong, except for the sign flips at ULSEN and SDSEN as I mentioned in the last entry (see here).

So we still should take a look at the analog filters in order to make sure why the signs are flipped.

  4097   Fri Dec 24 09:01:33 2010 josephbUpdateCDSBorrowed ADC

Osamu has borrowed an ADC card from the LSC IO chassis (which currently has a flaky generation 2 Host interface board).  He has used it to get his temporary Dell test stand running daqd successfully as of yesterday.

This is mostly a note to myself so I remember this in the new year, assuming Osamu hasn't replaced the evidence by January 7th.

  4101   Sat Jan 1 19:13:40 2011 ranaUpdateCDSc1pem now recording data

 I found that there was no PEM data nor any other data (no SUS or otherwise. No testpoints, no DAQ).

I went through the procedure that Jenne has detailed in the Wiki but it didn't work.

1) Firstly, the 'telnet fb 8088' step doesn't work. It says "Connected to fb.martian" but then just hangs. To replicate the effect of this step I tried ssh'ing to fb and doing a 'pkill daqd'. That works to restart the daqd process.

2) The wiki instructions had a problem. In the GUI step, it should say 'Save' after the Acquire bit has been set to 1. Even so, this works to get the .ini file right and the DTT can see the correct channel list, but none of the channels are available. There are just 'Unable to obtain measurement data'.

3) I tried running 'startc1pem', but no luck. I also tried rebooting c1sus from the command line. That worked so far as to come back up with all the right processes running, but still no data. The actual /frames directory shows that there are frames, but we just can't see the data. I also tried to get data usind the DTT-NDS2 method, but still no luck. (*** ITMX and ITMY both came back with all their filters off; worth checking if their BURTs are working correctly.)

Using DataViewer, however, I AM able to see the data (although the channel name is RED). In fact, I am able to see the trend data ever since I changed the Acquire bit to 1. Plot attached as evidence. Why does DTT not work anymore???

Attachment 1: Untitled.png
Untitled.png
  4126   Sat Jan 8 21:12:12 2011 ranaUpdateCDSMegatron is back

 I started reverting Megatron into a standard Ubuntu workstation after Joe/Osamu's attempt to steal it for their real time mumbo jumbo.

First, I installed a hard drive that was sitting around on top of it. That whole area is still a mess; I'm not surprised that we have so many CDS problems in such a chaotic state. There's another drive sitting around there called 'RT Linux' which I didn't use yet.

Second, I removed the ethernet cables and installed a monitor/keyboard/mouse on it.

Then I popped in the Ubuntu 10.04 LTS DVD, wiped the existing CentOS install and started the standard graphical installation of Ubuntu.

megatron.jpg

Megatron's specs attached: 

Attachment 2: sysinfo.text
  4129   Mon Jan 10 16:39:36 2011 josephb, alex, rolfUpdateCDSNew Time server for frame builder and 1PPS

Alex and Rolf came over today with a Tempus LX  GPS network timing server.  This has an IRIG-B output and a 1PPS output.  It can also be setup to act as an NTP server (although we did not set that up).

This was placed at waist height in the 1X7 rack.  We took the cable running to the presumably roof mounted antenna from the VME timing board and connected it to this new timing server.  We also moved the source of the 1PPS signal going to the master timer sequencer (big blue box in 1X7 with fibers going to all the front ends) to this new time server.  This system is currently working, although it took about 5 minutes to actually acquire a timing signal from the GPS satellites.  Alex says this system should be more stable, with no time jumps. 

I asked Rolf about the new timing system for the front ends, he had no idea when that hardware would be available to the 40m.

Currently, all the front ends and the frame builder agree on the time.  Front ends are running so the 1 PPS signal appears to be working as well.

  4130   Mon Jan 10 16:47:08 2011 josephb, alex, rolfUpdateCDSFixed c1lsc dolphin reflected memory

While Alex and Rolf were visiting, I pointed out that the Dolphin card was not sending any data, not even a time stamp, from the c1lsc machine.

After some poking around, we realized the IOP (input/output processor) was coming up before the Dolphin driver had even finished loading. 

We uncommented the line

#/etc/dolphin_wait

in the /diskless/root/etc/rc.local file on the frame builder.  This waits until the dolphin module is fully loaded, so it can hand off a correct pointer to the memory location that the Dolphin card reads and writes to.  Previously, the IOP had been receiving a bad pointer since the Dolphin driver had not finished loading.

So now the c1lsc machine can communicate with c1sus via Dolphin and from there the rest of the network via the traditional Ge Fanuc RFM.

  4132   Tue Jan 11 11:19:13 2011 josephbSummaryCDSStoring FE harddrives down Y arm

Lacking a better place, I've chosen the cabinet down the Y arm which had ethernet cables and various VME cards as a location to store some spare CDS computer equipment, such as harddrives.  I've added (or will add in 5 minutes) a label "FE COMPUTER HARD DRIVES" to this cabinet.

  4134   Tue Jan 11 13:32:52 2011 josephb, kiwamuUpdateCDSUpdated some DAQ channel names

[Joe, Kiwamu]

We modified the activateDAQ.py script which lives in /opt/rtcds/caltech/c1/chans/daq/ and updates the C1SUS.ini, C1MCS.ini, C1RMS.ini, C1SCX.ini and C1SCY.ini files.  These files contain the DAQ channels for all the optics.

It has been modified so that channels like C1:SUS-ITMX_ULSEN_OUT_DAQ become C1:SUS-ITMX_SENSOR_UL.  Similarly the oplev signals go from C1:SUS-ITMX_OLPIT_OUT to C1:SUS-ITMX_OPLEV_PERROR.

After some debugging, we ran the script successfully and checked the output was correct.  We then restarted the frame builder (telnet fb 8088 and then shutdown) and also hit the DAQ reload button for all the front ends.

I tested in dataviewer that I could go back several years as well as going back just 1 hour in the history and see data for C1:SUS-ITMX_SENSOR_LL as well as C1:SUS-ITMX_OPLEV_YERROR.  I also tested realtime is also working for these channels.

 

The contents of the script are below.

 

inputfiles=["C1SUS.ini","C1RMS.ini","C1MCS.ini","C1SCX.ini","C1SCY.ini"]
prefix="[C1:SUS-"
optics=["BS_","ITMX_","ITMY_","PRM_","SRM_","MC1_","MC1_","MC2_","MC3_","ETMX_"]
#channels=["SUSPOS_IN1","SUSPIT_IN1","SUSYAW_IN1","SUSSIDE_IN1","ULSEN_OUT","URSEN_OUT","LRSEN_OUT","LLSEN_OUT","SDSEN_OUT","OL_SUM_IN1","OLPIT_IN1","OLYAW_IN1"]
channels_dict = {'SUSPOS_IN1':'SUSPOS_IN1_DAQ',
'SUSPIT_IN1':'SUSPIT_IN1_DAQ',
'SUSYAW_IN1':'SUSYAW_IN1_DAQ',
'SUSSIDE_IN1':'SUSSIDE_IN1_DAQ',
'ULSEN_OUT':'SENSOR_UL',
'URSEN_OUT':'SENSOR_UR',
'LRSEN_OUT':'SENSOR_LR',
'LLSEN_OUT':'SENSOR_LL',
'SDSEN_OUT':'SENSOR_SIDE',
'OLPIT_OUT':'OPLEV_PERROR',
'OLYAW_OUT':'OPLEV_YERROR',
'OL_SUM_OUT':'OPLEV_SUM'}

suffix="_DAQ]\n"

## set datarate
datarate=2048

## read the ini files
for inputfile in inputfiles:
    print inputfile
    outputfile=inputfile
    ifile = open(inputfile,'r')
    lines = ifile.readlines()
    ifile.close()

    for k in range(len(lines)):
        for op in optics:
            for ch in channels_dict:
                if (prefix+op+ch+suffix) in lines[k]:
                    lines[k]=prefix + op + channels_dict[ch] + "]\n"
                    lines[k+1]=lines[k+1].lstrip("#").rstrip(lines[k+1].split("=")[1])+"1\n"
                    lines[k+2]=lines[k+2].lstrip("#")
                    lines[k+3]=lines[k+3].lstrip("#").rstrip(lines[k+3].split("=")[1])+str(datarate)+"\n"
                    lines[k+4]=lines[k+4].lstrip("#")
    ofile = open(outputfile,'w')
    for k in range(len(lines)):
        ofile.write(lines[k])
        #print lines[k]
    ofile.close()

  4136   Tue Jan 11 16:04:17 2011 josephbUpdateCDSScript to update web views of models for all installed front ends

I wrote a new script that is in /opt/rtcds/caltech/c1/scripts/AutoUpdate/ called  webview_simlink_update.m. 

This m-file when run in matlab will go to the /opt/rtcds/caltech/c1/target directory and for each c1 front end, generate the corresponding webview files for that system and place them in the AutoUpdate directory. 

Afterwards the files can be moved on Nodus to the /users/public_html/FE/ directory with:

mv /opt/rtcds/caltech/c1/scripts/AutoUpdate/*slwebview* /users/public_html/FE/

This was run today, and the files can be viewed at:

https://nodus.ligo.caltech.edu:30889/FE/

Long term, I'd like to figure out a way of automating this to produce automatically updated screens without having to run it manually.  However, simulink seems to stubbornly require an X window to work.

  4144   Wed Jan 12 17:50:21 2011 josephbUpdateCDSWorked on c1lsc, MC2 screens

[josephb, osamu, kiwamu]

We worked over by the 1Y2 rack today, trying to debug why we didn't get any signal to the c1lsc ADC.

We turned off the power to the rack several times while examining cards, including the whitening filter board, AA board, and the REFL 33 demod board.  I will note, I incorrectly turned off power in the 1Y1 rack briefly. 

We noticed a small wire on the whitening filter board on the channel 5 path.  Rana suggested this was to part of a fix for the channels 4 and 5 having too much cross talk.  A trace was cut and this jumper added to fix that particular problem.

We confirmed would could pass signals through each individual channel on the AA and whitening filter boards.  When we put them back in, we did noticed a large offset when the inputs were not terminated.  After terminating all inputs, values at the ADC were reasonable, measuring on from 0 to about -20 counts.  We applied a 1 Hz, 0.1 Vpp signal and confirmed we saw the digital controls respond back with the correct sine wave.

We examined the REFL 33 demod board and confirmed it would work for demodulating 11 MHZ, although without tuning, the I and Q phases will not be exactly 90 degrees apart.

The REFL 33  I and Q outputs have been connected to the whitening board's 1 and 2 inputs, respectively.  Once Kiwamu  adds approriate LO and PD signals to the REFL 33 demod board he should be able to see the resulting I and Q signals digitally on the PD1 I and Q channels.

 

In an unrelated fix, we examined the suspensions screens, specifically the Dewhitening lights.  Turns out the lights were still looking at SW2 bit 7 instead of SW2 bit 5.  The actual front end models were using the correct bit (21 which corresponds to the 9th filter bank), so this was purely a display issue.  Tomorrow I'll take a look at the binary outputs and see why the analog filters aren't actually changing.

 

 

 

  4146   Wed Jan 12 22:33:24 2011 kiwamuUpdateCDSMC2 dewhitening are healthy except for UR

I briefly checked the MC2 analog dewhitening filters.

It turned out that the switching of the dewhitening filters from epics worked correctly except for the UR path.

I couldn't get a healthy transfer function for the UR path probably because the UR monitor at the front panel on either the AI filter or the dewhitening filter maybe broken.

Need a check again.

Quote:  #4144

Tomorrow I'll take a look at the binary outputs and see why the analog filters aren't actually changing.

  4150   Thu Jan 13 14:21:13 2011 josephbUpdateCDSWebview of front end model files automated

After Rana pointed me to Yoichi's MEDM snapshot script, I learned how to use Xvfb, which is what Yoichi used to write screens without a real screen.  With this I wrote a new cron script, which I added to Mafalda's cron tab to be run once a day at 6am.

The script is called webview_update.cron and is in /opt/rtcds/caltech/c1/scripts/AutoUpdate/.

#!/bin/bash
DISPLAY=:6
export DISPLAY
#Check if Xvfb server is already running
pid=`ps -eaf|grep vfb | grep $DISPLAY | awk '{print $2}'`
if [ $pid ]; then
        echo "Xvfb already running [pid=${pid}]" >/dev/null
else
# Start Xvfb
echo "Starting Xvfb on $DISPLAY"
Xvfb $DISPLAY -screen 0 1600x1200x24 >&/dev/null &
fi
pid=$!
echo $pid > /opt/rtcds/caltech/c1/scripts/AutoUpdate/Xvfb.pid
sleep 3

#Running the matlab process
/cvs/cds/caltech/apps/linux/matlab/bin/matlab -display :6 -logfile /opt/rtcds/caltech/c1/scripts/AutoUpdate/webview.log -r webview_simlink_update

  4152   Thu Jan 13 16:41:07 2011 josephbUpdateCDSChannel names for LSC updated

I renamed most of the filter banks in the c1lsc model.  The input filters are now labeled based on the RF photodiode's name, plus I or Q.  The last set of filters in the OM subsystem (output matrix) have had the TO removed, and are now sensibly named ETMX, ETMY, etc.

We also removed the redundant filter banks between the LSCMTRX and the LSC_OM_MTRX.  There is now only one set, the DARM, CARM, etc ones.

The webview of the LSC model can be found here.

  4160   Fri Jan 14 20:39:20 2011 ranaUpdateCDSUpdated some DAQ channel names

I like this activateDAQ script, but someone (Jenne with Joe's help) still needs to add the PEM channels - we still cannot see any seismic trends.

  4171   Thu Jan 20 00:39:22 2011 kiwamuHowToCDSDAQ setup : another trick

Here is another trick for the DAQ setup when you add a DAQ channel associated with a new front end code.

 

 Once you finish setting up the things properly according to this wiki page (this page ), you have to go to 

      /cvs/cds/rtcds/caltech/c1/target/fb

and then edit the file called master

This file contains necessary path where fb should look at, for the daqd initialization.

Add your path associated with your new front end code on this file, for example:

        /opt/rtcds/caltech/c1/chans/daq/C1LSC.ini

       /opt/rtcds/caltech/c1/target/gds/param/tpchn_c1lsc.par

After editing the file, restart the daqd on fb by the usual commands:

             telnet fb 8088

             shutdown

  4173   Thu Jan 20 04:03:02 2011 kiwamuUpdateCDSc1scy error

 I found that c1scy was not running due to a daq initialization error.

 I couldn't figure out how to fix it, so I am leaving it to Joe.


 Here is the error messages in the dmesg on c1iscey
[   39.429002] c1scy: Invalid num daq chans = 0
[   39.429002] c1scy: DAQ init failed -- exiting
 
 
Before I found this fact, I rebooted c1iscey in order to recover the synchronization with fb.
The synchronization had been lost probably because I shutdowned the daqd on fb.
  4175   Thu Jan 20 10:15:50 2011 josephbUpdateCDSc1scy error

This is caused by an insufficient number of active DAQ channels in the C1SCY.ini file located in /opt/rtcds/caltech/c1/chans/daq/.  A quick look (grep -v # C1SCY.ini) indicates there are no active channels.  Experience tells me you need at least 2 active channels.

Taking a look at the activateDAQ.py script in the daq directory, it looks like the C1SCY.ini file is included, by the loop over optics is missing ETMY.  This caused the file to improperly updated when the activateDAQ.py script was run.  I have fixed the C1SCY.ini file (ran a modified version of the activate script on just C1SCY.ini).

I have restarted the c1scy front end using the startc1scy script and is currently working.

Quote:
 Here is the error messages in the dmesg on c1iscey
[   39.429002] c1scy: Invalid num daq chans = 0
[   39.429002] c1scy: DAQ init failed -- exiting
 

 

  4179   Thu Jan 20 18:20:55 2011 josephbUpdateCDSc1iscex computer and c1sus computer swapped

Since the 1U sized computers don't have enough slots to hold the host interface board, RFM card, and a dolphin card, we had to move the 2U computer from the end to middle to replace c1sus.

We're hoping this will reduce the time associated with reads off the RFM card compared to when its in the IO chassis.  Previous experience on c1ioo shows this change provides about a factor of 2 improvement, with 8 microseconds per read dropping to 4 microseconds per read, per this elog.

So the dolphin card was moved into the 2U chassis, as well as the RFM card.  I had to swap the PMC to PCI adapter on the RFM card since the one originally on it required an external power connection, which the computer doesn't provide.  So I swapped with one of the DAC cards in the c1sus IO chassis.

But then I forgot to hit submit on this elog entry..............

  4183   Fri Jan 21 15:26:15 2011 josephbUpdateCDSc1sus broken yesterday and now fixed

[Joe, Koji]
Yesterday's CDS swap of c1sus and c1iscex left the interfometer in a bad state due to several issues.

The first being a need to actually power down the IO chassis completely (I eventually waited for a green LED to stop glowing and then plugged the power back in) when switching computers.  I also plugged and plugged the interface cable from the IO chassis and computer while powered down.  This let the computer actually see the IO chassis (previously the host interface card was glowing just red, no green lights).

Second, the former c1iscex computer and now new c1sus computer only has 6 CPUs, not 8 like most of the other front ends.  Because it was running 6 models (c1sus, c1mcs, c1rms, c1rfm, c1pem, c1x02) and 1 CPU needed to be reserved for the operating system, 2 models were not actually running (recycling mirrors and PEM).  This meant the recycling mirrors were left swinging uncontrolled.

To fix this I merged the c1rms model with the c1sus model.  The c1sus model now controls BS, ITMX, ITMY, PRM, SRM.  I merged the filter files in the /chans/ directory, and reactivated all the DAQ channels.  The master file for the fb in the /target/fb directory had all references to c1rms removed, and then the fb was restarted via "telnet fb 8088" and then "shutdown".

My final mistake was starting the work late in the day.

So the lesson for Joe is, don't start changes in the afternoon.

Koji has been helping me test the damping and confirm things are really running.  We were having some issues with some of the matrix values.  Unfortunately I had to add them by hand since the previous snapshots no longer work with the models.

  4184   Fri Jan 21 17:59:27 2011 josephb, alexUpdateCDSFixed Dolphin transmission

The orientation of the Dolphin cards seems to be opposite on c1lsc and c1sus.  The wide part is on top on c1lsc and on the bottom on c1sus.  This means, the cable is plugged into the left Dolphin port on c1lsc and into the right Dolphin port on c1sus.  Otherwise you get a wierd state where you receive but not transmit.

  4200   Tue Jan 25 15:20:38 2011 josephbUpdateCDSUpdated c1rfm model plus new naming convention for RFM/Dolphin

After sitting down for 5 minutes and thinking about it, I realized the names I had been using for internal RFM communication were pretty bad.  It was because looking at a model didn't let you know where the RFM connection was coming from or going to.  So to correct my previous mistakes, I'm instituting the following naming convention for reflected memory, PCIE reflected memory (dolphin) and shared memory names.  These don't actually get used anywhere but the models, and thus don't show up as channel names anywhere else.  They are replaced by raw hex memory locations in the actual code through the use of the IPC file (/opt/rtcds/caltech/c1/chans/ipc/C1.ipc).  However it will make understanding the models easier for anyone looking at them or modifying them.

 

The new naming convention for RFM and Dolphin channels is as follows.

SITE:Sending Model-Receiving Model_DESCRIPTION_HERE

The description should be unique to that data being transferred and reused if its the same data.  Thus if its transfered to another model, its easy to identify it as the same information.

The model should be the .mdl file name, not the subsystem its a part of.  So SCX is used instead of SUS.  This is to make it easier to track where data is going.

In the unlikely case of multiple models receiving, it should be of the form SITE:Sending Model-Receiving Model 1-Receiving Model 2_DESCRIPTION_HERE.  Seperate models by dashes and description by underscores.

Example:

C1:LSC-RFM_ETMX_LSC

This channel goes from the LSC model (on c1lsc) to the RFM model (on c1sus).  It transfers ETMX LSC position feedback.  The second LSC may seem redundant until we look at the next channel in the chain.

C1:RFM-SCX_ETMX_LSC

This channel goes from the RFM model to the SCX model (on c1iscex). It contains the same information as the first channel, i.e. ETMX LSC position feedback.

 

I have updated all the models that had RFM and SHMEM connections, as well as adding all the LSC communciation connections to c1rfm.  This includes c1sus, c1rfm, c1mcs, c1ioo, c1gcv, c1lsc, c1scx, c1scy.  I have not yet built all the models since I didn't finish the updates until this afternoon.  I will build and test the code tomorrow morning.

 

 

 

  4203   Tue Jan 25 22:49:13 2011 KojiUpdateCDSFront End multiple crash

STATUS:

  • Rebooted c1lsc and c1sus. Restarted fb many times.
  • c1sus seems working.
  • All of the suspensions are damped / Xarm is locked by the green
  • Thermal control for the green is working
  • c1lsc is frozen
  • FB status: c1lsc 0x4000, c1scx/c1scy 0x2bad
  • dataviewer not working

1. DataViewer did not work for the LSC channels (liek TRX)

2. Rebooted LSC. There was no instruction for the reboot on Wiki. But somehow the rebooting automatically launched the processes.

3. However, rebooting LSC stopped C1SUS processes working

4. Rebooted C1SUS. Despite the rebooting description on wiki, none of the FE processes coming up.

5. Probably, I was not enough patient to wait for the completion of dorphine_wait? Rebooted C1SUS again.

6. Yes. That was true. This time I wait for everything going up automatically. Now all of c1pemfe,c1rfmfe,c1mcsfe,c1susfe,c1x02fe are running.
FB status for c1sus processes all green.

7. burtrestored c1pemfe,c1rfmfe,c1mcsfe,c1susfe,c1x02fe with the snapshot on Jan 25 12:07, 2010.

8. All of the OSEM filters are off, and the servo switches are incorrectly on. Pushing many buttons to restore the suspensions.

9. I asked Suresh to restore half of the suspensions.

10. The suspensions were restored and damped. However, c1lsc is still freezed.

11. Rebooting c1lsc freezed the frontends on c1sus. We redid the process No. 5 to No.10

12. c1x04 seems working. c1lsc, however, is still frozen. We decided to leave C1LSC in this state.

 

  4206   Wed Jan 26 10:58:48 2011 josephbUpdateCDSFront End multiple crash

Looking at dmesg on c1lsc, it looks like the model is starting, but then eventually times out due to a long ADC wait. 

[  114.778001] c1lsc: cycle 45 time 23368; adcWait 14; write1 0; write2 0; longest write2 0
[  114.779001] c1lsc: ADC TIMEOUT 0 1717 53 181

I'm not sure what caused the time out, although there about 20 messages indicating a failed time stamp read from c1sus (its sending TRX information to c1lsc via the dolphin connection) before the time out.

Not seeing any other obvious error messages, I killed the dead c1lsc model by typing:

sudo rmmod c1lscfe

I then tried starting just the front end model again by going to the /opt/rtcds/caltech/c1/target/c1lsc/bin/ directory and typing:

sudo insmod c1lscfe.ko

This started up just the FE again (I didn't use the restart script because the EPICs processes were running fine since we had non-white channels).  At the moment, c1lsc is now running and I see green lights and 0x0 for FB0 status  on the C1LSC_GDS_TP screen.

At this point I'm not sure what caused the timeout.  I'll be adding some more trouble shooting steps to the wiki though.  Also, c1scx, c1scy are probably in need of restart to get them properly sync'd to the framebuilder.

I did a quick test on dataviewer and can see LSC channels such as C1:LSC-TRX_IN1, as well other channels on C1SUS such as BS sensors channels.

Quote:

STATUS:

  • Rebooted c1lsc and c1sus. Restarted fb many times.
  • c1sus seems working.
  • All of the suspensions are damped / Xarm is locked by the green
  • Thermal control for the green is working
  • c1lsc is frozen
  • FB status: c1lsc 0x4000, c1scx/c1scy 0x2bad
  • dataviewer not working 

 

  4207   Wed Jan 26 12:03:45 2011 KojiUpdateCDSFront End multiple crash

This is definitely a nice magic to know as the rebooting causes too much hustles.

Also, you and I should spend an hour in the afternoon to add the suspension swtches to the burt requests.

Quote:

I killed the dead c1lsc model by typing:

sudo rmmod c1lscfe

I then tried starting just the front end model again by going to the /opt/rtcds/caltech/c1/target/c1lsc/bin/ directory and typing:

sudo insmod c1lscfe.ko

This started up just the FE again

 

  4208   Wed Jan 26 12:04:31 2011 josephbUpdateCDSExplanation of why c1sus and c1lsc models crash when the other one goes down

So apparently with the current Dolphin drivers, when one of the nodes goes down (say c1lsc), it causes all the other nodes to freeze for up to 20 seconds.

This 20 seconds can force a model to go over the 60 microseconds limit and is sufficiently long enough to force the FE to time out.  Alex and Rolf have been working with the vendors to get this problem fixed, as having all your front ends go down because you rebooted a single computer is bad.

[40184.120912] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0
[40184.120914] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0
[44472.627831] c1pem: ADC TIMEOUT 0 7718 38 7782
[44472.627835] c1mcs: ADC TIMEOUT 0 7718 38 7782
[44472.627849] c1sus: ADC TIMEOUT 0 7718 38 7782
[44472.644677] c1rfm: cycle 1945 time 17872; adcWait 15; write1 0; write2 0; longest write2 0
[44472.644682] c1x02: cycle 7782 time 17849; adcWait 12; write1 0; write2 0; longest write2 0
[44472.646898] c1rfm: ADC TIMEOUT 0 8133 5 7941

The solution for the moment is to start the computers at exactly the same time, so the dolphin is up before the front ends, or start the models by hand after the computer is up and dolphin running, but after they have timed out.  This is done by:

sudo rmmod c1SYSfe

sudo insmod /opt/rtcds/caltech/c1/target/c1SYS/bin/c1SYSfe.ko

 

Alex and Rolf have been working with the vendors to get this fixed, and we may simply need to update our Dolphin drivers.  I'm trying to get in contact with them and see if this is the case.

  4212   Thu Jan 27 15:16:43 2011 josephbUpdateCDSUpdated generate_master_screens.py

I modified the generate_master_screens.py script in /opt/rtcds/caltech/c1/medm/master/ to handle changing the MCL (and MC_L) listings to ALS for the two ETM suspension screens and associated sub-screens.

The relevant added code is:

custom_optic_channels = ['ETMX',
{'MCL':'ALS','MC_L':'ALS'},
'ETMY',
{'MCL':'ALS','MC_L':'ALS'}]

 

for index in range(len(custom_optic_channels)/2):
   if optic == custom_optic_channels[index*2]:
     for swap in custom_optic_channels[index*2+1]:
       sed_command = start_sed_string + swap + "/" + custom_optic_channels[index*2+1][swap] + middle_sed_string + optic + file
       os.system(sed_command)

When run, it generates the correctly named C1:SUS-ETMX_ALS channels, and replaces MCL and MC_L with ALS in the matrix screens.

 

  4220   Fri Jan 28 12:15:58 2011 josephbUpdateCDSUpdating conlog channel list/ working on "HealthCheck" script

I've updated the scan_adls script (currently located in /cvs/cds/caltech/conlog/bin) to look at new location of our medm screens.  I made a backup of the old conlog channel list as /cvs/cds/caltech/conlog/data/conlog_channels.old-2011-01-28.

I then ran the update_chanlist script in the same directory, which calls the scan_adl script.  After about 5 minutes it finished updating the channel list.  I restarted the conlogger just to be sure, and checked that our new model channels showed up in the conlog (which they do).

I have added a cron job to the op340m cron tab to once a day run the update_conlog script at 7am.

Next, I'm working on a HealthCheck script which looks at the conlog channel list and checks to see if channels are actually changing over short time scales, and then spit back a report on possibly non-functioning channels to the user.

  4241   Wed Feb 2 15:07:20 2011 josephbUpdateCDSactivateDAQ.py now includes PEM channels

[Joe, Jenne]

We modified the activateDAQ.py script to handle the C1PEM.ini file (defining the PEM channels being recorded by the frame builder) in addition to all the optics channels.  Jenne will be modifying it further so as to rename more channels.

  4246   Thu Feb 3 16:45:28 2011 josephbUpdateCDSGeneral CDS updates

Updated the FILTER.adl file to have the yellow button moved up, and replaced the symbol in the upper right with a white A with black background.  I made a backup of the filter file called FILTER_BAK.adl.  These are located in /opt/rtcds/caltech/c1/core/advLigoRTS/src/epics/util.

I also modified the Makefile in /opt/rtcds/caltech/c1/core/advLigoRTS/ to make the startc1SYS scripts it makes take in an argument.  If you type in:

sudo startc1SYS 1

it automatically writes 1 to the BURT RESTORE channel so you don't have to open the GDS_TP screen and by hand put a 1 in the box before the model times out.

The scripts also points to the correct burtwb and burtrb files so it should stop complaining about not finding them when running the scripts, and actually puts a time stamped burt snapshot in the /tmp directory when the kill or start scripts are run.  The Makefile was also backed up to Makefile_bak.

 

  4249   Fri Feb 4 13:31:16 2011 josephbUpdateCDSFE start scripts moved to scripts/FE/ from scripts/

All start and kill scripts for the front end models have been moved into the FE directory under scripts:  /opt/rtcds/caltech/c1/scripts/FE/.  I modified the Makefile in /opt/rtcds/caltech/c1/core/advLigoRTS/ to update and place new scripts in that directory. 

This was done by using

sed -i 's[scripts/start$${system}[scripts/FE/start$${system}[g' Makefile

sed -i 's[scripts/kill$${system}[scripts/FE/kill$${system}[g' Makefile

  4262   Tue Feb 8 16:04:58 2011 josephbUpdateCDSHard coded decimation filters need to be fixed

[Joe, Rana]

Filter definitions for the decimation filters to epics readback channels (like _OUT16) can be found in the fm10Gen.c code (in /opt/rtcds/caltech/c1/core/advLigoRTS/src/include/drv).

At the moment, the code is broken for systems running at 32k, 64k as they look to be defaulting to the 16k filter.  I'd like to also figure out the notation and plot the actual filter used for the 16k.

Rana has suggested a 2nd order, 2db ripple low pass Cheby1 filter at 1 Hz.

 

  51 #if defined(SERVO16K) || defined(SERVOMIXED) || defined(SERVO32K) || defined(SERVO64K) || defined(SERVO128K) || defined(SERVO256K)
  52 static double sixteenKAvgCoeff[9] = {1.9084759e-12,
  53                                      -1.99708675982420, 0.99709029700517, 2.00000005830747, 1.00000000739582,
  54                                      -1.99878510620232, 0.99879373895648, 1.99999994169253, 0.99999999260419};
  55 #endif
  56
  57 #if defined(SERVO2K) || defined(SERVOMIXED) || defined(SERVO4K)
  58 static double twoKAvgCoeff[9] = {7.705446e-9,
  59                                  -1.97673337437048, 0.97695747524900,  2.00000006227141,  1.00000000659235,
  60                                  -1.98984125831661,  0.99039139954634,  1.99999993772859,  0.99999999340765};
  61 #endif
  62
  63 #ifdef SERVO16K
  64 #define avgCoeff sixteenKAvgCoeff
  65 #elif defined(SERVO32K) || defined(SERVO64K) || defined(SERVO128K) || defined(SERVO256K)
  66 #define avgCoeff sixteenKAvgCoeff
  67 #elif defined(SERVO2K)
  68 #define avgCoeff twoKAvgCoeff
  69 #elif defined(SERVO4K)
  70 #define avgCoeff twoKAvgCoeff
  71 #elif defined(SERVOMIXED)
  72 #define filterModule(a,b,c,d) filterModuleRate(a,b,c,d,16384)
  73 #elif defined(SERVO5HZ)
  74 #else
  75 #error need to define 2k or 16k or mixed
  76 #endif

  4265   Wed Feb 9 15:26:22 2011 josephbUpdateCDSUpdated c1scx with lockin, c1gcv for green transmission pd

Updated the c1scx model to have two Lockin demodulators (C1:SUS-ETMX_LOCKIN1 and C1:SUS-ETMX_LOCKIN2).  There is a matrix C1:SUS-ETMX_INMUX which directs signals to the inputs of LOCKIN1 and LOCKIN2.  Currently only the GREEN_TRX signal is the only signal going in to this matrix, the other 3 are grounds.  The actual clocks themselves had to be at the top level (they don't work inside blocks) and thus named C1:SCX-ETMX_LOCKIN1_OSC and C1:SCX-ETMX_LOCKIN2_OSC.

 

There is a signal (IPC name is C1:GCV-SCX_GREEN_TRX) going from the c1gcv model to the c1scx model, which will contain the output from Jenne's green transmission PD which will eventually be placed. I've placed a filter bank on it in the c1gcv model as a monitor point, and it corresponds to C1:GCV-GREEN_TRX.

 

The suspension control screens were modified to have a screen for the Matrix feeding signals into the two lockin demodulators.  The green medm screen was also modified to have readbacks for the GREEN_TRX and GREEN_TRY channels.

 

So on the board, the top channel (labeled 1, corresponds to code ADC_0_0) is MCL.

Channel 2 (ADC_0_1) is assigned to frequency divided green signal.

Channel 3 (ADC_0_2) is assigned to the beat PD's DC output.

Channel 4 (ADC_0_3) is assigned to the green power transmission for the x-arm.

Channel 5 (ADC_0_4) is assigned to the green power transmission for the y-arm.

  4270   Thu Feb 10 14:07:18 2011 josephbUpdateCDSUpdating dolphin drivers to eliminate timeouts when one dolphin card is shutdown

[Joe,Alex]

Alex came over and we installed the new Dolphin drivers so that the front ends using the Dolphin PCIe RFM network don't pause for a long time when one of the other nodes in the network go down.  Generally this pause would cause the code to time out and quit.  Now you can take c1lsc or c1sus down without having the other have problems.

We did note on reboot however, that the Dolphin_wait script sometimes (not always) seems to hang.  Since this is run at boot up, to ensure the dolphin card has had enough to allocate memory space for data to be written/read from by the IOP process, it means nothing else in the startup script gets run if it does happen.  In this case, running "pkill dolphin_wait" may be necessary.

Note that you may still have problems if you hit the power button to force a shutdown (i.e. holding it for 4 seconds for immediate power off), but as long as you do a "reboot" or "shutdown -r now" type command, it should come down gracefully. 

 

What was done:

Alex grabbed the code from his server, and put it /home/controls/DIS/ on fb.

He ran the following commands in that directory to build the code.

./configure  '--with-adapter=DX' '--prefix=/opt/DIS'

make

sudo make install

He proceeded to modify the /diskless/root/etc/rc.local to have the line:

insmod /lib/modules/2.6.34.1/kernel/drivers/dis/dis_kosf.ko

In that same file he commented out

cd /root

and

exec /bin/bash/

He then modified the run levels in /diskless/root/etc/inittab. Level 0, level 3, and level 6 were changed:

l0:0:wait/etc/rc.halt

l3:3:wait:etc/rc.level3

l6:6:wait:/etc/rc.reboot

Then he created the scripts he was refering to:

rc.level3 is just:

exec /bin/bash

rc.halt is:

/opt/DIS/sbin/dxtool prepare-shutdown 0
sleep 3
halt -p

rc.reboot is:

reboot

 

Basically rc.halt calls a special code which prepares the Dolphin RFM card to shutdown nicely.  This is why just hitting the power button for 4 seconds will cause problems for the rest of the dolphin network.

We then checked out of svn the latest dolphin.c in  /opt/rtcds/caltech/c1/core/advLigoRTS/src/fe

 

The Dolphin RFM cards have a new numbering scheme.  4 is reserved for special  broadcasts to everyone, so the Dolphin node IDs now start at 8.  So we needed to change the c1lsc and c1sus Dolphin node IDs.

To change them we went to /etc/dis/dishosts.conf on the fb machine, and changed the following lines:

 

HOSTNAME: c1sus
ADAPTER:  c1sus_a0 4 0 4
HOSTNAME: c1lsc
ADAPTER:  c1lsc_a0 8 0 4

to

HOSTNAME: c1sus
ADAPTER:  c1sus_a0 8 0 4
HOSTNAME: c1lsc
ADAPTER:  c1lsc_a0 12 0 4

The FE models for the c1lsc and c1sus machines were recompiled and then the computers were rebooted.  After having them come back up, we tested that there was no time out by shutting down c1lsc and watching c1sus. We then reveresed and shutdown c1sus while watching c1lsc.  No problems occured.  Currently they are up and communicating fine.

 

  4282   Mon Feb 14 01:19:18 2011 ranaUpdateCDSToday's CDS problems

This is just a listing of  CDS problems I still notice today:

  1. MC2-MCL button was left ON due to BURT failure. This, of course, screws up our Green locking investigations because of the unintended feedback. Please fix the BURT/button issue.

  2. The GCV - FB0 status is RED. I guess this means there's something wrong? Its really a bad idea to have a bunch of whited out or falsely red indicators. No one will ever use these or trust these in the future.
  3. MC1/2/3 Lockins are all white. Also, the MODE switches for the dewhitening are all white.
  4. Is the MC SIDE coil dewhitening filter synced with anything? It doesn't seem to switch anything. Maybe the dewhite indicators at the top right of the SUS screens can be made to show the state of the binary output instead of just the digital filter
  5. MC WFS is all still broken. We need a volunteer to take this on - align beams, replace diodes, fix code/screens.
  4283   Mon Feb 14 01:40:14 2011 KojiOmnistructureCDSName of the green related channels

I propose to use C1:ALS-xxx_xxx for the names of the green related channels, instead of GCV, GCX, GCY, GFD...

Like C1:SUS or C1:LSC, we name the channels by the subsystems first, then probably we can specify the place.

We can keep the names of the processes as they are now.

  4285   Mon Feb 14 07:58:43 2011 SureshUpdateCDSToday's CDS problems

 

  I am concentrating on the RF system just now and will be attending to the RF PDs one by one.  Also plan to work on some of the simpler CDS problems when I overlap with Joe.  Will be available for helping out with the beam alignment.

 

 

  4291   Mon Feb 14 18:27:39 2011 josephbUpdateCDSBegan updating to latest CDS svn, reverted to previous state

[Joe, Alex]

This morning I began the process of bringing our copy of the CDS code up to date to the version installed at Livingston. The motivation was to get fixes to various parts, among others such as the oscillator part.   This would mean cleaning up front end model .mdl files without having to pass clk, sin, cos channels for every optic through 3 layers of simulink boxes.

I also began the process of using a similar startup method, which involved creating /etc/init.d/ start and stop scripts for the various processes which get run on the front ends, including awgtpman and mx_streams.  This allows the monitor software called monit to remotely restart those processes or provide a web page with a real time status of those processes.  A cleaner rc.local file utilizing sub-scripts was also adapted.

I did some testing of the new codes on c1iscey.  This testing showed a problem with the timing part of the code, with cycles going very long.  We think it has something to do with the code not accounting for the fact that we do not have IRIG-B timing cards in the IO chassis providing GPS time, which the sites do have.  We rely on the computer clock and ntpd.

At the moment, we've reverted to svn revision 2174 of the CDS code, and I've put the previously working version of the c1scy and c1x05 (running on the c1iscey computer) back. Its from the /opt/rtcds/caltech/c1/target/c1x05/c1x05_11014_163146 directory.  I've put the old rc.local file back in /diskless/root/etc/ directory on the fb machine.  Currently running code on the other front end computers was not touched.

  4292   Mon Feb 14 21:59:35 2011 ranaUpdateCDSUpdated some DAQ channel names

Although Joe and Kiwamu claim that they have inserted the correct DAQ names for the OPLEVs (e.g. PERROR and YERROR) back in Jan. 11, when I look today, I see that these channels are missing!

I want my PERROR/YERRORs back!

 

  4300   Tue Feb 15 11:56:17 2011 josephbUpdateCDSUpdated some DAQ channel names
That is my fault for not running the activateDAQ.py script after a round of rebuilds. I have run the script this morning, and confirmed that the oplev channels are showing up in dataviewer.

Quote:

Although Joe and Kiwamu claim that they have inserted the correct DAQ names for the OPLEVs (e.g. PERROR and YERROR) back in Jan. 11, when I look today, I see that these channels are missing!

I want my PERROR/YERRORs back!

 

 

  4302   Tue Feb 15 15:06:25 2011 josephbUpdateCDSCDS todo list for tomorrow morning

Currently, there is a test directory called /opt/rtcds/caltech/c1/new_core where we have the latest svn checkout.  Tomorrow (after everything works), it will become the core directory.

1) Modify on the fb machine the /diskless/root/etc/ld.so.cache file.  This is done by logging into fb, going to /etc/ld.so.conf.d/, modifying epics-x86_64.conf to only have .10 stuff , and running sudo /sbin/ldconfig.  Copy the newly generated /etc/ld.so.cache file to /diskless/root/etc/.

2) Modify the rc.local file on the fb machine in /diskless/root/etc/ to take advantage of the new subscripts and init.d/ start scripts.

3) Add the no_rfm_dma to all the iop models (c1x01,c1x02,c1x03,c1x04,c1x05).

4) Rebuild all front end models with new code.  Install.

5) Build awgtpman and mx_streams with new code.

6) Rerun activateDaq.py (to fix channel names from all the rebuilt code).

7) Double check Burt request files have the switch fix.

8) Restart the front ends.

9)Restart the frame builder.

9) Check channels, exitations, RFM connections.

10) Check Monit is working.

  4308   Wed Feb 16 12:16:14 2011 josephbUpdateCDSFixed Optical level SUM channel names

[Joe,Valera]

Valera pointed out the OPLEV SUM channels were incorrect.  We changing the optical level sum channel to _OPLEV_SUM when it should have been OL_SUM.  This has been fixed in the activateDAQ.py script.

  4309   Wed Feb 16 23:54:34 2011 ranaUpdateCDSToday's CDS problems
  1. ETMX cannot load his filter coefficients. Even if I change the file, the load button doesn't work. I tried changing the lockin filters but they won't load.
  2. The ETMX filter modules appear to have 2048 for some of the modules and 16384 for some of the others. How come the make script doesn't make them all 16384?
  3. There are a bunch of kill/start scripts in the scripts/ directory instead of in the scripts/FE/ directory. Did this get reverted after a new code exchange?
  4. I tried restarting the code using c1startscx, but that doesn't work correctly. It cannot find the burtrb and burtwb files even though they are in the normal path.
  5. Kiwamu was using a bunch of cockamamie filters I found.
  6. I can't get any minute trend data. I tried on rossa, rosalba, and op440m.
MC damp dataviewer diaggui AWG c1lsc c1ioo c1sus c1iscex c1iscey RFM The Dolphins Sim.Plant Frame builder TDS Cabling
                             
  4311   Thu Feb 17 11:20:04 2011 josephbUpdateCDSstart scripts no longer need sudo

I've modified the rc.local file to run the IOC codes as controls, which means they no longer write root permission log files on startup.

The awgtpman, which was the other permission issue with the start scripts, is started by a run script now.  This new version seems to be content to keep the permissions of the current log file, which is set to controls.

This should prevent the issue of sudo wiping your path environment variable for just that command. (Try "sudo which burtwb" versus "which burtwb" for example).  This apparently a security feature of sudo.

If you should happen to use sudo to run a start script, the easiest solution to fix the permissions is just got to the target directory (type "target") and run "sudo chown controls:controls -R *" on one of the workstations (the front ends don't handle the groups properly at the moment).

This should allow the scripts to properly use burtrb and burtwb to write and backup burt files.

  4312   Thu Feb 17 11:49:48 2011 josephbUpdateCDSFront end start/stop scripts go to /scripts/FE again

I modified the core/advLigoRTS/Makefile to once again place the startc1SYS and killc1SYS scripts in the scripts/FE/ directory.

It had been reverted in the SVN update.

  4313   Thu Feb 17 11:51:14 2011 josephbUpdateCDSLockin filter names too long - broke loading

Problem:

Could not load filters into the C1:SUS-ETMX_LOCKIN1_SIG filter bank.

Reason:

Apparently the filter bank name was too long.  I'm not sure why this isn't caught by the real time code generator, I'm planning on asking Alex and Rolf about it today.

Solution:

Reduce the name of the components.  Basically LOCKIN1 needs to become something like LOCK1 or LIN1.

 

In related news, it looks like the initial filters are hard coded to be 2048 Hz.  Given that they start out empty they won't cause things to break immediately, and if you're editing the file you can update the rate as you add the filter.  I'll also bring this up with Alex and Rolf and see if the RCG can't be more intelligent about its filter generation.

 

  4320   Thu Feb 17 23:56:53 2011 josephbUpdateCDSDaqd was rebuilt, now reverted.

As one of the trouble shooting steps for the daqd (i.e. framebuilder) I rebuilt the daqd executable.  My guess is somewhere in the build code is some kind of GPS offset to make the time correct due to our lack of IRIG-B signal.

The actual daqdrc file was left untouched when I did the new install, so the symmetricom gps offset is still the same, which confuses me.

I'll take a look at the SVN diffs tomorrow to see what changed in that code that could cause a 300000000 or so offset to the GPS time.

 

 

  4321   Fri Feb 18 00:13:55 2011 kiwamuUpdateCDSRe:Daqd was rebuilt, now reverted.

THANK YOU, JOE !!! 

Quote:

As one of the trouble shooting steps for the daqd (i.e. framebuilder) I rebuilt the daqd executable.

  4323   Fri Feb 18 13:41:22 2011 josephbUpdateCDSCDS fixes

I talked to Alex today and had two things fixed:

First the maximum length of filter names (in the foton C1SYS.txt files in /chans) has been increased to 40, from 20.  This does not increase EPICS channel name length (which is longer than 20 anyways).

This should prevent running into the case where the model doesn't complain when compiled, but we can't load filters.

Additionally, we modified the feCodeGen.pl script in /opt/rtcds/caltech/c1/core/advLigoRTS/src/epics/util/ to correctly generate names for filters in all cases.  There was a problem where the C1 was being left off the file name when in the simulink .mdl file the filter was located in a box which had "top_names"  set.

  4325   Fri Feb 18 17:52:25 2011 josephb, valeraUpdateCDSc1ass updated

We updated the c1ass model to include the BS.  We removed the dither excitation of the PZTs.  PZT control goes to epics. To do this, modified the /cvs/cds/caltech/target/c1iscaux/PZT_AI.db file.  We basically have it sum both the existing epics slider and our new output from c1ass.

More importantly we updated the color scheme.

We compiled and tested the Dolphin and RFM which work.

I should note we can't figure out why testpoints are not working properly with just this model.  Alex and Joe spent well over an hour trying to debug it to no success.  Current workaround is to add what channels you want from c1ass to the DAQ recording.  Other testpoints on other models appear to be working.

Attachment 1: c1ass_updated.png
c1ass_updated.png
  4344   Wed Feb 23 11:53:30 2011 josephbUpdateCDSUpdated mx drivers

Alex and I updated the open mx drivers from 1.3.3 to 1.3.901 (1.4 release candidate).  We downloaded the drivers from: http://open-mx.gforge.inria.fr/

We put them in /root/open-mx-1.3.901 on the fb machine (and thus get mounted by all the front ends.).  We did configure and make and make install.

We did a quick check with /opt/mx/bin/mx_info on the fb machine at this point and realized the MAC addresses and host names were all messed up, including two listings for c1iscex with different mac addresses (neither of which was c1iscex).

We then brought all the front ends mx_streams down, brought the fb down, then cleared all the peer names with mx_hostname.  We then brought the fb up, and the front end mx_stream processes.

/opt/mx/mx_info now shows a clean and correct set of hostnames and mac addresses.  Testpoints and trends are working.

ELOG V3.1.3-