ID |
Date |
Author |
Type |
Category |
Subject |
4772
|
Tue May 31 14:29:00 2011 |
jzweizig | Update | CDS | frames |
There seems to be something strange going on with the 40m frame builder.
Specifically, there is a gap in the frames in /frames/full near the start of
each 100k second subdirectory. For example, frames for the following times are missing:
990200042-990200669
990300045-990300492
990400044-990400800
990500032-990500635
990600044-990600725
990700037-990700704
990800032-990800677
990900037-990900719
To summarize, after writing the first two frames in a data directory, the next ~10 minutes of frames are usually missing. To make matters worse (for
the nds2 frame finder, at least) the first frame after the gap (and all successive frames) start at an arbitrary time, usually not aligned to a 16-second boundary. Is there something about the change of directories that is causing the frame builder to crash? Or is the platform/cache disk too slow to complete the directory switch-over without loss of data?
|
4773
|
Tue May 31 15:45:37 2011 |
Jamie | Update | CDS | c1iscey IOchassis powered off for some reason. repowered. |
We found that both of the c1iscey models (c1x05 and c1scy) were unresponsive, and weren't coming back up even after reboot. We then found that the c1iscey IOchassis was actually powered off. Steve's accepts some sort of responsibility, since he was monkeying around down there for some reason. After powerup and reboot, everything is running again. |
4776
|
Wed Jun 1 11:31:50 2011 |
josephb | Update | CDS | MC1 LR digital reading close to zero, readback ~0.7 volts |
There appears to be a bad cable connection somewhere on the LR sensor path for the MC1 optic.
The channel C1:SUS-MC1_LRPDMon is reading back 0.664 volts, but the digital sensor channel, C1:SUS-MC1_LRSEN_INMON, is reading about -16. This should be closer to +1000 or so.
We've temporarily turned off the LRSEN filter module output while this is being looked into.
I briefly went out and checked the cables around the whitening and AA boards for the suspension sensors, but even after wiggling and making sure everything was plugged in solidly. There was one semi-loose connection, but it wasn't on the MC1 board, but I pushed it all the way in anyways. The monitor point on the AA board looks correct for the LR channels, although ITMX LR struck me as being very low at about -0.05 Volts.
According to data viewer, the MC1 LR sensor channel went bad roughly two weeks ago, around 00:40 on 5/18 UTC, or 17:40 on 5/17 PDT.
UPDATE:
It appears the AA board (or possibly the SCSI cable connected to it) is the problem in the chain. |
4781
|
Thu Jun 2 16:31:41 2011 |
Jamie | Update | CDS | aquired SUS channel name suffixes changed from _DAQ to _DQ |
CDS changed the suffix for all aquired channel names from _DAQ to _DQ. When we rebuilt the sus models, described in the previous log, the channel names were changed and consequently the channel files were completely rewritten.
To fix the issue, the latest archived channel file was copied back into the chans directory, and the suffixes were changed, as so:
cd /opt/rtcds/caltech/c1/chans
cp archive/C1SUS_110602_155403.ini C1SUS.ini
sed -i 's/DAQ/DQ/g' C1SUS.ini
We then restarted the models and the framebuilder. |
4790
|
Mon Jun 6 18:29:01 2011 |
Jamie, Joe | Update | CDS | COMPLETE FRONT-END REBUILD (WITH PROBLEMS (fixed)) |
Today Joe and I undertook a FULL rebuild of all front end systems with the head of the 2.1 branch of the RCG. Here is the full report of what we did:
- checked out advLigoRTS/branches/branch-2.1, r2457 into core/branches/branch-2.1
- linked core/release to branches/branch-2.1
- linked in models to core/release/src/epics/simLink using Joe's new script (userapps/release/cds/c1/scripts/link_userapps)
- remove unused/non-up-to-date models:
c1dafi.md c1lsp.mdl c1gpv.mdl c1sup_vertex_plant_shmem.mdl
- modified core/release/Makefile so that it can find models:
--- Makefile (revision 2451) +++ Makefile (working copy) @@ -346,7 +346,7 @@ #MDL_MODELS = x1cdst1 x1isiham x1isiitmx x1iss x1lsc x1omc1 x1psl x1susetmx x1susetmy x1susitmx x1susitmy x1susquad1 x1susquad2 x1susquad3 x1susquad4 x1x12 x1x13 x1x14 x1x15 x1x16 x1x20 x1x21 x1x22 x1x23 #MDL_MODELS = $(wildcard src/epics/simLink/l1*.mdl) -MDL_MODELS = $(shell cd src/epics/simLink; ls m1*.mdl | sed 's/.mdl//') +MDL_MODELS = $(shell cd src/epics/simLink; ls c1*.mdl | sed 's/.mdl//') World: $(MDL_MODELS) showWorld:
- removed channel files for models that we know will be renumbered
- For this rebuild, we are also building modified sus models, that are now using libraries, so the channel numbering is changing.
- make World
- this makes all the models
- make installWorld
- this installs all the models
- Run activateDQ.py script to activate all the relevant channels
- this script was modified to handle the new "_DQ" channels
- make/install new awgtpman:
cd src/gds make cp awgtpman /opt/rtcds/caltech/c1/target/gds/bin
- turn off all watchdogs
- test restart one front end: c1iscex
BIG PROBLEM
The c1iscex models (c1x01 and c1scx) did not come back up. c1x01 was running long on every cycle, until the model crashed and brought down the computer. After many hours, and with Alex's help, we managed to track down the issue to a patch from Rolf at r2361. The code included in that patch should have been wrapped in an "#ifndef RFM_DIRECT_READ". This was fixed and committed to branches/branch-2.1 at r2460 and to trunk at r2461.
- update to core/branches/branch-2.1 to r2460
- make World && make installWorld with the new fixed code
- restarted all computers
- restart frame builder
- burt restored to 8am this morning
- turned on all watchdogs
Everything is now green, and things seem to be working. Mode cleaner is locked. X arm locked.
|
4800
|
Thu Jun 9 16:18:03 2011 |
josephb | Update | CDS | Second trends only go back 12 days |
While answering a quick question by Kiwamu, I noticed we only had second trends going back to 99050000 GPS time, May 27th 2011.
Trends (I thought) were intended to be kept forever, and certainly longer than full data, which currently goes back several months.
Jamie will need to look into this. |
4801
|
Thu Jun 9 18:25:22 2011 |
kiwamu | HowTo | CDS | look back a channel which doesn' exist any more |
For some purposes I looked back the data of some channels which don't exist any more. Here I explain how to do it.
If this method is not listed on the wiki, I will put this instruction on a wiki page.
(How to)
(1) Edit an "ini" file which is not associated to the real-time control (e.g. IOP_SLOW.ini)
(2) In the file, write a channel name which you are interested in. The channel name should be bracketed like the other existing channels.
example: [C1:LSC-REFL11_I_OUT_DAQ]
(3) Define the data rate. If you want to look at the full data, write
datarate = 2048
just blow each channel name.
Or if you want to look at only the trends, don't write anything.
(4) Save the ini file and restart fb. If necessary hit "DAQ Reload" button on a C1:AAA_GDS_TP.adl screen to make the indicators green.
(5) Now you should be able to look at the data for example by dataviewer.
(6) After you finish the job, don't forget to clean up the sentences that you put in the ini file because it will always show up on the channel list on dtt and is just confusing.
Also don't forget to restart fb to reflect the change. |
4803
|
Fri Jun 10 12:02:10 2011 |
rana | Update | CDS | Second trends only go back 12 days |
Quote: |
While answering a quick question by Kiwamu, I noticed we only had second trends going back to 99050000 GPS time, May 27th 2011.
Trends (I thought) were intended to be kept forever, and certainly longer than full data, which currently goes back several months.
Jamie will need to look into this.
|
Our concept is to keep second trends for 1-2 months and minute trends forever. The scheme that Alan had worked out many years ago had it such that we could look back to 1998 and that the minute trends would be backed up somehow.
If its not working, we need to get Alan's help to recover the previous configuration. |
4809
|
Mon Jun 13 15:33:55 2011 |
Jamie, Joe | Update | CDS | Dolphin fiber between 1Y3 and 1X4 appears to be dead |
The fiber that connects the Dolphin card in the c1lsc machine (in the 1Y3 rack) to the Dolphin switch in the 1X4 rack appears to have died spontaneously this morning. This was indicated by loss of Dolphin communication between c1lsc and c1sus.
We definitively tracked it down to the fiber by moving the c1lsc machine over to 1X4 and testing the connection with a short cable. This worked fine. Moving it back to using the fiber again failed.
Unfortunately, we have no replaced Dolphin fiber. As a work around, we are stealing a long computer->IO chassis cable from Downs and moving the c1lsc machine to 1X4.
This is will be a permanent reconfiguration. The original plan was to have the c1lsc machine also live in 1X4. The new setup will put the computer farther from the RF electronics, and more closely mimic the configuration at the sites, both of which are good things. |
4811
|
Mon Jun 13 18:40:08 2011 |
Jamie, Joe | Update | CDS | Snags in the repair of LSC CDS |
We've run into a problem with our attempts to get the LCS control back up and running.
As reported previously, the Dolphin fiber connection between c1lsc and c1sus appears to be dead. Since we have no replacement fiber, the solution was to move the c1lsc machine in to the 1X4 rack, which would allow us to use one of the many available short Dolphin cables, and then use a long fiber PCIe extension cable to connect c1lsc to it's IO chassis. However, the long PCIe extension cable we got from Downs does not appear to be working with our setup. We tested the cable with c1sus, and it does not seem to work.
We've run out of options today. Tomorrow we're going to head back to Downs to see if we can find a cable that at least works with the test-stand setup they have there. |
4812
|
Mon Jun 13 19:26:42 2011 |
Jamie, Joe | Configuration | CDS | SUS binary IO chassis 2 and 3 moved from 1X5 to 1X4 |
While preping 1X4 for installation of c1lsc, we removed some old VME crates that were no longer in use. This freed up lots of space in 1X4. We then moved the SUS binary IO chassis 2 and 3, which plug into the 1X4 cross-connect, from 1X5 into the newly freed space in 1X4. This makes the cable run from these modules to the cross connect much cleaner. |
4815
|
Tue Jun 14 09:25:17 2011 |
Jamie | Update | CDS | Dolphin fiber tested with c1sus, still bad |
The bad Dolphin was still bad when tested with a connection between c1sus and the Dolphin swtich.
I'm headed over to Downs to see if we can resolve the issue with the PCIe extension fiber. |
4816
|
Tue Jun 14 12:23:44 2011 |
Jamie, Joe | Update | CDS | WE ARE ALL GREEN! LSC back up and running in new configuration. |
After moving the c1lsc computer to 1X4, then connecting c1lsc to it's IO chassis in 1Y3 by a fiber PCIe extension cable, everything is back up and running and the status screen is all green. c1lsc is now directly connected to c1sus via a short copper Dolphin cable.
After lunch we will do some more extensive testing of the system to make sure everything is working as expected. |
4837
|
Mon Jun 20 09:28:19 2011 |
Jamie | Update | CDS | Shutting down low-voltage DC power in 1X1/1X2 racks |
In order to install the BO module in 1X2, I need to shut down all DC power to the 1X1 and 1X2 racks. |
4838
|
Mon Jun 20 10:45:43 2011 |
Jamie | Update | CDS | Power restored to 1X1/1X2 racks. IOO binary output module installed. |
All power has been restored to the 1X1 and 1X2 racks. The modecleaner is locked again.
I have also hooked up the binary output module in 1X2, which was never actually powered. This controls the whitening filters for MC WFS. Still needs to be tested. |
4843
|
Mon Jun 20 17:58:00 2011 |
rana | Update | CDS | Gateway program killed |
There was a rogue, undocumented, gateway process running on NODUS since ~4 PM. This guy was broadcasting channels back into the Martian and causing lockups in the IOO controls. I did a kill -9 on its process.
Someone will pay for this. |
4867
|
Thu Jun 23 21:34:21 2011 |
kiwamu | Update | CDS | no foton on the CentOS machines |
For some reasons foton's deafault sample rate is NOT correct when it runs on the CentOS machines.
It tries to setup the sample rate to be 2048 Hz instead of 16384 Hz until you specify the frequency.
To avoid an accidental change of the sample rate,
running foton on CentOS is forbidden until any further notifications.
Run foton only on Pianosa.
Additionally I added an alias sentence in cshrc.40m such that people can not run foron on CentOS (csh and tcsh, technically speaking).
Below is an example of raw output when I typed foron on a CentOs machine.
rossa:caltech>foton
DO NOT use foton on CentOS
|
4871
|
Thu Jun 23 22:53:02 2011 |
kiwamu | Update | CDS | ran activateDQ.py |
I found some DQ channels (e.g. SENSOE_UL and etc.) for C1SUS haven't been activated, so I ran activateDQ.py.
Then I restarted daqd on fb as usual. So far the DQ channels look working fine. |
4881
|
Fri Jun 24 22:35:23 2011 |
rana | Configuration | CDS | dataviewer broken on pianosa |
When I try to get minute trend, it says "word too long".  |
4886
|
Sun Jun 26 16:17:22 2011 |
rana | Update | CDS | diagonalization of MC input matrix |
I have updated the scripts/SUS/peakFit/ directory so that it now finds the SUS input matrix coefficients in addition to just finding the free-swinging peaks.
Procedure:
- Get OSEM sensors data via NDS2 from a time when the optics have been kicked and then left free swinging.
- Downsample the data to 64 Hz and save.
- Make power spectra with a 1 mHz resolution (i.e. we need a few hours of data) and ~10 averages.
- use the fminsearch lorentzian peak fitter -> save the peak frequencies
- Make Transfer Function estimate matrix at the peak frequencies between all OSEMs (this makes a 5x4 complex matrix)
- The matrix should be real, so make sure its mostly real and then take the real part only
- Normalize (height of biggest peak for each f_DOF should be 1)
- Add a Butterfly mode vector. This makes the sensing matrix go from 5x4 to 5x5. (Butterfly a.k.a. Pringle)
- Invert
- Normalize so that the biggest element in each Sensor2DOF column is 1.
- Load values into MEDM screen and then verify by another free swinging data run.
The attached PDF shows how much rejection of the unwanted DOFs we get between the existing diagonal input matrix and this new empirical matrix. Previously, the decoupling was only a factor of a few for some of the modes. Now the decoupling is more like orders of magnitude (at least according to this calculation). It will be worse when we load it and then try another free swinging run. However, the fact that the suppression can be this good means that the variation in the coefficients at the ~hours time scale is at least this small (~< 0.1%)
That's the basic procedure, but there are a lot of important but mainly technical details:
- Free swinging data must be taken with the angle bias ON. Otherwise, we are not measuring the correct sensing gain (i.e. the magnets are not in their nominal place within the OSEM-LED beam)
- Data must be checked so that the shadow sensor outputs are in their linear regime: if they are exploring the cubic part, then the fundamental is being suppressed.
- Instead of just using the peak frequency, I average a few points around the peak to get better SNR before inversion. I think this will make the results more stable.
- All previous input matrix diagonalization efforts (Buckley, Sakata & Kawamura, Black, Barton, Gonzalez, Adhikari & Lawrence, Saulson,...) for the past ~15 years have been using the spectra's peak height data. Today's technique uses the TF and so is more precise. The coherent transfer function is always better than just using the magnitude data.
- This method is now fairly automatic - there's no human intervention in fudging values, choosing peak heights, frequencies, etc.
- We'll have to rerun this, of course, after the mirrors are aligned and after the OSEM whitening fiasco is cleaned up somewhat.
I'll set the optics to be aligned and then swing tonight. |
Attachment 1: inMatDiag.pdf
|
|
4888
|
Sun Jun 26 22:38:20 2011 |
rana | Update | CDS | MC1 LR dead for > 1 month; now revived temporarily |
Since the MC1 LRSEN channel is not wasn't working, my input matrix diagonalization hasn't worked today wasn't working. So I decided to fix it somehow.
I went to the rack and traced the signal: first at the LEMO monitor on the whitening card, secondly at the 4-pin LEMO cable which goes into the AA chassis.
The signal existed at the input to the AA chassis but not in the screen. So I pressed the jumper wire (used to be AA filter) down for the channel corresponding to the MC1 LRSEN channel.
It now has come back and looks like the other sensors. As you can see from this plot and Joe's entry from a couple weeks ago, this channel has been dead since May 17th.
The ELOG reveals that Kiwamu caught Steve doing some (un-elogged) fooling around there. Burnt Toast -> Steve.

993190663 = free swinging ringdown restarted again |
Attachment 1: lrsen.png
|
|
4889
|
Mon Jun 27 00:23:11 2011 |
rana | Update | CDS | ETMX SIDE problem |
The slow readback of the ETMX side seems to also have something flaky and bi-stable. This is not an issue for damping, but it disables the SIDE watchdog for ETMX and makes it unsafe if we accidentally use the wrong damping sign. |
Attachment 1: etmx-side.png
|
|
4918
|
Thu Jun 30 06:54:07 2011 |
josephb | Update | CDS | Modified the automated scripts for producing model webviews |
Dave Barker pointed out last week that the webview of our simulink model files, generated from the installed models (i.e. in /opt/rtcds/caltech/c1/target/<system name>/simLink/) was not handling libraries properly. Essentially the web pages generated couldn't see inside library parts.
This was caused by 2 problems. The first being the userapps not being in the matlab path when the slwebview call was done, so it couldn't even find the libraries. The second problem is the slwebview code by default doesn't follow libraries and links, and needs a special command to be told to do so.
I added the following lines to the webview_simlink_update.m file:
addpath('/opt/rtcds/caltech/c1/core/trunk/src/epics/simLink/lib')
for sub = {'cds','isc','isi','sus','psl'}
for spath = {'common/models','c1/models/lib'}
addpath(['/opt/rtcds/caltech/c1/userapps/release/' sub{1} '/' spath{1}]);
end
end
I also changed the following:
temp = slwebview(final_files{x},'viewFile',false);
became
temp = slwebview(final_files{x},'viewFile',false,'FollowLinks','on','FollowModelReference','on');
After confirming these changes worked, I have sent a corrected version to Dave and Keith.
The webview results can be found at: https://nodus.ligo.caltech.edu:30889/FE/
|
4961
|
Tue Jul 12 10:18:05 2011 |
Jamie | Update | CDS | C1:DAQ-FB0_C1???_STATUS indicators red, restored after controller restarts |
Yesterday I found the C1:DAQ-FB0_C1???_STATUS lights to be red for the SUS, MCS, SCX, and SCY controllers. I know this has something to do with model communication with the framebuilder, but I unfortunately don't remember exactly what it is. I decided to try restarting the affected models to see if that cleared up the problem. It did. After restarting c1scx, c1scy, c1sus, and c1mcs everything came back up green.
We need some better documentation about what all of these status indicators mean. |
5006
|
Wed Jul 20 20:04:54 2011 |
Jamie | Update | CDS | C1:DAQ-FB0_C1XXX_STATUS sometimes unexplainably goes red |
I have been noticing this happening occasionally, but I don't understand what is causing:

The channel in question above is C1:DAQ-FB0_C1SCX_STATUS. This channel is (I believe) reporting some status of the front end model communication with the frame builder, but I'm not sure exactly what.
Usually this problem goes away when I restart the model or the frame builder, but it didn't work this time. Tomorrow I will figure out what this channel means, why it's sporadically going red, and how to correct it. |
5030
|
Mon Jul 25 13:01:24 2011 |
kiwamu | Update | CDS | c1ioo Make problem |
[Suresh / Kiwamu]
HELP US Jamieeeeeeee !! We are unable to compile c1ioo.
It looks like something wrong with Makefile.
We ran make c1ioo -- this was successful every time. However make install-c1ioo doesn't run.
The below is the error messages we got.
make install-target-c1ioo
make[1]: Entering directory `/opt/rtcds/caltech/c1/core/branches/branch-2.1'
Please make c1ioo first
Then we looked at Makefile and tried to find what was wrong. Then found the sentence (in 36th line from the top) saying
if test $(site)no = no; then echo Please make $$system first; exit 1; fi;\
We thought the lack of the site-name specification caused the error.
So then we tried the compile it again with the site name specified by typing
export site=c1
in the terminal window.
It went ahead a little bit further, but it still doesn't run all through the Make commands.
|
5031
|
Mon Jul 25 13:09:39 2011 |
Jamie | Update | CDS | c1ioo Make problem |
> It looks like something wrong with Makefile.
Sorry, this was my bad. I was making a patch to the makefile to submit back upstream and I forgot to revert my changes. I've reverted them now, so everything should be back to normal. |
5049
|
Wed Jul 27 15:49:13 2011 |
jamie | Configuration | CDS | dataviewer now working on pianosa |
Not exactly sure what the problem was, but I updated to the head of the SVN and rebuilt and it seems to be working fine now. |
5060
|
Fri Jul 29 12:39:26 2011 |
jamie | Update | CDS | c1iscex mysteriously crashed |
c1iscex was behaving very strangely this morning. Steve earlier reported that he was having trouble pulling up some channels from the c1scx model. I went to investigate and noticed that indeed some channels were not responding.
While I was in the middle of poking around, c1iscex stopped responding altogether, and became completely unresponsive. I walked down there and did a hard reset. Once it rebooted, and I did a burt restore from early this morning, everything appeared to be working again.
The fact that problems were showing up before the machine crashed worries me. I'll try to investigate more this afternoon. |
5094
|
Tue Aug 2 16:43:23 2011 |
jamie | Update | CDS | NDS2 server on mafalda restarted for access to new channels |
In order to get access to new DQ channels from the NDS2 server, the NDS2 server needs to be told about the new channels and restarted. The procedure is as follows:
ssh mafalda
cd /users/jzweizig/nds2-mafalda
./build_channel_history
./install_channel_list
pkill nds2
# wait a few seconds for the process to quit and release the server port
./start_nds2
This procedure needs to be run every time new _DQ channels are added.
We need to set this up as a proper service, so the restart procedure is more elegant.
An additional comment from John Z.:
The --end-gps parameter in ./build_channel_history seems to be causeing
some trouble. It should work without this parameter, but there is a
directory with a gps time of 1297900000 (evidently a test for GPS1G)
that might screw up the channel list generation. So, it appears that
the end time requires a time for which data already exists. this
wouldn't seem to be a big deal, but it means that it has to be modified
by hand before running. I haven't fixed this yet, but I think that I
can probably pick out the most recent frame and use that as an end-time
point. I'll see if I can make that work... |
5136
|
Mon Aug 8 00:12:58 2011 |
rana | Update | CDS | diagonalization of MC input matrix |
I've finally completed the SUS/peakFit/ scripts which find the new input matrix for the SUS. MC1, MC2, MC3, and ITMX have been matrix'd.
I tried to do the BS, but it came out with very funny matrix elements. Also the BS is missing its DAQ channels again (JAMIE !) so we can't diagnose it with the free swinging method.
To continue, we have to get some good data and try this again. Right now there are some weird issues with a lot of the optics. I've also set the damping gains for the optics with the new matrices.
Ex.
new_matrix = findMatrix('ITMX')
writeSUSinmat('ITMX', new_matrix)
writeSUSinmat.m
this script writes the values to the MEDM input SUS matrix. To do the writing, I used the low level 'caput' command instead of ezcawrite since the ezca libraries are getting deprecated.
caput doesn't really have good diagnostics, so I use matlab to check the return status and then display to the terminal. You can just rerun it if it gives you an error.
A coupled of normalization notes:
1) The POS/PIT/YAW rows are scaled so that the mean of abs(FACE elements) = 1. Previously, I had the max element = 1.
2) The SIDE row is scaled so that the SIDE element = +1.
3) I then normalized the ROWS according to the geometrical factors that Jamie has calculated and almost put into the elog.
All these scripts have been added to the SVN. I've removed the large binary data files from the directory though. You can just rsync them in to your laptop if you want to run this stuff remotely. |
5137
|
Mon Aug 8 00:58:26 2011 |
rana | Update | CDS | diagonalization of MC input matrix |
Besides the purpose of correctly tuning the suspensions, my hidden goal in the input matrix diagonalization has been to figure out what the 'true' sensing noise of the OSEMs is so that we can accurately predict the noise impact on the OAF.
The attached plot shows the DOFs of ITMX calibrated into microns or microrad as per Jamie's ethereal input matrix calculations.
The main result is in the ratio of POS to BUTTER. It tells us that even at nighttime (when this data was taken) we should be able to get some reduction in the arms at 1 Hz.
Whether we can get anything down to 0.1 Hz depends on how the arm control signal compares to the POS signal here. I leave it to Jenne to overlay those traces using a recent Arm lock. |
Attachment 1: null.png
|
|
5143
|
Mon Aug 8 19:45:27 2011 |
jamie | Update | CDS | activateDQ script run; SUS channels being acquired again |
> Also the BS is missing its DAQ channels again (JAMIE !) so we can't diagnose it with the free swinging method.
I'm not sure why the BS channels were not being acquired. I reran the activateDQ script, which seemed to fix everything. The BS DQ channels are now there.
I also noticed that for some reason there were SUS-BS-ASC{PIT,YAW}_IN1_DQ channels, even though they had their acquire flags set to 0. This means that they were showing up like test point channels, but not being written to frames by the frame builder. This is pretty unusual, so I'm not sure why they were there. I removed them. |
5162
|
Wed Aug 10 00:21:10 2011 |
jamie | Update | CDS | updates to peakFit scripts |
I updated the peakFit routines to make them a bit more user friendly:
- modified so that any subset of optics can be processed at a time, instead of just all
- broke out tweakable fit parameters into a separate parameters.m file
- added a README that describes use
These changes were committed to the 40m svn. |
5211
|
Fri Aug 12 16:50:37 2011 |
Yoichi | Configuration | CDS | FE Status screen rearranged |
I rearranged the FE_STATUS.adl so that I have a space to add c1ffc in the screen.
So, please be aware that the FE monitors are no longer in their original positions
in the screen. |
5214
|
Fri Aug 12 17:27:49 2011 |
Yoichi | Summary | CDS | Toggle button for RCG |
Bottom line: I made an RCG block to realize a toggle button easily.
Read on if you need such a button, or if you want to know how to
write a new RCG block with C.
-----------------
When I was making MEDM screens for FFC, I wanted to have a toggle
button to enable/disable the FFC path.
I wanted to have something like the ON/OFF buttons of the filter bank
screens, the one changes its state every time I click on it.
However, I could not find an easy way to realize that.
From MEDM, I can send a value to an EPICS channel using a "Message Button".
This value is always the same, say 1.
In the RCG model, I used a cdsEpicsMomentary block so that whenever the channel
gets 1, it stays to be 1 for a while and turns back to 0 in a second or so.
This generates a pulse of 1 when I click on a message button on a MEDM screen.
Then I needed a block to keep its internal state (0 or 1), and flips its state
whenever it receives a pulse of 1.
Since I couldn't find such a block in the current RCG library, I implemented one
using the cdsFunctionCall block. It allows you to implement a block with C code.
There is a good explanation of how to use this block in the CDS_PARTS library.
Here is basically what I did.
(1) Drag and drop thee cdsFunctionCall block to my model.
(2) In the "Block Properties", I put the following line in the Description field.
inline cdsToggle /opt/rtcds/caltech/c1/userapps/release/cds/common/src/cdsToggle.c
This means to call a function cdsToggle(), whose code is in the file indicated above.
(3) The contents of the source code is very simple.
void cdsToggle(double *in, int inSize, double *out, int outSize){
static double x = 0;
static double y = 0;
if (*in != y){
y = *in;
if (y == 1){
x = (x == 1) ? 0 : 1;
*out = x;
}
}
}
The function prototype is always the same. *in and *out are the pointers to the arrays of doubles
for input and output signals of the block. In simuLink, the signals have to be
multiplexed so that the RCG can know how many signals are handed to or returned from the function.
In order to keep the internal state of my custom block, I used "static" keyword in the
declaration of the variables. The rest of the code should be obvious.
(4) Just compile the model as usual. The RCG will automatically include the source code and put
a call to the function in the proper place.
I made the block a library so that people can use it.
/opt/rtcds/caltech/c1/userapps/trunk/cds/common/models/cdsToggle.mdl
is the one.
For the usage of it, please have a look at
/opt/rtcds/caltech/c1/userapps/trunk/isc/c1/models/c1lsc |
5312
|
Sat Aug 27 15:47:59 2011 |
rana | Update | CDS | OSEM noise / nullstream and what does it mean for satellites |
In the previous elog of mine, I looked at the nullstream (aka butterfly mode) to find out if the intrinsic OSEM noise is limiting the displacement noise of the interferometer or possibly the Wiener FF performance.
The conclusion was that its not above ~0.2 Hz. Due to the fortuitous breaking of the ITMX magnet, we also have a chance to check the 'bright noise': what the noise is with no magnet to occlude the LED beam.
As expected, the noise spectra with no magnets is less than the calculated nullstream. The attached plot shows the comparison of the LL OSEM (all the bright spectra look basically alike) with the damped
optic spectra from 1 month week ago.
From 0.1 - 10 Hz, the motion is cleanly larger than the noise. Below ~0.2 Hz, its possible that the common mode rejection of the short cavity lengths are ruined by this. We should try to see if the low frequency
noise in the PRC/SRC is explainable with our current knowledge of seismicity and the 2-dimensional 2-poiint correllation functions of the ground.
So, the question is, "Should we try to upgrade the satellite boxes to improve the OSEM sensing noise?" |
Attachment 1: Untitled.png
|
|
5315
|
Sun Aug 28 22:49:40 2011 |
Suresh | Update | CDS | fb down |
I recompiled c1ioo after making some changes and restarted fb. (about 9:45 - 10PM PDT) But it failed to restart. It responds to ping, but does not allow a ssh or telnet. The screen output is:
allegra:~>ssh fb
ssh: connect to host fb port 22: Connection refused
allegra:~>telnet fb 8087
Trying 192.168.113.202...
telnet: connect to address 192.168.113.202: Connection refused
telnet: Unable to connect to remote host: Connection refused
allegra:~>
Nor am I able to connect to c1ioo either....
|
5316
|
Mon Aug 29 00:49:00 2011 |
kiwamu | Update | CDS | Re : fb down |
Fb is in a bad situation. It needs a MANUAL fsck to fix the file system.
HELP US, Jamieeeeeeeeeeee !!!
When Suresh and I connected a display and tried to see what was going on, the fb computer was in a file system check.
This was because Suresh did a hardware reboot by pressing a power button on the front panel.
Since the file checking took so long time and didn't proceed fast, we pressed the reset button and again the power button.
Actually the reset button didn't work (maybe ?) it just made some light indicators flashing.
After the second reboot the reboot message said that it needs a manual fsck to fix the file system. This maybe because we interrupted the file checking.
We are leaving it to Jamie because the fsck command would do something bad if unfamiliar persons, like us, do it.
In addition to it, the boot message was also saying that line 37 in /etc/fstab was bad.
We logged into the machine with a safe mode, then found there was an empty line in 37th line of fstab.
We tried erasing this empty line, but failed for some reasons. We were able to edit it by using vi, but wasn't able to save it. |
5317
|
Mon Aug 29 12:05:32 2011 |
jamie | Update | CDS | Re : fb down |
fb was requiring manual fsck on it's disks because it was sensing filesystem errors. The errors had to do with the filesystem timestamps being in the future. It turned out that fb's system date was set to something in 2005. I'm not sure what caused the date to be so off (motherboard battery problem?) But I did determine after I got the system booting that the NTP client on fb was misconfigured and was therefore incapable of setting the system date. It seems that it was configured to query a non-existent ntp server. Why the hell it would have been set like this I have no idea.
In any event, I did a manual check on /dev/sdb1, which is the root disk, and postponed a check on /dev/sda1 (the RAID mounted at /frames) until I had the system booting. /dev/sda1 is being checked now, since there are filesystems errors that need to be corrected, but it will probably take a couple of hours to complete. Once the filesystems are clean I'll reboot fb and try to get everything up and running again. |
5319
|
Mon Aug 29 18:16:10 2011 |
jamie | Update | CDS | Re : fb down |
fb is now up and running, although the /frames raid is still undergoing an fsck which is likely take another day. Consequently there is no daqd and no frames are being written to disk. It's running and providing the diskless root to the rest of the front end systems, so, so the rest of the IFO should be operational.
I burt restored the following (which I believe is everything that was rebooted), from Saturday night:
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1lscepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1susepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1iooepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1assepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1mcsepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1gcvepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1gfdepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1rfmepics.snap
/opt/rtcds/caltech/c1/burt/autoburt/snapshots/2011/Aug/27/23:07/c1pemepics.snap
|
5323
|
Tue Aug 30 11:28:56 2011 |
jamie | Update | CDS | framebuilder back up |
The fsck on the framebuilder (fb) raid array (/dev/sda1) completed overnight without issue. I rebooted the framebuilder and it came up without problem.
I'm now working on getting all of the front-end computers and models restarted and talking to the framebuilder now. |
5324
|
Tue Aug 30 11:42:29 2011 |
jamie | Update | CDS | testpoint.par file found to be completely empty |
The testpoint.par file, located at /opt/rtcds/caltech/c1/target/gds/param/testpoint.par, which tells GDS processes where to find the various awgtpman processes, was completely empty. The file was there but was just 0 bytes. Apparently the awgtpman processes themselves also consult this file when starting, which means that none of the awgtpman processes would start.
This file is manipulated in the "install-daq-%" target in the RCG Makefile, ultimately being written with output from the src/epics/util/updateTestpointPar.pl script, which creates a stanza for each front-end model. Rebuilding and installing all of the models properly regenerated this file.
I have no idea what would cause this file to get truncated, but apparently this is not the first time: elog #3999. I'm submitting a bug report with CDS.
|
5325
|
Tue Aug 30 14:33:52 2011 |
jamie | Update | CDS | all front-ends back up and running |
All the front-ends are now running. Many of them came back on their own after the testpoint.par was fixed and the framebuilder was restarted. Those that didn't just needed to be restarted manually.
The c1ioo model is currently in a broken state: it won't compile. I assume that this was what Suresh was working on when the framebuilder crash happened. This model needs to be fixed. |
5408
|
Wed Sep 14 20:04:05 2011 |
jamie | Update | CDS | Update to frame builder wiper.pl script for GPS 1000000000 |
I have updated the wiper.pl script (/opt/rtcds/caltech/c1/target/fb/wiper.pl) that runs on the framebuilder (in crontab) to delete old frames in case of file system overloading. The point of this script is to keep the file system from overloading by deleting the oldest frames. As it was, it was not properly sorting numbers which would have caused it to delete post-GPS 1000000000 frames first. This issue was identified at LHO, and below is the patch that I applied to the script.
--- wiper.pl.orig 2011-04-11 13:54:40.000000000 -0700
+++ wiper.pl 2011-09-14 19:48:36.000000000 -0700
@@ -1,5 +1,7 @@
#!/usr/bin/perl
+use File::Basename;
+
print "\n" . `date` . "\n";
# Dry run, do not delete anything
$dry_run = 1;
@@ -126,14 +128,23 @@
if ($du{$minute_trend_frames_dir} > $minute_frames_keep) { $do_min = 1; };
+# sort files by GPS time split into prefixL-T-GPS-sec.gwf
+# numerically sort on 3rd field
+sub byGPSTime {
+ my $c = basename $a;
+ $c =~ s/\D+(\d+)\D+(\d+)\D+/$1/g;
+ my $d = basename $b;
+ $d =~ s/\D+(\d+)\D+(\d+)\D+/$1/g;
+ $c <=> $d;
+}
+
# Delete frame files in $dir to free $ktofree Kbytes of space
# This one reads file names in $dir/*/*.gwf sorts them by file names
# and progressively deletes them up to $ktofree limit
sub delete_frames {
($dir, $ktofree) = @_;
# Read file names; Could this be inefficient?
- @a= <$dir/*/*.gwf>;
- sort @a;
+ @a = sort byGPSTime <$dir/*/*.gwf>;
$dacc = 0; # How many kilobytes we deleted
$fnum = @a;
$dnum = 0;
@@ -145,6 +156,7 @@
if ($dacc >= $ktofree) { last; }
$dnum ++;
# Delete $file here
+ print "- " . $file . "\n";
if (!$dry_run) {
unlink($file);
}
|
5424
|
Thu Sep 15 20:16:15 2011 |
jamie | Update | CDS | New c1oaf model installed and running |
[Jamie, Jenne, Mirko]
New c1oaf model installed
We have installed the new c1oaf (online adaptive feed-forward) model. This model is now running on c1lsc. It's not really doing anything at the moment, but we wanted to get the model running, with all of it's interconnections to the other models.
c1oaf has interconnections to both c1lsc and c1pem via the following routes:
c1lsc ->SHMEM-> c1oaf
c1oaf ->SHMEM-> c1lsc
c1pem ->SHMEM-> c1rfm ->PCIE-> c1oaf
Therefore c1lsc, c1pem, and c1rfm also had to be modified to receive/send the relevant signals.
As always, when adding PCIx senders and receivers, we had to compile all the models multiple times in succession so that the /opt/rtcds/caltech/c1/chans/ipc/C1.ipc would be properly populated with the channel IPC info.
Issues:
There were a couple of issues that came up when we installed and re/started the models:
c1oaf not being registered by frame builder
When the c1oaf model was started, it had no C1:DAQ-FB0_C1OAF_STATUS channel, as it's supposed to. In the daqd log (/opt/rtcds/caltech/c1/target/fb/logs/daqd.log.19901) I found the following:
Unable to find GDS node 22 system c1oaf in INI files
It turns out this channel is actually created by the frame builder, and it could not find the channel definition file for the new model, so it was failing to create the channels for it. The frame builder "master" file (/opt/rtcds/caltech/c1/target/fb/master) needs to list the c1oaf daq ini files:
/opt/rtcds/caltech/c1/chans/daq/C1OAF.ini
/opt/rtcds/caltech/c1/target/gds/param/tpchn_c1oaf.par
These were added, and the framebuilder was restarted. After which the C1:DAQ-FB0_C1OAF_STATUS appeared correctly.
SHMEM errors on c1lsc and c1oaf
This turned out to be because of an oversight in how we wired up the skeleton c1oaf model. For the moment the c1oaf model has only the PCIx sends and receives. I had therefore grounded the inputs to the SHMEM parts that were meant to send signals to C1LSC. However, this made the RCG think that these SHMEM parts were actually receivers, since it's the grounding of the inputs to these parts that actually tells the RCG that the part is a receiver. I fixed this by adding a filter module to the input of all the senders.
Once this was all fixed, the models were recompiled, installed, and restarted, and everything came up fine.
All model changes were of course committed to the cds_user_apps svn as well. |
5426
|
Thu Sep 15 21:56:01 2011 |
Mirko | Update | CDS | c1oaf check, possible shmem problem |
After Jamie installed the c1oaf model ( entry 5424 ) I went and checked the intermodel communication.
Remember the config is:
c1lsc ->SHMEM-> c1oaf
c1oaf ->SHMEM-> c1lsc
c1pem ->SHMEM-> c1rfm ->PCIE-> c1oaf
I checked at least one of every communications type.
-All signals reach their destinations.
-c1lsc_to_c1oaf_via_shmem is more noisy adding noise to the signal. lsc runs at 16kHz and oaf at 2kHz but that should actually smooth things out.

|
Attachment 1: c1lsc_to_c1oaf_via_shmem.png
|
|
Attachment 2: c1oaf_to_c1lsc_via_shmem_fixed_sine_inj_at_100Hz.png
|
|
Attachment 3: c1oaf_to_c1lsc_via_shmem_white_noise_inj.png
|
|
Attachment 4: c1pem_to_c1oaf_via_rfm.png
|
|
5486
|
Tue Sep 20 17:45:30 2011 |
kiwamu | Update | CDS | daqd is restarting by hisself ? |
[Jenne / Kiwamu]
Fb was sick. Dataviewer and Fourier Tools didn't work for a while.
After 10 minutes later they became healthy again. No idea what exactly was going on.
One thing we found was that : during the sickness of fb, it looks like daqd was restarting by hisself. Is this normal ??
Here is the bottom sentences of restart.log. Apparently daqd was rebooting although we didn't command to do so.
daqd_start Tue Sep 20 02:41:17 PDT 2011
daqd_start Tue Sep 20 13:18:12 PDT 2011
daqd_start Tue Sep 20 17:33:00 PDT 2011
|
5535
|
Sat Sep 24 01:38:14 2011 |
kiwamu | Update | CDS | c1scx and c1x01 restarted |
[Koji / Kiwamu]
The c1scx and c1x01 realtime processes became frozen. We restarted them around 1:30 by sshing and running the kill/start scripts. |
5561
|
Wed Sep 28 02:42:04 2011 |
kiwamu | Update | CDS | some DAQ channel lost in c1sus : fb, c1sus and c1pem restarted |
Somehow some DAQ channels for C1SUS have disappeared from the DAQ channel list.
Indeed there are only a few DAQ channels listed in the C1SUS.ini file.
I ran the activateDQ.py and restarted daqd.
Everything looks okay. C1SUS and C1PEM were restarted because they became frozen.
|