40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 37 of 344  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  3828   Fri Oct 29 18:37:33 2010 yutaSummaryCDSdaqd and current CDS status

Background:
  Before Joe left(~ 1 hour ago), fb was working for a while. But after he left, daqd core dumped.
  This is maybe because we started c1sus and c1rms again for a delay measurement, just before he left.

What I did:
  I restarted IOP(c1x02) and FE models.
  Now it seems OK (we can use dataviewer and diaggui), but daqd reports bunch of errors like;

CA.Client.Exception...............................................
    Warning: "Identical process variable names on multiple servers"
    Context: "Channel: "C1:SUS-ITMX_TO_COIL_0_3_INMON", Connecting to: 192.168.113.85:42367, Ignored: c1susdaq:42367"
    Source File: ../cac.cpp line 1208
    Current Time: Fri Oct 29 2010 18:07:39.132686519
..................................................................


tp node xx invalid  (xx is 38 to 36)

Current CDS status:

MC damp dataviewer diaggui AWG c1ioo c1sus c1iscex RFM Sim.Plant  ...
                   ... 

   Please add other stuff you need.

Below is an example of how the color code works:

06-25-2009.NN_24ThreatLevel.GJH2L69BK.1.jpg

  3829   Sat Oct 30 05:27:53 2010 yutaSummaryCDSCDS time delay measurement

Motivation:
  We want to know the time delay of CDS in the IOP scheme.

Setup:
delaysetup.png

What I did:
1. Plugged out SCSI cable from ADC card 2 and DAC card 0 on C1SUS machine.
   ADC card 2 is ADC 0
   DAC card 0 is DAC 0

2. Measured tranfer function between ADC and DAC by SR785 and compared with the downsampling filter in IOP with 65534Hz(=4x16384Hz) sampling frequency.

  As ADC_0_0 corresponds to PRM ULSEN input and DAC_0_0 corrsponds to ULCOIL output, we turned all the filters off and set gains to 0 or 1 so that TF between ULSEN to ULCOIL will be ideally 1. (see this wiki page for channel assigns)

  The filter coefficients for the down sampling filter was found in;
    /cvs/cds/rtcds/caltech/c1/core/advLigoRTS/src/fe/controller.c
  It was named feCoeff4x.

static double feCoeff4x[9] =
        {0.014805052402446,
        -1.71662585474518,    0.78495484219691,   -1.41346289716898,   0.99893884152400,
        -1.68385964238855,    0.93734519457266,    0.00000127375260,   0.99819981588176};


3. Calculated the time delay dt using the following formula;
  dt = [pm - pc]/f/360deg    (pm: measured phase, pc: calculated phase from feCoeff4x, f: frequency)

4. Measured TF between the SCSI cables to estimate the effect of the cables and others.
  Disconnected SCSI cables from ADC and DAC, and connected A aad B(see setup).
  I measured both when input coupling of SR785 is DC and AC and see what happens.

Result:
  [time delay of the CDS]  (left, middle)
    The time delay gets larger with frequency. The time delay seems to be -175 usec at DC.
    However, the gain seems a little different from my expectation(feCoeff4x). So, there are maybe other filters I don't know.
    I neglected TF of upsampling this time.

  [cable and other effect]  (right)
    The effect to the time delay measurement was tiny by a factor of 10^4 to 10^3 (few nsec).
    But the total cable length was about 5 m and assuming signal speed is 0.6c, delay will be about 30 nsec.
    I don't know what's happening.

CDSdelay.png

Plan:
  - make a model that does not go through IOP and see the delay caused by IOP

By the way:
  fb daqd is still running for hours!
  Every FEs are running(c1sus,rms,mcs).

  3830   Sat Oct 30 14:35:43 2010 KojiSummaryCDSCDS time delay measurement

Unsatisfactory.

Neglecting the digital anti-imaging filter makes the discrepancy. You must take into account your digital filter twice.

I attached the slides I made during my visit for March LVC '09. P.5 would be useful.

Quote:

Result:
  [time delay of the CDS]  (left, middle)
    The time delay gets larger with frequency. The time delay seems to be -175 usec at DC.
    However, the gain seems a little different from my expectation(feCoeff4x). So, there are maybe other filters I don't know.
    I neglected TF of upsampling this time.

 

Attachment 1: CDS_system_investigation_090323.pdf
CDS_system_investigation_090323.pdf CDS_system_investigation_090323.pdf CDS_system_investigation_090323.pdf CDS_system_investigation_090323.pdf CDS_system_investigation_090323.pdf CDS_system_investigation_090323.pdf CDS_system_investigation_090323.pdf CDS_system_investigation_090323.pdf
  3834   Mon Nov 1 11:34:08 2010 josephbUpdateCDSCA.Client.Exception spam fixed

Problem:

CA.Client.Exception...............................................
    Warning: "Identical process variable names on multiple servers"
    Context: "Channel: "C1:SUS-ITMX_TO_COIL_0_3_INMON", Connecting to: 192.168.113.85:42367, Ignored: c1susdaq:42367"
    Source File: ../cac.cpp line 1208
    Current Time: Fri Oct 29 2010 18:07:39.132686519

The above exception gets thrown for each channel sent to the framebuilder when you start the frame builder.  Its the 192.168.113.xxx (main martian) and 192.168.114.xxx (DAQ) networks sending the same information.  This makes it hard to see real errors when starting the frame builder.

Solution:

Configure EPICS channel access to only use one network.  This is done by modifying the /etc/bash/bashrc and the /diskless/root/etc/bash/bashrc files with the following two lines:

export EPICS_CA_AUTO_ADDR_LIST=NO
export EPICS_CA_ADDR_LIST="192.168.113.255"

The first line tells the computers not to automatically search all attached network devices.  The second tells it to use the 192.168.113.xxx network for EPICS channel broadcasts.

  3836   Mon Nov 1 14:53:30 2010 josephb, alex, rolfUpdateCDSAlex updated FB and mx_streams code

Problem:

The framebuilder was being flaky.  MX_streams would go down, prevent testpoints from working and so forth.

Solution:

Send Alex up North for a week to fix the code. 

Alex came back and installed updates to the frame builder and the mx_streams code (Myrinet Express over Generic Ethernet Hardware) used by the front ends to talk to the frame builder.  Instead of 1 stream per model, there's now just 1 per front end handling all communications.

Alex did an SVN update and we now have the latest CDS code.

Self restarting codes:

The frame builder code (daqd) and nds pipe have been added to the fb machine's inittab.  Specifically it calls a script called /opt/rtcds/caltech/c1/target/fb/start_daqd.inittab and /opt/rtcds/caltech/c1/target/fb/start_nds.inittab.

The addition to the /etc/inittab file on fb is:

daq:345:respawn:/opt/rtcds/caltech/c1/target/fb/start_daqd.inittab 

nds:345:respawn:/opt/rtcds/caltech/c1/target/fb/start_nds.inittab

When these codes die they should automatically restart.

Self starting codes at boot up:

The front ends now start the mx_stream script (which lives in /opt/rtcds/caltech/c1/target/fb/ directory) at boot up.  They call it with the approriate command line options for that front end. It can be found in the /etc/rc.local file.

They look like: mx_stream -s "c1x02 c1sus c1mcs c1rms" -d fb:0

As always, the front end codes to be started are defined in the /etc/rtsystab file (or on fb, in the /diskless/root/etc/rtsystab file).

However, if it does go down you would need to restart it manually, although it seems more robust now and doesn't seem to go down every time we restart the frame builder.

All the usual front end IOCs and modules should be started and loaded on boot up as well.

 

Current CDS status:

MC damp dataviewer diaggui AWG c1ioo c1sus c1iscex RFM Sim.Plant Frame builder  ...
                     ... 
  3837   Mon Nov 1 15:35:41 2010 josephbUpdateCDSFront end USR and CPU times now recorded by DAQ

Problem:

We have no record of how long the CPUs are taking to perform a cycle's worth of computation

Solution:

I added the following channels to the various slow DAQ configuration files in /opt/rtcds/caltech/c1/chans/daq/

IOO_SLOW.ini:[C1:FEC-34_USR_TIME]
IOO_SLOW.ini:[C1:FEC-34_CPU_METER]
IOP_SLOW.ini:[C1:FEC-20_USR_TIME]
IOP_SLOW.ini:[C1:FEC-20_CPU_METER]
IOP_SLOW.ini:[C1:FEC-33_USR_TIME]
IOP_SLOW.ini:[C1:FEC-33_CPU_METER]
MCS_SLOW.ini:[C1:FEC-36_USR_TIME]
MCS_SLOW.ini:[C1:FEC-36_CPU_METER]
RMS_SLOW.ini:[C1:FEC-37_USR_TIME]
RMS_SLOW.ini:[C1:FEC-37_CPU_METER]
SUS_SLOW.ini:[C1:FEC-21_USR_TIME]
SUS_SLOW.ini:[C1:FEC-21_CPU_METER]

 

Notes:

To restart the daqd code, simply kill the running process.  It should restart automatically.  If it appears not to have started, check the /opt/rtcds/caltech/c1/target/fb/restart.log file and the /opt/rtcds/caltech/c1/target/fb/logs/daqd.log.xxxx files.  If you made a mistake in the DAQ channels and its complaining, fix the error and then restart init on the fb machine by running "sudo /sbin/init q"

  3838   Mon Nov 1 15:47:15 2010 yutaSummaryCDSCDS time delay measurement

Background:
  I measured CDS time delay last week, but because of my lack of understanding the system, it was incorrect.
  IOP has an anti-aliasing filter before downsampling from 64kHz(65536Hz) to 16kHz(16384Hz) and also has an anti-imaging filter before upsampling from 16kHz to 64kHz.
  So, I should have take feCoeff4x into account twice.
downupsampling.png

Result:
  TF agreed well with 2-time feCoeff4x and CDS time delay was -123.5 usec.
CDSdelay2.png


Plan:
 - make AWG(, diaggui TF measurement, tdssine) work
 - check input/output filter switching (using tdssine & tdsdmd)
 - measure openloop TF of MC suspension damping
 - divide it with my expectation and see if there are any filters I don't know

Quote:

Unsatisfactory.

Neglecting the digital anti-imaging filter makes the discrepancy. You must take into account your digital filter twice.

I attached the slides I made during my visit for March LVC '09. P.5 would be useful.

 

  3839   Mon Nov 1 16:43:24 2010 KojiSummaryCDSCDS time delay measurement

Um, Beautiful.

Actually, 123.5usec is almost exactly twice of 1/16384Hz.
Because of the loop, we have 1/16384Hz delay. I wonder where we do have the delay.

In order to understand the behaviour of the system can I ask you to test the following things?

1) What are the delay without IOPs with fsampl of 16k, 32k, 64k?

2) What are the delay with IOP with fsampl of 32k, 64k?

Quote:

Result:
  TF agreed well with 2-time feCoeff4x and CDS time delay was -123.5 usec.
CDSdelay2.png

 

  3841   Mon Nov 1 19:32:08 2010 yutaSummaryCDSfb crashed? during c1ioo and c1mcs connection at ASC

Frame builder died again!!

Background:
  We want to do angle to length measurement to optimize the beam position and increase visibility of MC locking.
  In order to do A2L measurement, we need excitation point, but AWG is currently not working.
  The better way is to use LOCKIN stuff like we had for OMC and put it to C1IOO WFS.
  A software oscillator in LOCKIN shakes the suspension, and demodulate the length signal.
  We can choose whatever DOF to shake, whatever signal to demodulate. It would be useful not just for A2L.

What I did:

  I started to put C1IOO WFS signal into C1SUS MC suspension RT model, but after compiling new c1mcs, fb crashed.
  Looks like daqd and mx_streams are running, but DAQ is not working(red).
  I don't know how to restart in a new way!

  3842   Mon Nov 1 23:31:05 2010 yutaUpdateCDSchecked input hardware filter in single frequency

Background:
  For input filter, we have analog whitening filter and also digital whitening filter. They have the same TF and when analog one is off, digital one should be on and vice versa.
  I made a python script that checks the switching automatically.

Method:
  Excite the suspension in a single frequency and see sensor inputs(XXSEN_IN1).
  Calculate the magnitude in the excitation frequency and compare it when digital whitening is off and on.
  When digital whitening is off, analog should be on, so sensor inputs should gone though the analog filter. That means the signal is multiplied by the TF of that filter, which makes the difference.

  We currently don't have excitation and I thought I have to wait, but instead of putting some extra excitation, I found that 60Hz line noise is useful.

Script:
  The script is /cvs/cds/caltech/users/yuta/scripts/WDWchecker.py
  For every sensor input, it;
    0. Stores current filter switching(XXSEN_SW1R)
    1. turns OFF the digital filter(FM1, using ezcaswitch)
    2. tdsdmd XXSEN_IN1 in 60Hz
    3. turns ON the digital filter
    4. tdsdmd XXSEN_IN1 in 60Hz
    5. divides mag(2.) by mag(4.) and calculate the analog filter gain in 60Hz
    6. Restores the filter switching in the state before the checking

Result:
  The results are;

C1:SUS-BS_ULSEN_IN1: 22.2 dB
C1:SUS-BS_URSEN_IN1: 18.7 dB
C1:SUS-BS_LRSEN_IN1: 22.7 dB
C1:SUS-BS_LLSEN_IN1: 16.0 dB
C1:SUS-BS_SDSEN_IN1: 21.5 dB
C1:SUS-ITMX_ULSEN_IN1: 16.9 dB
C1:SUS-ITMX_URSEN_IN1: 16.3 dB
C1:SUS-ITMX_LRSEN_IN1: 17.5 dB
C1:SUS-ITMX_LLSEN_IN1: 17.1 dB
C1:SUS-ITMX_SDSEN_IN1: 6.2 dB
C1:SUS-ITMY_ULSEN_IN1: 15.5 dB
C1:SUS-ITMY_URSEN_IN1: 16.5 dB
C1:SUS-ITMY_LRSEN_IN1: 17.4 dB
C1:SUS-ITMY_LLSEN_IN1: 16.3 dB
C1:SUS-ITMY_SDSEN_IN1: 18.0 dB
C1:SUS-PRM_ULSEN_IN1: 0.1 dB
C1:SUS-PRM_URSEN_IN1: 10.3 dB
C1:SUS-PRM_LRSEN_IN1: 13.1 dB
C1:SUS-PRM_LLSEN_IN1: -32.3 dB
C1:SUS-PRM_SDSEN_IN1: 14.6 dB
C1:SUS-SRM_ULSEN_IN1: 17.3 dB
C1:SUS-SRM_URSEN_IN1: 13.5 dB
C1:SUS-SRM_LRSEN_IN1: 1.6 dB
C1:SUS-SRM_LLSEN_IN1: 16.7 dB
C1:SUS-SRM_SDSEN_IN1: 18.3 dB

C1:SUS-MC1_ULSEN_IN1: 17.0 dB
C1:SUS-MC1_URSEN_IN1: 18.6 dB
C1:SUS-MC1_LRSEN_IN1: 14.9 dB
C1:SUS-MC1_LLSEN_IN1: 27.0 dB
C1:SUS-MC1_SDSEN_IN1: 16.6 dB
C1:SUS-MC2_ULSEN_IN1: 19.8 dB
C1:SUS-MC2_URSEN_IN1: 14.0 dB
C1:SUS-MC2_LRSEN_IN1: 20.8 dB
C1:SUS-MC2_LLSEN_IN1: 16.1 dB
C1:SUS-MC2_SDSEN_IN1: 17.3 dB
C1:SUS-MC3_ULSEN_IN1: 15.5 dB
C1:SUS-MC3_URSEN_IN1: 17.3 dB
C1:SUS-MC3_LRSEN_IN1: 18.2 dB
C1:SUS-MC3_LLSEN_IN1: 18.7 dB
C1:SUS-MC3_SDSEN_IN1: 16.8 dB


  Whitening filter has 18dB gain at 60Hz. (It's 3Hz pole, 30Hz zero, 100Hz zero and 0dB at DC)
  So, from the result, at least MC suspensions look like they have correct switching.
  But some channels doesn't look ok.
  We have to check those.

Plan:
   - check ITMX_SDSEN, PRM_ULSEN, PRM_LLSEN, SRM_LRSEN input filters
   - check the script and see if the script can really check. maybe the script needs some adjustments (# of averaging, multiple frequency, ......)
   - make AWG(, tdssine) work
   - check output hardware filter

By the way:
  fb is back. I don't know why. With help from Joe, I just compiled c1mcs again and again changing number of RFM channels.

  3844   Tue Nov 2 11:34:53 2010 josephb, alexUpdateCDSdiagconfd running, excitations back in dtt

Problem:

Diagnostic test tools was starting with errors.

Cause:

After the reboot of the frame builder machine yesterday by Alex, the diagconfd daemon was not getting started by xinetd.  There was a sequence error in the startup where xinetd was being called before mounting drives from linux1.

Important Note:

If you do not see the "nds" line you would not have diagnostic tests enabled in the DTT:

[controls@rosalba apps]$ diag -i | grep nds
nds * * 192.168.113.202 8088 * 192.168.113.202

Solution:

Alex changed /etc/xinetd.d/diagconfd file to point to /opt/apps/gds/bin/diagconfd instead of /opt/apps/bin/diagconf.  He also ensured xinetd started after mounting from linux1.

Alex's Suggestion:
My feeling is we should get rid of this feature and have an NDS address
entry box in the "Online" tab in the DTT with the default "nds". I
mentioned this to Jim Batch and he greed with me, so maybe he is going to
implement this. So maybe you guys want to request the same thing too, send
the request to Rolf and Jim, so we can have the last demon exercised.

  3845   Tue Nov 2 13:51:40 2010 josephb, yutaUpdateCDSRFM slowdown problem

Problem:

Each RFM memory location which needs to be read by a front end model slows the model significantly.

With no RFM memory locations to be read (replaced with grounds), the c1mcs model runs around 25 microseconds per cycle.

With 1 RFM memory location (MC_L), it runs around 29-33 microseconds.

With 3 RFM memory locations (MC_L, MC1_PIT, MC1_YAW), it runs around 45 microseconds.

With 7 RFM memory locations, the code generally doesn't run at all, going past the 62 microsecond maximum required to be able to keep up with the 16 kHz sample rate.

Last night Yuta somehow got it running with 7 RFM memory locations, but in that case, all the odd numbered RFM channels (1,3,5 as counted by the ipc file) did not work.  It was running at around 55 microseconds in that case.

The c1ioo code which is writing the data to the RFM card is experiencing no such slow down.

Current CDS status:

MC damp dataviewer diaggui AWG c1ioo c1sus c1iscex RFM Sim.Plant Frame builder  ...
                     ... 
  3846   Tue Nov 2 15:24:18 2010 josephbUpdateCDSc1ioo and c1mcs only sending MC_L, MC1_PIT, MC1_YAW

In order to have the c1mcs model run, we're running with only 3 RFM channels between c1ioo and c1mcs at the moment.  This leaves the model at around 45 microseconds, and at least lets us damp.

Alex and I still need to track down why the RFM read calls are taking so much time to execute.

  3849   Wed Nov 3 02:23:11 2010 yutaSummaryCDSchecking whitening filter board

Summary:
  Last night, I found that some of the input channels have wrong hardware filter switching(see elog #3842).
  So, to check the whitening board(D000210), I swapped the one with ok switching and bad switching.
  During the checking, I somehow broke the board.
  I fixed it, and now the status is the same as last night (or, at least look like the same).

What I did:
  1. Switching for SRM_LRSEN looked bad and every input channel for MC3 looked OK.
     So, I unplugged the whitening board for SRM (1X5-1-5B) and plugged it into MC3's place(1X5-1-8B).

  2. Ran WDWchecker.py for MC3. The switching seemed OK for every input channel, which means the whitening board was not the wrong one.

  3. Swapped back the whitening board as it was.

  4. Found MC3_ULSEN_OUT and MC3_LLSEN_OUT was keep showing negative value(they should be positive).

  5. Check the board and found that one of LT1125 for UL/LL was wrong (broken virtual ground).

  6. Replaced LT1125 and put the board back to 1X5-1-8B.

  7. Checked the board with WDWchecker.py and dataviewer 5-hour minute trend.
      The input signal came back to normal value(Attachment #1), MC3 damping working, input filter switching seems working

before LT1125 replacement after LT1125 replacement
C1:SUS-MC3_ULSEN_IN1: -2.4 dB [!]
C1:SUS-MC3_URSEN_IN1: 16.9 dB
C1:SUS-MC3_LRSEN_IN1: 15.4 dB
C1:SUS-MC3_LLSEN_IN1: -1.1 dB [!]
C1:SUS-MC3_SDSEN_IN1: 18.4 dB
C1:SUS-MC3_ULSEN_IN1: 18.2 dB
C1:SUS-MC3_URSEN_IN1: 17.6 dB
C1:SUS-MC3_LRSEN_IN1: 16.6 dB
C1:SUS-MC3_LLSEN_IN1: 17.1 dB
C1:SUS-MC3_SDSEN_IN1: 16.2 dB


Result:
  The whitening board seems OK.
  The wrong one is either wiring or RT model. Or, the checking script.

Attachment 1: MC3SEN.png
MC3SEN.png
  3852   Wed Nov 3 09:32:00 2010 ranaSummaryCDSEurocard board swapping

When swapping Eurocard boards, it is safest to first turn down the power supplies and then do the swapping.

Otherwise, it is sometimes the case that people plug in the board slowly and/or asymmetrically. This can cause the opamp to see one of the power supply rails without the other (e.g. +15 but not -15).

This can destroy some opamps. The true danger is that there may be damage to the board which you do not notice for several months, thereby leaving a timebomb for the next person.

Don't be an electronics terrorist!

  3853   Wed Nov 3 15:13:55 2010 josephbUpdateCDSTemporary RFM slow read work around

Problem:

Each RFM read in the c1mcs model is adding ~7 microseconds to the cycle time.  Adding too many pushes it over the 62 microsecond limit.

RFM writes do not have this problem.

Temporary Solution:

The fastest fix was to create a new front end model, called c1rfm, which does nothing but read in the MC1, MC2, MC3 PIT and YAW signals from the c1ioo machine, and then passes them along to the c1mcs model via shared memory, which is fast.

This means the data being sent is 2 cycles slow, one to go over the RFM, and one to go over shared memory.  It is running at 16384 cycles/second, so it shouldn't have much impact at the frequencies we use those channels for.

MC_L is still being sent directly to the c1mcs front end code via RFM.

Current Status:

The c1mcs model is running at  30-33 microseconds for CPU time.

The c1rfm model is running at 45-47 microseconds for CPU time.

All 7 channels, MC_L, MC1_PIT, MC1_YAW, MC2_PIT, MC2_YAW, MC3_PIT, MC3_YAW are responding.

The c1rfm model was added to the /diskless/root/etc/rtsystab file on the fb machine so that it automatically starts on the reboot of c1sus.

The USR and CPU time channels for c1rfm were added to the MCS_SLOW.ini file in /opt/rtcds/caltech/c1/chans/daq/ so that the framebuilder records them, namely:

[C1:FEC-38_USR_TIME]
[C1:FEC-38_CPU_METER]

The framebuilder was restarted to take these new channels into account.

Future:

Finish implementing and debugging the "round robin" RFM reader so as to not require a seperate model to be doing RFM reads in parallel.

Look into improving read speed by either merging timestamps and data into a single  or reading time stamp once every tenth or hundreth cycle, although this at best provides a factor of 2 improvement.

Check to see if RFM card being on the IO chassis or directly in the computer chassis makes a difference.

Get Alex and Rolf to improve RFM read speed.

  3855   Wed Nov 3 17:01:01 2010 josephbSummaryCDSComparison of RFM read times

Problem:

RFM reads are slow.  Rolf has said it should take 2-3 microseconds per read. 

c1sus is taking about 7 microseconds per read, twice as slow as Rolf's claim.

Hypothesis:

The RFM card is in the IO chassis, and is sharing the PCIe bus with 4 ADC cards, 3 DAC cards, 4 BO cards, and a BIO card.  Its possible this crowded bus is causing the reads to take even longer.

Test Results:

Compare read times between the c1sus computer, which has its RFM card in the IO chassis, to c1ioo, which has its RFM card in the computer.

c1ioo:

No RFM reads: 8 microseconds

3 RFM reads: 20 microseconds (~4 per read)

6 RFM reads: 32 microseconds (~4 per read)

c1sus:

No RFM read: 25 microseconds (bigger model)

1 RFM read: 33 microseconds (~8 per read)

3 RFM read: 45 microseconds (~7 per read)

6 RFM read: Over 62 microseconds, doesn't run.

Conclusion:

It looks like moving the RFM card may help by about a factor of 2 in read speed, although its still not quite what Alex and Rolf claim it should be.

The c1mcs and c1ioo models have been reverted to their normal operations.

 

  3857   Wed Nov 3 21:19:40 2010 yutaUpdateCDSput LOCKIN to c1ioo model and checked

(Joe, Yuta)

Summary:
  LOCKIN(consists of oscillator and demodulator) is needed for A2L measurement.
  So, we put LOCKIN to c1ioo model, whose outputs goes to c1mcs ASC.
  After that, I checked the functionality of LOCKIN by directly connecting DAC output for a coil to ADC input for MCL with BNC cable.

What we did:
[Putting LOCKING to c1ioo model]
1. Copied Simulink LOCKIN stuff(cdsOsc, Product, cdsPhase ...) from /cvs/cds/caltech/cds/rward-advLigo/src/epics/simLink/omc.mdl and put it into c1ioo model.

2. Copied MEDM screen file /cvs/cds/caltech/medm/c1/omc/C1OMC_LSC_LOCKIN.adl and modified it for our use.

[Checking LOCKIN]
3. Disconnected MC2_ULCOIL input to SOS Coil driver at 1X4-1-6A and checked the signal from software oscillator at c1ioo is coming.

4. Disconnected the cable labeled "MC OUT1" at 1X2 (which is MCL signal to ADC) and put MC2_ULCOIL output directly using long BNC cable.

5. Checked the functionality of LOCKIN by StripTool.
   The cable wiring did not conflict with my expectation.
   Software mixer is working.(frequency is doubled. X_SIN has offset and X_COS doesn't)

6. Put the cables back.

Result:

  Thanks to c1rfm, c1ioo and c1sus are talking without ADC timeout.
  Also, LOCKIN is working fine.

Attachment 1: Screenshot_C1IOO_LOCKIN.png
Screenshot_C1IOO_LOCKIN.png
  3860   Thu Nov 4 15:15:43 2010 josephbUpdateCDSModified feCodeGen.pl, fmseq.pl, and suspension screens

Feature Requested:

Have the CPU_meter change change color at various alarm levels.  These alarm levels have been set at 2/3 maximum for Minor alarm (yellow) and 9/10 maximum for Major alarm (red).

Implementation:

Rather than hand code each EPICS .db file to add the alarm files each time we rebuild the front ends, I decided to modify it at the source (since it strikes me as a generally useful alarm level for all front end codes).

First, I modified the feCodeGen.pl script.

I changed

print EPICS "OUTVARIABLE FEC\_$dcuId\_CPU_METER epicsOutput.cpuMeter int ai 0 field(HOPR,\"$rate\") field(LOPR,\"0\")\n";

to

print EPICS "OUTVARIABLE FEC\_$dcuId\_CPU_METER epicsOutput.cpuMeter int ai 0 field(HOPR,\"$rate\") field(LOPR,\"0\") field(HIGH,\"$two_thirds_rate\") field(HSV,\"MINOR\") field(HIHI,\"$     nine_tenths_rate\") field(HHSV,\"MAJOR\")\n";

I added the following two lines just before it as well:

$two_thirds_rate = int($rate * 2 / 3);

$nine_tenths_rate = int($rate * 9 / 10);

However, only the first four fields were actually added to the database file.  Apparently fmseq.pl, which populated the database, was hard coded to only handle up to 4 fields.

I modified the fmseq.pl script in /opt/rtcds/caltech/c1/core/advLigoRTS/src/epics/util/ so as to be able to handle up to 6 field values when writing EPICS .db files.

This change was accomplished by simply changing the following line

($junk, $v_name, $v_var, $v_type, $ve_type, $v_init, $v_efield1, $v_efield2, $v_efield3, $v_efield4 ) = split(/\s+/, $_);

to 

($junk, $v_name, $v_var, $v_type, $ve_type, $v_init, $v_efield1, $v_efield2, $v_efield3, $v_efield4, $v_efield5, $v_efield6 ) = split(/\s+/, $_);

everywhere it occurred.  There were something like 10 instances of it. Also, I added the two lines

        $vardb .= "    $v_efield5\n";
        $vardb .= "    $v_efield6\n";

after each set of

         $vardb .= "    $v_efield1\n";
         $vardb .= "    $v_efield2\n";
         $vardb .= "    $v_efield3\n";
         $vardb .= "    $v_efield4\n";

Lastly, I modified the CPU_METER bar graph on the C1SUS_DEFAULTNAME.adl screen (located in /opt/rtcds/caltech/c1/medm/master/) to use alarm levels, and then ran generate_master_screens.py.

 

  3861   Thu Nov 4 15:27:33 2010 josephb, yutaUpdateCDSc1ioo test points not working

Problem:

We can't access any of the c1ioo computer testpoints.  Dataviewer and DTT both fail to read any data from them.

According to diag -l (when run on Rosalba in /opt/apps), the testpoints are not being set.  Also at some point during the day when debugging, we also somehow messed up all the front end connections to the framebuilder.

Errors reported by dataviewer:

Server error 13: no data found
datasrv: DataWriteRealtime failed: daq_send: Illegal seek
Server error 12: no such net-writer
Server error 13: no data found
datasrv: DataWriteRealtime failed: daq_send: Illegal seek
Server error 12: no such net-writer
read(); errno=0
Server error 6532: invalid channel name
Server error 16080: unknown error
datasrv: DataWriteRealtime failed: daq_send: Illegal seek

Error reported by daqd.log:

[Thu Nov  4 13:29:35 2010] About to request `C1:IOO-MC1_PIT_IN1' 10022
on node 34
[Thu Nov  4 13:29:35 2010] Requesting 1 testpoints; tp[0]=10022; tp[1]=32531

[Thu Nov  4 13:29:35 2010] About to request `C1:SUS-MC2_SUSPOS_IN1'
10094 on node 36
[Thu Nov  4 13:29:35 2010] Requesting 1 testpoints; tp[0]=10094; tp[1]=32531

[Thu Nov  4 13:29:38 2010] ETIMEDOUT: test point `C1:IOO-MC1_PIT_IN1'
(tp_num=10022) was not set by the test point manager; request failed
[Thu Nov  4 13:29:38 2010] About to clear `C1:IOO-MC1_PIT_IN1' 10022 on node 34

Attempted Fixes:

Remove all daq related files: /opt/rtcds/caltech/c1/target/gds/param/tpchn_c1ioo.par and /opt/rtcds/caltech/c1/chans/daq/C1IOO.ini.  Rebuilt the front end code.

Double checked ethernet connection between c1ioo and the daq router and fb. 

Confirmed open mx was running on c1ioo. Confirmed awgtpman was running (although at one point I did find duplicate awgtpman running for c1x03, the IOP associated with c1ioo).

Rebooted the c1ioo machine. Confirmed all necessary codes came back.

Restarted the daqd process several times on the fb machine.

Current Status:

Framebuilder is running, and c1sus testpoints are available.  c1ioo test points are not.  Waiting to hear back from Alex on possible ideas.

  3863   Thu Nov 4 17:53:29 2010 yutaUpdateCDSprimitive python script for A2L measurement

Summary:
  I wrote a python script for A2L measurement.
 Currently it is really primitive, but I tested the basic functionality of the script.

 We already have A2L script(at /cvs/cds/rtcds/caltech/c1/scripts/A2L) that uses ezlockin, but python is more stable and easy to read.

A2L measurement method:
  1. Dither a optic using software oscillator in LOCKIN and demodulate the length signal by that frequency.
  2. Change coil output gains to change the pivot of the dithering and do step 1.
  3. Coil output gain set that gives the smallest demodulated magnitude tells you where the current beam spot is.

  Say you are dithering the optic in PIT and changing the coil gains keeping UL=UR and LL=LL.
  If the coil gain set UL=UR=1.01, LL=LR=-0.99 gives you demodulated magnitude 0, that means the current beam spot is 1% upper than the center, compared to 1/2 of UL-LL length.
  You do the same thing for YAW to find horizontal position of the beam.

Description of the script:
  Currently, the script lives at /cvs/cds/caltech/users/yuta/scripts/A2L.py
  If you run;
     ./A2L.py MC1 PIT
  it gives you vertical position of the beam at MC1.

  It changes the TO_COIL matrix gain by "DELTAGAINS", turns on the oscillator, and get X_SIN, X_COS from C1IOO_LOCKIN.
  Plots DELTAGAINS vs X_SIN/X_COS and fit them by y=a+bx+cx^2.(Ideally, c=0)
  Rotates (X_SIN, X_COS) vectors to get I-phase and Q-phase.
    (I,Q)=R*(X_SIN,X_COS)
  Rotation angle is given by;
    rot=arctan(b(X_COS)/b(X_SIN))
  which gives Q 0 slope(Ideally, Q=0).
  x-intercept of DELTAGAINS vs I plot gives the beam position.

Checking the script:
  1. I used the same setup when I checked LOCKIN(see elog #3857). C1:SUS-MC2_ULCOIL output goes directly to C1:IOO-LOCKIN_SIG input.

  2. Set oscillator frequency to 18.13Hz, put 18.13Hz band-pass filter to C1:IOO-LOCKIN_SIG filter module, and put 1Hz low-pass filter to C1:IOO-LOCKIN_X_SIN/X_COS filter modules.
        Drive frequency 18.13Hz is same as the previous script(/cvs/cds/rtcds/caltech/c1/scripts/A2L/A2L_MC2).

  3. Ran the script. Checked that Q~0 and rot=-35deg.

  4. Put phase shifting filter to C1:IOO-LOCKIN_SIG filter module and checked Q~0 and rotation angle.
     fitler rot(deg)
     w/o    -35
     +90deg  45
     -90deg  56
     -45deg -80

  5. Put some noise in C1:SUS-MC2_ULCOIL by adding SUSPOS feedback signal and ran the script.(Attachment #1)
      During the measurement, the damping servo was off, so SUSPOS feedback signal can be treated as noise.

Conclusion:
  The result from the test measurement seems reasonable.
  I think I can apply it to the real measurement, if MCL signal is not so noisy.[status: yellow]

Plan:
  - add calculating coherence procedure, averaging procedure to the script
  - add setting checking procedure to the script
  - apply it to real A2L measurement

Bay the way:
  Computers in the control room is being so slow (rossa, allegra, op440m, rosalba). I don't know why.

Attachment 1: a2ltest.png
a2ltest.png
  3869   Fri Nov 5 15:20:18 2010 josephb, alexSummaryCDS40m computer slow down solved

Problem:

The 40m computers were responding sluggishly yesterday, to the point of being unusable.

Cause:

The mx_stream code running on c1iscex (the X end suspension control computer) went crazy for some reason.  It was constantly writing to a log file in /cvs/cds/rtcds/caltech/c1/target/fb/192.168.113.80.log.  In the past 24 hours this file had grown to approximately 1 Tb in size. The computer had been turned back on yesterday after having reconnected its IO chassis, which had been moved around last week for testing purposes - specifically plugging the c1ioo IO chassis in to it to confirm it had timing problems.

Current Status:

The mx_stream code was killed on c1iscex and the 1 Tb file removed.

Computers are now more usable.

We still need to investigate exactly what caused the code to start writing to the log file non-stop.

Update Edit:

Alex believes this was due to a missing entry in the /diskless/root/etc/hosts file on the fb machine.  It didn't list the IP and hostname for the c1iscex machine.  I have now added it.  c1iscex had been added to the /etc/dhcp/dhcpd.conf file on fb, which is why it was able to boot at all in the first place.  With the addition of the automatic start up of mx_streams in the past week by Alex, the code started, but without the correct ip address in the hosts file, it was getting confused about where it was running and constantly writing errors.

Future:

When adding a new FE machine, add its IP address and its hostname to the /diskless/root/etc/hosts file on the fb machine.

  3870   Fri Nov 5 16:40:22 2010 josephb, alexUpdateCDSc1ioo now has working awgtpman

Problem:

We couldn't set testpoints on the c1ioo machine.

Cause:

Awgtpman was getting into some strange race condition.  Alex added an additional sleep statement and shifted boot order slightly in the rc.local file.  This apparently is only a problem on c1ioo, which is a Sun X4600.  It was using up 100% of a single CPU for the awgtpman process.

Status:

We now have c1ioo test points working.

Future:

Need to examine the startc1ioo script and see if needs a similar modification, as that was tried at several points but yielded a similar state of non-functioning test points. For the moment, reboot of c1ioo is probably the best choice after making modifications to the c1ioo or c1x03 models.

Current CDS status:

MC damp dataviewer diaggui AWG c1ioo c1sus c1iscex RFM Sim.Plant Frame builder  TDS
                     
  3872   Fri Nov 5 21:49:12 2010 ranaSummaryCDS40m computer slow down solved

Quote:

Problem:

The 40m computers were responding sluggishly yesterday, to the point of being unusable.

Cause:

The mx_stream code running on c1iscex (the X end suspension control computer) went crazy for some reason.  It was constantly writing to a log file in /cvs/cds/rtcds/caltech/c1/target/fb/192.168.113.80.log.  In the past 24 hours this file had grown to approximatel

The moral of the story is, PUT THINGS IN THE ELOG. This wild process is one of those things where people say 'this won't effect anything', but in fact it wastes several hours of time.

  3877   Sat Nov 6 16:13:14 2010 ranaSummaryCDS40m computer slow down solved

As part of the effort to debug what was happening with the slow computers, I disabled the auto MEDM snapshots process that Yoichi/Kakeru setup some long time ago:

https://nodus.ligo.caltech.edu:30889/medm/screenshot.html

We have to re-activate it now that the MEDM screen locations have been changed. To do this, we have to modify the crontab on nodus and also the scripts that the cron is calling. I would prefer to run this cron on some linux machine since nodus starts to crawl whenever we run ImageMagick stuff.

Also, we should remember to start moving the old target/ directories into the new area. All of the slow VME controls are still not in opt/rtcds/.

  3880   Mon Nov 8 14:30:15 2010 josephb, yutaUpdateCDSAdded LIGONDSIP setting to cshrc.40m

We added the following line to the cshrc.40m file in the 64-bit linux and 32-bit linux sections:

setenv LIGONDSIP fb

This allows codes like tdsdmd to work properly on the linux machines (seemed to already work fine on the solaris op440m without this change).

  3891   Thu Nov 11 04:32:53 2010 yutaSummaryCDSfound poor contact of DAC cable, previous A2L results were wrong

(Koji, Jenne, Yuta)

We found one of DAC cables had a poor contact.
That probably caused our too much "tilt" of the beam into MC.

Story:
  From the previous A2L measurement and MC aligning, we found that the beam is somehow vertically "tilted" so much.
  We started to check the table leveling and the beam height and they looked reasonable.
  So, we proceeded to check coil balancings using optical levers.
  During the setup of optical levers, I noticed that VMon for MC1_ULCOIL was always showing -0.004 even if I put excitation to coils.
  It was because one of the DAC cables(labeled CAB_1Y4_88) had a poor contact.
  If I push it really hard, it is ok. But maybe we'd better replace the cable.

What caused a poor connection?:
  I don't know.
  A month ago, we checked that they are connected, but things change.

How to prevent it:
  I made a python script that automatically checks if 4 coils are connected or not using C1:IOO_LOCKIN oscillator.
  It is /cvs/cds/caltech/users/yuta/scripts/coilchecker.py.
  It turns off all 3 coils except for the one looking at, and see the difference between oscillation is on and off.
  The difference can be seen by demodulating SUSPOS signal by oscillating frequency.
  If I intentionally unplug CAB_1Y4_88, the result output for MC1 will be;

==RESULTS== (GPS:973512733)
MC1_ULCOIL      0.923853382664 [!]
MC1_URCOIL      38.9361304794
MC1_LRCOIL      55.4927575177
MC1_LLCOIL      45.3428533919


Plan:
 - Make sure the cable connection and do A2L and MC alignment again
 - Even if the cables are ok, it is better to do coil balancings. See the next elog.

  3895   Thu Nov 11 11:51:30 2010 KojiSummaryCDSfound poor contact of DAC cable, previous A2L results were wrong

The cause is apparent! The connectors on the cables are wrong!
Currently only 50% of the pin length goes into the connector!

Quote:

(Koji, Jenne, Yuta)

We found one of DAC cables had a poor contact.
That probably caused our too much "tilt" of the beam into MC.

  It was because one of the DAC cables(labeled CAB_1Y4_88) had a poor contact.
  If I push it really hard, it is ok. But maybe we'd better replace the cable.

What caused a poor connection?:
  I don't know.
  A month ago, we checked that they are connected, but things change.

 

  3900   Thu Nov 11 21:07:49 2010 josephbUpdateCDSPlugged c1iscex into DAQ network - still causes network slowdown

I connected the c1iscex computer to the dedicated DAQ network switch (located in 1X7).

This does not seem to have helped c1iscex stop spewing out "OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)" at the rate of ~1 Gigabyte per minute.

c1iscex is currently off until a solution can be found.

  3901   Thu Nov 11 23:35:23 2010 KojiSummaryCDSfound poor contact of DAC cable, previous A2L results were wrong

[Koji / Yuta]

There were the guys who used the PENTEK 40pin connectors into the IDC 40pin connectors.
Those connectors are not compatible at all.

==> We replaced the connectors on the cables from DAC to IDC adapters to the dewhitening board for the vertex SUSs.

In addition, I found one of the Binary OUT IDC50pin connector has no clamp.

==> We put the IDC50pin clamp on it.

bad-boys.png

PENTEK connectors were inserted. The latches are not working!

IMG_3698.jpg

the vertical pitch is different between PENTEK and IDC!
IMG_3705.jpg

Wow! Where is the clamp???

IMG_3706.jpg

Quote:

The cause is apparent! The connectors on the cables are wrong!
Currently only 50% of the pin length goes into the connector!

Quote:

(Koji, Jenne, Yuta)

We found one of DAC cables had a poor contact.
That probably caused our too much "tilt" of the beam into MC.

  It was because one of the DAC cables(labeled CAB_1Y4_88) had a poor contact.
  If I push it really hard, it is ok. But maybe we'd better replace the cable.

What caused a poor connection?:
  I don't know.
  A month ago, we checked that they are connected, but things change.

 

 

  3906   Fri Nov 12 10:49:34 2010 josephb, valeraUpdateCDSTest of ADC noise

Test:

Look at the effects of the ADC voltage range on the ADC noise floor.

ADC input was terminated with 50 ohms.  We then looked at the channel with DTT. This was at +/- 10 V range.  We used C1:SUS-PRM_SDSEN_IN1 as the test channel.

The map.c file (in /opt/rtcds/caltech/c1/core/advLigoRTS/src/fe/ ) then had two lines added at line 766.

//JCB temporary 2.5V test, remove me
  adcPtr[devNum]->BCR &= 0x84240;

This hard coded the 2.5 V range (we default to the 10 V range at the moment).

We then rebuilt the c1x02 model and reran the test.

Finally, we reverted the code change to map.c and rebuilt c1x02.

Results:
I've attached the DTT output of the two tests.

It appears the ADC is limited by 1.6 uV/rtHz.  Hence the increase in noise in counts by a factor of 4 when we drop to +/- 2.5 V from +/- 10 V.

Attachment 1: ADC_noise.pdf
ADC_noise.pdf
  3910   Fri Nov 12 19:24:56 2010 KojiUpdateCDSTest of ADC noise

[Koji Yuta]

We found one of the ADC cables were left unconnected. This left the MC suspensions uncontrollable through the whole afternoon.
Please keep the status updated and don't forget to revert the configuration...

Quote:

Test:

Look at the effects of the ADC voltage range on the ADC noise floor.

ADC input was terminated with 50 ohms.  We then looked at the channel with DTT. This was at +/- 10 V range.  We used C1:SUS-PRM_SDSEN_IN1 as the test channel.

The map.c file (in /opt/rtcds/caltech/c1/core/advLigoRTS/src/fe/ ) then had two lines added at line 766.

//JCB temporary 2.5V test, remove me
  adcPtr[devNum]->BCR &= 0x84240;

This hard coded the 2.5 V range (we default to the 10 V range at the moment).

We then rebuilt the c1x02 model and reran the test.

Finally, we reverted the code change to map.c and rebuilt c1x02.

Results:
I've attached the DTT output of the two tests.

It appears the ADC is limited by 1.6 uV/rtHz.  Hence the increase in noise in counts by a factor of 4 when we drop to +/- 2.5 V from +/- 10 V.

 

  3912   Sat Nov 13 15:53:05 2010 yutaUpdateCDSdiagonalization of MC input matrix

Motivation:
  MC is aligned from the A2L measurement, but to do the beam centering more precisely, we need coils to be balanced.
  There are several ways to balance the coils, like using oplev or WFS QPD RF channels.
  But oplev takes time to setup, especially for MC3. Also, c1ioo WFS channels were newly setup and haven't been checked yet.
  So, I decided to use OSEM sensors.
  An OSEM sensor itself is sensitive to every DOF of an optic motion, but we can diagonalize them using 4 OSEM sensors and proper input matrix.

Method:

  1. Measure transfer functions between
     ULSEN and URSEN (H_UR(f))
     ULSEN and LRSEN (H_LR(f))
     ULSEN and LLSEN (H_LL(f))

  2. Make a matrix A.

    A =  [[ 1           1           1          ]
          [ H_UR(f_pos) H_UR(f_pit) H_UR(f_yaw)]
          [ H_LR(f_pos) H_LR(f_pit) H_LR(f_yaw)]
          [ H_LL(f_pos) H_LL(f_pit) H_LL(f_yaw)]]


    where f_dof are resonant frequencies.

  3. A is

    s = Ad


   where vectors s^T=[ULSEN URSEN LRSEN LLSEN] and d^T=[POS PIT YAW].
   So,

    d = Bs = (A^TA)^(-1)A^Ts

   where A^T is transpose of A.

   B is the input matrix that diagonalizes 3 DOFs.

What I did:

  1. Measured the TFs using diaggui and exported as ASCII.

  2. Made a script that reads that TF file, calculates and sets a new input matrix B.
    /cvs/cds/caltech/users/yuta/scripts/inputmatrixoptimizer.py
   You need to set resonant frequencies to use the script.

   New input matrices for MCs are;

C1:SUS-MC1_INMATRIX
[[ 1.17649712  0.94315611  0.85065054  1.02969624]
 [ 0.55939288  1.28066594 -0.85235358 -1.3075876 ]
 [ 1.23467139 -0.74521928 -1.29394051  0.72616882]]

C1:SUS-MC2_INMATRIX
[[ 1.12630748  1.01451545  0.9013457   0.95783137]
 [ 1.03043025  0.67826036 -1.37270598 -0.91860341]
 [ 0.83546271 -1.26311029 -0.6456881   1.2557389 ]]

C1:SUS-MC3_INMATRIX
[[ 1.18212117  1.26419447  0.77744155  0.77624281]
 [ 0.79344415  0.84959646 -1.10946339 -1.247496  ]
 [ 1.00225331 -0.84807863 -1.21772132  0.93194674]]


  I ignored SIDE this time.

Result:

  Spectra of each SUSDOF_IN1_DAQ before diagonalization (INMATRIX elements all 1 or -1) were
MCspectraNov09.png

  After diagonalization, spectra are
MCspectraNov13.png

  As you can see, each SUSDOF has only single peak (and SIDE peak) after the diagonalization.
  SUSSIDE still has 4 peaks because SIDE is not included this time.

  For MC2, POS to SUSPIT and POS to SUSYAW got worse. I have to look into them.

Effect of resonant frequency drift:

  As you can compare and see from the spectra above, resonant frequencies of MC1 are somehow drifted(~0.5%) from Nov 9 to Nov 13.
  If resonant frequency you expected was wrong, calculated input matrix will be also wrong.
  The effect of 0.5% drift and wrong input matrix can be seen from this spectra. DOFs are not clearly separated.
 MC1spetra_wrongmatrix.png

Plan:

 - learn how to use diaggui from command line and fully automate this process
 - balance the coils using these diagonalized SUSPOS, SUSPIT, SUSYAW

  3915   Sun Nov 14 11:56:59 2010 valeraUpdateCDSTest of ADC noise

 

We missed a factor of 2 in the ADC calibration: the differential 16 bit ADC with +/-10 V input has 20 V per 32768 counts (1 bit is for the sign). I confirmed this calibration by directly measuring ADC counts per V.

So the ADC input voltage noise with +/-10V range around 100 Hz is 6.5e-3 cts/rtHz x 20V/32768cts =  4.0 uV/rtHz. Bummer. 

The ADC quantization noise limit is 1/sqrt(12 fs/2)=1.6e-3 cts/rtHz. Where the ADC internal sampling frequency is fs=64 kHz. If this would be the limiting digitization noise source then the equivalent ADC input voltage noise would be 1 uV/rtHz with +/-10 V range.

  3919   Mon Nov 15 11:13:12 2010 josephbUpdateCDSModified rc.local to not start mx_streams automatically

Problem:

c1iscex floods the network with about 1 gigabyte of error messages in a few seconds, writing to a log file in /opt/rtcds/caltech/c1/target/fb/logs/

Temporary change:

I commented out the following line in the rc.local file on the fb machine in the /diskless/root/etc/ directory:

#nice --20 ./mx_stream -s "$SYSTEMS" -d fb:0 >& logs/$HOSTNAME.log&

This disables the automatic start up of the mx_streams code on all the front ends.  This will prevent the network being brought to its knees by c1iscex while we debug the problem.

It also means on a reboot of the front ends, the mx_stream process needs to be started by hand until this change is reverted.

To do this, log into the front end and then change directory to /opt/rtcds/caltech/c1/target/fb

For c1sus, run:

./mx_stream -s c1x02 c1sus c1mcs c1rms c1rfm  -d fb:0

For c1ioo, run:

./mx_stream -s c1x03 c1ioo -d fb:0

 

  3926   Mon Nov 15 16:26:46 2010 josephbUpdateCDSc1iscex is now running and the network hasn't died

Problem:

c1iscex was spamming the network with error messages.

Solution:

Updated the front end codes to current standards (they were on the order of months out of date).  After fixing them up and rebuilding the codes on c1iscex, it no longer had problems connecting to the frame builder.\

Status:

I can look at test points for ETMX.  It is not currently damping however.

To Do:

Move filters for ETMX into the correct files. 

Need to add a Binary output blue and gold box to the end rack, and plug it into the binary output card.  Confirm the binary output logic is correct for the OSEM whitening, coil dewhitening, and QPD whitening boards. 

Get ETMX damped.

Figure out what we're going to do with the aux crate which is currently running y-end code at the new x-end.  Koji suggested simply swapping auxilliary crates - this may be the easiest.  Other option would be to change the IP address, so that when it PXE boots it grabs the x-end code instead of the y-end code.

Current CDS status:

MC damp dataviewer diaggui AWG c1ioo c1sus c1iscex RFM Sim.Plant Frame builder TDS
                     
  3935   Tue Nov 16 21:42:31 2010 ranaUpdateCDSScreen Time Fix
I learned today that the following python code will do a find/replace to fix the TIME string on any MEDM screen which has a whited out time field.
Previously, this field was sourced from the c1dscepics of c1losepics process. Now we have to get it from the IOO or SUS front ends

Here's the python code:

import re
o = open("output.adl","w")
data = open("test.adl").read()
o.write( re.sub("C0:TIM-PACIFIC_STRING","C1:FEC-34_TIME_STRING",data)  )
o.close()

Where 'output.adl' could be the same name as 'test.adl' if you want to
replace the existing file. Also FEC-34 just refers to which FE you're running.
It could, in principle, be any one of them.
 

The next step is to figure out how to apply this to all the files in a directory.

  3938   Wed Nov 17 10:39:20 2010 josephbUpdateCDSScreen Time Fix

An improved python code to apply a replacement to all *.adl files in a directory would be:

import re, os
files = os.listdir("./")
  for file in files:
    if ".adl" in str(file):
      data = open(file).read()
      o = open(file,"w")
      o.write( re.sub("C0:TIM-PACIFIC_STRING","C1:FEC-34_TIME_STRING",data)  )
      o.close()

Of course, this entire python script can be replaced with a single sed command:

sed -i 's/C0:TIM-PACIFIC_STRING/C1:FEC-34_TIME_STRING/g' *

A more complicated script could be written which looks for key identifiers either in the file header or inside the file to determine which front end is appropriate, using a dictionary like:

dcuid_dict = {"BS":21,"PRM":37,"SRM":37,"ITMX":21,"ITMY":21,"MC1":36,"MC2":36,"MC3":36,"ETMX":24,"ETMY":26}

and then using for loops and if statements.

 

  3940   Wed Nov 17 16:02:30 2010 josephbUpdateCDSModified feCodeGen.pl to fix filtMuxMatrix name generation

Problem:

Sometime in the last 3 weeks, probably when Alex brought his latest changes from Hanford to the 40m and did an SVN update, the code which generates the names of the filter .adl files links for the overall matrix view broke.

Fix:

I modified FE code gen to use $basename instead of the base name after the top name transform (this changes _ to - after the first 3 letters

@@ -3520,11 +3522,11 @@
 
                  my $tn = top_name_transform($basename);
                  my $basename1 = $usite . ":" . $tn . "_";
-                 my $filtername1 = $usite . $tn;
+                 my $filtername1 = $usite . $basename;

Still having problems:

The filter modules built with the matrix of filter modules run (offests/gains work), but will not load filter coefficients/filter names.  All the other filter modules outside the matrix seem to load fine.  At this point, doing a rebuild of any of the front end machines may cause the A2L filter banks to be unloadable.

 

  3941   Wed Nov 17 20:44:59 2010 yutaSummaryCDSno QPD channels on c1sus machine today

(Joe, Suresh, Yuta)

Currently, only 2 ADC cards work on c1sus machine.
No QPD inputs(e.g. MC2 trans), and no RFM.


Summary:
  We wanted to have PEM(physical environment montor) channels, so we moved a ADC card in c1sus machine.
  It ended up with destroying one of the 3 ADCs.

What we did:
  1. Moved ADC card at PCIe expansion board slot 0 to other empty slot.
     What we call PCI slot 0 was "DO NOT USE" in LIGO-T10005230-v1, so we moved it.

  2. Connected that ADC card to PEM channel box at 1X7 via SCSI cable.

  3. ADC card order is changed, so we checked ADC number assinging and re-labeled the cable.

  4. Found RFM is not working(c1sus and c1ioo not talking) and fb is in a weird state(Status: 0x4000 in GDS screens)

  5. Swapped the cabling so that ADC card 0 will be connected to timing interface card at slot1, but didn't help.
     More than that, we suffered ADC timeout.

  6. Tried ADC card swapping, slot position changing, taking out some of the ADC cards, etc.
     We found that ADC timeout doesn't happen with 2 ADC cards.
     But if we connect one of the ADC card to the timing interface card at slot 8, c1sus ADC timeouts with 2 ADC cards, too.
     So, I think that timing interface card is bad.

  7. Stopped rebooting c1sus again and again. We decided to investigate the problem tomorrow.
     We only need ADC card 0 and 1 for MC damping.(see this wiki page)
       ADC card 0: all UL/UR/LR/LL SENs
       ADC card 1: all SD SENs     
       ADC card 2: all QPDs

Result:
  We can damp optics and lock MC.
  We can't do A2L because RFM is not working.
  We can't see MC2 trans because we currently don't have ADC card 2.

  3945   Thu Nov 18 11:06:20 2010 josephbUpdateCDSc1sus and ADCs

Problem:

ADCs are timing out on c1sus when we have more than 3.

Talked with Rolf:

Alex will be back tomorrow (he took yesterday and today off), so I talked with Rolf.

He said ordering shouldn't make a difference and he's not sure why would be having a problem. However, when he loads the chassis, he tends to put all the ADCs on the same PCI bus (the back plane apparently contains multiples).  Slot 1 is its own bus, Slots 2-9 should be the same bus, and 10-17should be the same bus.

He also mentioned that when you use dmesg and see a line like "ADC TIMEOUT # ##### ######", the first number should be the ADC number, which is useful for determining which one is reporting back slow.

Plan:

Disconnect c1sus IO chassis completely, pull it out, pull out all cards, check connectors, and repopulate with Rolf's suggestions and keeping this elog in mind.

In regards to the RFM, it looks like one of the fibers had been disconnected from  the c1sus chassis RFM card (its plugged in in the middle of the chassis so its hard to see) during all the plugging in and out of the cables and cards last night.

  3946   Thu Nov 18 14:05:06 2010 josephb, yutaUpdateCDSc1sus is alive!

Problem:

We broke c1sus by moving ADC cards around.

Solution:

We pulled all the cards out, examined all contacts (which looked fine), found 1 poorly connected cable internally, going between an ADC and ADC timing interface card  (that probably happened last night), and one of the two RFM fiber cables pulled out of its RFM card.

We then placed all of the cards back in with a new ordering, tightened down everything, and triple checked all connections were on and well fit.

 

Gotcha!

Joe forgot that slot 1 and slot 2 of the timing interface boards have their last channels reserved for duotone signals.  Thus, they shouldn't be used for any ADCs or DACs that need their last channel (such as MC3_LR sensor input).  We saw a perfect timing signal come in through the MC3_LR sensor input, which prevented damping. 

We moved the ADC timing interface card out of the 1st slot  of the timing interface board and into slot 6 of the timing interface board, which resolved the problem.

Final Configuration:

 

 Timing Interface Board

Timing Interface Slot 1 (Duotone) 2 (Duotone) 3 4 5 6 7 8 9 10 11 12 13
Card None DAC interface (can't use last channel) ADC Interface ADC interface ADC interface

ADC

interface

None None None DAC interface DAC interface None None

 PCIe Chassis

Slot 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
PCIe Number Do Not Use 1 6 5 4 9 8 7 3 2 14 13 12 17 16 15 11 10
Card None ADC DAC ADC ADC ADC BO BO BO BO DAC DAC BIO RFM None None None None

Still having Issues with:

ITM West damps.  ITM South damps, but the coil gains are opposite to the other optics in order to damp properly.

We also need to look into switching the channel names for the watchdogs on ITMX/Y in addition to the front end code changes.

  3947   Thu Nov 18 14:19:01 2010 josephbUpdateCDSSwapped c1auxex and c1auxey codes

Problem:

We had not switched the c1aux crates when we renamed the arms, thus the watchdogs labeled ETMX were really watching ETMY and vice-versa.

Solution:

I used telnet to connect to c1auxey, and then c1auxex.

I used the bootChange command to change the IP address of c1auxey to 192.168.113.59 (c1auxex's IP), and its startup script.  Similarly c1auxex was changed to c1auxey and then both were rebooted.

 

c1auxey > bootChange

'.' = clear field;  '-' = go to previous field;  ^D = quit

boot device          : ei
processor number     : 0
host name            : linux1
file name            : /cvs/cds/vw/mv162-262-16M/vxWorks
inet on ethernet (e) : 192.168.113.60:ffffff00 192.168.113.59:ffffff00
inet on backplane (b):
host inet (h)        : 192.168.113.20
gateway inet (g)     :
user (u)             : controls
ftp password (pw) (blank = use rsh):
flags (f)            : 0x0
target name (tn)     : c1auxey c1auxex
startup script (s)   : /cvs/cds/caltech/target/c1auxey/startup.cmd /cvs/cds/caltech/target/c1auxex/startup.cmd
other (o)            :

value = 0 = 0x0

c1auxex > bootChange

'.' = clear field;  '-' = go to previous field;  ^D = quit

boot device          : ei
processor number     : 0
host name            : linux1
file name            : /cvs/cds/vw/mv162-262-16M/vxWorks
inet on ethernet (e) : 192.168.113.59:ffffff00 192.168.113.60:ffffff00
inet on backplane (b):
host inet (h)        : 192.168.113.20
gateway inet (g)     :
user (u)             : controls
ftp password (pw) (blank = use rsh):
flags (f)            : 0x0
target name (tn)     : c1auxex c1auxey
startup script (s)   : /cvs/cds/caltech/target/c1auxex/startup.cmd /cvs/cds/caltech/target/c1auxey/startup.cmd
other (o)            :

value = 0 = 0x0

  3948   Thu Nov 18 16:32:21 2010 yutaSummaryCDScurrent damping status for all optics c1sus handles

Summary:
   I set Q-values for each ringdown of PRM, BS, ITMX, ITMY, MC1, MC2, MC3 to ~5 using QAdjuster.py.
   Here are the results;
c1susdampings.png

  Red ringdowns indicate the second try after gain setting.

Note:

  - ITMX and ITMY are referred according to MEDM screens in this entry.
  - ITMX(south) OSEM positions are currently so bad(LL and SD are all the way in/out).
        I have to change IFO_ALIGN slider values to check the damping servo. For SIDE, I couldn't do that. I reverted the slider change after the damping checking.
  - ITMY(west) somehow has opposite coil gain sign.
       Usually for the other optics, UL,UR,LR,LL is 1,-1,1,-1. But for ITMY to damp, they are -1,1,-1,1.
  - PRM damps, but ringdown doesn't look nice. There must be something funny going on.
  - SRM doesn't have OSEMs put in now.

  3954   Fri Nov 19 12:53:50 2010 josephbUpdateCDSTestpoints on c1iscex now working

Problem:

c1iscex did not have test points working last night.

Solution:

The diag -i command indicated that :

awg 19 0 192.168.113.80 822095891 1 192.168.113.80

awg 45 0 192.168.113.80 822095917 1 192.168.113.80

The first number after the awg should be the DCUID number.  The IP address 192.168.113.80 corresponds to c1iscex.  So we had awg and testpoints setup for DCUI 19 and 45 on c1iscex.  DCUID 19 is c1x01 (the IOP), but 45 was used for a test awhile back. 

Turns out that in the testpoint.par file located in /cvs/cds/rtcds/caltech/c1/target/gds/param, there were two entries for c1scx, one with DCUID 24 and also DCUID 45.  The model at the time was running with DCUID 24.

So I changed the model DCUID to 45, deleted the [C-node24] entry in the testpoint.par file, and restarted the machine, and also did a "telnet fb 8088" and "shutdown" to restart the frame builder.

  3957   Fri Nov 19 17:12:22 2010 yutaUpdateCDSETMX damped, but with weird TO_COIL matrix

Background:
  c1iscex machine is currently being setup and RT model c1scx is running.
  But ETMX(south) didn't seem to be damped, so I checked it.

What I did:
  1. Checked the wiring. It seemed to be OK.
    Looked LEMO monitor output of SUS PD Whitening Board(D000210) with oscilloscope and they seemed to be getting some sensor signal except SDSEN.
      SDSEN is funny. C1:SUS-ETMX_SPDMon decreases slowly when PD input cable is disconnected, and increases slowly when connected.
      There might be some problem in the circuits.
    Looked LEMO monitor output of SOS Coil Driver Module(D010001) with oscilloscope and they seemed to be receiving correct signal from DAC.
      When ULCOIL offset is added, ch1 increased and so on.

  2. Checked the direction of SUSDOF motion when kicked with one coil.
    The result was;

kick (+) POS PIT YAW
ULCOIL + + +
URCOIL + - +
LRCOIL + - -
LLCOIL + + -


    This table tells you, when ULCOIL_OFFSET increases, SUSPOS increases and so on.
    If URCOIL and LLCOIL are swapped, they look correct.
    Also, they have opposite sign to the usual optics(e.g. MCs, BS, PRM).

  3. Changed TO_COIL matrix according to the table above(see Attachment #1). Changed signs of XXCOIL_GAINs.

  4. ETMX damped!

Plan:
  - Check the wiring after SOS Coil Driver Module and circuit around SDSEN
  - Check whitening and dewhitening filters. We connected a binary output cable, but didn't checked them yet.
  - Make a script for step 2
  - Activate new DAQ channels for ETMX (what is the current new fresh up-to-date latest fb restart procedure?)

Attachment 1: ETMXdamping.png
ETMXdamping.png
  3959   Sat Nov 20 01:58:56 2010 yutaHowToCDSeditting RT models and MEDM screens

(Suresh, Yuta)

If you come up with a good idea and want to add new things to current RT model;

 1. Go to simLink directory and open matlab;
    cd /cvs/cds/rtcds/caltech/c1/core/advLigoRTS/src/epics/simLink
    matlab


 2. In matlab command line, type;
    addpath lib

 3. Open a model you want to edit.
    open modelname

 4. Edit! CDS_PARTS has useful CDS parts.
    open CDS_PARTS
    There are some traps. For example, you cannot put cdsOsc in a subsystem

 5. Compile your new model. See my elog #3787.
 
 6. If you want to burt restore things;
    cd /cvs/cds/caltech/burt/autoburt/snapshots/YEAR/MONTH/DATE/TIME/
    burtgooey


 7. Edit MEDM screens
    cd /cvs/cds/rtcds/caltech/c1/medm
    medm


 8. Useful wiki page on making a new suspension MEDM screens;
    http://lhocds.ligo-wa.caltech.edu:8000/40m/How_to_make_new_suspension_medm_screens

  3960   Sat Nov 20 02:25:30 2010 yutaUpdateCDS2 LOCKINs for suspension models

(Suresh, Koji, Yuta)

Background:
  No AWG. No tdssine.
  ...... LOCKIN!

What we did:
  1. Added 2 LOCKINs for c1sus model.
   Currently, we cannot put cdsOsc in a subsystem.
   So, we put LOCKINs just for BS for a test.
   The signal going into LOCKIN can be anything. For now, we just put a matrix for selecting the signal and connected the input signals to the ground.

   See the following page for the current simlink diagram of c1sus model.
     https://nodus.ligo.caltech.edu:30889/FE/c1sus_slwebview_files/index.html

  2. Edited MEDM screens. (see Attachment #1)

Result:
  We succeeded in putting 2 LOCKINs and exciting BS.
  During the update, we might destroyed things. For example, fb status is red in GDS screens.
  We will wait for Joe to fix them.

Plan:
 - Fix cdsOsc and put LOCKINs for all the other optics
 - Come up with a good idea what to do with this LOCKIN. Remember, LOCKIN is not just a replacement for excitation points.
 - Enhance an oscillator so that we can put a random noise

Attachment 1: LockinRoll.png
LockinRoll.png
  3961   Sat Nov 20 03:37:11 2010 yutaSummaryCDSCDS time delay measurement - the ripple

(Koji, Joe, Yuta)

Motivation:
  We wanted to know more about CDS.

Setup:
  Same as in elog #3829.

What we did:

  1. Made test RT models c1tst and c1nio for c1iscex.
     c1tst has only 2 filter module(minimum limit of a model), 2 inputs, 2 outputs and it runs with IOP c1x01.
     c1nio is the same as c1tst except it runs(or, should run) without IOP.

  2. Measured the time delay of ADC through DAC using different machine, different sampling rate by measuring transfer functions.

  3. c1nio(without IOP) didn't seem to be running correctly and we couldn't measure the TF.
     "1 PPS" error appeared in GDS screen(C1:FEC-39_TIME_ERR).
     It looks like c1nio is receiving the signal as we could see in the MEDM screen, but the signal doesn't come out from the DAC.

TF we expected:
  All the filters and gains are set to 1.

  We have DA's TF when putting 64K signal out to analog world.
    D(f)=exp(-i*pi*f*Ts)*sin(pi*f*Ts)/(pi*f*Ts)  (Ts: sample time)

  We have AA filter and AI filter when downsampling and upsampling.
    A(f)=G*(1+b11/z+b12/z/z)/(1+a11/z+a12/z/z)*(1+b21/z+b22/z/z)/(1+a21/z+a22/z/z)       z=exp(i*2*pi*f*Ts)
  Coefficients can be found in /cvs/cds/rtcds/caltech/c1/core/advLigoRTS/src/fe/controller.c.

/* Coeffs for the 2x downsampling (32K system) filter */
static double feCoeff2x[9] =
        {0.053628649721183,
        -1.25687596603711,    0.57946661417301,    0.00000415782507,    1.00000000000000,
        -0.79382359542546,    0.88797791037820,    1.29081406322442,    1.00000000000000};
/* Coeffs for the 4x downsampling (16K system) filter */
static double feCoeff4x[9] =
    {0.014805052402446, 
    -1.71662585474518,    0.78495484219691,   -1.41346289716898,   0.99893884152400,
    -1.68385964238855,    0.93734519457266,    0.00000127375260,   0.99819981588176};


  For 64K system, we expect H=1.

  We also have a delay.
    S(f)=exp(-i*2*pi*f*dt)   (dt: delay time)

  So, total TF we expect is;
    H(f)=a*A(f)^2*D(f)*S(f)
  a is a constant depending on the range of ADC and DAC(I think). Currently, a=1/4.

  We may need to think about TF when upsampling.(D(f) is TF of upsampling 64K to analog)

Result:

  Example plot is attached.
  For other plots and the raw data, see /cvs/cds/caltech/users/yuta/scripts/CDSdelay2/ directory.
  As you can see, TFs are slightly different from what we expect.
  They show ripple we don't understand at near cut off frequency.

  If we ignore the ripple, here is the result of delay time at each condition;

data file    host    FE    IOP        rate    sample time    delay        delay/Ts
c1rms16K.dat    c1sus      c1rms    adcSlave    16K    61.0usec    110.4usec    1.8
c1scx16K.dat    c1iscex    c1scx    adcSlave    16K    61.0usec     85.5usec    1.4
c1tst16K.dat    c1iscex    c1tst    adcSlave    16K    61.0usec     84.3usec    1.4
c1tst32K.dat    c1iscex    c1tst    adcSlave    32K    30.5usec     53.7usec    1.8
c1tst64K.dat    c1iscex    c1tst    adcSlave    64K    15.3usec     38.4usec    2.5

  The delay time shown above does not include the delay of DA. To include, add 7.6usec(Ts/2).

  - delay time is different for different machine
  - number of filters (c1scx has full of filters for ETMX suspension, c1tst has only 2) doen't seem to effect much to delay time
  - higher the sampling rate, larger the (delay time)/(sample time) ratio

Plan:

 - figure out how to run a model without IOP
 - where do the ripples come from?
 - why we didn't see significant ripple at previous measurement on c1sus?

Attachment 1: c1tst16Kdelay.png
c1tst16Kdelay.png
  3962   Mon Nov 22 12:00:18 2010 josephbUpdateCDSUpdated Computer Restart Procedures for FB

I've updated the  Computer Restart Procedures  page in the wiki with the latest fb restart procedure.

To just restart just the daqd (frame builder) process, do:

1) telnet fb 8088

2) shutdown

The init process will take care of the rest and restart daqd automatically.

 

Background:
Plan:
  - Check the wiring after SOS Coil Driver Module and circuit around SDSEN
  - Check whitening and dewhitening filters. We connected a binary output cable, but didn't checked them yet.
  - Make a script for step 2
  - Activate new DAQ channels for ETMX (what is the current new fresh up-to-date latest fb restart procedure?)

 

ELOG V3.1.3-