40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 253 of 350  Not logged in ELOG logo
ID Date Authorup Type Category Subject
  3289   Mon Jul 26 10:02:42 2010 josephbUpdateCDSRerouted RFM around c1lsc, took RFM card out of c1lsc

If you're refering to just the medm screen,  those can be restored from the SVN.  As we're moving to a new directory structure, starting with /opt/rtcds/caltech/c1/, the old LSC screens can all be put back in the /cvs/cds/caltech/medm/c1/lsc directory if desired.

The slow lsc aux crate, c1iscaux2, is still working, and those channels are still available.  I confirmed that one was still updating. As a quick test, I went to the SVN and pulled out the C1LSC_RFADJUST.adl file, renamed it to C1LSC_RFadjust.adl and placed it in /cvs/cds/caltech/medm/c1/lsc/, and checked it linked properly from the C1IOO_ModeCleaner.adl file.  I haven't touched the modulation depths, as I didn't want to mess with the mode cleaner, but if I get an OK, we can test that today and confirm that modulation depth control is still working.

Quote:

 I just realized that an unfortunate casualty of this LSC work was the deletion of the slow controls for the LSC which we still use (some sort of AUX processor). For example, the modulation

depth slider for the MC is now in an unknown state.

 

  3296   Tue Jul 27 11:24:53 2010 josephbHowToComputer Scripts / Programskilldataviewer script

I placed a script for killing all instances of the dataviewer program on the current computer in /cvs/cds/caltech/scripts/general/.  Its called killdataviewer.  This is intended to get rid of a bunch of zombie dataviewer processes quickly.  These processes get into this bad state when the dataviewer program is closed in any way other than the graphical menu File -> Exit option.

Its contents are very simple:

#/bin/bash

kill `ps -ef | grep dataviewer | grep -v grep | grep -v killdataviewer | awk '{print $2}'`

  3319   Thu Jul 29 12:31:24 2010 josephbUpdateCDSWorking DAC, working IOP - next up SUS

Ok, after a few minutes of talking to Alex, I got the correct "GUI syntax" through my head, and we now have a simple working green end control which in fact puts signals out through the DAC.

Note to self, do not put any additional filters or controls in the IOP module.  Basically just change the master block with GDS numbers, DCU_ID numbers, etc.  When using a control model, copy the approriate ADC and ADC selector or DAC to the control model.  It will magically be connected to the IOP.

A correct example of a simple control model is attached.

Next in line is to get the adapter boxes for SUS into the new 1X5 rack and get started on SUS filter conversion and figuring out which ADC/DAC channels correspond to which inputs.

 

Attachment 1: Simple_Green_Control.png
Simple_Green_Control.png
  3342   Sat Jul 31 17:37:36 2010 josephbUpdateCDSCables needed for CDS test

Last Thursday, Kiwamu and I went through the cabling necessary for a full damping test of the vertex optics controled by the sus subsytem, i.e. BS, ITMX, ITMY, PRM, SRM.  The sus IO chassis is sitting in the middle of the 1X4 rack.  The c1sus computer is the top 1U computer in that rack.

ADCs:

The hardest part is placing the 2x D-sub connectors to scsi on the lemo break out boxes connected to the 110Bs.  The breakout boxes can be seen at the very top of the picture Kiwamu took here.  These will require a minor modification to the back panel to allow the scsi cable to get out.  There are two of these boxes in the new 1X5 rack.  These would be connected by scsi to the ADC adapters in the back of the sus IO chassis in 1X4.  The connectors are currently behind the new 1X5 rack (along with some spare ADCs/DACs/BOs.

There are 3 cables going from 40 IDC to 37 D-sub (the last 3 wires are not used and do not need to be connected, i.e. 38-40).  These plug into the blue and gold ADC adapter box, the top one shown here.  There is one spare connection which will remain unused for the moment.  The 40 IPC ends plug into the Optical Lever PD boxes in the upper right of the new 1X4 rack (as seen in the top picture here - the boards on the right). At the back of the blue and gold adapter box is a scsi adapter which goes to the back of the IO chassis and plugs into an ADC.

In the back of the IO chassis is a 4th ADC which can be left unconnected at this point.  It will eventually be plugged into the BNC breakout box for PEM signals over in the new 1X7 rack, but is unneeded for a sus test.

DACs:

There are 5 cables going from 3 SOS dewhite/anti-image boards and 2 LSC anti-image boards into 3 blue and gold DAC adapter boxes.  Currently they plug into the Pentek DACs at the bottom of the new 1X4 rack.  Ideally we should be able to simply unplug these from the Pentek DACs and plug them directly into the blue and gold adapter boxes.  However at the time we checked, it was unclear if they would reach.  So its possible new cables may need to be made (or 40 pin IDC extenders made). These boxes are then connected to the back of the IO chassis by SCSI cables.  One of the DAC outputs will be left unconnected for now.

Binary Output:

The Binary output adapter boxes are plugged into the IO chassis BO cards via D-sub 37 cables.  Note one has to go past the ADC/DAC adapter board in the back of IO chassis and plug directly into the Binary Output cards in the middle of the chassis.  The 50 pin IDC cables should be unplugged from XY220s and plugged into the BO adapter boxes.  It is unclear if these will reach.

Timing:

We have a short fiber cable (sitting on the top shelf of the new 1X3 rack) which we can plug into the master timing distribution (blue box located in the new 1X6 rack) and into the front of the SUS IO chassis.  It doesn't quite make it going through all the holes at the top of the racks and through the cabling trays, so I generally only plug it in for actual tests.

The IO chassis is already plugged into the c1sus chassis with an Infiniband cable.

 

So in Summary to plug everything in for a SUS test requires:

  • 6x SCSI cables (3 ADC, 3 DAC) (several near bottom of new 1X3 rack)
  • 4x 37 D-sub to 37 D-sub connector (end connectors can be found behind new 1X5/1X6 area with the IO chassis stuff - Need to be made) (4 BO)
  • 3x 40 IDC to 37 D-sub connectors (end connectors can be found behind new 1X5/1X6 area - Need to be made)(ADC)
  • 5x 64 pin ribbon to 40 IDC cable (already exist, unclear if they will reach) (DAC)
  • 8x 50 pin IDC ribbon (already exist, unclear if they will reach) (BO)
  • 1x Double fiber from timing master to timing card
  • 1x Infiniband cable (already plugged in)

Tomorrow, I will finish up a channel numbering plan I started with Kiwamu on Thursday and place it in the wiki and elog.  This is for knowing which ADC/DAC/BO channel numbers correspond to which signals.  Which ADCs/DACs/BOs the cables plug into matter for the actual control model, otherwise you'll be sending signals to the wrong destinations.

WARNING: The channel numbers on the front Binary Output blue and gold adapter boxes are labeled incorrectly.  Channels 1-16 are really in the middle, and 17-32 are on the left when looking at the front of the box.  The "To Binary IO Module" is correct.

  3399   Wed Aug 11 13:28:44 2010 josephbUpdateCDSNew cdsIPCx parts, old part breaks models

While working on a test stand at LLO, I've discovered that there's a new update to the Real Time Code Generator that breaks virtually all of our current models.

Previously, we've been using the cdsIPCx part as a flexible link between models, and could be set to RFM (reflected memory), SHMEM (shared memory on the local computer), or PCIE (pci express or dolphin I think) depending on what was written in the IPC file (found in /cvs/cds/caltech/chans/ipc/C1.ipc).

Recently, three new parts were added called cdsIPCx_SHMEM, cdsIPCx_RFM, cdsIPCx_PCIE, and the previous cdsIPCx broken.  The cdsIPCx_***** has to match the type written in the IPC file or else it errors out with a rather unhelpful error message which doesn't tell you which part/line is in conflict...

i.e.

***ERROR: IPCx type mis-match: I vs. ISHM
make: *** [g1lsc] Error 255

In anycase, when using a current checkout (updated or checked out after ~July 30th), all cdsIPCx blocks in the simulink models needs to be changed to the approriate type.  I'm trying to get a new CDS_PARTS.mdl file I updated into the svn, but for now you can just open up the model file directly in the /advLigoRTS/src/epics/simLink/lib directory and copy it in from there (i.e. cdsIPCx_SHMEM.mdl) to your model file.  I'm also trying to get the feCodeGen.pl file changed so the error message actually tells you what channel/part is mismatching so you don't have to guess.

An easy way to modify a file for all shared memory connections without doing a ton of cutting, pasting and renaming is:

sed -i 's/"cdsIPCx"/"cdsIPCx_SHMEM"/g' file_name_here.mdl

  3408   Thu Aug 12 14:01:53 2010 josephbUpdateCDScurrent status

First, awesome progress.

Second, the Binary Output parts do in fact need to be added to the c1sus model.  They don't need to appear in the IOP though.  (They are somehow automatically taken care of by the IOP code). 

It looks like in the 2004 revision in SVN, the cdsCDO32 part (located in the CDS_PARTS.mdl/IO_PARTS) was broken.  I fixed it and updated the svn with it, so a svn update should pull the corrected version.

I'm not sure whats wrong with the DAQs.  I'll try to take a closer look at the model file tonight and see if I can make suggestions.  When you write outputs on DAC_0 or DAC_1 is the C1SUS GDS TP screen showing anything?

  3440   Thu Aug 19 09:51:43 2010 josephbUpdateelogelog dead again

I found the elog dead again this morning, and the script didn't kill again. I modified the script to use the following line instead of "pkill elogd":

kill `ps -ef | grep elogd | grep -v grep | grep -v start-elog-nodus | awk '{print $2}'`

Hopefully the script will be a bit more robust in the future.

Quote:

Quote:

The elog was so dead this time that the restart script didn't work.  I followed the restart-by-hand instructions instead, with success.

 Just for added interest, I tried a different method when the restart script broke.  The "start-elog-nodus" script has a line "kill elogd".  This seems not to be actually killing anything anymore, which means the elog can't restart.  So this time I went for "kill <pid number>", and then ran the startup script.  This worked.  So it's the "kill elogd" which isn't working reliably.

 

  3446   Fri Aug 20 13:09:53 2010 josephbUpdateelogRebooted elog

I had to restart the elog again.

At this point, I'm going to try to get one of the GC guys to install gdb on nodus, and run the elog in the debugger, that way when it crashes the next time, I have some error output I can send back to the developer and ask why its crashing there.

  3464   Tue Aug 24 14:29:18 2010 josephbUpdateelogElog down for 1 minute

I'm going to take the elog down for one minute and restart it under gdb (using a copy of gdb stolen from fb40m since I couldn't figure out how to install an old enough version on nodus from source).  The terminal with information is running on Rosalba under the "Phase Noise" panel, so please don't close it.  Ideally, the next time the elog crashes, I'll have some output indicating why or at least the line in the code.  I can then look at the raw source code or send the line back to the developer and see if he has any ideas.

  3467   Wed Aug 25 12:18:47 2010 josephbUpdateelogTrying new version of elog to see if it helps stability

So unfortunately, I made the start-elog-nodus script smart enough to kill the debugging run I had (although thats probably good since there might have been issues with continuing to run - just poor timing on part of the crash).

In related news, I have gotten the latest version of the elog code to actually compile on Nodus.  I had to hack the cryptic.c file (elog/src/cryptic.c) to get it to work though.

The following was copied from the #ifdef _MSC_VER section of the code into the #else directly following that section. 

#define MAX(x,y) ((x)>(y)?(x):(y))
#define MIN(x,y) ((x)<(y)?(x):(y))
#define __alignof__(x) sizeof(x)
#define alloca(x) malloc(x)
#define mempcpy(d, s, n) ((char *)memcpy(d,s,n)+n)
#define ERANGE 34


I also removed #include <stdint.h> as the functionality it provides is covered by inttypes.h on Solaris machines, which is automatically included.

This new code was released August 5th 2010, while the old elog code we were running was 2.7.5 and was released sometime in 2008.  There are several crash fixes mentioned in the version notes so I'm hoping this may improve stability. I'm in the process of making a copy of the elog logbooks into the elog-2.8.0 install (so as to have a backup with the original 2.7.5).  I'm also copying over all the configuration files.   In a few minutes I'm going to try switching over to the new elog.  If it doesn't work, or is worse, its easy enough to just start up the current version.

All files are located in /cvs/cds/caltech/elog/elog-2.8.0 (the old directory is elog-2.7.5).  I've made  a new startup script called start-elog-nodus-2.8.0.  To start the new one, just run that script.  To start the old one, just go to the elog-2.7.5 directory and run the old start-elog-nodus script.

  3468   Wed Aug 25 12:40:28 2010 josephbUpdateelogReverted back to 2.7.5 until further testing is done

So apparently the themes/configurations didn't work so nicely on some of the logbooks with 2.8.0, so I'm reverting to 2.7.5 until I can figure out (assuming I can) how to get them to display properly.

  3471   Wed Aug 25 15:55:33 2010 josephbUpdateelogStaying with 2.7.5 until passwords sorted out

Turns out the elog version 2.8.0 uses a different encryption method than 2.7.5.  This mean the encrypted passwords stored in the elogd.cfg don't work with the new code.  elogd includes functionality to generate encrypted passwords, but unfortunately I don't know the administration passwords for some of the logbooks.  So I'm going to leave 2.7.5 running until I can get those added properly to the 2.8.0 cfg file.

  3473   Thu Aug 26 13:08:03 2010 josephbUpdateCDSWatch dogs for Vertex optics turned off

We are in the process of doing a damping test with the real time code and have turned off the vertex optics watchdogs temporarily, including BS, ITMs, SRM, PRM, MCs.

  3498   Tue Aug 31 16:42:26 2010 josephbUpdateCDSTemporarily reverting to CDS revision 2005

Apparently updating to the latest revision of the RCG has some issues with diaggui and awgtpman.  Alex had to do some recompiling up at Hanford which apparently took him some time.  He'll be coming by tomorrow to try to bring those codes to the front end machines.

As a temporary fix, until Alex gets here tomorrow, we're reverting to the 2005 revision in the svn of the cds code.  I'm placing it in the location it is supposed to go in the new advLIGO scheme, which is /opt/rtcds/caltech/c1/core/, which is where Keith had it at LLO.  Once we get the new codes working, we will do an svn update on that location and migrate our work to that install location, at which point I'll remove the old /cvs/cds/caltech/cds/advLigoRTS/ location.

  3507   Wed Sep 1 12:24:47 2010 josephbUpdateCDSTrying to get up to date CDS code runnning

Alex, Joe:

We copied the latest x02 to c1x02 and modified to our the config block in it.

We removed gds_node_id.  We just have one number now, the dcuid, which is unique for each controller, simulated plant and IOP.  Set site to C1 and host to c1sus.

Alex made the latest awgtpman backwards compatible, and checked that into svn.

We installed the latest framecpp onto c1sus from www.ldas-sc.ligo.caltech.edu/packages/ using wget.

wget www.ldas-sc.ligo.caltech.edu/packages/framecpp-1.18 and then used make.

This let us compile diagd on c1sus, using the command make stand in the /advLigoRTS/build area.

We copied gds from the seiteststand over at handford and are trying to build that on megatron.  However, there's a bunch of packages we need for it to install properly.  Alex said he'd work on that later, possibly trying to make some portable binaries.

Checked out the latest dataviewer into /opt/rtcds/caltech/c1/core/daq, however its not quite working yet either.  This is another thing Alex said he'll work on later.

We are also going to test Alex and Rolf's kernel patch over on c1iscex on Centos base kernel (apparently they've been using Gentoo up at hanford for the test stands...) and see how that works.

  3515   Thu Sep 2 16:45:48 2010 josephbUpdateCDSNumbering scheme of the PCI bus

Rolf has recently written a document describing how one should fill out an IO chassis and how the numbering works out.  This can be found in the DCC at Rolf's PCIe numbering guide (T1000523).

Basically it works out that slot 1 corresponds to PCIe number 1, but slot 2 corresponds to PCIe number 6.  And so forth.  The following table gives a quick summary.

Slot 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
PCIe Number 1 6 5 4 9 8 7 3 2 14 13 12 17 16 15 11 10

 

  3516   Thu Sep 2 17:43:30 2010 josephbUpdateCDSOne working BO output module, others not so much

 Joe and Kiwamu:

We found one bug in the RCG code, where the second input for the CDO32 part (32 binary output) was simply a repeat of the first input, and totally ignored the second input.  This was fixed in the /advLigoRTS/src/epics/util/lib/CDO32.pm file by changing 

$calcExp .= $::fromExp[0];

to

$calcExp .= $::fromExp[1];

This fix has been added to the svn.  Unfortunately, while we have a single working binary output module, the 2nd and later modules do not seem to be responding at all.  We've done the usual swaping parts of the path in both software and hardware and can't find any bad pieces in our model files or the actual hardware.   That leaves me wondering about the c code, specifically if the CDO32Output[1], CDO32Output[2], and so forth array entries in the code are being handled properly.  I'll try to get some thoughts on it from Alex tomorrow.

  3521   Fri Sep 3 11:23:16 2010 josephbConfigurationComputersrossa nvidia driver and dual monitor configuration
At LLO the machines are running Centos 5.5. A quick login confirms this. Specifically the release is 2.6.18-194.3.1.el5.

Quote:

Why are we running CentOS 4.8 instead of 5.5 ?    What runs at LLO?     What runs in Downs?

 

  3531   Tue Sep 7 10:50:53 2010 josephbConfigurationComputersrossa notes

The controls group is user id 500 by default on most new machines. Unfortunately, the user ID used across the already existing machines is 1001. One method of doing this switch is in this elog.  You can also do the change of the controls ID by becoming root and using the graphical command system-config-users.  This will  let you change the user ID and group ID for controls to 1001.  This graphical interface also lets you change the login shell.

Unfortunately, I had some minor difficulty and I ended up removing the old controls and creating a new controls account with the correct values and using tcsh.  The .cshrc file has been recreated to source cshrc.40m.  The controls account now has correct permissions, although some of the preferences such as background will need to be reset.

 

 

Quote:

**** Deleted
apps/emacs
apps/linux64/firefoxold
apps/linux64/comsol      (old v. 3.5)

* running up2date on rossa

* rossa needs to be able move windows between monitors: Xinerama?

* there are permissions problems: controls on rossa can't
make and delete directories made by 'controls' elsewhere.
Some sort of user# or group issue?

 

  3546   Wed Sep 8 14:44:37 2010 josephbUpdateCDSUpdated parse_mdl_to_ipc.py and feCodeGen.pl

I updated parse_mdl_to_ipc.py to correctly work with the 3 new cdsIPCx parts, namely cdsIPCx_PCIE (for dolphin connections), cdsIPCx_RFM (for our traditional reflected memory connections), and cdsIPCx_SHMEM (local shared memory on the computer).  These parts replaced cdsIPCx awhile back (see here ).

The code now correctly counts each type independantly with regards to ipcNum (I.e. you can have ipcNum = 0 for RFM and ipcNum = 0 for SHMEM for example).

 

I also went in and modified a few sections of the feCodeGen.pl (in /opt/rtcds/caltech/c1/core/advLigoRTS/src/epics/util/) so as to properly assign names to matrix adl files, matrix of filter bank adl files, and filter bank adl files.

  3553   Thu Sep 9 14:42:40 2010 josephbUpdateCDSUpdating suspensions model

I've been working on updating the suspensions model, incorporating Koji's refinements as well as trying to simplify the model and making it less cluttered. 

This had the side benefit of making an incorrect connection obvious.  I had incorrectly wired the ASCPIT input to be summed into the yaw path, and the ASCYAW input into the pitch path.  This has been corrected.

I've finished a single optic, and now I am in the process of propagating the changes to all the optics, as well as cleaning up the overall diagram using Rolf's new tags, which make things much less cluttered.  I've attached a screenshot of the PRM optic control model, and will be updating the Matlab web export once I've updated the full model.

Attachment 1: SingleSUS.png
SingleSUS.png
  3564   Mon Sep 13 10:22:48 2010 josephbUpdateCDSRCG bugs/feature request wiki page

I've started a wiki page under the Upgrade 09/New CDS section regarding known bugs and pending feature requests for the Real Time Code Generator.   It can be found at http://lhocds.ligo-wa.caltech.edu:8000/40m/Bugs_and_Pending_Feature_requests_for_the_RCG.  If you have any ideas to improve the RCG or encounter a bug in the code generation process (say a particular part doesn't work inside subsystems for example), please note them here.

Currently there are bugs with excitation points (they don't work inside of a subsystem block) and tags (they don't respect scope and only 1 "from" tag for each "goto" tag connected to the output of a subsystem block).

  3576   Wed Sep 15 14:34:57 2010 josephbSummaryCDSPlan for RFM switch over

Steps for RFM switch over:

1) Ensure the new frame builder code is working properly:

   A) Get Alex to finish compiling the frame builder and test on Megatron.

   B) Test the new frame builder code on fb40m (which is running Solaris) in a reversible way.  Change directory structure away from Data1, Data2, to use actual times.

   C) Confirm new frame builder code still records slow channels (c1dcuepics).

2) Ensure awg, tpman, and diagnostic codes (dtt) are working with the new front end code.

3) Physically move RFM cables from old front ends to the new front ends.  Remove excess connections from the network.

4) Merge the megatron/c1sus/c1iscex/c1ioo network with the main network.

   A) Update all the network settings on the machines as well as Linux1

   B) Remove the network switch separating the networks.

4) Start the new frame builder code on fb40m.

  3583   Fri Sep 17 12:11:42 2010 josephbUpdateCDSDowns update

In doing a re-inventory prior to the IOO chassis installation, I re-discovered we had a missing interface board that goes in an IO chassis.  This board connects the chassis to the computer and lets them talk to each other.  After going to Downs we remembered Alex had taken a possibly broken interface board back to downs for testing. 

Apparently the results of that testing was it was broken.  This was about 2.5 months ago and unfortunately it hadn't been sent back for repairs or a replacement ordered.  Its my fault for not following up on that sooner.

I asked Rolf what the plan for the broken one was.  His response was  they were planning on repairing it, and that he'd have it sent back for repairs today.  My guess the turn around time for that is on the order of 3-4 weeks (based on conversations with Gary), however it could be longer.  This will affect when the last IO chassis (LSC) can be made fully functional.  I did however pickup the 100 foot fiber cable for going between the LSC chassis and the LSC computer (which will be located in 1X3).

As a general piece of information, according to Gary the latest part number for these cards is OSS-SHB-ELB-x4/x8-2.0 and they cost 936 dollars (latest quote).

  3584   Fri Sep 17 14:55:01 2010 josephbUpdateCDSTook 5565 RFM card from IOVME to place in the new IOO chassis

I took the 5565 RFM card out of the IOVME machine to so I could put it in the new IO chassis that will be replacing it.  It is no longer on the RFM network.  This doesn't affect the slow channels associated with the auxilliary crate.

  3588   Mon Sep 20 10:33:21 2010 josephbBureaucracyComputersLarry stopped by - GC machine had conflicting IP

Larry stopped by today and had to disconnect the m25 machine (this is the 1st GC machine on the left as you walk into the control room) because its IP was conflicting with a machine over in Downs.  Do not use 131.215.115.125 as the IP on this machine as this is already assigned to someone else.  They couldn't figure out the root password to change it which is why it is not currently plugged into the network, and is not to be until an appropriate IP is assigned.

They've asked that whoever set the machine up to please contact them (extension 2974).

  3589   Mon Sep 20 11:39:45 2010 josephbUpdateCDSSwitch over

I talked to with Alex this morning, discussing what he needed to do to have a frame builder running that was compatible with the new front ends.

 

1) We need a heavy duty router as a separate network dedicated to data acquisition running between the front ends and the frame builder.  Alex says they have one over at downs, although a new one may need to be ordered to replace that one.

2) The frame builder is a linux machine (basically we stop using the Sun fb40m and start using the linux fb40m2 directly.).

3) He is currently working on the code today.  Depending on progress today, it might be installable tomorrow.

  3590   Mon Sep 20 16:59:26 2010 josephbUpdateCDSMegatron in 1X2 rack, to be come c1ioo

[Rana, Koji, Joe]

We pulled the phase shifters in the 1X2 rack out to make room for megatron.  Megatron will be converted into c1ioo, and the 8 core, 1U computer will be used as c1lsc.  A temporary ethernet cable was run from 1X2 to 1X3 to connect megatron to the same sub-network.

The c1lsc machine was worked on today, setting it up to run the real time code, along with the correct controls accounts, passwords, .cshrc files, etc.  It needs to be moved from 1X1 to 1X4 tomorrow.

  3593   Tue Sep 21 16:05:21 2010 josephbUpdateCDSFirst pass at rack diagram

I've made a first pass at a rack diagram for the 1X1 and 1X2 racks, attached as png.

Gray is old existing boards, power supplies etc.  Blue is new CDS computers and IO chassis, and gold is for the Alberto's new RF electronics.  I still need to double check on whether some of these boards will be coming out (perhaps the 2U FSS ref board?).

Attachment 1: 1X1_1X2_racks.png
1X1_1X2_racks.png
  3594   Wed Sep 22 16:35:45 2010 josephbUpdateCDSFibers pulled, new FB install tomorrow

[Aidan, Tara, Joe]

We pulled out what used to be the LSC/ASC fiber from the 1Y3 arm rack, and then redirected it to the 1X1 rack.  This will be used as the c1ioo 1PPS timing signal.  So c1ioo is using the old c1iovme fiber for RFM communications back to the bypass switch, and the old LSC fiber for 1PPS.

The c1sus machine will be using the former sosvme fiber for communications to the RFM bypass switch.  It already had a 1 PPS timing fiber.

The c1iscex machine had a new timing fiber already put in, and will be using the c1iscey vme crate's RFM for communication.

We still need to pull up the extra blue fiber which was used to connect c1iscex directly to c1sus, and reuse it as the 1PPS signal to the new front end on the Y arm. 

Alex has said he'll come in tomorrow morning to install the new FB code.

 

  3606   Fri Sep 24 22:58:40 2010 josephbUpdateCDSModified front end medm screens

To startup medm screens for the new suspension front end, do the following:

1) From a control room machine, log into megatron

ssh -X megatron

2) Move to the new medm directory, and more specifically the master sub-directory

cd /opt/rtcds/caltech/c1/medm/master/

3) Run the basic sitemap in that directory

medm -x sitemap.adl

 

The new matrix of filters replacing the old ULPOS, URPOS, etc type filters is now on the screens.  This was previously hidden.  I also added the sensor input matrix entry for the side sensor.

Lastly, the C1SUS.txt filter bank was updated to place the old ULPOS type filters into the correct matrix filter bank.

 

The suspension controls still need all the correct values entered into the matrix entries (along with gains for the matrix of filter banks), as well as the filters turned on.  I hope to have some time tomorrow morning to do this, which basically involves looking at the old screens copying the values over.  The watch dogs are still controlled by the old control screens.  This will be fixed on Monday when I finish switching the front ends over from their sub-network to the main network, at which point logging into megatron will no longer be necessary.

  3612   Mon Sep 27 17:35:13 2010 josephbUpdateCDSUpdated Suspension screens/Megatron now c1ioo/Further work on fb

The medm screens have been updated further, with the hidden matrices added in bright colors.  An example screen shot is attached.

Megatron has been renamed c1ioo and moved to martian network.  Similarly, c1sus and c1iscex are also on the martian network.  Medm screens can be run on any of the control machines and they will work.

Currently the suspension controller is running on c1sus.

The frame builder is currently running on the fb machine *however* it is not working well.  Test points and daq channels on the new front ends tended to crash it when Alex started the mx_stream to the fb via our new DAQ network (192.168.114.XXX, accessible through the front ends or fb - has a dedicated 1 gigabit network with up to 10 gigabit for the fb).  So for the moment, we're running without front end data. Alex will be back tomorrow to work on it. 

Alex claimed to have left the frame builder in a state where it should be recording slow data, however, I can't seem to access recent trends (i.e. since we started it today).  The frame builder throws up an error "Couldn't open raw minute trend file '/frames/trend/minute_raw/C1:Vac-P1_pressure', for example.  Realtime seems to work for slow channels however.  Remember to connect to fb, not fb40m. So it seems the fb is still in a mostly non-functional state.

Alex also started a job to convert all the old trends to the correct new data format, which should finish by tomorrow.

RA: Nice screen work. The old screens had a 'slow' slider effect when ramping the bias so that we couldn't whack the optic too hard. Is the new one instantaneous?

Attachment 1: MC1_Example_Screen.png
MC1_Example_Screen.png
  3615   Tue Sep 28 10:07:29 2010 josephbUpdateCDSUpdated Suspension screens/Megatron now c1ioo/Further work on fb

Quote:

RA: Nice screen work. The old screens had a 'slow' slider effect when ramping the bias so that we couldn't whack the optic too hard. Is the new one instantaneous?

 Looking at the sliders, I apparently still need to connect them properly.  There's a mismatch between the medm screen channel name and the model name.  At the moment there is no "slow" slider effect implemented, so they are effectively instantaneous.  Talking with Alex, he suggests writing a little c-code block and adding it to the model.  I can use the c code used in the filter module ramps as a starting point.

  3619   Wed Sep 29 11:18:36 2010 josephbUpdateCDSApps code changes

After asking Alex specifically what he did yesterday after I left, he indicated he copied  a bunch of stuff from Hanford, including the latest gds, fftw, libframe, root.  We also now have the new dtt code as well.  But those apparently were for the Gentoo build   After asking Alex about the ezca tools this morning, he discovered they weren't complied in the gds code he brought over.  We are in the process of getting the source over here and compiling the ezca tools. 

 

Alex is indicating to me that the currently compiled new gds code may not run on the Centos 5.5 since it was compiled Gentoo (which is what our new fb is running and apparently what they're using for the front ends at Hanford).  We may need to recompile the source on our local Centos 5.5 control machines to get some working gds code.  We're in the process of transferring the source code from Hanford.  Apparently this latest code is not in SVN yet, because at some point he needs to merge it with some other work other people have been doing in parallel and he hasn't had the time yet to do the work necessary for the merge.

For the moment, Alex is undoing the soft link changes he did pointing gds at the latest gds code he copied, and pointing back at the original install we had.

  3634   Fri Oct 1 11:53:42 2010 josephbConfigurationCDSAdded RCG simlink files to the 40m svn

I've added a new directory in /opt/rtcds/caltech/c1/core called rts_simlink.  This directory is now in the 40m svn.  Unfortunately, the simlink files used to generate the front end c codes live in a directory controlled by the CDS svn.  So I've copied the .mdl files from /opt/rtcds/caltech/c1/core/advLigoRTS/src/epics/simLink/ into this new directory and added them into the 40m svn.  When making changes to the simlink files, please copy them to this new directory and check them in so we can a useful history of the models.

 

  3636   Fri Oct 1 16:34:06 2010 josephbUpdateCDSc1sus not booting due to fb dhcp server not running

For some reason, the dhcp server running on the fb machine which assigns the IP address to c1sus (since its running a diskless boot) was down.  This was preventing c1sus from coming up properly.  The symptom was an error indicated no DHCP offers were made(when I plugged a keyboard and monitor in).

To check if the dhcp server is running, run ps -ef | grep dhcpd.  If its not, it can be started with "sudo /etc/init.d/dhcpd start"

  3642   Mon Oct 4 11:20:45 2010 josephbUpdateCDSFixed Suspension binary output list and sus model

I've updated the CDS wiki page listing the wiring of the 40m suspensions with the correct binary output channels.  I previously had confused the wiring of the Auxillary crate XY220 (watchdogs) with the SOS coil dewhitening bypasses.  So I had wound up with the wrong channels (the physical cables we plugged in were correct, just what I thought was going on channel X of that cable was wrong).  This has been corrected in the plan now.  The updated channel/cable list is at http://lhocds.ligo-wa.caltech.edu:8000/40m/Upgrade_09/CDS/Suspension_wiring_to_channels

  3644   Mon Oct 4 15:28:10 2010 josephbUpdateCDSTrying to get c1ioo booting as Gentoo.

I modified the dhcpd.conf file in /etc/dhcp on the fb machine.  I added a entry for c1ioo, listing its MAC address and ip number near the bottom of the file.  I then restarted the dhcp server using "sudo /etc/init.d/dhcpd restart" while on the fb machine.

I also modified the rtsystab, which is used to determine which front end codes start on boot up of a machine.  I added a line: c1ioo   c1x03  c1ioo

I am now in the process of getting c1ioo to come up as a Gentoo machine so I can build a model with an RFM connection in it and test the communication between c1sus and c1ioo.  This involves removing the hard drives and checking to make sure the boot priority is correct (i.e. it checks for a network boot).

  3661   Wed Oct 6 15:56:14 2010 josephbHowToCDSHow to load matrices quickly and easily

Awhile ago I wrote several scripts for reading in medm screen matrix settings and then writing them out.  It was meant as kind of a mini-burt just for matrices for switching between a couple of different setups quickly.

Yuta has expressed interest in having this instruction available.

In /cvs/cds/caltech/scripts/matrix/ there are 4 python scripts:

saveMatrix.py, oldSaveMatrix.py, loadMatrix.py, oldLoadMatrix.py

The saveMatrix.py and loadMatrix.py are for use with the current front end codes (which start counting from 1), where as the old*.py files are for the old system.

To use saveMatrix.py you need to specify the number of inputs, outputs, and the base name of the matrix (i.e. C1:LSP-DOF2PD_MTRX is the base of C1:LSP-DOF2PD_MTRX_0_0_GAIN for example), as well as an ouptut file where the values are stored.

So to save the BS in_matrix setting you could do (from /cvs/cds/caltech/scripts/matrix/)

./saveMatrix.py -i 4 -o 5 -n "C1:SUS-BS_TO_COIL_MTRX" -f -d ./to_coil_mtrx.txt

The -i 4 indicates 4 inputs, the -o 5 indicates 5 outputs, -n "blah blah" indicates the base channel name, -f indicates a matrix bank of filters (if its just a normal matrix with no filters, drop the -f flag), and -d ./to_coil_mtrx.txt indicates the destination file.

To write the matrix, you do virtually the same thing:

./loadMatrix.py -n "C1:SUS-PRM_TO_COIL_MTRX" -f -d ./to_coil_mtrx.txt

In this case, you're writing the saved values of the BS, to the PRM.  This method might be faster if you're trying to fill in a bunch of new matrices that are identical rather than typing 1's and -1's 20 times for each matrix.

I'll have Yuta add his how-to of this to the wiki.

  3665   Thu Oct 7 10:37:42 2010 josephbUpdateCDSc1sus with flaky ssh

Currently trying to understand why the ssh connections to c1sus  are flaky.  This morning, every time I tried to make the c1sus model on the c1sus machine, the ssh session would be terminated at a random spot midway through the build process.  Eventually restarting c1sus fixed the problem for the moment.

However, previously in the last 48 hours, the c1sus machine had stopped responding to ssh logins while still appearing to be running the front end code.  The next time this occurs, we should attach a monitor and keyboard and see what kind of state the computer is in.  Its interesting to note we didn't have these problems before we switched over to the Gentoo kernel from the real-time linux Centos 5.5 kernel.

  3678   Fri Oct 8 12:21:11 2010 josephbUpdateCDSchecking MC1 suspension damping

Upon investigation, it appears that the c1mcs model was (and still is) timing out after a random amount of time. Or in other words, it at some point it was taking too long to do all the calculations for a single cycle and falling behind. The evidence for this is from the dmesg command when run on c1sus.

There's a bunch of lines like:

[ 8877.438002] c1mcs: cycle 568 time 62; adcWait 0; write1 0; write2 0; longest write2 0

[ 8877.438002] c1mcs: cycle 569 time 62; adcWait 0; write1 0; write2 0; longest write2 0

With a final line like: [ 8877.439003] c1mcs: ADC TIMEOUT 1 2405 37 2277

This last line indicates in fell so far behind it gave up.

However, this doesn't actually seem to be related to the amount of computation going on with the front end. I restarted the c1mcs model this morning by logging into the c1sus machine, and changing to the /opt/rtcds/caltech/c1/target/c1mcs/bin directory and running:

lsmod

sudo rmmod c1mcsfe

sudo insmod c1mcsfe.ko

The first line just lists the running modules. The second removes the c1mcs module, and the third starts it up again.

I proceeded to turn all the filters and and set all the matrix values while keeping an eye on the C1MCS-GDS_TP.adl screen and its timing indicator. It was running fine with all these turned on. Then I ran a dtt session from rosalba (going to /opt/apps/, typing bash, then source gds-env.bash, and finally diaggui) as well as a dataviewer and looked at 6 test point channels. It received data fine.

However, about 2 minutes after I had stopped doing this (although the dataviewer realtime session was still going) the USR timing jumped from about 20 microseconds to 35 microseconds, and the CPU Max timing jumped to the 50-60 microsecond range. At this point dmesg started reporting things like:

[54143.465613] c1mcs: cycle 1076 time 62; adcWait 0; write1 0; write2 0; longest write2 0

[54143.526004] c1mcs: cycle 2064 time 62; adcWait 0; write1 0; write2 0; longest write2 0

About a minute later the code gave up and reported a ADC timeout via dmesg. None of the other front ends seem to be affected.

I had to clear the test points manually after stopping dataviewer and dtt by going to rosalaba,using the sourced gds-env.bash, and running diag -l. I then typed "tp clear 36 *" to clear all the test points on the model with FEC number 36 (corresponding to c1mcs).

I removed and restarted c1mcs again, trying to turn on a few things at a time and letting it run for a few minutes to see if I could narrow down if its one particular filter perhaps reaching an underflow and starting to bog down the computations. However, after about 45 minutes of this, the model is still running and I've turned all the filters on and have been running about 8 test points with no problem, so the problem is clearly intermittent.

Quote:

Notes:
 c1mcs crashed many times during the investigation, and I had to kill and restart it again and again.
 It seemed to be easily crashed when filters are on, and so I couldn't check whether the damping servo is working correctly or not today.

Next work:

  - fix c1mcs (and maybe others)
  - check the damping servo by comparing the displacements of each 4 degrees of freedom when servo in off and on.

 

  3680   Fri Oct 8 15:57:32 2010 josephbUpdateCDSc1ioo status

I've been trying to figure out why the c1ioo machine crashes when I try to run the c1ioo front end.

I tried removing some RFM components from the c1ioo model, and then the c1gpt model (Kiwamu's green locking model) as an easier test case.  Both cause the machine to lock up once they start running.  Lastly, I tried running the c1x02 and c1sus models on the c1ioo machine instead of the c1sus machine, after first turning off all models on c1sus.  This led to the same lockup. 

Since those models run fine on the c1sus machine, I could only conclude that a recent change in the fe code generator or the Gentoo kernel and the Sun X4600 computer don't work together at the moment.

After talking to Alex, he got the idea to check if the monitor() and mwait() were supported on the c1ioo machine.  These function calls were added relatively recently, and are   used to poll a memory location to see if something has been written there, and then do something when it is.  Apparently the Sun X4600 computers do not support this call.  Alex has modified the code to not use these functions calls, at least for now.  He'd like to add a check to the code so it does use those calls on machines which have them supported.

After this change, the c1ioo and c1gpt front end codes do in fact run.

  3681   Fri Oct 8 17:35:24 2010 josephbUpdateCDSstatus of c1ioo, c1sus and rfm

RFM is still not working.  I can see data on a filter just before it reaches the RFM sending code, but I see only zeros on the receiving side.

c1sus machine and c1x02, c1sus, c1mcs, c1rms are running.  At the moment, the c1mcs model is running at about 42 microseconds for USR time and 56 microseconds for CPU MAX, which is close to the 61 microsecond limit.  This is with MC filters on.  So far it has not been late, but its not clear to me if its going to stay that way.  So far I haven't been able to isolate why it sometimes slows down after a few minutes.  Also, it was running faster earlier in the day (around 30-ish microseconds) and I believe it has slowed down slightly in the last hour or two.

c1ioo machine and c1x03, c1ioo are running. However its not doing very much good as I can't get any data transferred from it to any of the optic suspensions. I need to spend some more time debugging this and then grab Alex I think.

  3687   Mon Oct 11 10:49:03 2010 josephbUpdateCDSc1sus stability

Taking a look at the c1sus machine, it looks as if all of the front end codes its running (c1sus - running BS, ITMX, ITMY, c1mcs - running MC1, MC2, MC3, and c1rms - running PRM and SRM) worked over the weekend.  As I see no

Running dmesg on c1sus reports on a single long cycle on c1x02, where it took 17 microseconds (~15 microseconds i maximum because the c1x02 IOP process is running at 64kHz).

Both the c1sus and c1mcs models are running at around 39-42 microseconds USR time and 44-50 microseconds CPU time.  It would run into problems at 60-62 microseconds.

Looking at the filters that are turned on, it looks as it these models were running with only a single optic's worth of filters turned on via the medm screens.  I.e. the MC2 and ITMY filters were properly set, but not the others.

The c1rms model is running at around 10 microseconds USR time and 14-18 microseconds CPU time.  However it apparently had no filters on.

It looks as if no test points were used this weekend.  We'll turn on the rest of the filters and see if we start seeing crashes of the front end again.

Edit:

The filters for all the suspensions have been turned on, and all matrix elements entered.  The USR and CPU times have not appreciably changed.  No long cycles have been reported through dmesg on c1sus at this time.  I'm going to let it run and see if it runs into problems.

  3714   Thu Oct 14 10:29:33 2010 josephbUpdateComputersMafalda ready for NDS2, updated Rosalba, Rossa repos

At John Zweizig's request, I installed a couple of packages from the lscsoft repository, along with libtools, automake, autoconf, and several kerberos packages, including cyrus-sasl, cyrus-sasl-lib, cyrus-sasl-devel, cyrus-sasl-gssapi, and krb5-libs.  All it needs now is John to come down and install the NDS2 server.

I copied the lscsoft.repo file from /etc/yum.conf.d/ on allegra to mafalda, as well as rosalba and rossa, just for completeness.  I also added the epel repository to rossa and installed the yum-priorities package and set the priorities on rossa's repositories. 

  3715   Thu Oct 14 10:59:10 2010 josephbUpdateComputersUpdated cshrc.40m and Computer Restart Procedures

I started modifying the cshrc.40m file in /cvs/cds/caltech/ so that it starts pointing at the new directories.

# misc aliases
alias target 'cd /opt/rtcds/caltech/c1/target'
alias scripts 'cd /opt/rtcds/caltech/c1/scripts'
alias chans 'cd /opt/rtcds/caltech/c1/chans'
alias c 'cd /opt/rtcds/caltech/c1'
alias s 'cd /opt/rtcds/caltech/c1/scripts'
alias u 'cd /cvs/cds/caltech/users'

I also added "alias screens 'cd /opt/rtcds/caltech/c1/medm'" as a quick way to get to the medm directory.

Once we get multiple compiled versions (i.e. i386, x86_64, Solaris) of the new gds tools from Alex, we'll have to some more serious surgery on this file.

I removed the "New Computer Restart Procedures" and simply moved the changes into the "Computer Restart procedures", found here.  I've removed everything I don't think applies anymore (all the VME FE reboot procedures for example).

  3716   Thu Oct 14 12:45:47 2010 josephbUpdateComputersNDS 2 server installation on mafalda

[Joe, John Zweizig]

John stopped by around noon today to install the NDS 2 server.  He installed it /cvs/cds/caltech/users/jzweizig/nds2-server/.

Once John is done, I will be moving this to a more sensible install location that is not his user directory, but its there for the moment.

We had to install a couple more packages including bzip2, bzip2-devel, gcc-c++, openssl, and openssl-devel. 

We mounted the /frames directory from the fb machine to mafalda by modifying the /etc/fstab file with the line:

fb:/frames              /frames                 nfs     bg,ro           0 0

 

If we change channels recorded by the frame builder, we need to update a channel list file for the NDS 2 server.  There's an excutable located at:

/cvs/cds/caltech/users/jzweizig/nds2-server/bin/buildChannelList

This builds the file list if given a .gwf file.  These are written by the frame builder, and can be found in /frames/full/####, where #### are the first 4 gps digits of the gravity files contained in that directory.

Upon questing about when we get to GPS time 1000000000, he said there's some updates he needs to do so it rather throws away the last 5 digits, rather than keeping the first 4.

An example command run on the fb or mafalda machine is:

/cvs/cds/caltech/users/jzweizig/nds2-server/bin/buildChannelList  /frames/full/9711/C-R-971119728-16.gwf > nds2-mafalda/C-R-ChanList.txt

For a seconds trends file (located in /frames/trends/second/ instead of /frames/full)

/cvs/cds/caltech/users/jzweizig/nds2-server/bin/buildChannelList  /frames/trends/second/9711/ C-T-971106780-60.gwf > nds2-mafalda/C-T-ChanList.txt

For a minute trends file (located in /frames/trends/minute)

/cvs/cds/caltech/users/jzweizig/nds2-server/bin/buildChannelList  /frames/trends/minute/9711/C-T-971106780-60.gwf  > nds2-mafalda/C-M-ChanList.txt

In these cases, John was putting the lists in the /cvs/cds/caltech/users/jzweizig/nds2-mafalda/ directory.

 

Both the  C-raw-cache.txt file and the nds2.conf files need to be configured to point at the correct files in the nds2-mafalda directory.

 

  3720   Thu Oct 14 14:09:30 2010 josephbUpdateCDSNDS 2 server status

[Joe, John]

The nds 2 server is about 50% of the way there.  You can connect to it and get channel lists, but its having issues actually serving the data.  The errors we're getting basically say it can't find the source data for the channel

John had to go get lunch at 2pm, but he said he'd log in remotely later and try to configure it properly.

  3728   Fri Oct 15 15:32:35 2010 josephbUpdateCDSslow DAQ channels added

I added all the _OUT16, _INMON, _EXCMON channels associated with the BS, ITMX, and ITMY channels in SUS_SLOW.ini in the /opt/rtcds/caltech/c1/chans/daq directory.  Similarly, I added the channels associated with MC1, MC2, and MC3 to MCS_SLOW.ini and those associated with PRM and the SRM to RMS_SLOW.ini. 

Lines pointing to these files were then added to the /opt/rtcds/caltech/c1/target/fb/master file and the frame builder restarted.  It took about 4 tries before the frame builder stayed up using ./daqd -c ./daqdrc.

I generated the .ini files with a python script.  Its at /opt/rtcds/caltech/c1/scripts/daqscripts/create_inmon_out16_daq_ini.py. It checks the C0EDCU.ini to find what slow channels already exist, then goes through a medm directory given at the command line looking for all _OUT16, _INMON, and _EXCMON channels, adding them if they aren't already accounted for, and then writing out the new file to the location of your choice.

  3793   Wed Oct 27 10:53:03 2010 josephbConfigurationComputersWhy doesn't DTT work?!?

Test points for the SUS channels should be there.  They have been working previously this week.  Possibly break down points include awgtpman, mx_streams, and the fb itself.  I'll look into that.

As far as other fast channels, there are no other fast front ends running than the suspensions ones we have.  Until additional channels get connected to the front ends and the models updated, those are the channels we have available.  However I am working on getting c1ioo up and running, and we can try connecting in some PEM channels today to the c1sus front end's 4th ADC.

 

Edit:

I tried starting a fresh instance of the frame builder, but when I brought the old copy down, it left a pair of zombie or dead mx_stream processes running on c1sus . Basically c1mcs and c1rms were still running, while c1x02 and c1sus came down.  I tried to kill the processes but this caused the c1sus machine to crash.  In the past I've killed left over mx_stream processes running after the frame builder has gone down, but I've never seen them crash the computer.  I'm unsure why this happened since we haven't done any updates of the code, just updated models and daq configuration files.

Quote:

DTT has only SUS and "X02" channels under C1 in the drop down channel selection menu.  Basically, we can't measure any fast channels with DTT.  I keep getting the error: "Unable to select testpoints."  Sadface.

Similar things are true for DataViewer.  The same limited number of fast channels, and no data found:

Server error 13: no data found
datasrv: DataWriteRealtime failed: daq_send: Illegal seek

Is this a framebuilder problem?  Is this something that the CDS team has on the to-do list?

 

ELOG V3.1.3-