40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 45 of 344  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  7189   Wed Aug 15 10:40:16 2012 DenUpdateCDSaa filters

The lack of AA filter for MCL signal is RFM model strongly disturbed entering to OAF signal

aa.png

  7193   Wed Aug 15 13:24:12 2012 DenUpdateCDSRFM -> OAF

Transmission of signals between RFM and OAF is bad again. Now we do not see any errors in IPC_ERR monitors so models think that they get all data but the data is wrong

oaf.png

  7197   Wed Aug 15 17:23:22 2012 jamieUpdateCDSfront end IOP models changed to reflect actual physical hardware

As Rolf pointed out when he was here yesterday, all of our IOPs are filled with parts for ADCs and DACs that don't actually exist in the system.  This was causing needless module error messages and IOP GDS screens that were full of red indicators.  All the IOP models were identically stuffed with 9 ADC parts, 8 DAC parts, and 4 BO parts, even though none of the actual front end IO chassis had physical configurations even remotely like that.  This was probably not causing any particular malfunctions, but it's not right nonetheless.

I went through each IOP, c1x0{1-5}, and changed them to reflect the actual physical hardware in those systems.  I have committed these changes to the svn, but I haven't rebuilt the models yet.  I'll need to be able to restart all models to test the changes, so I'm going to wait until we have a quiet time, probably next week.

  7230   Sun Aug 19 19:02:47 2012 DenUpdateCDSPEM -> RFM -> OAF

Data from PEM now goes directly to OAF without using RFM. Transmission RFM -> OAF errors are gone as RFM has to read 30 channels less now.

Again kernel "protection error" occured as before with PEM model so OAF model could not start. I changed optimization flag to -02, this fixed the problem.

  7237   Mon Aug 20 23:31:49 2012 JenneUpdateCDSNote to self - fast PSL chans

Rana points out that we haven't had fast channels for PMC (trans, refl, pzt), input laser things, more FSS things since the upgrade.  Bad.

  7279   Sun Aug 26 21:47:50 2012 KojiUpdateCDSC1LSC ooze

I came in to the lab in the evening and found c1lsc had "red" for FB connection.
I restarted c1lsc models and it kept hung the machine everytime.

I decided to kill all of the model during the startup sequence right after the reboot.
Then run only c1x04 and c1lsc. It seems that c1oaf was the cause, but it wasn't clear.

  7287   Mon Aug 27 17:14:00 2012 jamieUpdateCDSc1oaf problem

Quote:

I came in to the lab in the evening and found c1lsc had "red" for FB connection.
I restarted c1lsc models and it kept hung the machine everytime.

I decided to kill all of the model during the startup sequence right after the reboot.
Then run only c1x04 and c1lsc. It seems that c1oaf was the cause, but it wasn't clear.

The "red for FB connection" issue was probably a dead mx_stream on c1lsc.  That can usually be fixed by just restarting mx_stream.

There is definitely a problem with c1oaf, though.  It crashes immediately after attempting to start.  kernel log for a crash included below.

We will leave c1oaf off until we have time to debug.

[83752.505720] c1oaf: Send Computer Number  = 0
[83752.505720] c1oaf: entering the loop
[83752.505720] c1oaf: waiting to sync 19520
[83753.207372] c1oaf: Synched 701492
[83753.207372] general protection fault: 0000 [#2] SMP 
[83753.207372] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:2e:01.0/class
[83753.207372] CPU 4 
[83753.207372] Modules linked in: c1oaf c1ass c1sup c1lsp c1cal c1lsc c1x04 open_mx dis_irm dis_dx dis_kosif mbuf [last unloaded: c1oaf]
[83753.207372] 
[83753.207372] Pid: 0, comm: swapper Tainted: G      D    2.6.34.1 #5 X7DWU/X7DWU
[83753.207372] RIP: 0010:[<ffffffffa1bf7567>]  [<ffffffffa1bf7567>] T.2870+0x27/0xbf0 [c1oaf]
[83753.207372] RSP: 0000:ffff88023ecc1aa8  EFLAGS: 00010092
[83753.207372] RAX: ffff88023ecc1af8 RBX: ffff88023ecc1ae8 RCX: ffffffffa1c35e48
[83753.207372] RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffffffffa1c21360
[83753.207372] RBP: ffff88023ecc1bb8 R08: 0000000000000000 R09: 0000000000175f60
[83753.207372] R10: 0000000000000000 R11: ffffffffa1c2a640 R12: ffff88023ecc1b38
[83753.207372] R13: ffffffffa1c2a640 R14: 0000000000007fff R15: 0000000000000000
[83753.207372] FS:  0000000000000000(0000) GS:ffff880001f00000(0000) knlGS:0000000000000000
[83753.207372] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[83753.207372] CR2: 000000000378a040 CR3: 0000000001a09000 CR4: 00000000000406e0
[83753.207372] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[83753.207372] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[83753.207372] Process swapper (pid: 0, threadinfo ffff88023ecc0000, task ffff88023ec7eae0)
[83753.207372] Stack:
[83753.207372]  ffff88023ecc1ab8 0000000000000096 0000000000000019 ffff88023ecc1b18
[83753.207372] <0> 0000000000014729 0000000000032a0c ffff880001e12d90 000000000000000a
[83753.207372] <0> ffff88023ecc1bb8 ffffffffa1c06cad ffff88023ecc1be8 000000000000000f
[83753.207372] Call Trace:
[83753.207372]  [<ffffffffa1c06cad>] ? filterModuleD+0xd6d/0xe40 [c1oaf]
[83753.207372]  [<ffffffffa1c07ae3>] feCode+0xd63/0x129b0 [c1oaf]
[83753.207372]  [<ffffffffa1c00dc6>] ? T.2888+0x1966/0x1f10 [c1oaf]
[83753.207372]  [<ffffffffa1c1b3bf>] fe_start+0x1c8f/0x3060 [c1oaf]
[83753.207372]  [<ffffffff8102ce57>] ? select_task_rq_fair+0x2c8/0x821
[83753.207372]  [<ffffffff8104cd8b>] ? enqueue_hrtimer+0x65/0x72
[83753.207372]  [<ffffffff8104d8f6>] ? __hrtimer_start_range_ns+0x2d6/0x2e8
[83753.207372]  [<ffffffff8104d91b>] ? hrtimer_start+0x13/0x15
[83753.207372]  [<ffffffff810173df>] play_dead_common+0x6e/0x70
[83753.207372]  [<ffffffff810173ea>] native_play_dead+0x9/0x20
[83753.207372]  [<ffffffff81001c38>] cpu_idle+0x46/0x8d
[83753.207372]  [<ffffffff814ec523>] start_secondary+0x192/0x196
[83753.207372] Code: 1f 44 00 00 55 66 0f 57 c0 48 89 e5 41 57 41 56 41 55 41 54 53 48 8d 9d 30 ff ff ff 48 8d 43 10 4c 8d 63 50 48 81 ec e8 00 00 00 <66> 0f 29 85 30 ff ff ff 48 89 85 18 ff ff ff 31 c0 48 8d 53 78 
[83753.207372] RIP  [<ffffffffa1bf7567>] T.2870+0x27/0xbf0 [c1oaf]
[83753.207372]  RSP <ffff88023ecc1aa8>
[83753.207372] ---[ end trace df3ef089d7e64971 ]---
[83753.207372] Kernel panic - not syncing: Attempted to kill the idle task!
[83753.207372] Pid: 0, comm: swapper Tainted: G      D    2.6.34.1 #5
[83753.207372] Call Trace:
[83753.207372]  [<ffffffff814ef6f4>] panic+0x73/0xe8
[83753.207372]  [<ffffffff81063c19>] ? crash_kexec+0xef/0xf9
[83753.207372]  [<ffffffff8103a386>] do_exit+0x6d/0x712
[83753.207372]  [<ffffffff81037311>] ? spin_unlock_irqrestore+0x9/0xb
[83753.207372]  [<ffffffff81037f1b>] ? kmsg_dump+0x115/0x12f
[83753.207372]  [<ffffffff81006583>] oops_end+0xb1/0xb9
[83753.207372]  [<ffffffff8100674e>] die+0x55/0x5e
[83753.207372]  [<ffffffff81004496>] do_general_protection+0x12a/0x132
[83753.207372]  [<ffffffff814f17af>] general_protection+0x1f/0x30
[83753.207372]  [<ffffffffa1bf7567>] ? T.2870+0x27/0xbf0 [c1oaf]
[83753.207372]  [<ffffffffa1c06cad>] ? filterModuleD+0xd6d/0xe40 [c1oaf]
[83753.207372]  [<ffffffffa1c07ae3>] feCode+0xd63/0x129b0 [c1oaf]
[83753.207372]  [<ffffffffa1c00dc6>] ? T.2888+0x1966/0x1f10 [c1oaf]
[83753.207372]  [<ffffffffa1c1b3bf>] fe_start+0x1c8f/0x3060 [c1oaf]
[83753.207372]  [<ffffffff8102ce57>] ? select_task_rq_fair+0x2c8/0x821
[83753.207372]  [<ffffffff8104cd8b>] ? enqueue_hrtimer+0x65/0x72
[83753.207372]  [<ffffffff8104d8f6>] ? __hrtimer_start_range_ns+0x2d6/0x2e8
[83753.207372]  [<ffffffff8104d91b>] ? hrtimer_start+0x13/0x15
[83753.207372]  [<ffffffff810173df>] play_dead_common+0x6e/0x70
[83753.207372]  [<ffffffff810173ea>] native_play_dead+0x9/0x20
[83753.207372]  [<ffffffff81001c38>] cpu_idle+0x46/0x8d
[83753.207372]  [<ffffffff814ec523>] start_secondary+0x192/0x196

  7457   Mon Oct 1 16:05:01 2012 jamieUpdateCDSmx stream restart required on all front ends

For some reason the frame builder and mx stream processes on ALL front ends were down.  I restarted the frame builder and all the mx_stream processes and everything seems to be back to normal.  Unclear what caused this.  The CDS guys are aware of the issue with the mx_stream stability and are working on it.

  7477   Thu Oct 4 14:04:21 2012 jamieUpdateCDSfront ends back up

All the front end machines are back up after the outage.  It looks like none of the front end machines came back up once power was restored, and they all needed to be powered manually.  One of the things I want to do in the next CDS upgrade is put all the front end computers in one rack, so we can control their power remotely.

c1sus was the only one that had a little trouble.  It's timing was for some reason not syncing with the frame builder.  Unclear why, but after restarting the models a couple of times things came back.

There's still a little red, but it mostly has to do with the fact that c1oaf is busted and not running (it actually crashes the machine when I tried to start it, so this needs to be fixed!).

  7529   Thu Oct 11 11:57:40 2012 jamieUpdateCDSall IOP models rebuild, install, restarted to reflect fixed ADC/DAC layouts

Quote:

As Rolf pointed out when he was here yesterday, all of our IOPs are filled with parts for ADCs and DACs that don't actually exist in the system.  This was causing needless module error messages and IOP GDS screens that were full of red indicators.  All the IOP models were identically stuffed with 9 ADC parts, 8 DAC parts, and 4 BO parts, even though none of the actual front end IO chassis had physical configurations even remotely like that.  This was probably not causing any particular malfunctions, but it's not right nonetheless.

I went through each IOP, c1x0{1-5}, and changed them to reflect the actual physical hardware in those systems.  I have committed these changes to the svn, but I haven't rebuilt the models yet.  I'll need to be able to restart all models to test the changes, so I'm going to wait until we have a quiet time, probably next week.

I finally got around to rebuilding, installing, and restarting all the IOP models.  Everything went smoothly.  I had to restart all the models on all the screens, but everything seemed to come back up fine.  We now have many fewer dmesg error messages, and the GDS_TP screens are cleaner and don't have a bunch of needless red.

A frame builder restart was also required, due to name changes in unused (but unfortunately still needed) channels in the IOP.

  7580   Fri Oct 19 12:45:12 2012 DenUpdateCDSc1lsc is up after reboot
  7713   Wed Nov 14 21:59:09 2012 DenUpdateCDSdaq errors

I tried to add a test point to C1MCS model and spent next two hours rebooting front-ends, restarting models and realigning MC.

dmesg told me that DAQ channels can not be allocated as they already exist. Last time we met this problem Jamie emailed Alex about it. Jamie, what is the output? Restarting iop model does not help this time.

  8028   Thu Feb 7 19:25:22 2013 yutaUpdateCDSC1ALS filters reloaded

Filters for C1ALS were all gone. So, I copied /opt/rtcds/caltech/c1/chans/C1GCV.txt and renamed it as C1ALS.txt.

I also fixed links in the medm screens; C1ALS.adl and C1ALS_COMPACT.adl.
I'm not sure what happened to C1SC{X,Y} screens.

Quote:

I decided to rename the c1gcv model to be c1als.  This is in an ongoing effort to rename all the ALS stuff as ALS, and get rid of the various GC{V,X,Y} named stuff.

(...snip...)

The above has been done.  Still todo:

  • FIX SCRIPTS!  There are almost certainly scripts that point to GC{V,X,Y} channels.  Those will have to be fixed as we come across them.
  • Fix the c1sc{x,y}/master/C1SC{X,Y}_GC{X,Y}_SLOW.adl screens.  I need to figure out a more consistent place for those screens.
  • Fix the C1ALS_COMPACT screen
  • ???

 

 

  8109   Tue Feb 19 15:10:02 2013 JamieUpdateCDSc1iscex alive again

c1iscex is back up.  It is communicating with it's IO chassis, and all of it's models (c1x01, c1scx, c1spx) are running again.

The problem was that the IO chassis had no connection to the computer.  The One Stop card in the IO chassis, which is the PCIe bridge from the front-end machine and the IO chassis, was showing four red lights instead of the dozen or so green lights that it usually shows.  Upon closer inspection, the card appeared to be complaining that it had no connection to the host card in the front-end machine.  Un-illuminated lights on the host card seemed to be pointing to the same thing.

There are two connector slots on the expansion card, presumably for a daisy chain situation.  Looking at other IO chassis in the lab I determined that the cable from the front-end machine was plugged into the wrong slot in the One Stop card.  wtf.

Did someone unplug the cable connecting c1iscex to it's IO chassis, and then replug it in in the wrong slot?  A human must have done this.

  8126   Thu Feb 21 12:56:38 2013 JenneUpdateCDSc1iscex dead again

c1iscex is dead again.  Red lights, no "breathing" on the FE status screen.

  8128   Thu Feb 21 14:32:02 2013 JamieUpdateCDSc1iscex models restarted

Quote:

c1iscex is dead again.  Red lights, no "breathing" on the FE status screen.

The c1iscex machine itself wasn't dead, the models were just not running.  Here are the last messages in dmesg:

[130432.926002] c1spx: ADC TIMEOUT 0 7060 20 7124
[130432.926002] c1scx: ADC TIMEOUT 0 7060 20 7124
[130433.941008] c1x01: timeout 0 1000000 
[130433.941008] c1x01: exiting from fe_code()

I'm guessing maybe the timing signal was lost, so the ADC stopped clocking.   Since the ADC clock is the everything clock, all the "fe" code (ie. models) aborted. Not sure what would have caused it.

I restarted all the models ("rtcds restart all") and everything came up fine. Obviously we should keep our eyes on things, and note if anything strange was happening if this happens again.

  8299   Fri Mar 15 02:14:27 2013 JenneUpdateCDSSimulink linking to wrong library part

Jamie and I discovered a problem with Matlab/Simulink earlier today. 

In the end suspension models, there is a subblock (with top_names) for ALS stuff.  Inside there, we use a library part called "ALS_END".  When the model was created, it included the part ...../userapps/release/isc/c1/models/ALS_END.mdl .  However, if you open up the c1scy diagram and look in the ALS block for this part, you see the part that is in ..../userapps/release/isc/common/models/ALS_END.mdl .  Note the difference - the one we want is in the c1 directory, while the one that was created (by Jamie) for the LHO One Arm Test is in the common directory. 

If you compile the c1scy model, the RCG is using the correct library part, so the information regarding which part we want is still in there. 

However, if you delete the ALS_END part from the model, put the correct one in, save, close, then reopen the model, it once again displays the wrong model.  The right click "go to library part" option brings you to the library part that is displayed, which is currently the wrong one.  THIS IS BAD, since we could start modifying the wrong things.  You do get a warning by Matlab about the file being "shadowed", so we should take heed when we see that warning, and make sure we are getting the file we want.

We are currently running Matlab version 7.11.0.584, which is r2010b.  Step 1 will be to update Matlab to the latest version, in hopes that this fixes things.  We also should change the name of our c1 part, so that it does not have the same name as the one for the sites.  This is not a great solution since we can't guarantee that we will never choose the same names as other sites, but it will at least fix this one case.  Again, if you see the warning about "shadowed" filenames, pay attention.

  8431   Tue Apr 9 14:55:13 2013 JamieUpdateCDSoverbooked test points cause of DAQ problems

Folks were complaining that they were getting zeros whenever they tried to open fast channels in DTT or Dataviewer.  It turned out that the problem was that all available test points were in use in the c1lsc model:

lsc-gds.png

There is a limit to how many test points can be open to a single model (in point of fact I think the limit is on the data rate from the model to the frame builder, not the actual number of open test points).  In any event, they was all used up.  The grid at the bottom right of the C1LSC GDS screen was all full of non-zeros, and the FE TRATE number was red, indicating that the data rate from this model had surpassed threshold.

The result of this overbooking is that any new test points just get zeros.  This is a pretty dumb failure mode (ideally one would not be able to request the TP at all with an appropriate error message), but it is what it is.  This usually means that there are too many dtt/dataviewers left with open connections.

We tried killing all the open processes that we could find that might be holding open test points, but that didn't seem to clear them up.  Stuck open test points is another known problem.  Referencing the solution in #6968 I opened the diag shell and killed all test points everywhere:

controls@pianosa:~ 0$ diag -l -z
Set new test FFT
NDS version = 12
supported capabilities: testing  testpoints  awg  
diag> tp clear * *
test point cleared
diag> quit
EXIT KERNEL
controls@pianosa:~ 0$
  8468   Mon Apr 22 11:26:25 2013 KojiConfigurationCDSsome RT processes restarted

When I came to the 40m, I found most of the FB signals are dead.

The suspensions were not dumped but not too much excited. Use watchdog switches to cut off the coil actuators.

Restarted mxstream from the CDS_FE_STATUS screen. The c1lsc processes got fine. But the FB indicators for c1sus, c1ioo, c1iscex/y are still red.

Sshed into c1sus/ioo, run rtcds restart all . This made them came back under control.

Same treatment for c2iscex and c1iscey. This made c1sus stall again. Also c1iscey did not come back.

At this point I decided to kill all of the rt processes on c1sus/c1ioo/c1iscex/c1iscey to avoid interference between them.
And started to restart from the end machines.

c1iscex did not come back by rtcds restart all.
Run lsmod on c1iscey and found c1x05 persisted stay on the kernel. rmmod did not remove the c1x05 module.
Run software reboot of c1iscey. => c1iscey came back online.

c1iscey did not come back by rtcds restart all.
Run software reboot of c1iscex. => c1iscex came back online.

c1ioo just came back by rtcds restart all.

c1sus did not come back by rtcds restart all.
Run software reboot of c1sus => c1sus came back online.

This series of restarting made the fb connections of some of the c1lsc processes screwed up.
Run the following restarting commands => all of the process are running with FB connection.
rtcds restart c1sup
rtcds restart c1ass
rtcds restart c1lsc

Enable damping loops by reverting the watchdog switches.

All of the FE status are green except for the c1rfm bit 2 (GE FANUC RFM CARD 0).

  8483   Wed Apr 24 14:20:49 2013 KojiUpdateCDSFE Web view not updated?

The FE web view seems not up-to-date, does it? ( maybe for a year)

https://nodus.ligo.caltech.edu:30889/FE/c1mcs_slwebview_files/index.html

  8547   Tue May 7 23:03:12 2013 KojiConfigurationCDSCDS work

Summary:

c1rfm / c1lsc / c1ass / c1sus were modified. They were recomplied and installed. They are running fine
and confirmed PRMI locking (attempt), arm locking, and Yarm ass with the new codes.

Motivation:

1a. SQRTing switching for POP110 was wrong. 0 enabled sqrting, 1 disabled sqrting. I wanted to fix this.
1b. Sqrting for POP22 was not implemented.

2. Preparation for the shadow sensor control with POPDC.

3. ASS had only an input. I want to run two ASS for the X and Y arms.

SQRTing for POP110/22:

- Flipped the input of the bypass switch. Correspoding MEDM indicators are fixed on the power normalization screen.
- Copied the sqrting structure from POP110 to POP22. Correspoding MEDM buttom was made on the power normalization screen.

- The function of the sqrting buttons were confirmed.

Additional ASS output:

- The output path "NPRO" was removed. Corresponding RFM channels have also removed.
- The previous NPRO path was turned to the "ASS1" path. The previous "ASS" path was turned to "ASS2".
- Corresponding shared memory channel are created/renamed.
- c1ass was modified to receive the new ASS shared memory channels. ASS1 is assigned to the X arm. ASS2 is assigned to the Y arm
- The output matrix screen and the lockin screen were modified accordingly.
- Only script/ASS/Arm_ASS_Setup.py was affected. The corespoding lines (matrix assignment) was fixed.

- The function of Den's version of  ASS was confirmed.

LSC->PRM ASC path

- We want to connect POPDC to PRM ASC. POPDC is acquired on c1lsc.
- So, for now we use the LSC input matrix to assign POPDC to one of the servo bank.
- The last row of the LSC output matrix was assigned to the PCIE connection to c1sus.
- This PCIE connection was connected to the PRM ASC YAW input.

- The connection between LSC and SUS was confirmed.

- During this process I found that there are bunch of channels transferred from LSC to SUS via RFM.
  These channels are transferred via PCIE(dolphin) and then via RFM. But LSC and SUS are connected
  with dolphin. So this just adds additional sampling delay while there is no benefit. I think we should remove the RFM part.
  Note that we need to use RFM for the end mirrors but this also should use only the RFM connection.


Rebuilding the codes

- Prior to the tests of the new functionalities, the codes were rebuild/installed as usual.
- The suspension were shutdown with the watch dogs before the restart of the realtime codes.
- Once the realtime codes were restarted successfully, the watch dogs were reloaded.
- As we removed/added the channels, fb was restarted.
- c1rfm / c1lsc / c1ass / c1sus codes were checked-in to svn
 

  8548   Wed May 8 16:10:09 2013 JamieUpdateCDSUnknown DAQ channels in c1sus c1x02 IOP?

Someone for some reason added full-rate DAQ specification to some ADC3 channels in the c1sus IOP model (c1x02):

#DAQ Channels

TP_CH15 65536
TP_CH16 65536
TP_CH17 65536
TP_CH18 65536
TP_CH19 65536
TP_CH20 65536
TP_CH21 65536

These appear to be associated with c1pem, so I'm guessing it was Den (particularly since he's the worst about making modifications to models and not telling anyone or logging or svn committing).

I'm removing them.

  8549   Wed May 8 17:03:35 2013 JamieConfigurationCDSmake direct IPC connections between c1lsc and c1sus/c1mcs

Previously, for some reason, many IPC connections were routed through the c1rfm model, even if a direct IPC connection was possible.  It's unclear why this was done.  I spoke to Joe B. about it and he couldn't remember either.  Best guess is that it was just for book keeping purposes.  Or maybe some old timing issue that has been fixed by DMA fixes in the RTS.  So the point is that it's no longer needed, and we can reduce delays by making direct connections.

I made direct IPC connections from c1lsc to both c1sus and c1mcs, bypassing the c1rfm, through which they had previously been routed.  All models were rebuilt/installed/restarted and everything seems to be working fine.

  8550   Wed May 8 17:23:04 2013 JamieConfigurationCDSfixed direct IPC connection between c1als and c1mcs

As with the previous post, I eliminated and unnecessary hop through c1rfm for the c1als --> c1mcs connection for the ALS output to MC2 POS.

As a side note, we might considering piping the ALS signals into the LSC input matrix, elevating them to actual LSC error signals, which in some since they are.  It's just that right now we're routing them directly to the actuators without going through the full LSC control.

  8551   Wed May 8 17:45:49 2013 JamieConfigurationCDSMore bypassing c1rfm for c1mcs --> c1ioo IPCs

As with the last two posts, I eliminated more unnecessary passing through c1rfm for IPC connections between c1mcs and c1ioo.

All models were rebuilt/installed/restarted and svn committed.  Everything is working and we have eliminated almost all IPC errors and significantly simplified things.

  8580   Wed May 15 17:17:05 2013 JamieSummaryCDSAccounting of ADC/DAC channel availability

We need ADC and DAC channels for a couple of things:

  • POP QPD: 3x ADC
  • ALS PZTs: 3x 2x 2x DAC (three pairs of PZTs, at ends and vertex, each with two channels for pitch and yaw)
  • Fibox: 1x DAC

What's being used:

  • c1iscex/c1iscey:
    • DAC_0:   7/16 = 9 free
    • ADC_0: 17/32 = 15 free
  • c1sus:
    • DAC: ?
    • ADC: ?
  • c1ioo
    • DAC_0:   0/16 = 16 free ?? This one is weird. DAC in IO chassis, half it's channels connected to cross connect (going ???), but no model links to it
    • ADC_0: 23/32 = 9 free
    • ADC_1:  8/32 = 24 free
  • c1lsc
    • DAC_0: 16/26 = 0 free
    • ADC_0: 32/32 = 0 free

What this means:

  • We definitely have enough DACs for the ALS PZTs.  The free channels are also in the right places: at the end stations and in the c1ioo FE, which is close to the PSL and hosts the c1als controller.
  • We appear to have enough ADCs for the QPD in c1ioo.
  • We don't have any available DAC outputs in c1lsc for the Fibox.  If we can move the Fibox to the IOO racks (1X1, 1X2) then we could send LSC channels to c1ioo and use c1ioo's extra DAC channels.

Of course we'll have to investigate the AA/AI situation as well.  I'll try to asses that in a follow up post.

PS: this helps to identify used ADC channels in models:

grep adc_ sus/c1/models/c1scx.mdl | grep Name | awk '{print $2}' | sort | uniq

 

  8581   Wed May 15 17:38:49 2013 JamieSummaryCDSAA/AI requirements

Quote:

What this means:

  • We definitely have enough DACs for the ALS PZTs.  The free channels are also in the right places: at the end stations and in the c1ioo FE, which is close to the PSL and hosts the c1als controller.
  • We appear to have enough ADCs for the QPD in c1ioo.
  • We don't have any available DAC outputs in c1lsc for the Fibox.  If we can move the Fibox to the IOO racks (1X1, 1X2) then we could send LSC channels to c1ioo and use c1ioo's extra DAC channels.

Of course we'll have to investigate the AA/AI situation as well.  I'll try to asses that in a follow up post.

It looks like we have spare channels in the AA chassis for the existing c1ioo ADC inputs to accommodate the POP QPD. 

We need AI interfaces for the ALS PZTs.  What we ideally need is 3x D000186, which are the eurocard AI boards that have the flat IDC input connects that can come straight from the DAC break-out interfaces.  I'm not finding any in the spares in the spare electronics shelves, though.   If we can't find any we'll have to make our own AI interfaces.

  8582   Wed May 15 17:48:25 2013 JamieUpdateCDSmisc problems noticed in models

I noticed a couple potential issues in some of the models while I was investigating the ADC/DAC situation:

c1ioo links to ADC1, but there are broken links to the bus selector that is supposed to be pulling out channels to go into the PSL block.  They're pulling channels from ADC0, which it's not connected to, which means these connections are broken.  I don't know if this means the current situation is broken, or if the model was changed but not recompiled, or what.  But it needs to be fixed.

c1scy connects ADC_0_11, label "ALS_PZT", to an EpicsOutput called "ALS_LASER_TEMP", which means the exposed channel is called "C1:SCY-ALS_LASER_TEMP".  This is almost certainly not what we want.  I don't know why it was done this way, but it probably needs to be fixed.  If we need and EPICS record for this channel it should come from the ALS library part, so it gets the correct name and is available from both ends.

  8583   Wed May 15 19:32:04 2013 ranaSummaryCDSAccounting of ADC/DAC channel availability
  1. What are we using 16 DAC channels for in the LSC?
  2. What are the functions of those IOO DAC channels which go to cross-connects? If they're not properly sending, then we may have malfunctioning MC or MCWFS.
  3. Can we just use the SLOW DAC (4116) for the ALS PZTs? We used this for a long time for the input steering and it was OK (but not perfect).
  8585   Wed May 15 22:47:11 2013 JamieSummaryCDSAccounting of ADC/DAC channel availability

Quote:
  1. What are we using 16 DAC channels for in the LSC?

For the new input and output tip-tilts.  Two input, two output, each requires four channels.

Quote:
  1. What are the functions of those IOO DAC channels which go to cross-connects? If they're not properly sending, then we may have malfunctioning MC or MCWFS.

I have no idea.  I don't know what the hardware is, or is supposed to be, connected to.  DAC for WFS??  Was there at some point supposed to be fast output channels in the PSL?

Quote:
  1. Can we just use the SLOW DAC (4116) for the ALS PZTs? We used this for a long time for the input steering and it was OK (but not perfect).

 Probably. I'm not as familiar with that system.  I don't know what the availability of hardware channels is there.  I'll investigate tomorrow.

  8598   Fri May 17 18:58:58 2013 Jamie, KojiSummaryCDSWeird DAC bit flipping at half integer output values

While looking at the DAC anti-imaging filters, Koji noticed an odd feature of the DAC output:

sweep.pdf

What you see here is 16kHz double data from a model right before the DAC part ('C1:SUS-PRM_ULCOIL_OUT', blue), and the full 64kHz int data sent to the physical DAC as reported by the IOP ('C1:X02-MDAC0_TP_CH0', green).  The balls are the actual sample values (as expected there are four green balls for every blue).  The output data is being ramped continuously between 0 and 1.

As the output data crosses the half-count level, the integer DAC output oscillates continuously at every 64kHz sample between the bounding integer values (in this case 0 and 1).

Here's the data as we hold the output continuously at the half-count level; the integer DAC output just oscillates continuously:

const.pdf

After some probing I found that the oscillation happens between [-0.003 +0.004] of the half-count level.

The result of this is a fairly strong 32 kHz line in the DAC analog output.

We looked in the controller.c and couldn't identify anything that would be doing this.  This is the output procedure as I can see it (controller.c lines): 

  1. The double from the model is passed to the IOP
  2. The IOP applies a sample-and-hold or zero-pad if the model is running at a slower speed than the IOP (1799)
  3. The data is then anti-image filtered (1801)
  4. A half-integer is added/subtracted before casting such that the cast is a round instead of a floor (1817)
  5. The data double is cast to an int (1819)
  6. The data is written to the DAC (1873)

There's nothing there that would indicate this sort of bit flipping.

  8599   Fri May 17 19:56:52 2013 KojiSummaryCDSWeird DAC bit flipping at half integer output values

Let me make a complimentary comment on this effect.

Because of this oscillation feature, we have a 32kHz peak in the DAC spectrum rather than a 64kHz peak.

For advanced LIGO, the universal AI (D070081) was made to have 3rd-order 10kHz LPF with 64kHz notch.
If we have a higher peak at 32kHz (where the rejection is not enough) than at 64kHz, the filter does not provide
enough filtering of the DAC artifacts.

For the 40m, the original filter had the cut off of 7kHz as the sampling rate was 16kHz.
If we want to extend the frequency range by 4times, the correspoding cut off should be 28kHz.
The rejection is again not enough at 32kHz.

If this peak is an avoidable feature by using simple sample&hold the peak freq is pushed up to
64kHz and we can use the AI filters as planned.

  8613   Wed May 22 11:09:33 2013 JamieSummaryCDSWeird DAC bit flipping at half integer output values

After querying CDS folks about this issue, I got some responses that indicated the problems was likely limit-cycle oscillations due to zero-padding of the data when upsampling.  Tobin ran some Matlab tests to confirm this issue.

Starting in RCG 2.5 there is a new "no_zero_pad=1" cdsParameters option turns zero padding OFF.  I tried enabling this option c1scy to see how the behavior changed.  Sure enough, the 32 kHz oscillations mostly went away.  There are no oscillations for outputs held at the half-count value, and the oscillations around the half-count transitions went away as well.

The only thing I could see is a bit of oscillation when converging on a constant half-count value that went away after a couple of milliseconds:

nopad.pdf

So we might consider adding the no_zero_pad=1 option to all of our coil driver outputs, which might eliminate the need to add notches at the Nyquist in the analog anti-image filters

  8614   Wed May 22 11:21:28 2013 KojiSummaryCDSWeird DAC bit flipping at half integer output values

Is this limit cycle caused by the residual of the digital AI filtering at the half sampling freq and that his the threshold?
Or is this some nonlinear effect? If this is a linear effect associated with the zero-padding, the absolute
value of the DC may affect the amplitude of the oscillation. (Or equivalently the range of the DC where we get this oscillation.)

Anyway, it seems that we should use no-zero-padding.

You pointed out the ringdown of the digital AI filter in the sample-hold case (i.e. no-zero-padding case).
How does it look like in the conventional zero-padding case?

  8615   Wed May 22 11:35:06 2013 JamieSummaryCDSWeird DAC bit flipping at half integer output values

Quote:

Is this limit cycle caused by the residual of the digital AI filtering at the half sampling freq and that his the threshold?
Or is this some nonlinear effect? If this is a linear effect associated with the zero-padding, the absolute
value of the DC may affect the amplitude of the oscillation. (Or equivalently the range of the DC where we get this oscillation.)

This is a good question.  We may be able to test if it's a linear effect if we have enough DAC range to get the oscillation to be more than a single sample.

Quote:

You pointed out the ringdown of the digital AI filter in the sample-hold case (i.e. no-zero-padding case).
How does it look like in the conventional zero-padding case?

 In the zero-pad case the oscillation just continues indefinitely at the half-count value, so it never dies out (at least as far as I can tell).

  8616   Wed May 22 15:08:37 2013 KojiSummaryCDSWeird DAC bit flipping at half integer output values

It seems that the effect is from the (linear) residual fluctuation of the digital AI filter for the zero-padded signal.

Namely, if we give the larger constant number, we get more oscillation.

  8626   Thu May 23 10:24:23 2013 JamieSummaryCDSc1scy model continues to run at the hairy edge

c1scy, the controller model at the Y END, is still running very long, typically at 55/60 microseconds, or ~92% of it's cycle.  It's currently showing a recorded max cycle time (since last restart or reset) of 60, which means that it has actually hit it's limit sometime in the very recent past.  This is obviously not good, since it's going to inject big glitches into ETMY.

c1scy is actually running a lot less code than c1scx, but c1scx caps out it's load at about 46 us.  This indicates to me that it must be some hardware configuration setting in the c1iscey computer.

I'll try to look into this more as soon as I can.

  8654   Thu May 30 10:40:59 2013 JamieConfigurationCDSAttempt to cleanup c1ioo ADC connections

I have attempted to reconcile all of the ADC connections to c1ioo.  Upon close inspection, it appears that there was a lot of legacy stuff hanging around.  Either that or things have not been properly connected.

The c1ioo front end machine has two ADC cards, ADC0 and ADC1, which are used by two models, c1ioo and c1als.  The CURRENT ADC connections are listed in the table below.  The yellow cells indicate connections that were moved.  The red cells indicate connections that were removed/unplugged:

  channel block connection channel usage  model
ADC0 8-15 MC WFS1 interface   MC WFS1 c1ioo
16-23 MC WFS2 interface   MC WFS2 c1ioo
0-7

generic interface card (2 pin lemo)

0    
1    
2    
3 ALS TRX c1als
4 ALS TRY c1als
5    
6 MCL c1ioo
7 MCF c1ioo

 

  channel block connection channel usage model
ADC1 0-31 1U interface board 0/1 (J1A) PSL FSS MIXER/NPRO c1ioo
2/3 (J2) ALS BEAT X/Y DC c1als
4/5 (J3) PSL eurocrate DAQ interface J4  
6/7 PSL eurocrate DAQ interface J5  
8/9 PSL eurocrate DAQ interface J6  
10/11 MC eurocrate DAQ interface J1  
12/13 MC servo board DAQ  
14/15 (J8)    
16/17 (J9A) UNLABELLED ("DAQ ISS1"???)  
18/19 (J10) "DAQ ISS2"  
20/21 "DAQ ISS3"  
22/23 ALS BEAT X I/Q c1als
24/25 ALS BEAT Y I/Q c1als
26/27    
28/29    
30/31 (J16)    

The following changes were made:

  • "MC L" had been connected to ADC_0_0, moved to ADC_0_6
  • "MC F" had been connected to ADC_0_6, moved to ADC_0_7

The c1ioo model was rebuilt/restarted to reflect this change.

The PSL-FSS_MIXER and PSL-FSS_NPRO connections were broken in the c1ioo so I fixed them when I moved the MC channels.

All the removed connections from ADC1 were not used by any of the front end models, which is why I unplugged them.  Except for the MC DAQ interface J1 and MC servo DAQ connections, I left all other cables plugged in to wherever they were coming from.  The MC cables I did fully remove.

I don't know what these connections were meant for.  Presumably they expose they expose some useful DAQ channels that we're now getting elsewhere, but I'm not sure.  We don't currently have an ISS, which is presumably why the cables labelled "ISS" are not going anywhere.

TODO

I would like to see some more 4-pin lemo --> double BNC cables made.  That would allow us to more easily use the ADC1 generic interface board:

  • Moved ALS TRX/Y to ADC1, so that we can keep all the ALS connections together in ADC1.
  • POP QPD X/Y/SUM

We should also figure out if we're sub-optimally using the various "DAQ" connections to the DAQ cable connectiosn to the eurocrate DAQ interface cards and servo boards.

  8656   Thu May 30 11:28:34 2013 JamieConfigurationCDSc1als model cleanup

The c1als model was pulling out some ADC0 connections that were no longer used for anything:

  • ADC_0_1 --> sfm "FD" --> IPC "C1:ALS-SCX_FD"
  • ADC_0_5 --> sfm "OCX" --> term
  • ADC_0_6 --> sfm "ADC" --> term

The channels would have shown up as C1:ALS-FD, C1:ALS-OCX, C1:ALS-ADC.  The IPC connection that presumably was meant to go to c1scx is not connected on the other end.

I removed all this stuff from the model and rebuilt/restarted.

  8722   Wed Jun 19 02:46:19 2013 JenneUpdateCDSConnected ADC channels from IOO model to ASS model

Following Jamie's table in elog 8654, I have connected up the channels 0, 1 and 2 from ADC0 on the IOO computer to rfm send blocks, which send the signals over to the rfm model, and then I use dolphin send blocks to get over to the ass model on the lsc machine. 

I'm using the 1st 3 channels on the Pentek Generic interface board, which is why I'm using channels 0,1,2. 

I compiled all 3 models (ioo, rfm, ass), and restarted them.  I also restarted the daqd on the fb, since I put in a temporary set of filter banks in the ass model, to use as sinks for the signal (since I haven't done anything else to the ASS model yet).

All 3 models were checked in to the svn.

  8727   Wed Jun 19 18:24:14 2013 JenneUpdateCDSProto-ASC implemented in ASS model

I have implemented a proto-ASC in the ASS model.  

In an ASC block within the ASS model, I take in the POP QPD yaw, pit, and sum signals.  I ground the sum, since I don't have normalization yet (also, the QPD that we're using normalizes in the readout box already).  The pit and yaw signals each go through a filter bank, and then leave the sub-block so I can send the signals over to the SUS model, to push on PRM ASCPIT and ASCYAW. 

In doing this, I have removed the temporary PRM ASCYAW connection that Koji had made from the secret 11'th row of the LSC output matrix (see Koji's elog 8562 for details from when he implemented this stuff).

LSC, SUS and ASS were recompiled, and restarted.  I also restarted the daqd on the fb.

  8739   Mon Jun 24 16:41:40 2013 JenneUpdateCDSProto-ASC implemented in ASS model

I am working on making the Proto-ASC less "proto".  I have put IPC senders in the LSC model to send the cavity trigger signals over to the ASS model, for ASC use.  I'm partially done working on the ASC end of things to implement triggering.

LSC should be compile-able right now, ASS is definitely not.  But, I expect that no one should need to compile either before I get back in a few hours.  If you do - call me and we'll figure out a plan.

  8740   Tue Jun 25 00:13:00 2013 JenneUpdateCDSProto-ASC implemented in ASS model

I have finished my work on the LSC and ASS models for now. The triggering is all implemented, and should be ready to go.  There are no screens yet.

I have *not* compiled either the LSC or the ASS, since Rana and Manasa still have the IFO.

  8809   Tue Jul 9 11:37:37 2013 gautamUpdateCDSset up for testing DAC Interface-board pin outs

The bank marked channel 9-16 is free, but the connector is a 40 pin IDC and I need to know the exact pin-out configuration before I can set about making the custom ribbon cable that will send the control signals from the DAC card to the PZT driver board. 

The DAC interface board on rack 1Y4 seems to be one of the first versions of this board, and has no DCC number anywhere on it. Identical modules on other racks have the DCC number D080303, but this document does not exist and there does not seem to be any additional documentation anywhere. The best thing I could find was the circuit diagram for the ADL General Standards 16-bit DAC Adapter Board, which has what looks like the pin-out for the 68 pin SCSI connector on the DAC Interface board. Koji gave me an unused board with the same part number (D080303) and I used a multimeter and continuity checking to make a map between DAC channels, and the 40 pin IDC connector on the board, but this needs to be verified (I don't even know if what is sitting inside the box on 1Y4 is the same D080303 board).

Jenne suggested making a break-out cable to verify the pin-outs, which I did with a 40-pin IDC connector and a bit of ribbon wire. The other end of the ribbon wire has been stripped so that we can use some clip-on probes and an oscilloscope to verify the pin-outs by sending a signal to DAC channels 9 through 16 one at a time. On the software side, Jenne did the following:

  • Restarted the mx_stream on c1iscey  (unrelated to this work)
  • 8 Excitation points added in the simulink model on c1scy 
  • Model compiled and installed

We have not restarted c1scy yet as Annalisa is working on some Y-arm stuff right now. We will restart c1scy and use awggui to perform the test once she is done.

 Pink edits by JCD

  8811   Tue Jul 9 12:01:20 2013 gautamUpdateCDSset up for testing DAC Interface-board pin outs

 

 Jenne just rebooted c1scy and daqd on the framebuilder. We will do the actual test after lunch.

  8814   Tue Jul 9 18:44:37 2013 gautamUpdateCDSDAC Interface Board-Pin Outs

  Summary:

The pin-outs for the DAC interface board have been determined.

Details:

  • I used a temporary break-out cable (pic attached) and connected the 40pin IDC connector on this to the DAC interface board at 1Y4.
  • I had a hypothetical pin-out map which was to be verified. So I connected pairs of ribbon wire to an oscilloscope in the configuration which I believed to be correct, and then used awggui to send a 3Hz, 10000 count sine-wave to the corresponding channel via the excitation points set up earlier.
  • I verified that the correct waveform showed up on the scope screen. I then tried sending the same signal to another DAC channel and verified that there were no accidental shorts/bad connections. The signal was fairly noisy, but this was probably because of the makeshift connections.
  • Repeated the above for all 8 channels in the bank marked 9-16 on the DAC interface board.

Turns out that my deductions using the D0902496 wiring diagram, a spare D080303 DAC to IDC adaptor and a multimeter were correct! The pin outs as determined by this test are sketched in the graphic below.

To Do:

  •  Now that the pin-outs have been determined, I need to go about making the custom ribbon that will connect the 40pin IDC on the DAC interface board to the 10-pin IDC on the PZT driver board. Because there is a pair of wires that will have to be 'skipped' while going from the 40-pin to the 10 pin IDC (corresponding to the pair not-connected between two DAC channels on the 40-pin IDC), this may be tricky.

Misc:

The excitation points added to the simulink model are still there, I plan on keeping it as such till I finish installation of the boards as they will be useful for testing purposes.

 

Pin-Outs of the DAC to IDC Adaptor (D080303) inside the "DAC Interface Box at 1Y4":

DAC_Interface_Board_Pin-out.pdf

 

Makeshift break-out ribbon cable:

 

break-out_ribbon.JPG

 

 

  8858   Tue Jul 16 15:22:27 2013 manasaUpdateCDSFront ends back up

c1sus, c1ioo, c1iscex and c1iscey were down. Why? I was trying to lock the arm and I found that around this time, several computers stopped working mysteriously. Who was working near the computer racks at this time???

I did an ssh into each of these machines and rebooted them sudo shutdown -r now

But then I forgot / neglected/ didn't know to bring back some of the SLOW Controls computers because I am new here and these computers are OLD. Then Rana told me to bring them back and then I ignored him to my great regret. As it turns out he is very wise indeed as the legends say.

So after awhile I did Burtgooey restore of c1susaux (one of the OLD & SLOW ones) from 03:07 this morning. This brought back the IMC pointing and it locked right away as Rana foretold.

Then, I again ignored his wise and very precious advice much to my future regret and the dismay of us all and the detriment of the scientific enterprise overall.

Later however, I was forced to return to the burtgooey / SLOW controls adventure. But what to restore? Where are these procedures? If only we had some kind of electronics record keeping system or software. Sort of like a book. Made of electronics. Where we could LOG things....

But then, again, Rana came to my rescue by pointing out this wonderful ELOG thing. Huzzah! We all danced around in happiness at this awesome discovery!!

But wait, there was more....not only do we have an ELOG. But we also have a thing called WIKI. It has been copied from the 40m and developed into a thing called Wikipedia for the public domain. Apparently a company called Google is also thinking about copying the ELOG's 'find' button.

When we went to the Wiki, we found a "Computer Restart Procedures" place which was full of all kinds of wonderous advice, but unfortunately none of it helped me in my SLOW Controls quest.

 

Then I went to the /cvs/cds/caltech/target/ area and started to (one by one) inspect all of the targets to see if they were alive. And then I burtgooey'd some of them (c1susaux) ?? And then I thought that I should update our 'Computer Restart Procedures' wiki page and so I am going to do so right now ??

And then I wrote this elog.

 

  8860   Tue Jul 16 18:20:25 2013 JenneUpdateCDSProto-ASC implemented in ASS model

The proto-ASC now includes triggering.  I have updated the hacky temp ASC screen to show the DoF triggering.  I have to go, but when I get back, I'll also expose the filter module triggering.  So, for now we may still need the up/down scripts, but at least the ASC will turn itself off if there is a lockloss.

  8880   Fri Jul 19 12:23:34 2013 manasaUpdateCDSCDS FE not happy

I found CDS rt processes in red. I did 'mxstreamrestart' from the medm. It did not help. Also ssh'd into c1iscex and tried 'mxstreamrestart' from the command line. It did not work either.

I thought restarting frame builder would help. I ssh'd to fb. But when I try to restart fb I get the following error:

controls@fb ~ 0$ telnet fb 8088
Trying 192.168.113.202...
telnet: connect to address
192.168.113.202: Connection refused

 

Screenshot-Untitled_Window.png

  8881   Fri Jul 19 14:04:24 2013 KojiUpdateCDSCDS FE not happy

daqd was restarted.


- tried telnet fb 8088 on rossa => same error as manasa had

- tried telnet fb 8087 on rossa => same result

- sshed into fb ssh fb

- tried to find daqpd by ps -def | grep daqd => not found

- looked at wiki https://wiki-40m.ligo.caltech.edu/New_Computer_Restart_Procedures?highlight=%28daqd%29

- the wiki page suggested the following command to run daqd /opt/rtcds/caltech/c1/target/fb/daqd -c ./daqdrc &

- ran ps -def | grep nds => already exist. Left untouched.

- Left fb.

- tried telnet fb 8087 on rossa => now it works

ELOG V3.1.3-