40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Tue Sep 22 17:30:55 2015, jamie, Update, DAQ, attempts at getting new fb working 
    Reply  Wed Sep 30 13:59:49 2015, jamie, Update, DAQ, attempts at getting new fb working 
       Reply  Thu Oct 1 20:26:21 2015, jamie, Update, DAQ, Swapping between fb and fb1 
    Reply  Thu Oct 1 19:49:52 2015, jamie, Update, DAQ, more failed attempts at getting new fb working 
       Reply  Thu Oct 1 20:24:02 2015, jamie, Update, DAQ, more failed attempts at getting new fb working 
       Reply  Sun Oct 4 14:28:03 2015, jamie, Update, DAQ, more failed attempts at getting new fb working 
Message ID: 11636     Entry time: Tue Sep 22 17:30:55 2015     Reply to this: 11653   11655
Author: jamie 
Type: Update 
Category: DAQ 
Subject: attempts at getting new fb working 

Today I've been trying to get the new frame builder, tentatively 'fb1', to work.  It's not fully working yet, so I'm about to revert the system back to using 'fb'.  The switch-over process is annoying, since our one myrinet card has to be moved between the hosts.

A brief update on the process so far:

I'm being a little bold with this system by trying to build daqd against more system libraries, instead of the manually installed stuff usually nominally required.  Here's some of the relevant info about th fb1 system:

  • Debian 7 (wheezy)
  • lscsoft ldas-tools-framecpp-dev 2.4.1-1+deb7u0
  • lscsoft gds-dev 2.17.2-2+deb7u0
  • lscsoft libmetaio-dev 8.4.0-1+deb7u0
  • lscsoft libframe-dev 8.20-1+deb7u0
  • /opt/rtapps/epics-1.4.12.2_long
  • /opt/mx-1.2.16
  • advLigoRTS trunk

I finally managed to get daqd to build against the advLigoRTS trunk (post 2.9 branch).  I'll post detailed build log once I work out all the kinks.  It runs ok, including writing out full frames, as well as second and minute trends and raw minute trends, but there are a couple of show-stopper problems:

  • daqd segfaults if the C1EDCU.ini is specified.  If I comment out that one file from the 'master' channel ini file list then it runs without segfaulting.
  • Something is going on with the mx_streams from the front ends:
    • They appear to look ok from the daqd side, but the FEC-<ID>_FB_NET_STATUS indicators remain red.  The "DAQ" bit in the STATE_WORD is also red.  Again, this is even though data seems to be flowing.
    • The mx_stream processes on the front ends are dying (and restarting via monit) about every 2 minutes.  It's unclear what exactly is happening, but they all dia around the same time, so it possibly initiated from a daqd problem.  Around the time of the mx_stream failures, we see this in the daqd log:
[Tue Sep 22 17:24:07 2015] GPS MISS dcu 91 (TST); dcu_gps=1127003062 gps=1127003063

Aborted 1 send requests due to remote peer Aborted 1 send requests due to remote peer 00:25:90:0d:75:bb (c1sus:0) disconnected
mx_wait failed in rcvr eid=004, reqn=11; wait did not complete; status code is Remote endpoint is closed
00:30:48:d6:11:17 (c1iscey:0) disconnected
mx_wait failed in rcvr eid=002, reqn=235; wait did not complete; status code is Remote endpoint is closed
disconnected from the sender on endpoint 002
mx_wait failed in rcvr eid=005, reqn=253; wait did not complete; status code is Bad session (missing mx_connect?)
disconnected from the sender on endpoint 005
disconnected from the sender on endpoint 004
[Tue Sep 22 17:24:13 2015] GPS MISS dcu 39 (PEM); dcu_gps=1127003062 gps=1127003069
  • Occaissionally the daqd process dies when the front end mx_streams processes die.

I'll keep investigating, hopefully with some feedback from Keith and Rolf tomorrow.

ELOG V3.1.3-