40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Thu Apr 17 17:15:54 2014, jamie, Update, CDS, mx_stream not starting on c1ioo c1ioo_no_mx_stream.png
    Reply  Thu Apr 17 17:22:32 2014, Jenne, Update, CDS, mx_stream not starting on c1ioo, locking okay 
    Reply  Fri Apr 18 14:00:48 2014, rolf, Update, CDS, mx_stream not starting on c1ioo 
       Reply  Fri Apr 18 19:05:17 2014, jamie, Update, CDS, mx_stream not starting on c1ioo 
Message ID: 9825     Entry time: Thu Apr 17 17:15:54 2014     Reply to this: 9826   9830
Author: jamie 
Type: Update 
Category: CDS 
Subject: mx_stream not starting on c1ioo 

While trying to get dolphin working on c1ioo, the c1ioo mx_stream processes mysteriously stopped working.  The mx_stream process itself just won't start now.  I have no idea why, or what could have happened to cause this change.  I was working on PCIe dolphin stuff, but have since backed out everything that I had done, and still the c1ioo mx_stream process will not start.

mx_stream relies on the open-mx kernel module, but that appears to be fine:

controls@c1ioo ~ 0$ /opt/open-mx/bin/omx_info  
Open-MX version 1.3.901
 build: root@fb:/root/open-mx-1.3.901 Wed Feb 23 11:13:17 PST 2011

Found 1 boards (32 max) supporting 32 endpoints each:
 c1ioo:0 (board #0 name eth1 addr 00:14:4f:40:64:25)
   managed by driver 'e1000'
   attached to numa node 0

Peer table is ready, mapper is 00:30:48:d6:11:17
================================================
  0) 00:14:4f:40:64:25 c1ioo:0
  1) 00:30:48:d6:11:17 c1iscey:0
  2) 00:25:90:0d:75:bb c1sus:0
  3) 00:30:48:be:11:5d c1iscex:0
  4) 00:30:48:bf:69:4f c1lsc:0
controls@c1ioo ~ 0$ 

However, if trying to start mx_stream now fails:

controls@c1ioo ~ 0$ /opt/rtcds/caltech/c1/target/fb/mx_stream -s c1x03 c1ioo c1als -d fb:0
c1x03
mmapped address is 0x7f885f576000
mapped at 0x7f885f576000
send len = 263596
OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)
mx_connect failed
controls@c1ioo ~ 1$ 

I'm not quite sure how to interpret this error message.  The "00:00:00:00:00:00" has the form of a 48-bit MAC address that would be used for a hardware identifier, ala the second column of the OMC "peer table" above, although of course all zeros is not an actual address.  So there's some disconnect between mx_stream and the actually omx configuration stuff that's running underneath.

Again, I have no idea what happened.  I spoke to Rolf and he's going to try to help sort this out tomorrow.

Attachment 1: c1ioo_no_mx_stream.png  26 kB  Uploaded Thu Apr 17 18:19:49 2014  | Show | Show all
ELOG V3.1.3-