40m QIL Cryo_Lab CTN SUS_Lab CAML OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Message ID: 6623     Entry time: Tue May 8 09:58:17 2012
Author: Den 
Type: Update 
Category: CDS 
Subject: SUS -> FB 

 [Alex, Den] 

It was in vain to restart mx_stream yesterday as C1SUS did not see FB

controls@c1sus ~ 0$ /opt/open-mx/bin/omx_info 

Open-MX version 1.3.901
 build: root@fb:/root/open-mx-1.3.901 Wed Feb 23 11:13:17 PST 2011
Found 1 boards (32 max) supporting 32 endpoints each:
 c1sus:0 (board #0 name eth1 addr 00:25:90:06:59:f3)
   managed by driver 'igb'
Peer table is ready, mapper is 00:60:dd:46:ea:ec
================================================
  0) 00:25:90:06:59:f3 c1sus:0
  1) 00:60:dd:46:ea:ec fb:0                           // this line was missing
  2) 00:14:4f:40:64:25 c1ioo:0
  3) 00:30:48:be:11:5d c1iscex:0
  4) 00:30:48:bf:69:4f c1lsc:0
  5) 00:30:48:d6:11:17 c1iscey:0
 
At the same time FB saw C1SUS:
 
controls@fb ~ 0$ /opt/mx/bin/mx_info
 
MX Version: 1.2.12
MX Build: root@fb:/root/mx-1.2.12 Mon Nov  1 13:34:38 PDT 2010
1 Myrinet board installed.
The MX driver is configured to support a maximum of:
8 endpoints per NIC, 1024 NICs on the network, 32 NICs per host
===================================================================
Instance #0:  299.8 MHz LANai, PCI-E x8, 2 MB SRAM, on NUMA node 0
Status: Running, P0: Link Up
Network: Ethernet 10G

MAC Address: 00:60:dd:46:ea:ec
Product code: 10G-PCIE-8AL-S
Part number: 09-03916
Serial number: 352143
Mapper: 00:60:dd:46:ea:ec, version = 0x00000000, configured
Mapped hosts: 6

                                                        ROUTE COUNT
INDEX    MAC ADDRESS     HOST NAME                        P0
-----    -----------     ---------                        ---
   0) 00:60:dd:46:ea:ec fb:0                              1,0
   1) 00:30:48:d6:11:17 c1iscey:0                         1,0
   2) 00:30:48:be:11:5d c1iscex:0                         1,0
   3) 00:30:48:bf:69:4f c1lsc:0                           1,0
   4) 00:25:90:06:59:f3 c1sus:0                           1,0
   5) 00:14:4f:40:64:25 c1ioo:0                           1,0
 
For that reason when I restarted mx_stream on c1sus, the script tried to connect to the standard 00:00:00:00:00:00 address, as the true address was not specified.
 
Alex restarted mx on FB. Note, DAQD process will not allow one to do that until it runs, at the same time, you can't just kill it, it will restart automatically. For that reason one should open /etc/inittab and replace respawn to stop in the line
 
daq:345:respawn:/opt/rtcds/caltech/c1/target/fb/start_daqd.inittab
 
then execute inittab using init q and restart mx on the FB
 
controls@fb ~ 0$ sudo /sbin/init q
controls@fb ~ 0$ sudo /etc/init.d/mx restart
 

After that C1SUS started to communicate with FB. But the reason why this happened and how to prevent from this in future Alex does not know.

Restarting DAQD process (or may be C1SUS) also solved the problem with guralp channels, now they are fine. Again, why this happened is unknown.

 

ELOG V3.1.3-