40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Tue Nov 15 15:16:04 2011, kiwamu, Update, CDS, dataviewer doesn't run blank.png
    Reply  Tue Nov 15 15:56:23 2011, jamie, Update, CDS, dataviewer doesn't run 
Message ID: 5896     Entry time: Tue Nov 15 15:56:23 2011     In reply to: 5895
Author: jamie 
Type: Update 
Category: CDS 
Subject: dataviewer doesn't run 

Quote:

Dataviewer is not able to access to fb somehow.

I restarted daqd on fb but it didn't help.

Also the status screen is showing a blank while form in all the realtime model. Something bad is happening.

 So something very strange was happening to the framebuilder (fb).  I logged on the fb and found this being spewed to the logs once a second:

[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 15:28:51 2011] going down on signal 11
sh: /bin/gcore: No such file or directory

Apparently /bin/gcore was trying to be called by some daqd subprocess or thread, and was failing since that file doesn't exist.  This apparently started at around 5:52 AM last night:

[Tue Nov 15 05:46:52 2011] main profiler warning: 1 empty blocks in the buffer
[Tue Nov 15 05:46:53 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:54 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:55 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:46:56 2011] main profiler warning: 0 empty blocks in the buffer
...
[Tue Nov 15 05:52:43 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:52:44 2011] main profiler warning: 0 empty blocks in the buffer
[Tue Nov 15 05:52:45 2011] main profiler warning: 0 empty blocks in the buffer
GPS time jumped from 1005400026 to 1005400379
[Tue Nov 15 05:52:46 2011] going down on signal 11
sh: /bin/gcore: No such file or directory
[Tue Nov 15 05:52:46 2011] going down on signal 11
sh: /bin/gcore: No such file or directory

The gcore I believe it's looking for is a debugging tool that is able to retrieve images of running processes.  I'm guessing that something caused something int the fb to eat crap, and it was stuck trying to debug itself.  I can't tell what exactly happend, though.  I'll ping the CDS guys about it.  The daqd process was continuing to run, but it was not responding to anything, which is why it could not be restarted via the normal means, and maybe why the various FB0_*_STATUS channels were seemingly dead.

I manually killed the daqd process, and monit seemed to bring up a new process with no problem.  I'll keep an eye on it.

ELOG V3.1.3-