40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Message ID: 10279     Entry time: Sat Jul 26 15:30:15 2014
Author: Joseph Areeda 
Type: Update 
Category: Computer Scripts / Programs 
Subject: NDS2 server propem on megatron 

The NDS2 server on megatron was unresponsive for what i think was the last couple of days.

The NDS the log file (~nds2mgr/logs/nds2-201407151045.log) started reporting "Stage: parser output queue is full." at 2014.7.24 14:47:54 also there are 16 connections still not closed with LindmeierLaptop.cacr.caltech.edu (131.215.146.102) with 15 of them in CLOSE_WAIT. 

To identify these zombie sockets we use "netstat -an | grep 31200"

The server was in a condition that /etc/init.d/nds2 stop didn't work and the process had to be manually kill -9'ed and then about 3 or 4 minutes later the zombie sockets were gone at /etc/init.d/nds2 start was used to restart the server.

The LindemejerLaptop was using pynds to get a bunch of channels at once to test drive a streaming visualization code for glitches.  It's unclear whether this bumped into a server limitation.  We have seen similar states in ldvw that seem to be the result of errors which result in client-server connections not being closed properly, leaving data in an output buffer causing Linux to wait for the other side to empty the buffer.

ELOG V3.1.3-