The NDS2 server on megatron was unresponsive for what i think was the last couple of days.
The NDS the log file (~nds2mgr/logs/nds2-201407151045.log) started reporting "Stage: parser output queue is full." at 2014.7.24 14:47:54 also there are 16 connections still not closed with LindmeierLaptop.cacr.caltech.edu (131.215.146.102) with 15 of them in CLOSE_WAIT.
To identify these zombie sockets we use "netstat -an | grep 31200"
The server was in a condition that /etc/init.d/nds2 stop didn't work and the process had to be manually kill -9'ed and then about 3 or 4 minutes later the zombie sockets were gone at /etc/init.d/nds2 start was used to restart the server.
The LindemejerLaptop was using pynds to get a bunch of channels at once to test drive a streaming visualization code for glitches. It's unclear whether this bumped into a server limitation. We have seen similar states in ldvw that seem to be the result of errors which result in client-server connections not being closed properly, leaving data in an output buffer causing Linux to wait for the other side to empty the buffer. |