Path: utzoo!utstat!helios.physics.utoronto.ca!jarvis.csri.toronto.edu!rutgers!usc!cs.utexas.edu!uunet!mcsun!sunic!dkuug!freja!skinfaxe!seindal
From: seindal@skinfaxe.diku.dk (Rene' Seindal)
Newsgroups: news.software.nn
Subject: Re: nnmaster dies
Message-ID: <1990Feb21.043015.9430@diku.dk>
Date: 21 Feb 90 04:30:15 GMT
References: <5317@m2c.M2C.ORG> <2114@labtam.oz> <469@texas.dk>
Sender: news@diku.dk (The Netnews System)
Organization: Department Of Computer Science, University Of Copenhagen
Lines: 30

storm@texas.dk (Kim F. Storm) writes:

> iand@labtam.oz (Ian Donaldson) writes:

> >We found that if nnmaster was being run, connecting to the remote
> >via NNTP, and the remote host dies, so does nnmaster.  It doesn't try
> >and re-establish the NNTP link.

> I believe this has changed with version 6.3.8.  We made some efforts
> to have nnmaster detect immediately that the nntp server dies, and
> just go to sleep immediately and wait for one -r period (you probably
> don't use less than 10 minutes intervals to stay on good terms with
> your NNTP server :-)

The nnmaster uses the same NNTP code as nn, and it will only try to reconnect
to the NNTP server, if it gets a NNTP response code that indicates a server
timeout.  This hardly ever happens for the master, since the NNTP servers I
have seen only times out, if the socket has been idle for a while (e.g., 15
minutes or so).

I have never seen nnmaster die, because of a server crash, but I have seen it
hang.  The problem is that the socket used hasn't got keepalive set.  If the
NNTP server crashes while nnmaster is connected, the master just sits there in
a read() call that never returns.  It has to be killed forcibly.  If keepalive
had been set, it would get a SIGPIPE, that whould interrupt the read() call.

There was a note about this in the file NNTP, saying I had seen this
behaviour, but didn't know why.  I do now, so it will be fixed in 6.4.

Rene' Seindal (seindal@diku.dk)