Xref: utzoo news.software.nntp:419 news.software.b:3567
Path: utzoo!utstat!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!bloom-beacon!bloom-beacon!wesommer
From: wesommer@athena.mit.edu (William Sommerfeld)
Newsgroups: news.software.nntp,news.software.b
Subject: Re: Survey: C News batching vs. nntplink
Message-ID: <WESOMMER.89Nov19231648@snorkelwacker.athena.mit.edu>
Date: 20 Nov 89 07:16:48 GMT
References: <1989Nov20.002159.26404@brutus.cs.uiuc.edu>
Sender: news@athena.mit.edu (News system)
Organization: None.
Lines: 62
In-Reply-To: coolidge@brutus.cs.uiuc.edu's message of 20 Nov 89 00:21:59 GMT

In article <1989Nov20.002159.26404@brutus.cs.uiuc.edu> coolidge@brutus.cs.uiuc.edu (John Coolidge) writes:

   There's an obvious problem that many people have remarked upon
   involving the contradicition between the C News batching code in
   nntpd vs. the continuous transmissions of nntplink. Since the
   batching code relies on only writing articles every so often, lots
   of articles are received when nntplink is run but aren't passed on
   to relaynews until later, while nntpxmit-style transfers send the
   article later but get it processed later. The end result is slower
   article propagation and lots more dups.

Funny you should notice.  As it turns out, nntplink doesn't have to
change; only nntpd need change.  Patches aren't available yet (and
might not be; I'm really busy; don't even ask for them), but the
changes are simple enough to describe:

- I made nntpd aware of the NEWSCTL/LOCKinput lock file.  If relaynews
is running, this lock file exists.  I rearranged the code in batch.c
to queue the batch into NEWSARTS/in.coming *first*, and only fork/exec
newsrun if relaynews isn't running.

- I rearranged the loop in serve.c to make the alarm timeout and
handler come from a global variable instead of a compiled-in constant.
At the top of the loop, the timeout and alarm handler variables are
reset to the default values.

- The function which implements the ihave command sets the timeout to
five seconds, and the alarm handler to a function which, if
NEWSCTL/LOCKinput doesn't exist, terminates the batch.

The effect is that: if articles are coming in one at a time and the
machine isn't backlogged, they get processed one at a time.

If articles are flowing in a continuous stream (less than a
five-second delay between articles), they get batched using the
existing rules (five minutes or 300KB, whichever comes first).

If the machine is backlogged (relaynews is running), the articles get
processed in batches.

We've only been running this way for a couple of days now on
bloom-beacon.mit.edu and snorkelwacker.mit.edu, and it *seems* to be
working well, but it still hasn't been exposed to a full volume
during-the-week feed, so I don't know if it will break down.

Given this kind of code in nntpd, it would make sense for nntplink to
*not* close the connection after every 20 articles... given average
article sizes, every 100 articles would be more like it; that way, if
the machine is backed up, you get large batches which allow C news to
run at full blast.

We're running C news/NNTP on slow machines with slow disks, and it
seems to be keeping up; B news was running at the edge (bloom-beacon's
load was continuously over 10 with B news; these days, it seems to be
hovering around 1-2..).

The five second delay seems fairly short, but 30 seconds wasn't enough
to avoid lots of dups.
--
Henry Spencer is so much of a  |    Bill Sommerfeld at MIT/Project Athena
minimalist that I often forget |    sommerfeld@mit.edu
he's there - anonymous         |