Xref: utzoo news.software.nntp:419 news.software.b:3567 Path: utzoo!utstat!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!bloom-beacon!bloom-beacon!wesommer From: wesommer@athena.mit.edu (William Sommerfeld) Newsgroups: news.software.nntp,news.software.b Subject: Re: Survey: C News batching vs. nntplink Message-ID: Date: 20 Nov 89 07:16:48 GMT References: <1989Nov20.002159.26404@brutus.cs.uiuc.edu> Sender: news@athena.mit.edu (News system) Organization: None. Lines: 62 In-Reply-To: coolidge@brutus.cs.uiuc.edu's message of 20 Nov 89 00:21:59 GMT In article <1989Nov20.002159.26404@brutus.cs.uiuc.edu> coolidge@brutus.cs.uiuc.edu (John Coolidge) writes: There's an obvious problem that many people have remarked upon involving the contradicition between the C News batching code in nntpd vs. the continuous transmissions of nntplink. Since the batching code relies on only writing articles every so often, lots of articles are received when nntplink is run but aren't passed on to relaynews until later, while nntpxmit-style transfers send the article later but get it processed later. The end result is slower article propagation and lots more dups. Funny you should notice. As it turns out, nntplink doesn't have to change; only nntpd need change. Patches aren't available yet (and might not be; I'm really busy; don't even ask for them), but the changes are simple enough to describe: - I made nntpd aware of the NEWSCTL/LOCKinput lock file. If relaynews is running, this lock file exists. I rearranged the code in batch.c to queue the batch into NEWSARTS/in.coming *first*, and only fork/exec newsrun if relaynews isn't running. - I rearranged the loop in serve.c to make the alarm timeout and handler come from a global variable instead of a compiled-in constant. At the top of the loop, the timeout and alarm handler variables are reset to the default values. - The function which implements the ihave command sets the timeout to five seconds, and the alarm handler to a function which, if NEWSCTL/LOCKinput doesn't exist, terminates the batch. The effect is that: if articles are coming in one at a time and the machine isn't backlogged, they get processed one at a time. If articles are flowing in a continuous stream (less than a five-second delay between articles), they get batched using the existing rules (five minutes or 300KB, whichever comes first). If the machine is backlogged (relaynews is running), the articles get processed in batches. We've only been running this way for a couple of days now on bloom-beacon.mit.edu and snorkelwacker.mit.edu, and it *seems* to be working well, but it still hasn't been exposed to a full volume during-the-week feed, so I don't know if it will break down. Given this kind of code in nntpd, it would make sense for nntplink to *not* close the connection after every 20 articles... given average article sizes, every 100 articles would be more like it; that way, if the machine is backed up, you get large batches which allow C news to run at full blast. We're running C news/NNTP on slow machines with slow disks, and it seems to be keeping up; B news was running at the edge (bloom-beacon's load was continuously over 10 with B news; these days, it seems to be hovering around 1-2..). The five second delay seems fairly short, but 30 seconds wasn't enough to avoid lots of dups. -- Henry Spencer is so much of a | Bill Sommerfeld at MIT/Project Athena minimalist that I often forget | sommerfeld@mit.edu he's there - anonymous |