Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!gem.mps.ohio-state.edu!brutus.cs.uiuc.edu!coolidge From: coolidge@brutus.cs.uiuc.edu (John Coolidge) Newsgroups: news.software.b Subject: Re: Lots of dups Summary: Argh, dead memory Keywords: cnews, nntp, dups Message-ID: <1989Oct26.163552.14601@brutus.cs.uiuc.edu> Date: 26 Oct 89 16:35:52 GMT References: <1989Oct25.164024.14894@ctr.columbia.edu> <1989Oct25.205129.16397@brutus.cs.uiuc.edu> <1989Oct25.234516.4057@polyslo.CalPoly.EDU> Sender: news@brutus.cs.uiuc.edu Reply-To: coolidge@cs.uiuc.edu Distribution: na Organization: U of Illinois, CS Dept., Systems Research Group Lines: 72 hoyt@polyslo.CalPoly.EDU (Sir Hoyt) writes: >In article <1989Oct25.205129.16397@brutus.cs.uiuc.edu> coolidge@cs.uiuc.edu writes: >>seth@ctr.columbia.edu (Seth Robertson) writes: >>>[stuff about lots of dups deleted] >>[About how how to speed up article processing by feeding directly to relynews] >>I've adopted >>the second solution by writing a newsrun daemon written in perl that >>runs about every 10 sec (alas, my code is nowhere near the point where >>I'd release it). > NNTP starts newsrun on *every* batch of news that arrives. > I think that if NNTP is getting alot of news, it well > start up newsrun on the batch already there, then go on to > recieving more news. I have not check on this, just what > I have observed. [ARGH! Beat head semi-violently against wall :-)] Of course you're right for the default setup. I've had that code turned off for so long now that I forgot it was there. Why? Because with 8-10 active nntpd's all giving me news, that's 8-10 invocations of newsrun EVERY MINUTE! (Actually, it's not that bad since some of my connections leave the nntp connection open continuously --- but that opens up an entirely different can of works wrt nntpd). Sure, newsrun locks against itself, but starting 8-10 fairly large shell scripts per minute is not a good thing. It was FAR more efficient to just run newsrun once a minute from cron --- except that this really pushes up the number of dups. Thus my every-10(actually 5)-sec newsrun daemon. There's also a bug that we noticed in SunOS 4.0.3 involving execing things from nntpd --- it had a bad effect on network memory. With 8-10 active nntpds, portmap would crash after a while, effectively crashing the machine. > I for one see little use in running newsrun every 10 sec. > Especially since NNTP incoming batches are named: nntp.XXXXXX > and newrun looks for batches of the name: XXXXXXXXX. > X = some numbers. And also becuase NNTP processes the batches > right away. NNTP renames its batches from nntp.XXXXXX (where X is the pid of the nntpd) to XXXXXXXXXXX (timestamp of the batch) just before the exec of newsrun in the default setup. My current setup is about as optimized as possible wrt processing articles fast: nntpd writes each article to a separate batch file (I changed the naming scheme to avoid conflicts when more than one article appears in a second). Newsrund comes along every 5 sec and feeds all of the current batches down a pipe into relaynews (this is a big win every time more than one batch is ready -- a common thing here -- since relaynews is fairly expensive to start). The overall load caused by this system appears empirically to be about 1/2 that of doing standard batching and once-per-minute runs of newsrun, and about 1/5 that of running newsrun from nntpd. It's probably not a huge win for sites with <5 feeds, but for us (10 feeds, 5 top-20) it's the only way to go. The increased file system traffic caused by writing each article as a batch trades off with the traffic saved in dups, and the net result is slightly lower load and much faster propagation. > There is also another problem between C news and NNTP. This > has to do with the slight change in history format. If > you ever see the message "malformed history...." form > NNTP you just accpeted an dup article. This is caused by expired articles: nntpd for some reason wants to check for a filename in history, even if the article isn't going to be received anyway. There's a simple patch for this that was posted a while back (I thought it made it into 1.5.6, but I guess not). I can ship it out on request. --John -------------------------------------------------------------------------- John L. Coolidge Internet:coolidge@cs.uiuc.edu UUCP:uiucdcs!coolidge Of course I don't speak for the U of I (or anyone else except myself) Copyright 1989 John L. Coolidge. Copying allowed if (and only if) attributed. You may redistribute this article if and only if your recipients may as well.