Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!ittatc!dcdwest!sdcsvax!ucbvax!hplabs!oliveb!glacier!reid From: reid@glacier.ARPA (Brian Reid) Newsgroups: net.news.adm,net.news.b Subject: Re: curious 2.10.3 efficiency situation Message-ID: <5104@glacier.ARPA> Date: Sat, 8-Mar-86 03:19:57 EST Article-I.D.: glacier.5104 Posted: Sat Mar 8 03:19:57 1986 Date-Received: Mon, 10-Mar-86 00:17:25 EST References: <5044@glacier.ARPA> Reply-To: reid@glacier.UUCP (Brian Reid) Organization: Stanford University, Computer Systems Lab Lines: 41 Xref: watmath net.news.adm:550 net.news.b:1309 Well, I've managed to get my netnews under control, after several days of wrestling with it. I'm worried, though, that this is just the first of many hiccups like this, and that perhaps a Vax 750 is not a big enough computer to be a backbone site running this news software. The problem is that we have 2 primary feeds; each potentially feeds us about 500 articles a day. In the steady state, when we make regular contact with both feeds, we end up with about 150 from oliveb and about 350 from decwrl. The compress/inews pipeline is able to handle about 3 articles a minute, so our normal load is about 3 hours per day (out of 24) devoted to incoming news. When we were out of action for a couple of days, the steady-state rhythym was broken, and both oliveb and decwrl queued all of the messages for us. They arrived and sat in UUXQT queues, and while they sat there there was no record that we had the article, and we didn't send it out to the other feed, and so on: a vicious cycle. The net result was that we were getting a torrent of 1000 articles per day, with many duplicates, from our two feeds. Since we had a backlog of 2500 articles waiting to be processed, we never caught up; glacier was not able to process 1000 articles per day, but only by catching up, by getting all 3500 articles processed, would the input flow be reduced back down to 500 per day. What I finally did to fix it was to hotwire inews/rnews to run at nice -18, and then to replace /bin/csh with a program that said "sorry, no logins permitted at this time". This kept users off the machine while letting uucico through. With no other users on the machine, but with uucp running full bore, I was able to get my backlog of 3500 messages processed in about 20 hours. Then I reset everything back to normal; we've been running this way for 3 hours now and the UUXQT queues are normal. I don't yet have a solution to this problem, but I predict that we are not the last site that it will happen to. If I had not been able to kick all of my users off the machine for a day, then I really don't know how we could have recovered from this, other than by asking our feeds to remove net.religion and net.politics and that sort of thing, in order to decrease the arrival traffic. -- Brian Reid decwrl!glacier!reid Stanford reid@SU-Glacier.ARPA