Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!tut.cis.ohio-state.edu!att!cbnewsh!wcs
From: wcs@cbnewsh.ATT.COM (Bill Stewart 201-949-0705 erebus.att.com!wcs)
Newsgroups: news.newusers.questions
Subject: Re: How can I access USENET articles database on my computer ?
Message-ID: <8006@cbnewsh.ATT.COM>
Date: 8 Feb 90 00:10:27 GMT
References: <8000@cbnewsh.ATT.COM>
Reply-To: wcs@cbnewsh.ATT.COM (Bill Stewart 201-949-0705 erebus.att.com!wcs)
Distribution: usa
Organization: AT&T Bell Labs Random Organization Name Generator
Lines: 53

In article <8000@cbnewsh.ATT.COM> aaron@cbnewsh.ATT.COM (aaron.michael.chesir) writes:
]How can I access the vast storage of USENET articles on my host computer
]without running the USENET articles reader programs (i.e. readnews, vnews,
]etc.) ? I would love to be able to run a program that searches a particular
]newsgroup for all articles whose header contains a key word, etc.

I'm on the same machine as you are, but this kind of stuff was
useful to me when I was a new user, so I'm posting it ...

There are two main ways netnews gets passed around:
- The B News / C News method
- The NNTP Network News Transfer Protocol.

In B/C News, all news articles that a machine might want get shipped to
that machine, using whatever network is around (uucp, ftp, etc.).
The news system stores each article in a file, and keeps a database
of what articles it has around (file name, message id, age, title).
News reader programs keep track of what artilces you've read, and
use the news system databases and the spool directory of articles.

Where the databases and articles are kept depends on where your
adminstrator feels like putting them, but the typical locations are
/usr/lib/news for the databases and /use/spool/news for the articles.
Subdirectories under the spool directory correspond to newsgroups:
this article is a file in /usr/spool/news/news/announce/newusers.
This is the method your machine uses, so this is where to grep.
( Article numbers are different on each machine.  The command "hgrep"
is a grep that only greps article headers and skips the body - much faster.)

NNTP is a different approach, designed for use in a TCP/IP network.
The theory is that, if you've got high-speed, mostly-reliable networks,
you don't need to have everybody keep a copy of every article
whether they want it or not.  I don't know much about NNTP because
the news machine I used to run for my department was a leftover 3B2
that didn't have an Ethernet board, and NNTP was just coming out then.
Essentially, you'll have a server machine that feeds a bunch of others,
which distributes a certain amount of the database of what articles it has.
When a client machine wants an article (because some user wants to
read it), it retrieves the article from the server.  I don't know if
the client keeps the article around for a while or not.
NNTP is much more efficient, and is the way much of the Internet gets
its news, but requires teaching the newsreaders how to use it.
NNTP also has been ported to networks like AT&T's Datakit.

An intermediate approach is to mount /usr/spool/news over a remote
file system like RFS or NFS, with a few hacks to inews to make
outgoing news do the right thing.  This is less efficient than NNTP
(being an NFS server is more work than an NNTP server),
but is pretty transparent.
-- 
# Bill Stewart AT&T Bell Labs 4M312 Holmdel NJ 201-949-0705 erebus.att.com!wcs

# ho95c has gone the way of all VAX/785s, so I'm now on erebus.att.com