Xref: utzoo news.software.b:1598 news.admin:3320 news.sysadmin:944 Path: utzoo!attcan!uunet!cbmvax!snark!eric From: eric@snark.UUCP (Eric S. Raymond) Newsgroups: news.software.b,news.admin,news.sysadmin Subject: Re: Why are news articles separate files? Summary: Henry is correct, as usual Message-ID: Date: 30 Aug 88 05:09:38 GMT References: <471@icus.uucp> <1988Aug26.160040.22326@utzoo.uucp> Organization: Somewhere in Hyperspace Lines: 67 In article <1988Aug26.160040.22326@utzoo.uucp> Henry Spencer writes: > In article <471@icus.UUCP> lenny@icus.UUCP (Lenny Tropiano) writes: > >why are each and every news article keep in separate files within separate > >directories? > > The simple answer is "compatibility". In the case of the C News crew, > we really didn't have a choice, since we didn't plan to rewrite all the > news readers. 3.0 has been a bit more ambitious, but even there it's > a substantial win if old news readers continue to work, since there are > several of them and it's a lot of work to replace them all. Actually, I *have* replaced all the readers with upward-compatible rewrites, and added three special-purpose new ones. But all the changes retain news database compatibility with B2.11. This is not to argue Henry's point, just to clarify it. Having old readers continue to work is good. > In truth, we thought about the matter at some length beforehand, and > basically decided that we couldn't think of any new way that would be > *enough* better to justify it. Ditto. > >Wouldn't a database solution be more apropos? > > Aside from inode conservation, exactly what is the win in this? We could > not see any in particular. Our solution to the inode problem is to have > plenty of inodes -- they are not expensive. Ditto again. > Performance was THE big issue > with us, and the 3.0 crew aren't ignoring it either. Correct. I know B3.0 ain't quite the screaming hot-rod C news is reputed to be, but it's up there; last time I checked our profile figures against C news's published ones there was maybe 15% or so difference. 3.0's goals are different -- more focused on maintainability, ease of administration, better reader interfaces and providing a migration path to full distributed hypertext. > The existing scheme, although arguably crude, has a lot going for it. It > is simple. It is robust. It is amenable to manipulation by the standard > Unix tools, instead of requiring a whole new set of its own. It is fairly > efficient for the sorts of things that are done often. These are important > advantages. 100% agreement that these are the right reasons for keeping the format as it is. In fact, my one experimental change in article tree format turned out to be premature optimization, and I've removed it. One thing I am building in right now is code to parse mailboxes as though they were pseudo-newsgroups. This has involved rigorously separating the get-article primitive from the reader 'session' and 'presentation' layers above it and the database or nntp-access layer below it (the analogy with an OSI stack is intentional, the service libraries really are layered kind of that way). So if you want to experiment with a database representation, snarf the beta and do it. You'll only have to change one module each in the reader libraries and rnews. -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP Post: 22 S. Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718