Newsgroups: news.software.b Path: utzoo!sq!lee From: lee@sq.sq.com (Liam R. E. Quin) Subject: Re: Modifying news storage for fast searches Message-ID: <1989Nov27.064440.20233@sq.sq.com> Reply-To: lee@sq.com (Liam R. E. Quin) Organization: Unixsys (UK) Ltd References: <51195@looking.on.ca> <2179@prune.bbn.com> <1989Nov26.032124.680@sq.sq.com> Date: Mon, 27 Nov 89 06:44:40 GMT Key-word news might not be as useful as it sounds. The searching is fun, but it turns out that we often want to look for unusual permutations of frequent words. The savings in space are probably minimal compared to the extra CPU, program complexity and programmer effort. >In article <2179@prune.bbn.com> rsalz@bbn.com (Rich Salz) writes: >>There was a mini text-retrieval system that appeared in comp.sources.misc >>qndxr I think the name was. There will be a bigger system in c.s.unix in >>a couple of weeks. Canada-dwellers who can't wait can now ftp the sources of my lq-text package from radio.astro.toronto.edu which is (according to telnet) 128.100.75.4 The file is in utils/lq-text.shar.Z This is an awfully early release, but I have less than a month left on the net, so I thought it better to post it while I could still receive and send patches and fixes, etc... It was only really effective on news when I ran a filter to remove most of the headers. The same is true for mail -- consider all of the words in the Path: header line... For transmission, consider that the words-file must never be separated from the article, or one ends up with what might in effect be an insoluble encryption problem! Lee Lee -- Liam R. Quin, Unixsys (UK) Ltd [note: not an employee of "sq" - a visitor!] lee@sq.com (Whilst visiting Canada from England, until Christmas) utai!anduk.uucp!lee (after Christmas) ...striving to promote the interproduction of epimorphistic conformability