Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!gatech!hao!oddjob!gargoyle!ihnp4!cbosgd!cwruecmp!hal!ncoast!allbery From: allbery@ncoast.UUCP (Brandon Allbery) Newsgroups: news.admin Subject: Re: keyword-based news Message-ID: <2855@ncoast.UUCP> Date: Sat, 11-Jul-87 01:12:35 EDT Article-I.D.: ncoast.2855 Posted: Sat Jul 11 01:12:35 1987 Date-Received: Sun, 12-Jul-87 15:56:04 EDT References: <266@brandx.rutgers.edu> <8262@utzoo.UUCP> <2185@hplabsc.UUCP> Reply-To: allbery@ncoast.UUCP (Brandon Allbery) Followup-To: news.admin Organization: Cleveland Public Access UN*X, Cleveland, Oh Lines: 56 As quoted from <2185@hplabsc.UUCP> by taylor@hplabs.HP.COM (Dave Taylor): +--------------- | As a side note, I hacked up a newsreader that is based purely on keywords | | And as to the stuff that isn't keyworded correctly, well, if you think | about it, as more and more people were to use a system of this nature | the articles would become better and better keyworded since if you are | going to go to the trouble of WRITING an article, you certainly want to | make sure that the maximal number of people READ it, right? (this can +--------------- From experience: Someone may, in an article keyworded to A, B, C, and D, make a reference to E which is so minor as to not deserve keywording... until it turns out that that reference answers another person's question, but that person never gets to see it on a keyword search for E. In fact, you can change "may" to "will"; it happens all the time. The only way I see to get keywords working is to potentially use every word in an article (both header and body) that is not a syntactic particle as a keyword, after standardizing case and attempting to deal with spelling and prefixes/suffixes. This doesn't strike me as being very fast, space con- servative, or (without either a better AI program than we've got or a (horrors!) moderator choosing the keywords) likely to be correct. (And even the moderator can mess up.) Of course, omitting syntactic particles makes it difficult to find the article in what is now soc.lang.english (if there is such; I haven't checked) on uses of the word "the".... +--------------- | and get my knews reader up to a sufficient state to allow me to post it | to net.sources (errr, to whatever group is appropriate, since It Is Obvious | that Unmoderated Groups are Evil (even though I have proposed a scheme to +--------------- comp.sources.misc The problems with any unmoderated scheme are amply demonstrated by the bogus posting by richard@bigtuna.UUCP of a month back. It doesn't matter WHAT you do, people will scream bloody murder if they can't use net.sources as comp.sources.d. (There were more discussions in net.sources than in comp.sources.d during its final two weeks, even ignoring the discussions about net.sources becoming moderated. How do I know? Erik Fair jumped the gun and my mailbox was suddenly filled with 15 duplicate copies of every message sent to net.sources. Once I eliminated the duplicates, the amount of non-source in net.sources was absolutely disgusting.) I, too, will try to find time to implement the keyword scheme I discussed above: I'm interested in seeing how bad it really is. I hate to imagine the keyword database, though.... -- [Copyright 1987 Brandon S. Allbery, all rights reserved] \ ncoast 216 781 6201 [Redistributable only if redistribution is subsequently permitted.] \ 2400 bd. Brandon S. Allbery, moderator of comp.sources.misc and comp.binaries.ibm.pc {{ames,harvard,mit-eddie}!necntc,{well,ihnp4}!hoptoad,cbosgd}!ncoast!allbery <>