Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watnot!ccplumb From: ccplumb@watnot.UUCP Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: Spell and /usr/dict/web2 Message-ID: <12630@watnot.UUCP> Date: Wed, 18-Mar-87 10:29:10 EST Article-I.D.: watnot.12630 Posted: Wed Mar 18 10:29:10 1987 Date-Received: Fri, 20-Mar-87 06:20:54 EST References: <135@utecfb.Toronto.Edu> <2563@hcr.UUCP> Reply-To: ccplumb@watnot.UUCP (Colin Plumb) Organization: U. of Waterloo, Ontario Lines: 25 Keywords: spell dictionary web2 Xref: utgpu comp.unix.wizards:1439 comp.unix.questions:1421 In article <2563@hcr.UUCP> mike@hcr.UUCP (Mike Tilson) writes: >The question was whether one would get a better "spell" hash table >by building it from the Webster's word list rather than from the >much shorter dictionary used by the "spell" makefile. > >I believe the answer is "no". > >First, if you do greatly expand the dictionary, remember to increase >the size of the hash table. The algorithm works best when the final >bit table has about 50% ones and 50% zeros. With a larger dictionary >and no increase in table size, you will turn on more bits. In the limit, >all bits will be on, and all input to spell will be accepted as correct >spelling. Um... not quite. If the table is 50% ones, then 50% of any collection of random character strings would be accepted as correct. As I recall, from Jon Bentley's discussion of spell in Programming Pearls, the hash table size was chosen so that, assuming 20 reported problems per run, a misspelling would slip through about once every hundred runs, so the ones should fill up 1/2000 of the hash table. -- -Colin Plumb (watmath!watnot!ccplumb) Zippy says: My CODE of ETHICS is vacationing at famed SCHROON LAKE in upstate New York!!