Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site hcr.UUCP Path: utzoo!hcr!mike From: mike@hcr.UUCP (Mike Tilson) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: Spell and /usr/dict/web2 Message-ID: <2563@hcr.UUCP> Date: Mon, 16-Mar-87 09:22:35 EST Article-I.D.: hcr.2563 Posted: Mon Mar 16 09:22:35 1987 Date-Received: Tue, 17-Mar-87 05:39:19 EST References: <135@utecfb.Toronto.Edu> Organization: HCR Corporation, Toronto Lines: 25 Keywords: spell dictionary web2 The question was whether one would get a better "spell" hash table by building it from the Webster's word list rather than from the much shorter dictionary used by the "spell" makefile. I believe the answer is "no". First, if you do greatly expand the dictionary, remember to increase the size of the hash table. The algorithm works best when the final bit table has about 50% ones and 50% zeros. With a larger dictionary and no increase in table size, you will turn on more bits. In the limit, all bits will be on, and all input to spell will be accepted as correct spelling. The biggest problem is that a large dictionary contains many "strange" words. Many of these words differ from common words by only one letter (or one letter transposition). This means you increase the chance of a typo turning into another valid word in the dictionary. It turns out that you are better off using a smaller dictionary. If you use a very rare word, spell will report it, but you simply ignore the report. On the other hand, more of your typing and spelling errors will also be reported. I believe there was a paper that discussed this subject in CACM a few months back (I don't have it at hand, so it might have been in some other journal.) /Michael Tilson, HCR Corporation {utzoo,utcsri,ihnp4,...}!hcr!mike