Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!hao!ames!sdcsvax!ucbvax!decvax!reeves From: reeves@decvax.UUCP (Jon Reeves) Newsgroups: comp.unix.questions Subject: Re: spell bug????? Message-ID: <128@decvax.UUCP> Date: Wed, 12-Aug-87 20:22:32 EDT Article-I.D.: decvax.128 Posted: Wed Aug 12 20:22:32 1987 Date-Received: Sat, 15-Aug-87 04:42:42 EDT References: <541@augusta.UUCP> <727@houxa.UUCP> <1673@ncr-sd.SanDiego.NCR.COM> <1262@sol.ARPA> Reply-To: reeves@decvax.UUCP (Jon Reeves) Organization: Digital Eq. Corp. - Merrimack NH. Lines: 24 Keywords: spell Summary: BSD and USG spells use different algorithms In article <1262@sol.ARPA> ken@cs.rochester.edu (Ken Yap) writes: >Nobody has mentioned the possibility that the words in question fell >through spell's probabilistic detection algorithm. This is, indeed, what happened. Ken's description was mostly correct (except the magic number is 11) for BSD systems. Even with proper tuning, the algorithm still has a 1-in-2048 chance of generating a false match. As an example, "nbclowd" collides with: chloroplatinate, telephone/vulnerable, nonogenarian/polarography, crocus, gummy, irremovable, kingpin, lucre/tangential, alma, whirligig, Curran. (Where two words are separated by a slash, both collide with the same hash value.) On a BSD-derived system, the best way to check for nonsense strings is to use match (or grep, or comm) against /usr/dict/words. Spell is designed for catching typos. System V uses a completely different algorithm that can't generate this kind of false matches. -- Jon Reeves decvax!reeves -or- reeves@decvax.dec.com "[T]he use of the binary system in the machine is a passing phase ..." - Douglas Hartree, University of Cambridge, 1949.