Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!caip!clyde!burl!ulysses!bellcore!whuxcc!lcuxlm!whuxl!mike From: mike@whuxl.UUCP (BALDWIN) Newsgroups: net.sources Subject: Re: Re: soundex algorithm wanted Message-ID: <1239@whuxl.UUCP> Date: Wed, 3-Sep-86 18:06:54 EDT Article-I.D.: whuxl.1239 Posted: Wed Sep 3 18:06:54 1986 Date-Received: Thu, 4-Sep-86 04:06:43 EDT References: <27@houligan.UUCP> <672@bnrmtv.UUCP> Organization: AT&T Bell Laboratories, Whippany Lines: 62 > > I would like any info pertaining to soundex search algorithms > > (phonetic grep). Source to a nifty, efficient algorithm would > > be great, but I'll take anything. Thanx in advance. > > > > /*********************************************************\ > * This program exemplifies the soundex algorithm. * > * * > * You type in a word and it spits out the soundex string * > * that was produced for that word. * > \*********************************************************/ Unfortunately, it doesn't generate correct Soundex codes. The algorithm is actually pretty tricky, and I've seen lots that don't handle names like Lloyd and Manning properly. Here's one that I believe is correct: ----- #include #define SDXLEN 4 char * soundex(name) char *name; { static char buf[SDXLEN+1]; register char c, lc, prev = '0'; register int i; strcpy(buf, "a000"); for (i = 0; *name && i < SDXLEN; name++) if (isalpha(*name)) { lc = tolower(*name); c = "01230120022455012623010202" [lc-'a']; if (i == 0 || (c != '0' && c != prev)) { buf[i] = i ? c : lc; i++; } prev = c; } return buf; } ----- And a little driver for it: ----- main() { char line[64]; while (gets(line)) puts(soundex(line)); return 0; } -- Michael Baldwin (not the opinions of) AT&T Bell Laboratories {at&t}!whuxl!mike