Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!rutgers!ucsd!ucbvax!decwrl!sun!pitstop!sundc!seismo!uunet!mcvax!tuvie!tuhold!thom From: thom@tuhold (Thom Fruehwirth) Newsgroups: comp.lang.prolog Subject: Re: Soundex revisited Message-ID: <1190@tuhold> Date: 1 Sep 88 11:52:11 GMT Organization: Institut f. Angewandte Informatik, TU Vienna Lines: 40 In a recent article ok@quintus.uucp (Richard A. O'Keefe) writes: > Since the specification of the Soundex function requires that > duplicate characters be discarded, it is possible to construct > strings of arbitrary length _all_ of which must be inspected > even if only 3 codes are produced. (Stick "sss.......sss" in > front of any word you like.) But what sense does this make ? > Fruehwirth is quite right that I didn't take my "implementation" > version as far as I could have. Here's my "optimised" version: the > soundex_atoms/2 predicate and char_to_code/2 table didn't change. I pleased to see that Richard O'Keefes code follows my initial suggestions on how to transform soundex(-like) specifications. Only one little transformation is still missing: That of transforming zeros/2 away. It's a minor change, but it avoids going the roundabout way of counting the number of character-codes produced so far: % soundex_chars(+Letters, +PreviousCode, +FillInCode, -Digits) % in our case FillInCode = "000" soundex_chars([], _, Zeros, Zeros). soundex_chars([Char|Chars], Prev, Zeros, Codes) :- char_to_code(Char, Code), ( Code =:= Prev -> % Discard duplicate characters soundex_chars(Chars, Code, Zeros, Codes) ; Code =:= 0 -> % Ignore vowels soundex_chars(Chars, Code, Zeros, Codes) ; Zeros = [Zero] -> % Stop when 3 digits have been done Codes = [Code] ; Zeros = [Zero|Zeros1], % otherwise, convert 1 more character Codes = [Code|Codes1], soundex_chars(Chars, Code, Zeros1, Codes1) ). Isn't it more beautiful this way ? thom fruehwirth PS: Effiency is about the same, results depend on the test cases used.