Path: utzoo!attcan!uunet!lll-winken!lll-tis!mordor!joyce!ames!amdcad!sun!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.lang.prolog Subject: Re: Soundex revisited Message-ID: <351@quintus.UUCP> Date: 3 Sep 88 11:05:03 GMT References: <1178@tuhold> <816@kuling.UUCP> <334@quintus.UUCP> <818@kuling.UUCP> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 196 In article <818@kuling.UUCP> waldau@kuling.UUCP (Mattias Waldau) writes: >He also says that his program should be easier to understand since it >is shorter, but I say that this only holds under the assumption that >you have to read the whole program to understand what it describes. > >An example: we want to define the relation that removes the vowels >from a sequence of letters. [He provides a Prolog version and the following 1st-order version:] >remove_vowels(With, Without) <-> > subsequence(Without, With) & > (Elements){Element in Without -> ~vowel(Element)} (1) The two definitions do not say the same thing. Consider the simpler case of vowel(0), nonvowel(1). With that definition, the Prolog version would reject remove_vowels([0,*], [*]), but the version quoted above would accept it. (2) By assuming that "the predicate... subseqence [is] already understood by the programmer", Waldau builds in the conclusion he wants. I would derive the Prolog version from the generic predicate exclude/3: remove_vowels(With, Without) :- exclude(vowel, With, Without). I believe that by distinguishing between "specific" predicates and "generic" predicates (predicate schemes) it is possible to produce a Prolog-like language with second-order convenience but provably first-order power, but I haven't proved it yet. I think it is fair to compare the full text of Waldau's specification with the full text of my "implementation" because both were self- contained. Waldau says >I generally believe that quantifiers are easier to understand >than >recursion (3) I guess it's a matter of what you're used to. I find definitions like exclude P [] = []. exclude P [X|Xs] = exclude P Xs if P(X). exclude P [X|Xs] = [X|exclude P Xs] otherwise. remove_vowels = exclude vowel. ever so much clearer than Waldau's version with explicit quantifiers. {Note that if Waldau can assume subsequence/2, I can assume exclude, so this specification has *neither* recursion nor quantifiers.} I would point out that definitions like this are (equational) logic too. I'd also be happy with a specification like remove_vowels(With, Wout) <-> subsequence(With, Wout) & range(Wout) .intersect. vowels = {} Again, this specification has *neither* recursion nor quantifiers. The whole thing can be expressed in APL as (WITH .notelement VOWELS) .compress WITH which is very much in the 'exclude' spirit. I think it is a bad idea to have a lot of explicit recursion in a specification and an even worse one to have a lot of explicit quantifiers. A specification should build up a high level vocabulary which permits the *succint* statement of the interesting bits. For an example of this, consider the Soundex algorithm again. It's about sequences. A language with a LOT of predefined vocabulary for sequences is APL. Modulo the fact that I haven't touched APL in about 5 years, here's an APL definition of Soundex. (I know that characters aren't integers in APL; I also know what to do about that.) $ K <- code # nullary function [1] K <- 256 .rho 0 # 256s 0s [2] K["aehiouwy"] <- 1 ... [7] K["r"] <- 6 $ $ S <- soundex X ;K # K is a local [1] K <- code[X] # K[i] = code[X[i]] [2] K <- ((1 .drop K) = -1 .drop K) .compress 1 .drop K # no dups [3] K <- (K = 0) .compress K # discard vowels [4] S <- X[1], 3 .take K, 0 0 0 # 1st letter, 3 0-padded digits $ That's pretty concise, and it manages without any form of iteration, whether indexed loops, recursion, or quantifiers, because it starts with a large given sequence-manipulation vocabulary. I have enough sequence-handling "predicate schemas" kicking around in my libraries that I can do this kind of thing in Prolog as soundex([X|Xs], [X,D1,D2,D3]) :- maplist(code, [X|Xs], [K|K1]), %\ these three schema exclude(=:=, K1, [K|K1], K2), %| instances combine into exclude(=:=(0), K2, K3), %/ a single induction on K1 append(K3, [0,0,0], [D1,D2,D3|_]). which is a deliberate parallel to the APL version. Just for interest, let's see what happens when we attack this with a few simple program transformations. a. Unfold the call to maplist. It is replaced by code(X, K), maplist(code, Xs, K1) b. Introduce an auxiliary predicate soundex(K1, K, Xs, K3) :- maplist(code, Xs, K1), exclude(=:=, K1, [K|K1], K2), exclude(=:=(0), K2, K3). c. Unfold maplist again, getting two cases: K1 = [] and K1 = [C|Cs]. soundex([], K, [], K3) :- exclude(=:=, [], [K], K2), exclude(=:=(0), K2, K3). soundex([C|Cs], K, [X|Xs], K3) :- code(X, C), maplist(code, Xs, Cs), exclude(=:=, [C|Cs], [K,C|Cs], K2), exclude(=:=(0), K2, K3). d. Unfold exclude/4 and exclude/3 in the first clause, getting soundex([], _, [], []). e. Unfold exclude/4 in the second clause, getting two new clauses soundex([C|Cs], K, [X|Xs], K3) :- code(X, C), C =:= K, maplist(code, Xs, Cs), exclude(=:=, Cs, [C|Cs], K2), exclude(=:=(0), K2, K3). soundex([C|Cs], K, [X|Xs], K3) :- code(X, C), C =\= K, maplist(code, Xs, Cs), exclude(=:=, Cs, [C|Cs], K2), exclude(=:=(0), [C|K2], K3). f. Fold the original definition of soundex/4 into the first clause, getting soundex([C|Cs], K, [X|Xs], K3) :- code(X, C), C =:= K, soundex(Cs, C, Xs, K3). g. Unfold exclude/3 in the last clause, replacing it by two new clauses: soundex([C|Cs], K, [X|Xs], K3) :- code(X, C), C =\= K, maplist(code, Xs, Cs), exclude(=:=, Cs, [C|Cs], K2), 0 =:= C, exclude(=:=(0), K2, K3). soundex([C|Cs], K, [X|Xs], [C|K3]) :- code(X, C), C =\= K, maplist(code, Xs, Cs), exclude(=:=, Cs, [C|Cs], K2), 0 =\= C, exclude(=:=(0), K2, K3). h. Fold the original definition of soundex/4 into these two clauses. The complete definition of soundex/4 now looks like soundex([], _, [], []). soundex([C|Cs], K, [X|Xs], K3) :- code(X, C), C =:= K, soundex(Cs, C, Xs, K3). soundex([C|Cs], K, [X|Xs], K3) :- code(X, C), C =\= K, 0 =:= C, soundex(Cs, C, Xs, K3). soundex([C|Cs], K, [X|Xs], [C|K3]) :- code(X, C), C =\= K, 0 =\= C, soundex(Cs, C, Xs, K3). Going from this to the version I offered a while back with if-then-elses (C =:= K -> .. ; C =:= 0 -> .. ; ..) is easy. This is such a *mechanical* derivation from the original specification that one normally just writes it straight down. That's one of the reasons why I stick with languages like ML and Prolog rather than going for full 1st-order predicate calculus with equality: with the aid of a few simple ideas I can do derivations like this and *easily* produce efficient code at the other end.