Path: utzoo!attcan!uunet!lll-winken!lll-tis!helios.ee.lbl.gov!pasteur!ames!mailrus!cornell!uw-beaver!teknowledge-vaxc!sri-unix!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.lang.prolog Subject: Re: Some comments on term comparison Message-ID: <377@quintus.UUCP> Date: 9 Sep 88 23:56:39 GMT References: <25985@ucbvax.BERKELEY.EDU> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 112 In article <25985@ucbvax.BERKELEY.EDU> vanroy@ji.Berkeley.EDU (Peter Van Roy) writes: >Consider the following predicate: > a(X,X). > a(Y,Z). > a(W,T). > a(a,b). >The query > | ?- setof(a(X,Y), a(X,Y), S). >gives the result (slightly edited for readability): > S = [a(A,B),a(C,D),a(E,E),a(a,b)] > >It would be better if one of a(A,B) and a(C,D) were deleted from this result. >In order to do this, two terms such as a(X,X) and a(Y,Y) which have the >same pattern of bound variables should be considered equal. >This requires a new meaning for the equality of two terms. [Shouldn't that be "same pattern of UNbound variables"?] The basic problem here is not really a problem with term comparison. There are two ways of looking at these variables (A,B,C,D,E). (1) Since the variables are really "copies" of the original X,Y, we should have delayed until X and Y were sufficiently instantiated to ensure that duplicate elimination never had to compare a variable with a variable. In this view, the mistake is returning any answer at all: if we were to say | ?- all_solutions(a(X,Y), a(X,Y), S), X=foo, Y=baz. the answer should be S=[a(foo,baz)]. (2) The other view is that the variables which came back are METAVARIABLES, and the list S is really at the meta-level: is_a_solution(X) <-> (exists E) member(E, S) & instance_of(X, E) In this view, the set S *really* means [a($1,$1),a($1,$2),a(a,b)] But note that in order to reduce redundancy, we should not just drop an element E when it is *equivalent to* another element of S, but we should also drop E if it is *subsumed by* another element of S. That is to say, a($1,$1) should ALSO be dropped, and a non-redundant result is [a($1,$2),a(a,b)]. Note that in neither case do we have any reason to retain a(E,E). Since a(p,p) is an instance of both a($1,$1) and a($1,$2), it is implicitly present *twice* in [a(A,B),a(E,E),a(a,b)]. If we are willing to tolerate that, we have lost our excuse for demanding that a(C,D) be deleted. Van Roy's proposal (which I heard a couple of days before from Herve' Touati) is simply to compare terms by comparing "standardised" alphabetic variants of them. >Structural comparison is easy to implement. One possibility is to instantiate >all variables in each term temporarily to unique atoms in a sequence, do a >term comparison between the resulting ground terms, and then uninstantiate the >variables. Note that you have to standardise the terms *separately*, because they might share variables. {Instead of van Roy's "inst_vars/2", you would do well to use number_vars/3, which binds variables to '$VAR'(Int) terms.} Note that the desired effect can be obtained for the special case of setof/3 by doing touati_setof(Template, Generator, Set) :- quantify_and_number(Generator, Template, Goal), bagof(Template, Goal, Bag), sort(Bag, Temp), varnumbers_list(Temp, Set). quantify_and_number(Q^Generator, Template, Q^Goal) :- !, quantify_and_number(Generator, Template, Goal). quantify_and_number(Goal, Template, N^(Goal,numbervars(Template,0,N))). varnumbers_list([], []). varnumbers_list([T|Ts], [S|Ss]) :- varnumbers(T, S), varnumbers_list(Ts, Ss). where varnumbers/2 is an inverse of numbervars(_, 0, _). The query | ?- touati_setof(a(X,Y), a(X,Y), S). gave S = [a(a,b),a(A,A),a(B,C)] as the answer, with X and Y still unbound. >Some comments about this method: > 2. Two terms that are equal will still be equal if they are unified > together. > 3. Two terms that are equal can become unequal if one of them becomes > more instantiated. This is never true for ==. Note that unifying two terms which are alphabetic variants of each other can change the relative order of two OTHER terms. Part of sort/2 looks like this: sort(2, [X,Y|Rest], Answer, Rest) :- !, compare(R, X, Y), sort2(R, X, Y, Answer). sort2(<, X, Y, [X,Y]). sort2(>, X, Y, [Y,X]). sort2(=, X, X, [X]). <----- The last clause may indifferently be written sort2(=, X, X, [X]), sort(=, X, _, [X]), or sort(=, _, X, [X]), because with the standard ordering, unifying two identical terms has no effect. With the "alphabetic variant" ordering, unifying term equivalent terms may change the relative order of OTHER terms, so it MUST NOT be done by sort/2. So one of the other alternative versions of the last clause must be used. But there are two choices, and they yield distinguishable results. Note also that maintaining two binding environments during term comparison (should there be any variables which have to be compared) is intrinsically more costly than maintaining none.