Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!shadooby!samsung!think!ames!uhccux!munnari.oz.au!mudla!ok
From: ok@mudla.cs.mu.OZ.AU (Richard O'Keefe)
Newsgroups: comp.lang.prolog
Subject: Re: bagof & setof
Message-ID: <2783@munnari.oz.au>
Date: 22 Nov 89 17:34:37 GMT
References: <2177@infovax.lan.informatik.tu-muenchen.dbp.de>
Sender: news@cs.mu.oz.au
Lines: 212

In article <2177@infovax.lan.informatik.tu-muenchen.dbp.de>, grosse@lan.informatik.tu-muenchen.dbp.de (Malte Grosse) writes:
> A few weeks ago there was a request for bagof/3 & setof/3, since 
> it is not built into every Prolog.  With the help of previous
> announcements I was able to implement bagof/3 and setof/3.

Sorry, I seem to have missed your posting.  There is a public-domain
implementation of setof/3 and bagof/3 adapted from the DEC-10 Prolog
system; it includes a predicate free_variables/4.  See below.

> 	- is
> 		    bagof(X, (Y^fact1(X,Y),Z^fact2(X,Z)), L)
> 	  [equivalent] to
> 		    bagof(X, Y^Z^(fact1(X,Y),fact2(X,Z)), L)
> 	  (I assume it)

It is in the public-domain implementation.  However, you should be aware
that there are some Prologs which disagree (which is why we need a standard
that will standardise it instead of dropping it).  In particular, quite a
few take the easy way out and say that in a term like
	X^Y^Z^(....)
(which is parsed as ^(X, ^(Y, ^(Z, (...)) ))	)
the things bound by the top-level existential quantifiers are bound but
once you come to something which is not _^_ you stop looking for quantifiers.

> 	- is
> 		    bagof(X, (Y^fact1(X,Y,Z),Z^fact2(X,Z)), L)
> 	  [equivalent] to
> 		    bagof(X, Y^Z^(fact1(X,Y,Z),fact2(X,Z)), L)
> 	  (I assume it as well)

Neither in the public-domain version, nor in the "quick and dirty" versions
mentioned above, nor in logic would this be true.  If we took logic seriously,
the first would be equivalent to
		bagof(X, (Y1^fact1(X,Y,Z), Z1^fact2(X,Z)), L)
and the second would be equivalent to
		bagof(X, Y1^Z1(fact1(X,Y1,Z1), fact2(X,Z1)), L)

Note that the answer is different for the first question because pushing
the quantifiers Y, Z further out does not result in bogus capture of free
variables.  Here it does.

> 	- is
> 		    bagof(X,Y^(fact(X,Y),!),L)
> 	  the same as
> 		    bagof(X,(fact(X,Y),!),L)

no way.  The second one will return a binding for Y.  The first one
won't.  Why should a cut change anything?

> 	  the same as
> 		    fact(X,_),L=[X],!

Again, no.  The rule is that setof/3 and bagof/3 find bindings for the
FREE variables, and a variable is only free if it appears NEITHER within
the scope of an existential quantifier (the point you seem to be missing
in an earlier question is the "within the scope" bit) NOR within the
template (the first argument).  So provided X does not occur in L already,
bagof(X, ..., L) will not instantiate X.  But the query
		fact(X, _), !, L = [X]
*will* instantiate X.  

> 	- [what] does
> 		    bagof(X, (Y^fact1(X,Y,Z);Z^fact2(X,Z)), L)
> 	  [transform] to?
> 	  to
> 		    bagof(X, Y^Z^(fact1(X,Y,Z);fact2(X,Z)), L)
>  	  or to
> 		    bagof(X, Y^Znew^(fact1(X,Y,Z);fact2(X,Znew)), L)

There are again at least three answers:

(a) the logical answer.
    Think of W^(...W...) as (exists W)(...W...).
    (Y^fact1(X, Y, Z) ; Z^fact2(X, Z))
    thus stands for the logical formula
    (exists Y)fact1(X, Y, Z) or (exists Z)fact2(X, Z)
    which, by renaming bound variables, is equivalent to
    (exists %1)fact1(X, %1, Z) or (exists %2)fact2(X, %2)

    Neither of your suggested translations matches the logical reading.

(b) what the public-domain implementation does.
    It scans over the goal collecting variables which
    -- do not occur in the template (so X would be ignored)
    -- are not within the scope of an existential quantifier binding them
       (so Y would be ignored and the second occurrence of Z would be
       ignored)
    but it does not *rename* the existentially quantified variables.
    So it would see the first occurrence of Z, and would treat this query
    as
	bagof(X, Y^(fact1(X,Y,Z) ; fact2(X,Z)), L)

    This is a MISTAKE in the public-domain implementation; the first person
    to point it out to me was David S. Warren.  Note that this has nothing
    to do with whether it is an "and then" or an "or else" between the two
    goals.

(c) The quick-and-dirty method used by several Prolog systems says that
    since there are no OUTERMOST existential quantifiers, none of the
    variables are existentially quantified.

> 	  or 
> 	     does it give ALWAYS only one instance (the first),
> 	     since the 'fail' to get the solution is a level higher
> 	     than the two facts, because of the 'or'.

I don't understand this.  Level?  The basic outline of bagof/3 is

	bagof(Template, Goal, Bag) :-
		free_variables(Template, Goal, [], Vars),
		(   Vars = [] ->
		    findall(Template, Goal, Bag)
		;   findall(Vars-Template, Goal, RawSolns),
		    <magic>
		    keysort(RawSolns, OrdSolns),
		    group_with_matching_keys(OrdSolns, Vars, Bag)
		).

where
	findall(Template, Goal, List) :-
		(   push special marker on stack,
		    call(Goal),
		    push Template instance on stack,
		    fail
		;   collect Template instances from stack as List
		).

findall/3 is completely blind to the structure of the Goal.  bagof/3 cares
only to the extent that it influences the scan for existential quantifiers.
The POINT of findall/3 and bagof/3 is to find ALL the solutions, not to
cut off at some bizarre place.

I am of the opinion that the "right" reading for the existential quantifiers
is the logical one, and a good way to implement that is to treat them as if
they were local variables and construct a new term with those variables
renamed.  For example,
	(X^p(X,Y), Y^p(X,Y))
should be treated returned as
	Goal' = (p(_1,Y), p(X,_2)), Free Vars = [X,Y]

[By the way, this will be my last posting for a couple of months.]

Here is the public-domain code for free_variables/4.  I warn you, the mistake
is present in this version.  I did think about writing a new version with the
mistake corrected and broadcasting it, but since WG17 Prolog hasn't got
setof/3 or bagof/3 it didn't seem worth the trouble.

%   In order to handle variables properly, we have to find all the
%   universally quantified variables in the Generator.  All variables
%   as yet unbound are universally quantified, unless
%       a)  they occur in the template
%       b)  they are bound by X^P, setof, or bagof
%   free_variables(Generator, Template, OldList, NewList)
%   finds this set, using OldList as an accumulator.

free_variables(Term, Bound, VarList, [Term|VarList]) :-
	var(Term),
	term_is_free_of(Bound, Term),
	list_is_free_of(VarList, Term),
	!.
free_variables(Term, _, VarList, VarList) :-
	var(Term),
	!.
free_variables(Term, Bound, OldList, NewList) :-
	explicit_binding(Term, Bound, NewTerm, NewBound),
	!,
	free_variables(NewTerm, NewBound, OldList, NewList).
free_variables(Term, Bound, OldList, NewList) :-
	functor(Term, _, N),
	free_variables(N, Term, Bound, OldList, NewList).


free_variabels(N, Term, Bound, OldList, NewList) :-
	(   N =:= 0 -> NewList = OldList
	;   arg(N, Term, Argument),
	    free_variables(Argument, Bound, OldList, MidList),
	    M is N-1,
	    free_variables(M, Term, Bound, MidList, NewList)
	).

 
%   explicit_binding checks for goals known to existentially quantify
%   one or more variables.  In particular \+ is quite common.  It is
%   known that the first argument is instantiated.
 
explicit_binding(\+ _,                 Bound, fail,     Bound    ).
explicit_binding(Var^Goal,             Bound, Goal,     Bound+Var).
explicit_binding(setof(Var,Goal,Set),  Bound, Goal-Set, Bound+Var).
explicit_binding(bagof(Var,Goal,Bag),  Bound, Goal-Bag, Bound+Var).
explicit_binding(findall(_,_,Bag),     Bound, Bag,      Bound    ).
 
 
term_is_free_of(Term, Var) :-
	(   var(Term) ->
	    Term \== Var
	;   functor(Term, _, N),
	    term_is_free_of(N, Term, Var)
	).

term_is_free_of(N, Term, Var) :-
	(   N =:= 0 -> true
	;   arg(N, Term, Argument),
	    term_is_free_of(Argument, Var),
	    M is N-1,
	    term_is_free_of(M, Term, Var)
	).


list_is_free_of([], _).
list_is_free_of([Head|Tail], Var) :-
	Head \== Var,
	list_is_free_of(Tail, Var).