Path: utzoo!attcan!uunet!munnari.oz.au!goanna!ok
From: ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe)
Newsgroups: comp.lang.prolog
Subject: Re: nlp code: request for comments
Message-ID: <4156@goanna.cs.rmit.oz.au>
Date: 30 Oct 90 11:09:11 GMT
References: <4691@rex.cs.tulane.edu>
Organization: Comp Sci, RMIT, Melbourne, Australia
Lines: 237

In article <4691@rex.cs.tulane.edu>, rpg@rex.cs.tulane.edu (Robert Goldman) writes:
> 4. I have used feature structures, along the lines of the ones in Gazdar
> and Mellish's Natural Language Processing in Prolog, as a
> representation for the quasi-logical form.

It's worth noting that their representation for feature structures
(an improper list of Feature:Value pairs) is more than somewhat ugly.
A tiny touch of pre-processing can make the source code much clearer
(declare feature clusters, e.g.
	:- features(case_frame, [agent,patient,beneficiary,...])
 and then write rules that say e.g.
	p(Subj, Features, ...) -->
		{Features^[agent] = Subj}.
) and the code that actually _runs_ much faster (because unifications
are done in-line *as* unifications, not as calls to a non-logical
unify/3 or whatever it was.

I guess I should tidy up the code I gave my students for this and post it.

> I would like to give this semantic processor to my students to
> examine, and I would appreciate it if any of you could comment on the
> coding, and let me know if I have committed any Prolog solecisms.

I hope you really meant that.

> EDITORIAL COMMENT:
> Quite frankly, I have had a fairly difficult time teaching this course
> using Allen's book and Prolog together.

I was doing the same thing exactly this year.  We ended up using rather
little of Allen.  My students got a _lot_ of handouts to make up for it.

> X  a utility predicate
> X  new_atom(A)
> X  A must be unbound.  Will be bound to a new name.

The program appears to be using Quintus Prolog (use of library(basics)
and the like...)  What on earth was wrong with the existing library
predicate gensym/1, or if the "foo" prefix was so very important,
gensym/2?  The :- dynamic declaration for count/1 should have been in
the file gensym.pl (which is the _only_ file that has any business
knowing about that predicate) not in load.pl.


There are actually two essentially unrelated things going on in the
file 'hierarchy.pl'.  First, symbols are being mapped to classes.
The improper-list-of-pairs encoding of feature structures is a rather
poor representation for types.  A much better representation is due to
Chris Mellish.  Suppose we have the single-inheritance 'ako' tree

	a
		b
			c
			d
		e
			f
			g

with the individuals cee: c, dee: d, eff: f, gee: g.
We would map the individuals to terms representing their types thus:

	type_of(cee, b(c(_))).
	type_of(dee, b(d(_))).
	type_of(eff, e(f(_))).
	type_of(gee, e(g(_))).

More generally, for each <class> / <parent> arc in the tree, we have
	class_type(<class>, T0, T) :-
		class_type(<parent>, <class>(T0), T).
The top of the hierarchy as the corresponding rule
	class_type(<top>, T, T).
When we have an individual <indiv> belonging to <class> we say
	indiv_type(<indiv>, T) :-
		class_type(<class>, _, T).

So here we have
	class_type(c, T0, T) :- class_type(b, c(T0), T).
	class_type(d, T0, T) :- class_type(b, d(T0), T).
	class_type(b, T0, T) :- class_type(a, b(T0), T).
	class_type(f, T0, T) :- class_type(e, f(T0), T).
	class_type(g, T0, T) :- class_type(e, g(T0), T).
	class_type(e, T0, T) :- class_type(a, e(T0), T).
	class_type(a, T,  T).

	class_type(C, T) :- class_type(C, _, T).

	indiv_type(cee, T) :- class_type(c, T).
	indiv_type(dee, T) :- class_type(d, T).
	indiv_type(eff, T) :- class_type(f, T).
	indiv_type(gee, T) :- class_type(g, T).

This too is the kind of thing that can be done rather neatly by
a preprocessor.  Now, imagine that we want to say that a particular
verb must have an animate subject.  We might say
	may_fill(subject, see, X) :-
		class_type(animate, T),
		indiv_type(X, T).	% X's type is compatible with T

where the class_type/2 call can be preprocessed away.

Chris Mellish pointed out that this scheme generalises to
systems where multiple classifications apply to the same thing.
For example, something of type "agreement" might be classified
according to "person", "number", and "gender", so we might have
	agreement(1 | 2 | 3, s | p, m | f | n)
With that scheme, we can easily represent things like
	agreement(_,p,_)	"plural"
	agreement(3,_,f)	"third person feminine"
and combine them:
	agreement(3,p,f)	"third person plural feminine"

This can be much more economical, and is in my view much clearer,
than lists of unstructured atoms.

	MAKE UNIFICATION WORK FOR YOU!

As a particular example of doing things clearly with terms instead
of pounding away on lists, consider the complements of a verb phrase.

Goldman's program does

	vp(...) -->
		...
		{subcat(V, Subcat)},
		compl(..., Subcat, Gap).

	compl(compl(nil),Subcat,X/X) --> {member(iv,Subcat)},[].
	compl(compl(NP),Subcat,GapInfo) --> {member(tv,Subcat)},
		np(NP,_,GapInfo).
	compl(compl(NP1,NP2),Subcat,GapIn/GapOut) --> {member(bv,Subcat)},
		np(NP1,_,GapIn/Gap1),
		np(NP2,_,Gap1/GapOut).

where subcat/2 returns a subset of {iv,tv,bv} represented as a list.
But why use a list here?  Suppose instead that we represent the
verb subcategorisation as a triple
	v(i | 0, t | 0, b | 0)
where i, t, b mean that the verb _can_ be used as an intransitive,
transitive, or ditransitive-or-benefactive respectively, and 0 in
a particular slot means it can't.  Let's move this information to
the front as well:  it is always a good idea to have the argument
which we're dispatching on be the first so that a human reader
has the least possible trouble finding it.  Then we get

	compl(v(i,_,_), comp0, Gap, Gap) --> [].
	compl(v(_,t,_), comp1(Np), Gap0, Gap) -->
		np(NP, _, Gap0, Gap).
	compl(v(_,_,b), comp2(N1,N2), Gap0, Gap) -->
		np(N1, _, Gap0, Gap1),
		np(N2, _, Gap1, Gap).


There's a lot of left-over Lisp in the code.  For example, there's
a rule that starts out

	s(s(NP,VP),yes-no-q) --> ...

Now yes-no-q is a perfectly good Lisp atom (it's a spelling of
|YES-NO-Q|) but it is a compound term in Prolog -(-(yes,no),q).
Why does that matter?  Because a later rule tries to use it as
a function symbol!

A rather worse hangover (and if it isn't a headache now, it soon
will be) from Lisp is the use of 'nil' as a "default" or "absent"
marker.  Here's a particularly important case.

	translate(String,Tree,Sem) :-
		parse(Tree,Mood,String),
		semantics(Tree,Mood,Sem).

	semantics(smaj(Tree),Mood,Sem) :- Mood \= wh_q,
		Sem =.. [Mood,SSem],
		sem_translate(Tree,SSem,nil).

	semantics(smaj(Tree),wh_q,wh_q(Whvar,SSem)) :-
		new_atom(Whvar),
		sem_translate(Tree,SSem,Whvar).

In both of the calls to sem_translate/3 we pass an atom as the
last argument.  An atom spelled "nil" means "there isn't any
Wh-variable".  An atom spelled "foo123" or the like means
"there is a Wh-variable called foo123".  That is NOT good Prolog
coding practice.

	What are the situations, and what are the associated data?
	- there is a Wh-variable X
	- there is no Wh-variable
	Invent names for these situations, and make the associated
	data the arguments of appropriate terms
	- var(X) means there is a Wh-variable X
	- novar  means there is no Wh-variable

Then later on we'll be able to ask "was there a Wh-variable" by
doing	Wh = var(_)  instead of by doing Wh \== nil.

That's far from the only problem here.  The program does
"Mood \= wh_q" in order to test whether Mood is decl or yn_q
(assuming that yes-no-q should have been yn_q).  There is
no point in using (\=)/2 here; it would be better to use the
built-in predicate (\==)/2.  But it's better still to say
exactly what you do mean, so that a human reader can see what
the possible cases for Mood are.  (The use of (=..)/2 is a
fairly reliable cue that something rather strange is going on.
This is the bit that breaks if Mood is yes-no-q.)  The variable
names aren't too good either:  there isn't any String here; but
there _is_ a list of Words.

	translate(Words, Tree, Sem) :-
		parse(Tree, Mood, Words),
		semantics(Mood, Tree, Sem).

	semantics(decl, smaj(Tree), decl(Sem)) :-
		sem_translate(Tree, Sem, novar).
	semantics(yn_q, smaj(Tree), yn_q(Sem)) :-
		sem_translate(Tree, Sem, novar).
	semantics(wh_q, smaj(Tree), wh_q(WhVar,Sem)) :-
		gensym(WhVar),
		sem_translate(Tree, Sem, var(WhVar)).

And so it goes.

It would improve the program a _lot_ to have a comment which
says exactly what a Tree or a Sem can look like.

There's a lot more that could be said.

One thing that _does_ need to be said is that I was very pleased to
see this posting, and I've put a copy of it where my students can get
at it.  Never mind the flaws, at least it's _there_ and it's a place
to _start_.  Much the same can be said about the code in the Gazdar
& Mellish book; the code there isn't very good, but it's _there_ and
is a place to _start_, whereas Allen leaves you pretty much on your own.


-- 
The problem about real life is that moving one's knight to QB3
may always be replied to with a lob across the net.  --Alasdair Macintyre.