Path: utzoo!attcan!uunet!samsung!rex!rpg From: rpg@rex.cs.tulane.edu (Robert Goldman) Newsgroups: comp.lang.prolog Subject: nlp code: request for comments Message-ID: <4691@rex.cs.tulane.edu> Date: 26 Oct 90 18:47:34 GMT Reply-To: rpg@cs.tulane.edu Organization: C.S. Dept, Tulane Univ., New Orleans, LA Lines: 676 I'm teaching a course in NLP here at Tulane, using James Allen's Natural Language Understanding as a text, and using Prolog as the language of instruction. I've written a small semantic processing system along the lines of the one Allen outlines in his book. That is: 1. semantic processing is done on a completed parse; 2. semantic information (selection restrictions, and object typing) is represented in isa hierarchies; 3. semantic representation is a quasi-logical form, centered around a representation of the head verb of the sentence, with a fixed number of verb cases filled by semantic representations of other sentence constituents. 4. I have used feature structures, along the lines of the ones in Gazdar and Mellish's Natural Language Processing in PROLOG, as a representation for the quasi-logical form. I would like to give this semantic processor to my students to examine, and I would appreciate it if any of you could comment on the coding, and let me know if I have committed any prolog solecisms. Some notes: 1. I have not tried to make it terribly efficient: preferring clarity over efficiency where possible. 2. I'm not terribly concerned about details of the syntactic representation that are not of immediate import. E.g., I have bracketed the sentence in ways that have to do less with linguistic validity than with ease of translation into semantic representation, and played fast and loose with terms like nbar. 3. I have not felt free to provide in the following source listing code for the two predicates pathval/4 and unify/2. They are predicates for unifying feature structures, and for finding values in feature structures, and I have used code from Gazdar & Mellish, which is probably protected by copyright. The comment for pathval is pathval(Dag,Path,Value,Remainder) The pathvalue of Dag at Path is Value. Remainder is Dag with the Path and Value removed. 4. The code is admittedly fragmentary. I have just implemented a small fragment of Allen's suggested system. E.g., there is no provision for embedded sentences or adjectives. EDITORIAL COMMENT: Quite frankly, I have had a fairly difficult time teaching this course using Allen's book and prolog together. I think I had good reasons for choosing each of these, but I wouldn't recommend the combination to anyone else. Why? Because the discussion in Allen's book gives a very procedural view, and it's very hard for me, let alone my students, to read his discussion, `declarativify' it, and then develop prolog code based on that. Following is a shar archive of the system. Thanks for your patience with my long posting. If there's interest, I will take the comments people send me, incorporate them in the code, and make the results available for anonymous ftp. #! /bin/sh # This is a shell archive. Remove anything before this line, then unpack # it by saving it into a file and typing "sh file". To overwrite existing # files, type "sh file -c". You can also feed this as standard input via # unshar, or by typing "sh 'gensym.pl' <<'END_OF_FILE' X X/* X a utility predicate X new_atom(A) X A must be unbound. Will be bound to a new name. X */ Xnew_atom(A) :- counter(N), X number_chars(N, L), X append("foo",L,Name), X atom_chars(A,Name). X Xcounter(N) :- X count(N),!, X retract(count(N)), X N1 is N + 1, X assert(count(N1)). Xcounter(0) :- X assert(count(1)). END_OF_FILE if test 303 -ne `wc -c <'gensym.pl'`; then echo shar: \"'gensym.pl'\" unpacked with wrong size! fi # end of 'gensym.pl' fi if test -f 'hierarchy.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'hierarchy.pl'\" else echo shar: Extracting \"'hierarchy.pl'\" \(2690 characters\) sed "s/^X//" >'hierarchy.pl' <<'END_OF_FILE' X:- op(500, xfx, isa). X Xfeatures(Thing,Feats) :- X local_feats(Thing,Feats), X inherited_feats(Thing,Feats). X Xlocal_feats(Thing,Feats) :- feats(Thing,Feats),!. Xlocal_feats(_,_). X Xinherited_feats(Thing,Feats) :- X Thing isa OtherThing,!, X features(OtherThing, NewFeats), X unify(Feats, NewFeats). Xinherited_feats(_,_). X Xfindcase(Verb,Indicator,Case) :- X case(Indicator,Verb,Case). Xfindcase(Verb,Indicator,Case) :- X Verb isa OtherVerb,!, %single inheritance X findcase(OtherVerb,Indicator,Case). X X/* X feats(Type,FeatStr) X The features in FeatStr are characteristic of Type X */ X%features for things corresponding to nouns Xfeats(physobj,[concrete:t|_]). Xfeats(organic,[organic:t|_]). Xfeats(inanimate,[organic:f|_]). Xfeats(animate,[animate:t|_]). Xfeats(vegetable,[animate:f|_]). Xfeats(human,[human:t|_]). Xfeats(dog,[human:f|_]). Xfeats(cat,[human:f|_]). X%features for verbs Xfeats(actionverb,[agent:[animate:t|_], X experiencer:nil, X co_agent:[animate:t|_], X at_time:[time:t|_], X beneficiary:[human:t|_], X at_loc:[animate:f|_]|_]). Xfeats(objaction,[theme:[physobj:t|_], X instrument:[animate:f|_]|_]). Xfeats(put,[to_loc:[animate:f|_]|_]). Xfeats(transfer,[to_poss:[animate:t|_]|_]). Xfeats(interhuman,[agent:[human:t|_], X theme:[human:t|_]|_]). Xfeats(break,[agent:[human:t|_], X theme:[animate:f|_]|_]). Xfeats(unactionverb,[agent:nil|_]). X X Xcase(subject,actionverb,agent). Xcase(subject,objaction,instrument). Xcase(subject,objaction,theme). Xcase(dobj,objaction,theme). Xcase(dobj,interhuman,theme). Xcase(iobj,objaction,beneficiary). Xcase(iobj,transfer,to_poss). Xcase(by,objaction,agent). Xcase(at,actionverb,at_loc). Xcase(on,actionverb,at_loc). Xcase(in,actionverb,at_loc). Xcase(along,actionverb,at_loc). Xcase(from,transfer,from_poss). Xcase(from,transfer,from_loc). Xcase(to,transfer,to_poss). Xcase(to,transfer,to_loc). Xcase(for,actionverb,beneficiary). Xcase(with,actionverb,instrument). Xcase(with,actionverb,co_agent). Xcase(subject,unactionverb,experiencer). Xcase(dobj,unactionverb,theme). X X X/* X isa hierarchy X */ Xorganic isa physobj. Xinanimate isa physobj. X Xanimate isa organic. Xvegetable isa organic. X Xhuman isa animate. Xdog isa animate. Xcat isa animate. X Xtree isa vegetable. Xplant isa vegetable. X Xrock isa inanimate. Xcar isa inanimate. Xbuilding isa inanimate. Xhammer isa inanimate. Xwindow isa inanimate. X Xhouse isa building. Xapartment_complex isa building. X X%verb isa hierarchy Xobjaction isa actionverb. Xinstraction isa objaction. Xinterhuman isa actionverb. Xkiss isa interhuman. Xlove isa interhuman. Xput isa objaction. Xtransfer isa objaction. Xbreak isa objaction. Xcry isa actionverb. Xsee isa unactionverb. END_OF_FILE if test 2690 -ne `wc -c <'hierarchy.pl'`; then echo shar: \"'hierarchy.pl'\" unpacked with wrong size! fi # end of 'hierarchy.pl' fi if test -f 'sem.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'sem.pl'\" else echo shar: Extracting \"'sem.pl'\" \(3619 characters\) sed "s/^X//" >'sem.pl' <<'END_OF_FILE' X/* X SEM.PL X Rules for semantic processing X */ Xtranslate(String,Tree,Sem) :- X parse(Tree,Mood,String), X semantics(Tree,Mood,Sem). X Xsemantics(smaj(Tree),Mood,Sem) :- Mood \= wh_q, X Sem =.. [Mood,SSem], X sem_translate(Tree,SSem,nil). X Xsemantics(smaj(Tree),wh_q,wh_q(Whvar,SSem)) :- X new_atom(Whvar), X sem_translate(Tree,SSem,Whvar). X Xsem_translate(s(NPTree,VPTree),Sem,Whvar) :- X vp_sem(VPTree,NPSlot, Sem, Whvar), X np_sem(NPTree,NP,Whvar), X unify(NP,NPSlot), X new_atom(EventName), X unify([token:EventName|_],Sem). X X X/* X vp_sem(VPTree, NPSlot, Sem, Whvar) X The semantics of VPTree is Sem. A substructure of Sem is X NPSlot. This is the substructure with which the semantic X representation of the subject np should be unified. X Whvar is an input variable indicating the variable corresponding X to the wh-word in a wh_q. X */ Xvp_sem(VPTree, NPSlot, Sem, Whvar) :- X head_verb(VPTree, Verb), X verb_sem(Verb,VerbSense,Sem), X findcase(VerbSense,subject,Case), X pathval(Sem,Case,NPSlot,_), X complement_semantics(VPTree,VerbSense,Sem,Whvar). X X/* X if there's a verb complement, it must have some X semantics. If not, no problem. X */ Xcomplement_semantics(VPTree,VerbSense,Sem,Whvar) :- X verb_complement(VPTree,Comp),!, X comp_sem(Comp,VerbSense,Sem,Whvar). X Xhead_verb(vp(aux(_),v(Verb),compl(_)),Verb) :- !. X Xverb_complement(vp(aux(_),v(_),Comp),Comp) :- !. X X X/* X comp_sem(Comp,VerbSense,Sem) X */ Xcomp_sem(compl(nil),_,_,_). Xcomp_sem(compl(Tree1,Tree2),VerbSense,Sem,Whvar) :- X np_sem(Tree1,Iobj,Whvar), X findcase(VerbSense,iobj,Case1), X unify([Case1:Iobj|_],Sem), X np_sem(Tree2,Dobj,Whvar), X findcase(VerbSense,dobj,Case2), X unify([Case2:Dobj|_],Sem). Xcomp_sem(compl(Tree),VerbSense,Sem,Whvar) :- X np_sem(Tree,Dobj,Whvar), X findcase(VerbSense,dobj,Case), X unify([Case:Dobj|_],Sem). X X X X X/* X np_sem(NPTree,NPSem,Wh_var), X */ Xnp_sem(np(what),[token:Wh_var|_],Wh_var). Xnp_sem(np(who),[token:Wh_var|_],Wh_var). Xnp_sem(NPTree,Sem,Wh_var) :- X np_sem1(NPTree,Sem,Wh_var), X new_atom(A), X unify([token:A|_],Sem). X Xnp_sem1(np(Nbar,Mods), NbarSem, Wh_var) :- X np_mod_sem(Mods, ModsSem), X nbar_sem(Nbar, NbarSem, Wh_var), X unify([mods:ModsSem|_],NbarSem). Xnp_sem1(np(Nbar), NbarSem, _) :- X nbar_sem(Nbar, NbarSem). X Xnbar_sem(nbar(prop_n(Name)),Sem) :- prop_noun_sem(Name,Sem). Xnbar_sem(nbar(art(Art),n(Noun)), Sem) :- X art_sem(Art,ArtSem), X noun_sem(Noun,Sem), X unify(Sem,ArtSem). Xnbar_sem(nbar(pn(PN)),Sem) :- X pronoun_sem(PN,Sem). X Xnp_mod_sem((Mod,Mods),Sem) :- X np_mod_sem(Mod,Sem1), X np_mod_sem(Mods,Sem), X unify(Sem1,Sem). Xnp_mod_sem(pp(p(P),np(NPTree)),Sem) :- X pp_noun_mod(P, NPSlot, Sem), X np_sem(NPTree,NP,nil), X unify(NP,NPSlot). X Xprop_noun_sem(Name,Sem) :- X name_sem(Name,Sem), X features(human,PSem), X unify(Sem,PSem). X Xpronoun_sem(PN,Sem) :- X PN \= it, X noun_sem(person,PSem), X pnsem(PN,Sem), X unify(Sem,PSem). Xpronoun_sem(it,Sem) :- X pnsem(it, Sem), X noun_sem(inanimate,NSem), X unify(Sem,NSem). X X/* X pp_noun_mod(Preposition, NPSlot, Structure) X The prep builds the semantic Structure; X a substructure of this semantic Structure is X NPSlot. This is the place in the structure which should be X unified with the semantics of the np which is the argument X of the pp. X */ Xpp_noun_mod(at, NPSlot, [at_location:NPSlot|_]) :- X NPSlot = [organic:f|_]. X Xpp_noun_mod(on, NPSlot, [on_location:NPSlot|_]) :- X NPSlot = [organic:f|_]. X Xpp_noun_mod(in, NPSlot, [in_location:NPSlot|_]) :- X NPSlot = [organic:f|_]. X X Xnoun_sem(Noun,Sem) :- noun_sense(Noun,Sense), X features(Sense, Sem). X Xverb_sem(Verb,Sense,Sem) :- verb_sense(Verb,Sense), X features(Sense, Sem). X Xart_sem(the,[ref:def|_]). Xart_sem(a,[ref:indef|_]). X X X X X X END_OF_FILE if test 3619 -ne `wc -c <'sem.pl'`; then echo shar: \"'sem.pl'\" unpacked with wrong size! fi # end of 'sem.pl' fi if test -f 'grammar.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'grammar.pl'\" else echo shar: Extracting \"'grammar.pl'\" \(2707 characters\) sed "s/^X//" >'grammar.pl' <<'END_OF_FILE' X/* X GRAMMAR.PL X The grammar we will use with semantic processing. X*/ X Xparse(Tree,Mood,String) :- smaj(Tree,Mood,String,[]). X X/* X smaj has been added to the grammar for the sake of X allowing it to later be expanded to include X clauses (embedded sentences), like the ones in our X verb complementation homework. X */ Xsmaj(smaj(Tree),Mood) --> s(Tree,Mood). X X/* X s(ParseTree,Mood) X */ Xs(s(NP,VP),decl) --> np(NP,NPnum,[]/[]), X vp(VP,VPnum,[]/[],[]/[]), X {num_agree(NPnum,VPnum,_)}. X Xs(s(NP,VP),yes-no-q) --> X aux(Aux,AuxNum,[]/[]), X np(NP,NPnum,[]/[]), X vp(VP,AuxNum,[Aux,AuxNum]/[],[]/[]), X {num_agree(NPnum,AuxNum,_)}. X X%A wh-question querying the subject is normal in X%structure Xs(s(NP,VP),wh_q) --> X wh(GapInfo), X np(NP,NPnum,[GapInfo]/[]), X vp(VP,VPNum,[]/[],[]/[]), X {num_agree(NPnum,VPNum,_)}. X%querying anything else requires aux-inversion. Xs(s(NP,VP),wh_q) --> X wh(GapInfo), X aux(Aux,AuxNum,[]/[]), X np(NP,NPnum,[]/[]), X vp(VP,AuxNum,[Aux,AuxNum]/[],[GapInfo]/[]), X {num_agree(NPnum,AuxNum,_)}. X X/* X vp(Tree,Number,AuxGap,GapInfo) X The vp rules have GapInfo as an argument because they have to X pass the gaps on to the np's that may be nested in them. X */ Xvp(vp(aux(Aux),v(V),Compl),Num,AuxGap,GapInfo) --> aux(Aux,Num,AuxGap), X verb(V, Num, Form), X {aux_agree(Aux,Form), X subcat(V,Subcat)}, X compl(Compl,Subcat,GapInfo). X X X/* X compl(Tree,Subcat,GapInfo) X verb complement with Tree representation, admitted by a verb with X subcategorization alternatives SubCat. X */ Xcompl(compl(nil),Subcat,X/X) --> {member(iv,Subcat)},[]. Xcompl(compl(NP),Subcat,GapInfo) --> {member(tv,Subcat)}, X np(NP,_,GapInfo). Xcompl(compl(NP1,NP2),Subcat,GapIn/GapOut) --> {member(bv,Subcat)}, X np(NP1,_,GapIn/Gap1), X np(NP2,_,Gap1/GapOut). X X/* X np(ParseTree,Number,GapInfo) X */ Xnp(np(Nbar),Num,GapIn/GapOut) --> X nbar(Nbar,Num,GapIn/GapOut). Xnp(np(Nbar,Mods),Num,GapIn/GapOut) --> X nbar(Nbar,Num,GapIn/Gap1), X npmods(Mods,Gap1/GapOut). Xnp(np(Gap),_,[np(Gap)|Gaps]/Gaps) --> []. X X/* X npmods(TreeFragment,Gap) X for the moment, we allow only prepositional X phrases as modifiers X */ Xnpmods(NPmod,GapIn/GapOut) --> pp(NPmod,GapIn/GapOut). Xnpmods((NPmod,NPmods),GapIn/GapOut) --> pp(NPmod,GapIn/Gap1), X npmods(NPmods,Gap1/GapOut). X X X/* X nbar(Tree,Num,GapInfo) X */ Xnbar(nbar(art(Art),n(Noun)),Num,Gap/Gap) --> art(Art,ArtNum), X noun(Noun,NounNum), X {num_agree(ArtNum,NounNum,Num)}. Xnbar(nbar(prop_n(Name)),[s3],Gap/Gap) --> proper_noun(Name). Xnbar(nbar(pn(PN)),Num,Gap/Gap) --> pronoun(PN,Num). X X/* X pp(Tree, GapInfo) X */ Xpp(pp(p(Prep),np(NP)),GapIn/GapOut) --> prep(Prep),np(NP,_,GapIn/GapOut). X X/* X aux(Aux,Num,AuxGapInfo X */ Xaux(Aux,Num,[Aux,Num]/[]) --> []. Xaux(Aux,Num,AuxGap/AuxGap) --> aux(Aux,Num). END_OF_FILE if test 2707 -ne `wc -c <'grammar.pl'`; then echo shar: \"'grammar.pl'\" unpacked with wrong size! fi # end of 'grammar.pl' fi if test -f 'load.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'load.pl'\" else echo shar: Extracting \"'load.pl'\" \(251 characters\) sed "s/^X//" >'load.pl' <<'END_OF_FILE' X/* X The load file for our parser with semantic analysis X */ X:-consult(library(not)). X:-consult(library(sets)). X:-consult(library(basics)). X X:-compile([dag_unify,gensym]). X X:- consult([grammar,lexicon,sem,hierarchy]). %fs_utils X X:- dynamic count/1. END_OF_FILE if test 251 -ne `wc -c <'load.pl'`; then echo shar: \"'load.pl'\" unpacked with wrong size! fi # end of 'load.pl' fi if test -f 'lexicon.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'lexicon.pl'\" else echo shar: Extracting \"'lexicon.pl'\" \(2521 characters\) sed "s/^X//" >'lexicon.pl' <<'END_OF_FILE' X/* X LEXICON.PL X */ X X X X/* X noun(Lexeme,Num) X */ Xnoun(dog,[p3]) --> [dogs]. Xnoun(dog,[s3]) --> [dog]. Xnoun(hammer,[s3]) --> [hammer]. Xnoun(window,[s3]) --> [window]. X X X/* X art(Article,Num) X */ Xart(the,[s3,p3]) --> [the]. Xart(a,[s3]) --> [a]. X X X/* X verb(Root,Number-and-person,Form) X Form = 1 tenseless/present X 2 -s form (3s) X 3 past X 4 present participle X 5 past participle X */ Xverb(cry,[s1,p1,s2,p2,p3],1) --> [cry]. Xverb(cry,[s3],2) --> [cries]. Xverb(cry,_,4) --> [crying]. Xverb(see,[s1,p1,s2,p2,p3],1) --> [see]. Xverb(see,[s3],2) --> [sees]. Xverb(see,_,3) --> [saw]. Xverb(see,_,4) --> [seeing]. Xverb(see,_,5) --> [saw]. Xverb(love,[s3],2) --> [loves]. Xverb(love,[s1,p1,s2,p2,p3],1) --> [love]. Xverb(love,_,4) --> [loving]. Xverb(break,[s1,p1,s2,p2,p3],1) --> [break]. Xverb(break,[s3],2) --> [breaks]. Xverb(break,_,3) --> [broke]. Xverb(break,_,5) --> [broken]. X Xproper_noun(john) --> [john]. Xproper_noun(jack) --> [jack]. Xproper_noun(mary) --> [mary]. X X/* X pronoun(Pronoun,NumberandPerson X */ Xpronoun(i,[s1]) --> [i]. Xpronoun(we,[p1]) --> [we]. Xpronoun(you,[s2,p2]) --> [you]. Xpronoun(he,[s3]) --> [he]. Xpronoun(she,[s3]) --> [she]. Xpronoun(it,[s3]) --> [it]. Xpronoun(they,[p3]) --> [they]. X Xprep(at) --> [at]. Xprep(on) --> [on]. Xprep(in) --> [in]. Xprep(along) --> [along]. Xprep(from) --> [from]. Xprep(to) --> [to]. Xprep(for) --> [for]. Xprep(by) --> [by]. Xprep(with) --> [with]. X X X/* X aux(AuxVerb,Number) X */ Xaux(do,[s1,p1,s2,p2,p3]) --> [do]. Xaux(do,[s3]) --> [does]. Xaux(be,[s1]) --> [am]. Xaux(be,[s3]) --> [is]. Xaux(be,[p1,s2,p2,p3]) --> [are]. Xaux(nil,_) --> []. X X/* X wh(GapIntroduced) X */ Xwh(np(who)) --> [who]. Xwh(np(what)) --> [what]. X X/* X num_agree(N1,N2,N3) X N3 is the non-empty result of making N1 and N2 agree wrt X number. X */ Xnum_agree(N1,N2,N3) :- intersection(N1,N2,N3), X ( N3 = [] -> fail X | true X ). X X%do can be followed by the tenseless form Xaux_agree(do,1). X% be can be followed by the present participle Xaux_agree(be,4). X%participles not permitted without auxverb Xaux_agree(nil,1). Xaux_agree(nil,2). Xaux_agree(nil,3). X Xsubcat(cry,[iv]). Xsubcat(love,[tv]). Xsubcat(see,[tv]). Xsubcat(break,[iv,tv]). X X X/* X Word sense entries X */ Xnoun_sense(dog,dog). Xnoun_sense(hammer,hammer). Xnoun_sense(window,window). Xnoun_sense(rock,rock). X X Xverb_sense(break,break). Xverb_sense(cry,cry). Xverb_sense(love,love). Xverb_sense(kiss,kiss). Xverb_sense(see,see). X Xname_sem(john, [name:john,gender:male|_]). Xname_sem(jack, [name:jack,gender:male|_]). Xname_sem(mary, [name:mary,gender:female|_]). END_OF_FILE if test 2521 -ne `wc -c <'lexicon.pl'`; then echo shar: \"'lexicon.pl'\" unpacked with wrong size! fi # end of 'lexicon.pl' fi if test ! -d '' ; then echo shar: Creating directory \"''\" mkdir '' fi echo shar: End of shell archive. exit 0