Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!usc!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!ira.uka.de!rusmv1!rusux1!mark From: mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) Newsgroups: comp.lang.prolog Subject: Re: Efficiency of DCGs and chart parsers Message-ID: Date: 26 Jan 91 17:44:57 GMT References: <8014@castle.ed.ac.uk> <1549@gufalet.let.rug.nl> Sender: zrf80385@rusux1.rus.uni-stuttgart.de Organization: IMS, University of Stuttgart Lines: 49 In-reply-to: gosse@gufal02.let.rug.nl's message of 25 Jan 91 16:09:02 GMT Gosse Bouma askes: > How do you pass the chart around? as a list of chart items? I can hardly > believe that going through a list with, say, a few hundred elements is > more efficient in Prolog than just exploiting the data-base. No, it shouldn't just be an unstructured list, of course. In general you want to index inactive edges on their left-most string-positions (vertex) and active edges on their right-most string-positions. If you want to get n^3 behaviour (with CFGs) then you must index on both left and right string-positions (if I remember Earley's algorithm correctly). Prolog textbooks discuss a variety of data structures for implementing such indexing. (Macho types not fazed by circular structures caused by e.g. ``empty'' edges dominating the null string can avoid some of this indexing by representing a vertex as a set of edges, perhaps indexed by type and category, active edges as pairs consisting of a dotted rule and the vertex at the left string-position, and inactive edges consisting of pairs of a category and the vertex at the right string-position. In both cases, an edge's "other" string-position is that of the vertex of which it is member. But be warned - I've had one Prolog which supposedly could handle circular structures go into an infinite loop when checking if vertices V1 and V2 satisfy V1==V2. As I've said elsewhere, Prolog needs either efficient arrays or circular structures and preferably both, but that's another story!) > Also, the copying, which is avoided if you assert items, seems rather > expensive. Of course, copying (renaming of variables) is one of the things that happens during an assertion and a following call. I believe it is usually faster to copy a term than it is to assert it into the database, especially if the language has a copy_term/2 primitive (as does SICStus). > Also, given the fact that in unification grammars, the size of the chart items > tends to be rather large, using a list of items as additional argument > seems practically impossible to me. The chart itself would not be stored as part of a category labelling an edge, so I'm not sure exactly what you have in mind here. > Gosse Bouma, Alfa-informatica, RUG, Groningen gosse@let.rug.nl Mark Johnson