Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!samsung!olivea!mintaka!spdcc!iecc!compilers-sender From: grass@ulysses.att.com (Judith Grass) Newsgroups: comp.compilers Subject: Re: Generating LR parsers from EBNF? Keywords: LR(0), parse, bibliography, EBNF Message-ID: <91-05-069@iecc.cambridge.ma.us> Date: 9 May 91 14:11:41 GMT References: <29305.9105072033@pessoa.ecs.soton.ac.uk> Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: grass@ulysses.att.com (Judith Grass) Organization: AT&T Bell Labs Lines: 134 Approved: compilers@iecc.cambridge.ma.us This is a longish response containing a bibtex list of references... >> I have looked in several books and a couple of conference proceedings but >> I have failed to find any references on generating LR parse tables >> directly from an EBNF grammar. What I would like to know is: Research on this goes back to the mid-70's, but you'll find it mainly in Canadian and European sources. It seems right now that there isn't a lot being published about parser generators and parser theory, as supposedly there isn't anything interesting to say anymore. In general, a lot of theoretical work has been done on the direct generation of elalr parsers from regular-right part grammars (EBNF), but there are very few finished, working examples of them. Two I know of: my own "Ryacc" parser generator (a prototype) and Compiler Resource, Inc.'s Yacc++ (a product). >> 1. Are there any references? Tons. Try these: @article {LA77, author = "W. R. {LaLonde}", title = "Regular Right part Grammars and Their Parsers", journal = "CACM", volume = 20, number = 10, pages = "731--741", month = oct, year = 1977} @article {LA79, author = "W. R. {LaLonde}", title = "Constructing {LR} Parsers for Regular Right part Grammars", journal = AI, volume = 11, pages = "177--193", year = 1979} @article {PB, author = "P. W. {Purdom, Jr.} and C. A. Brown", title = "Parsing Extended {LR}(k) Grammars", journal = AI, volume = 15, pages = "115--127", year = 1981} @article {CH, author = "N. P. Chapman", title = "{LALR}(1,1) Parser Generation for Regular Right Part Grammars", journal = AI, volume = 21, pages = "29--45", year = 1984} @article {HEI, author = "S. Heilbrunner", title = "On the Definition of {ELR}(k) and {ELL}(k) Grammars", journal = AI, volume = 11, pages = "169--176", year = 1979} @article {HEC, author = "R. Heckmann", title = "An Efficient {ELL}(1) Parser Generator", journal = AI, volume = 23, pages = "127--148", year = 1986} @article {NS86, author = "I. Nakata and M. Sassa", title = "Generation of Efficient {LALR} Parsers for Regular Right Part Grammars", journal = AI, volume = 23, pages = "149--162", year = 1986} @article {NS87, author = "I. Nakata and M. Sassa", title = "A Simple Realization of {LR}-Parsers for Regular Right Part Grammars", journal = IPL, volume = 24, pages = "113--120", year = 1987} @inproceedings {GrKiSe, author = "J. E. Grass, Chandra Kintala and Ravi Sethi", title = "Object-oriented Redesign using {C++}", booktitle = "Usenix {C++} Conference Proceedings", pages = "75--86", address= "San Francisco, CA", month = "April", year = 1990} @unpublished {GR90.pub, author = "J. E. Grass", title = "A User's Guide to {Ryacc} (version 0.0): A Parser Generator for Regular Right Part Grammars", institution = BLTM, note = "Available from author", year = 1990} >> 2. Is the EBNF approach equivalent to the CFG one? Is >> the difference in the tables always a due to small set >> of `optimizations'? Since you can always rewrite an EBNF grammar as a BNF grammar, the languages described are the same set. Not all EBNF grammars will directly yield an ELALR parser (or ELR or ELL) because of some kinds of non-determinism possible in the description. I am not sure that this is the answer to the question you are asking, but... >> 3. Which is faster? The EBNF item sets are smaller but >> more complex to handle. Using the EBNF approach seems >> to reduce the need for `optimizing' the generated >> table. From my experience, for the same language, the ELALR parser tends to be about 1/3 faster, simply because there are fewer states in the parser and fewer "artificial" productions. The tables also are much smaller. The tables still can be packed and optimized, but the techniques for doing so have to be a bit different. >> 4. I only generated the LR(0) automaton by hand. I have >> not thought about `higher' grammar categories like >> LALR(1), LR(2), regular-LR etc. Are these kinds >> easily generated using the EBNF method? ELALR(1), ELR(1), LL(1) are all known techniques. My guess is that doing longer lookaheads is no problem, although I am not aware of anyone that has actually tried it. The techniques should be about the same as is done for LR(2) and such. -- Judy Grass, AT&T Bell Labs, Murray Hill grass@ulysses.att.com -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.