Path: utzoo!utgpu!water!watmath!clyde!ima!compilers-sender From: djones@megatest.uucp (Dave Jones) Newsgroups: comp.compilers Subject: old yacc bug. Fixed? Keywords: yacc Message-ID: <2737@ima.ima.isc.com> Date: 4 Oct 88 22:00:51 GMT Sender: compilers-sender@ima.ima.isc.com Reply-To: djones@megatest.uucp (Dave Jones) Organization: Megatest Corporation, San Jose, Ca Lines: 70 Approved: compilers@ima.UUCP I'm in the process of writing a compiler using yacc. A fellow worker loaned me "Introduction to Compiler Construction with Unix" by Axel T., Schreiner and H. George Friedman, Jr., Prentice Hall, 1985, so I've been trying to read it. The book alludes to a bug in yacc. The main thing I want to know is whether the Sun Unix 4.2 release 3.4 has the bug, and if so what do I do about it? They give some examples of using the "error" token, with the following footnote, which I quote verbatim: This way of extending repetative constructs has a drawback due to a bug in yacc ( as distributed with Bell version 7, Berkeley 4.2bsd [sic], and various derivatives): if in a state the default action is to reduce, and if the next terminal symbol cannot be shifted but _error_ could be (e.g., on a trailing _error_ rule), yacc's tables dictate that the reduction take place, event if the next terminal symbol cannot be shifted subsequently. In this case error recovery takes place "too late", and the parser can, in fact, go into a loop, mistakenly reduce rules several times, etc. The 4.1bsd [sic] distribution actually contains a correction for this bug, based on ["Practical LR error recovery", Sigplan Notices, Aug. 79]. Essentially, in these cases all possible inputs must be enumerated, so that the error can be detected; this results in slightly larger parser tables. The correction in 4.1bsd [sic] contains a typographical error, however. A definite correction is available from the authors (S. Johnson, personal communication, 1982) Notice that it says the bug is fixed in 4.1 but not 4.2, except that the fix is wrong! At least that's the interpretation I put on it. Can anyone make heads or tails of this gobblety-gook? (I love that "in fact" part.) Some questions" 1) How one may determine whether or not a given yacc has the bug? 2) What to do about it if you've got a buggy one? 3) Exactly what are the consequences -- does "go into a loop" mean "loop forever"? 4) What is the correction in 4.1? Is it in the source code? 5) What is the typographical error in the correction? 6) What is the "definite correction"? Do you have to have source? 7) How to obtain the correction from the authors? The best try I can make is that maybe one could expect to see entries of this form: state 3 X : Y _ error (6) Z : A _ (5) error shift 4 . reduce 5 And in this case the parser would reduce by rule 5, when shifting _error_ is in order. But I can't find any such states in any output I can generate. Besides, this would just cause some extra input to be thrown away. The parser would not get into a loop or make extra reductions. Please, HELP! [From djones@megatest.uucp (Dave Jones)] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request