Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!mintaka!spdcc!ima!esegue!compilers-sender From: firth@sei.cmu.edu (Robert Firth) Newsgroups: comp.compilers Subject: Re: Can Pascal be parsed by LR(1) parsing algorithm? Keywords: pascal, parse Message-ID: <9112@fy.sei.cmu.edu> Date: 18 Oct 90 19:02:44 GMT References: <9010091533.AA02386@apple.com> <1990Oct10.133752.14930@ncsuvx.ncsu.edu> <9094@fy.sei.cmu.edu> Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: firth@sei.cmu.edu (Robert Firth) Organization: Software Engineering Institute, Pittsburgh, PA Lines: 71 Approved: compilers@esegue.segue.boston.ma.us In article <9094@fy.sei.cmu.edu> firth@sei.cmu.edu I wrote: [ The Pascal form 'if c then else s' ] >But that is not legal Pascal. The relevant syntax (Wirth, section 9.2.2.1) >reads > ::= IF THEN | > IF THEN ELSE > >and there is no production from that yields . I was wrong. The above is indeed what Wirth's original paper says, and what I used when implementing Pascal. But in the later book by Jensen and Wirth, an extra production is added ::= ... | by means of which one can indeed produce from . Since this book is by the original designer and of later date, we should take it as superseding the older paper. Many thanks to Bob Corbett for setting me straight on this point. So, given that we do have empty statements, how nasty are they? There is a simple mechanical test: list all the tokens that can start a statement, then all the tokens that can follow a statement, and if the sets overlap we are screwed, since any token in the intersection might EITHER start a statement OR start a different nonterminal after an empty statement. The tokens that can start a statement are GOTO BEGIN IF CASE WHILE REPEAT FOR WITH and in addition, note that a label or case label can start with The tokens that can follow a statement are semicolon END ELSE UNTIL The intersection is null, so we can always identify an empty statement by a one-token lookahead. On another topic: while scrabbling through the listings I came across a genuine and forgotten nasty. Consider this fragment CASE fruit OF apple: pear This can continue as, for instance apple: pear: x := 1; or as apple: pear := 1; In other words, you cannot tell whether 'pear' starts another case label or the statement labelled. That may discombobulate a parser generator, especially since one or more comments may appear between the identifier and the next token. Again, hope that helps, and my apologies for regurgitating obsolete information. [Norm Diamond , Charles E Eaker , and Jon Mauney also noted that Pascal does indeed allow null statements. -John] -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.