Path: utzoo!attcan!uunet!world!esegue!compilers-sender From: KARSTEN@tfl.dk (Karsten Nyblad, TFL, Denmark) Newsgroups: comp.compilers Subject: Re: Can Pascal be parsed by LR(1) parsing algorithm? Keywords: pascal, parse Message-ID: <49.27136340@tfl.dk> Date: 10 Oct 90 16:06:55 GMT References: <9010091533.AA02386@apple.com> Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: "Karsten Nyblad, TFL, Denmark" Organization: Compilers Central Lines: 65 Approved: compilers@esegue.segue.boston.ma.us In article <9010091533.AA02386@apple.com>, amb@apple.com (A. Michael Burbidge) writes: > After struggling for some time to write a yacc description for the > Pascal language and after reading the description of the modifier yacc > contained in the UCB Pascal source directory I am beginning to wonder > if an LR(1) parsing algorithm can parse Pascal. ... I once wrote an LALR(1) parser for a superset of Pascal called Pascal+. I used the ISO Pascal standard except from for the extensions. Yes you can write an LR(1) parser, but you parser should be capable of handling disambiguating rules or you will have problems with the ELSE keyword. To make Pascal true LR(1) you need to make a little transformation of the grammar. Let's make an example: This is a little grammar like the four statements assignment, while loop and ifthen and ifthenelse. Stmt -> Assigment | While_header Stmt | IfThen_header Stmt ELSE Stmt | IfThen_header Stmt . This grammar is ambiguous because the parser will not know how to parse IfThen_header Ifthen_header Assignment ELSE Assignment Now we transform the grammar in the following way: We divide Stmt in two: 1) all Stmt ending on IfThen_header Stmt and 2) all Stmt not ending on IfThen_header: Stmt -> NoDangling | Dangling . NoDangling -> Assignment | While_header NoDangling | IfThen_header NoDangling ELSE NoDangling . Dangling -> While_header Dangling | IfThen_header NoDangling ELSE Dangling | IfThen_header Stmt . This grammar is LALR(1) and thus LR(1). Then about making semicolon optional. I also tried deleting all semicolons from my grammar. You do not need semicolons except in one case: the case statement, e.g.: CASE expr OF 2 : a:= a ; - 3 : END ^ Try deleting the semicolon. Then the parser will not know that the - is the start of a label until it has seen the colon. You can let your scanner divide - and + into prefix and infix operators by doing some lookahead in the scanner, but take care of writeln(5-3:3); Karsten Nyblad TFL, A Danish Telecommunication Research Laboratory E-mail: karsten@tfl.dk -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.