Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site dg_rtp.UUCP Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!mcnc!rti-sel!dg_rtp!meissner From: meissner@dg_rtp.UUCP (Michael Meissner) Newsgroups: net.lang.c Subject: Re: Error recovery (long) Message-ID: <387@dg_rtp.UUCP> Date: Fri, 6-Jun-86 00:49:16 EDT Article-I.D.: dg_rtp.387 Posted: Fri Jun 6 00:49:16 1986 Date-Received: Sat, 7-Jun-86 08:24:34 EDT References: <312@uw-nsr.UUCP> Reply-To: meissner@dg_rtp.UUCP (Michael Meissner) Distribution: net Organization: Data General (Languages @ Westborough, MA. => RTP, NC.) Lines: 105 Summary: Data General C answers In article <312@uw-nsr.UUCP> john@uw-nsr.UUCP (John Sambrook) writes: > >Regarding error recovery in C compilers, I like the error recovery >provided by the Data General C compiler. Here is an example of a >botched program: > > main() { > int a = 0 /* missing ";" */ > > printf("a: %s\n", (a == 1) ? "1" : "?"; /* missing ")" */ > } > >When compiled the following is written on stderr: > > Error 502 severity 2 beginning on line 4 (Line 4 of file main.c) > printf("a: %s\n", (a == 1) ? "1" : "?"; > ^ > Syntax Error. > A symbol of type ";" has been inserted before this symbol. > > > Error 502 severity 2 beginning on line 4 (Line 4 of file main.c) > printf("a: %s\n", (a == 1) ? "1" : "?"; > ^ > Syntax Error. > A symbol of type ")" has been inserted before this symbol. > >In this example the compiler produced a program that executed correctly. > >To be fair, both errors are "errors of omission." I believe, but do not >assert, that these errors are easier to repair than other types of errors. >In the event of serious errors the compiler will cease code generation and >only check the remaining input. I don't know the parsing method used in >this compiler; it does not seem to suffer from poor error recovery as do >many recursive-descent parsers. It's a pleasant surprise when somebody says he likes something. I am the author of the Data General C compilers. The parsing method that I use is a standard LALR parse, based on an internal tool that constructs the tables from a BNF input grammar. In comparison to YACC, the tool is not as developer friendly, ie, it only creates the tables, I have to write the routine that actually interprets the parse state machine and dispatch on the semantic actions. The error recovery routines must also be provided as well. YACC on the other hand, encapsulates the parser into the the C program it generates. It also handles error recovery (badly in my opinion), so that in general, the user doesn't have to mess with it. It also means that the user does not really have the control either. The algorithm that I use, which is the first part of Jerry Fisher's (from SIGPLAN, compiler construction conference) first attempts to insert, delete, or replace the token that is in error with any of the tokens that are in the follow set (ie would be possible, legal input), and then parses ahead 3 tokens. The first parse that will succeed for 3 tokens is selected (the tokens are given a priority, and tried in priority order). The second part of Jerry Fisher's algorithm is a complicated secondary recovery, which I initially attempted, and gave up because adapting his algorithm to my parser kept coming up with errors in my translation, or areas where I did not really understand what is going on deep within the LALR tables. As near as I can understand from looking at it, the YACC approach is to discard tokens until it can reduce from an 'error' production. It's been my experience that this rarly does what the compiler writer wants. As far as local replacement goes, I am currently thinking of adding another pass that would attempt to glue two tokens together (to make += out of + and = separated by whitespace). The priorities are the hardest thing to get a feeling for, and I still play with them every so often. As far as secondary recovery goes, my feeling still is that if you ever need to go to more extereme methods, the program is hopelessly damaged, and I question whether the programmer gets anything useful after the first few error messages. >While on the subject of compilers, I would like to share two other features >of this compiler that I find useful. I have not found these features in >other C compilers that I have used, although I have heard that the VAX/VMS >C compiler is very good. > >The first feature is the ability to generate a stack trace ("traceback") >in the event of a serious error. There are two compiler switches that >control the amount of information in a traceback. The "-Clineid" switch >causes the offending line number to be included while the "-Cprocid" switch >causes the procedure name to be included. There have been a few responses saying dbx/adb gives you the information, if you compile with -g and look at the core file. The traceback feature (which is standard on almost all 32-bit DG compilers) produces smallish tables, which can be kept in the program file, even when it is shipped to users in production mode. We also support -g and dbx. >The second feature is the ability to declare certain data structures as >"read only." This is done via a compiler switch "-R" and applies to all >data structures that are initialized to a constant value within the >compilation unit. This came from Berkley 4.2 (and 4.3) and was added in attempt to be as compatible with both 4.2 and system V.2 as we could. At some point in the future, when the ANSI X3J11 draft stabilizes to the point of going for public review, the `const' feature will also allow this without having to set the option. The private Data General keyword $shared allows this in the released revisions. >John Sambrook Work: (206) 545-2018 >University of Washington WD-12 Home: (206) 487-0180 >Seattle, Washington 98195 UUCP: uw-beaver!uw-nsr!john Michael Meissner Data General Corporation ...{ decvax, ihnp4, ucbvax, ... }!mcnc!rti-sel!dg_rtp!meissner