Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!rutgers!ucla-cs!zen!ucbvax!decvax!ima!johnl From: johnl@ima.UUCP Newsgroups: comp.compilers Subject: Re: recursive-descent error recovery Message-ID: <662@ima.ISC.COM> Date: Mon, 17-Aug-87 03:17:24 EDT Article-I.D.: ima.662 Posted: Mon Aug 17 03:17:24 1987 Date-Received: Tue, 18-Aug-87 05:24:10 EDT References: <634@ima.ISC.COM> <642@ima.ISC.COM> <651@ima.ISC.COM>, <655@ima.ISC.COM> Sender: johnl@ima.ISC.COM Reply-To: decvax!utzoo!henry Lines: 43 Approved: compilers@ima.UUCP > This seems like a pain. Recusrive descent with error recovery performed > by the higher level entity would seem to be simpler, namely because the > higher level entity knows more about what's going on... On the contrary, doing it at the higher level is a horrendous pain, and the smooth simplicity of the low-level recovery (which has detailed guidance from the higher level, remember) is an enormous win. You have to experience the difference to fully appreciate it -- I've written parsers both ways. The problems with doing it at the higher level boil down to (a) it adds a lot of complexity to the code, and (b) error repair often has to cross the boundaries of syntactic structures, which is painful in recursive descent because those are function boundaries in the parser. By contrast, the low-level approach requires 100 lines or so of code to handle *all* syntactic error repair for the entire compiler, and it's all in one place rather than interspersed throughout the parser. > ... Okay... Now I've told the user it screwed > up, let's recover from this sucker. The simplest thing to do, since > I'm in an expression, is to toss tokens until I get to a synchronizing > token like a semi... Just where does the code that does this reside? Remember that the code for parsing an expression is big and complicated, and may be spread over several functions. (In a straightforward recursive-descent parser, it will be a dozen or more functions.) They all have to cooperate very carefully to make even such a crude algorithm work. This takes a lot more effort and code than you would think. Also, you've missed an (admittedly non-obvious) point in my contribution. The error repair need not be at anywhere near as gross a level as throwing away everything until the next semicolon. That is necessary as a "backstop" algorithm, but the more local resync heuristic can be something like "if the input token is punctuation and the requested one is not, throw away the input, otherwise keep it". This repairs many minor goofs promptly and *correctly*. Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,decvax,pyramid}!utzoo!henry -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request