Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!husc6!purdue!haven!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.lang.misc Subject: Re: Language Design Message-ID: <16674@mimsy.UUCP> Date: 2 Apr 89 17:01:01 GMT References: <5200040@m.cs.uiuc.edu> <12443@watdragon.waterloo.edu> <1163@frog.UUCP> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 37 In article <1163@frog.UUCP> john@frog.UUCP (John Woods) writes: >I think the crux of the argument of the non-context-free crowd is that >the grammar in K&R does NOT actually specify the C language by itself, not >as a grammar. It relies on the trick of telling the lexer about typedefs, >which is a form of back-door context. For practical parsing (using, e.g., yacc), yes; for CF-ness, no. The *real* crux of the argument is that while you can write a decent, simple, no-feedback-through-lexer CFG that accepts all C programs, you can *not* write a decent, simple, no-feedback LALR(1) parser to do this, because the simple CFG is ambiguous. Those who say `C does not have a CFG' are really saying `C does not have a trivially yacc-able grammar', which is true (but, as I keep saying, really means only that we need a better parser generator than yacc). (And no, it does not take a nondeterministic automaton to parse C. The little back-door trick would work wonderfully well if only you could know, each time a yacc parser examines a token, which rule yacc was attempting. When it tries `typedef-name: identifier', the lexer should answer `identifier' if and only if the identifier happens to be a typedef. Then make sure that yacc tries this rule before trying `primary: identifier' [this part is easy] and voila! you have a parser. Of course, yacc actually only calls the parser once for both rules; you would need a way to change the token on the fly. A better gimmick---less expensive in compute power, at least---would be to associate a function (predicate) with each rule, and allow the predicate to abort the rule. You then attach an `identifier-really-is- typedef' predicate to the `typedef: identifier' rule, making the parser pass over this resolution and choose instead the `primary: identifier' rule. Indeed, the same trick would allow one to reject undeclared identifiers, making `program foo; begin bar := 1 end.' a syntax error instead of a semantic one---not that this is a good thing to do!) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris