Path: utzoo!utgpu!watmath!watdragon!akwright From: akwright@watdragon.waterloo.edu (Andrew K. Wright) Newsgroups: comp.lang.misc Subject: Re: Language Design, or: is C's Grammar Context Free Keywords: BNF, Grammars, Silliness Message-ID: <12778@watdragon.waterloo.edu> Date: 28 Mar 89 15:17:48 GMT References: <5200040@m.cs.uiuc.edu> <12443@watdragon.waterloo.edu> <16557@mimsy.UUCP> <16559@mimsy.UUCP> Reply-To: akwright@watdragon.waterloo.edu (Andrew K. Wright) Organization: U. of Waterloo, Ontario Lines: 41 In article <16559@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >Incidentally, now that I have said all that (some might say `too much' >:-) ), the only real difference between a C grammar that accepts >undeclared typedef names and a Pascal grammar that accepts undeclared >variable names is that, for some reason, people like to think of type >declarators as keywords and variables as identifiers. At least, this >is the only explanation I can see that we never argue whether Pascal's ^^^^^^^^^^^^^^^^^^^^^^^^^^ >grammar is context free. (If you require the parser to catch undeclared >variables, it is not, just as C's is not if you require the parser to >catch undeclared typedef names.) The difference I see is that a C grammar that accepts undeclared typedef names destroys more of the structure of the grammar than a Pascal grammar that accepts undeclared variable names. Consider the following C expression: x = (a)(b); If "a" is a typedef name, then the right side is a cast. If "a" is not a typedef name, then the right side is a function call. I dont believe similar situations exist for Pascal (corrections welcome). As G.V. Cormack notes, any language can be parsed by some context free grammar; we are interested in grammars which represent the semantic structure of the language as closely as possible. About "C's grammar [not] being screwed up": with a few simple kludges such as recognizing typedef names in the lex, it is fairly easy to build a grammar which has sensible structure for C. Or is it? Some points were raised about where PCC does/doesnt turn off typedef recognition, and this seems ill-defined. (Anybody know what the ANSI draft says?) If having parts of your language ill-defined does not bother you, consider trying to build a suffix or reverse parser for C. Such a parser might be used in a syntax directed editor, or for syntax error recovery. Such parsers cannot preserve the structure of a C grammar because they do not have available the information required to tell if an identifier is a typedef or not. Andrew K. Wright akwright@watmath.waterloo.edu CS Dept., University of Waterloo, Ont., Canada.