Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!jarthur!uci-ics!rfg From: rfg@ics.uci.edu (Ronald Guilmette) Newsgroups: comp.lang.c++ Subject: Re: zortech problem with lex Message-ID: <25ECE752.29589@paris.ics.uci.edu> Date: 1 Mar 90 09:11:46 GMT References: <6300008@ux1.cso.uiuc.edu> <24800002@sunb6> <25EC5CBF.26673@paris.ics.uci.edu> Reply-To: rfg@ics.uci.edu (Ronald Guilmette) Organization: UC Irvine Department of ICS Lines: 69 In article <25EC5CBF.26673@paris.ics.uci.edu> schmidt@zola.ics.uci.edu (Doug Schmidt) writes: >There are a number of confusions in your post. Let me see if I can >shed some light on this. > >In article <24800002@sunb6>, voss@sunb6 writes: >>I saw that comment about C++ not being LALR(1) in the g++ distribution, >>and at the time believed it. However, my biggest thrill at OOPSLA '89 >>was sitting at the same table as Bjarne Stroustrup for a few hours drinking >>free beer. He said that he IS USING A LALR(1) GRAMMER for C++ 2.0. >>He appeared sober & honest, therefore I assume that C++ is LALR(1), but >>non-trivial. > >Your statement confuses the difference between an LALR(1) language >(which C++ is *not*) and a particular set of tools used to recognize a >language. Doug, I afraid that I have to point out that *your* statement confuses the difference between an LALR(1) language and the patently false assertions that some people have made vis a vis the supposed non-LALR(1)'ness of C++. As was pointed out to me a couple of years ago by a fellow who really should know something about the subject (i.e. Tom Pennello) virtually any language can be considered to be trivially LALR(1) in the sense that the grammar can always be reduced to V* which will accept all legal inputs in the language. Rejection of illegal inputs with such a grammar would necessarily be strictly a matter for semantic analysis. Now it is quite obvious that the use of V* as a grammar to guide an LALR(1) parser would not allow the resulting parser to detect many sorts of errors which we would normally think of as being "syntatic" errors, (like say none) but the division between syntatic and semantic errors is an artificial one and such a grammar, when mashed into pasring tables and combined with a general LALR(1) parser driver, would in fact successfully (if trivially) parse C++, thus clearly indicating that C++ *is* LALR(1). The fact of the matter is C++ is only asserted to be be non-LALR(1) by persons who want to see more constructs in the language be disambiguated via easy-to-implement LALR(1) syntax rules rather than via much harder to implement semantic rules. There is nothing wrong with such a goal. In fact I would say that this is a very important goal. Anybody who has ever had to build a recognizer for C or C++ or FORTRAN or Ada will certainly tell you that it is one hell of a lot easier to disambiguate things via context free syntax rules than it is to do the same job via context sensitive semantic rules. In particular, typedefs have kept many C compiler writers awake at nights thinking about the torturous things they do to feed information backwards from semantic anaysis so that it can be subsequently considered by their (more convenient and more regular) LALR(1) syntax analyzers. So the goal is good. What is *not* good is muddling the issue with meaningless catch phrases like "C++ is not LALR(1)". In order to move the discussion onto a higher plain, let's try to be accurate and just say that it would be nice if an LALR(1) grammar for C++ could allow for much more disambiguation via easy-to-implement LALR(1) syntax rules. >What Bjarne probably meant (judging from the cfront 2.0 >sources) is that he was using the LALR(1) parser generated by yacc to >recognize C++. Free advice: Second guessing Bjarne's intended meaning based on sketchy second-hand accounts is probably inadvisable. // Ron Guilmette (rfg@ics.uci.edu) // C++ Entomologist // Motto: If it sticks, force it. If it breaks, it needed replacing anyway.