Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!jarthur!uci-ics!rfg
From: rfg@ics.uci.edu (Ronald Guilmette)
Newsgroups: comp.lang.c++
Subject: Re: zortech problem with lex
Message-ID: <25ECE752.29589@paris.ics.uci.edu>
Date: 1 Mar 90 09:11:46 GMT
References: <6300008@ux1.cso.uiuc.edu> <24800002@sunb6> <25EC5CBF.26673@paris.ics.uci.edu>
Reply-To: rfg@ics.uci.edu (Ronald Guilmette)
Organization: UC Irvine Department of ICS
Lines: 69

In article <25EC5CBF.26673@paris.ics.uci.edu> schmidt@zola.ics.uci.edu (Doug Schmidt) writes:
>There are a number of confusions in your post.  Let me see if I can
>shed some light on this.
>
>In article <24800002@sunb6>, voss@sunb6 writes:
>>I saw that comment about C++ not being LALR(1) in the g++ distribution,
>>and at the time believed it.  However, my biggest thrill at OOPSLA '89
>>was sitting at the same table as Bjarne Stroustrup for a few hours drinking
>>free beer.  He said that he IS USING A LALR(1) GRAMMER for C++ 2.0.
>>He appeared sober & honest, therefore I assume that C++ is LALR(1), but
>>non-trivial.
>
>Your statement confuses the difference between an LALR(1) language
>(which C++ is *not*) and a particular set of tools used to recognize a
>language.

Doug,

I afraid that I have to point out that *your* statement confuses the
difference between an LALR(1) language and the patently false assertions
that some people have made vis a vis the supposed non-LALR(1)'ness of C++.

As was pointed out to me a couple of years ago by a fellow who really should
know something about the subject (i.e. Tom Pennello) virtually any language
can be considered to be trivially LALR(1) in the sense that the grammar can
always be reduced to V* which will accept all legal inputs in the language.
Rejection of illegal inputs with such a grammar would necessarily be strictly
a matter for semantic analysis.

Now it is quite obvious that the use of V* as a grammar to guide an LALR(1)
parser would not allow the resulting parser to detect many sorts of errors
which we would normally think of as being "syntatic" errors, (like say
none) but the division between syntatic and semantic errors is an artificial
one and such a grammar, when mashed into pasring tables and combined with a
general LALR(1) parser driver, would in fact successfully (if trivially) parse
C++, thus clearly indicating that C++ *is* LALR(1).

The fact of the matter is C++ is only asserted to be be non-LALR(1) by
persons who want to see more constructs in the language be disambiguated
via easy-to-implement LALR(1) syntax rules rather than via much harder to
implement semantic rules.

There is nothing wrong with such a goal.  In fact I would say that this is
a very important goal.  Anybody who has ever had to build a recognizer for
C or C++ or FORTRAN or Ada will certainly tell you that it is one hell
of a lot easier to disambiguate things via context free syntax rules than
it is to do the same job via context sensitive semantic rules.  In particular,
typedefs have kept many C compiler writers awake at nights thinking about
the torturous things they do to feed information backwards from semantic
anaysis so that it can be subsequently considered by their (more convenient
and more regular) LALR(1) syntax analyzers.

So the goal is good.  What is *not* good is muddling the issue with meaningless
catch phrases like "C++ is not LALR(1)".  In order to move the discussion onto
a higher plain,  let's try to be accurate and just say that it would be nice
if an LALR(1) grammar for C++ could allow for much more disambiguation via
easy-to-implement LALR(1) syntax rules.

>What Bjarne probably meant (judging from the cfront 2.0
>sources) is that he was using the LALR(1) parser generated by yacc to
>recognize C++.

Free advice:  Second guessing Bjarne's intended meaning based on sketchy
second-hand accounts is probably inadvisable.


// Ron Guilmette (rfg@ics.uci.edu)
// C++ Entomologist
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.