Path: utzoo!attcan!uunet!lll-winken!ames!elroy!orion.cf.uci.edu!uci-ics!siam.ics.uci.edu!schmidt From: schmidt@siam.ics.uci.edu (Doug Schmidt) Newsgroups: comp.lang.c++ Subject: Re: Zortech distribution methods (was: Versions of Zortech Compiler) Message-ID: <3748@paris.ics.uci.edu> Date: 12 Jan 89 02:31:11 GMT References: <6578@pogo.GPID.TEK.COM> <6590084@hplsla.HP.COM> Sender: news@paris.ics.uci.edu Reply-To: Doug Schmidt Organization: University of California, Irvine - Dept of ICS Lines: 83 In article <6590084@hplsla.HP.COM> jima@hplsla.HP.COM (Jim Adcock) writes: >> Excuse me, >> But what is the difference between "c" and "c++"??? >> Why would it be more difficult to write a compiler for C++? >> >> Sorry if someone asked this question before and I missed it. > >I think a lot of people have been saying that a c++ compiler >is a lot more difficult beast than a c compiler, but I'm not >sure that's true. (Let me know when yours is finished, so I can test it! ;-) ;-) ) >For example, Tiemann is building g++ on gcc, and I believe the >vast majority of the code is still in common, though Tiemann &| >Stallman would be the right people to answer that. > After spending lots of time generating bug reports for GNU g++, I'd say that there are several areas where writing a fully-functional C++ compiler is *substantially* more difficult than writing a ``regular'' C compiler. Here are two examples that stand out in my mind: 1. The C++ grammar is inherently ambiguous (read the g++.texinfo description for detail about this). This means that traditional UNIX tools, like YACC (or in g++'s case, BISON, using bison.simple), are going to have problems with certain portions of C++ syntax. In order to properly parse the language, you are likely to need a recursive-descent parser with lookahead capabilities, and some heuristics. This approach is more tedious, error prone, and non-extensible. Even cfront doesn't always get it right. For example, try the following valid C program with cfront 1.2.1: ---------------------------------------- main ( ) { int A[10][10]; int (*B)[10] = A; } ---------------------------------------- I get: ---------------------------------------- CC test.c: "test.c", line 3: error: B is undefined 1 error ---------------------------------------- I suspect that cfront is parsing this as an indirect function call, with a return value cast as an int, through what it believes is a pointer to a function, i.e., B. Certain legal constructs in the C++ language are simply not included in GNU g++, for this reason (e.g., old-style C function definitions). If you want more information on this, read S.B. Lippman and B.E. Moo's excellent description in ``C++: From Research to Practice,'' from the 1988 USENIX C++ Workshop Proceedings. The following is a particularly apt quote from that article: Maintaining the old-style C syntax is likely to be the design choice for which Stroustrup's name will be most taken in vain by compiler writers to come! (page 128) 2. Certain areas of the C++ language definition are vaguely defined. As any software engineer will tell you, it is difficult, if not impossible, to write a ``correct'' implementation from an incomplete or inconsistent specification. As any regular reader of this newsgroup will attest, there are numerous questions posted here which do not appear to have simple and direct answers from immediately accessible language reference guides (for example, I've not seen a reply to the recent posting regarding inheritance of base constructors in derived classes). This is not really a criticism of C++, since it is still an experimental language and the designers are being careful not to prematurely overcommit to constructs that may return to haunt them. However, it is certainly a factor that greatly increases the complexity of writing a compiler for the language. Doug -- schmidt@ics.uci.edu (ARPA) | Per me si va nella citta' dolente. office: (714) 856-4043 | Per me si va nell'eterno dolore. | Per me si va tra la perduta gente. | Lasciate ogni speranza o voi ch'entrate.