Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!cs.utexas.edu!yale!cmcl2!lanl!lambda!jlg From: jlg@lambda.UUCP (Jim Giles) Newsgroups: comp.lang.misc Subject: Re: Cheap implementations of languages (Re: Pointers and poor implementations (was: Re: JLG's flogging ...)) Message-ID: <14317@lambda.UUCP> Date: 10 Apr 90 00:22:57 GMT References: Lines: 105 From article , by peter@ficc.uu.net (Peter da Silva): > [...] Like, the fact that C can run at all in such a minimal > implementation has proven to be one of it's strengths. How many PD Fortran > compilers are there? I've heard this old saw for years now. But, strangely, not often from a compiler writer (and, on such rare occasions, said compiler can't supply any specifics). The fact is that C is not simpler to implement than the other procedural languages in general use. In some ways it is harder to implement. Since you pick Fortran as the point of comparison, I will look at both. I) The Compiler Frontend a) Lexical scanning and parsing. Fortran requires extra work in the lexical scanner to distinguish keywords in some contexts. This is because blanks are not significant and are optional. The Fortran lexer must be able to tell Do10I=5.10 from Do 10 I=5,10. The method for doing this is simple and widely known, but still, you'd have to say that C was easier to scan and parse. Note that the two languages are equally easy to parse from all other points of view. Both are LR(1) and both have LALR or SLR grammers which can be pressed into service (usually by arbitrary elimination of shift-reduce conflicts by hand). b) Symbol table manipulation. In Fortran, there are only two types of symbol: local and global. The local symbols are divided into the usual intrinsic data types, and the global symbols are either external modules or are common blocks. The symbol table in Fortran compiling is built from scratch for each module compiled and is a simple object. C (like Pascal) has a nested scoping mechanism. Each object declares _some_ local symbols and inherits any defined in an outer scope that do not conflict with the new local names. This requires C compilers to maintain a somewhat more complicated symbol table that Fortran requires. Once again, the method for doing this is simple and widely known, but still, you'd have to say that Fortran was easier to implement in this regard. This is pretty much all the difference (in difficulty) to implement the compiler frontend for the two languages. The two languages are identical feature-for-feature to the rest of the compiler. That is, features which both languages have are equally hard to implement (and might be done identically in the case of shared backend compiler developements). So, the remaining differences have to do only with the differences in features. II) Feature differences. a) Pointer call by reference vs. Fortran parameter passing. C allows parameter passing by reference through pointers. The usual Fortran implementation is to pass all args by reference (although all by value/result is also allowed by the standard). In C, references are explicitly pointers and are allowed to be aliased. In Fortran, the references are implicit and aliasing is NOT permitted. This means that C procedures contain a lot of spurious dependencies in the intermediate code that Fortran intermediate code doesn't have. This makes C much harder to optimize as well. Now, since this discussion is about simplicity of implementation, there may be NO optimization performed and so, no difference in implementation difficulty. But, even a simple compiler may want to do SOME optimization - so this gives a slight not toward Fortran as simpler. b) Structs, Unions, dynamic memory, pointers, recursive procedures .... C has em, Fortran doesn't. Fortran is easier to implement on all of these counts. Slight reflection will suffice to find even more things is category IIb. In fact, from an unbiased examination, Fortran - not C- looks to be the simpler to implement. Actually, Fortran is among the simplest to implement of all procedural languages. This is not to say that Fortran is better off without IIb features or nested scoping. In fact Fortran _should_ have structs, unions, dynamic memory, and recursive procedures (to name a few). But the lack of these features _does_ make Fortran easier to implement. Note: you actually picked the worst example for your argument. If you had picked Pascal or Modula as your comparison, I would have had to conclude that the implementations were about the same in term of difficulty. But the fact is, I've just been through this same argument over the net with someone who claimed Fortran 90 was too big a language to ever be considered. I pointed out that the new standard will have a _subset_ of the features in C. They are somewhat better designed and more consistently expressed - but there are no new Fortran features which C doesn't already have a version of. (In fact, Fortran 90 makes better use of many features: for example, both C and Fortran 90 have function prototypes and the attendant implementation difficulties. But the Fortran 90 standard makes use of these to provide generic functions which C doesn't have. Both language have the same implementation overhead but Fortran gets more out of it.) J. Giles