Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!world!esegue!compilers-sender From: md89mch@cc.brunel.ac.uk (Martin Howe) Newsgroups: comp.compilers Subject: Multi-compilers -- The ``Ideal'' Programming language ?? Summary: Never mind implementation, specification's difficult enough ! Keywords: design Message-ID: <1990Sep25.025517.25446@esegue.segue.boston.ma.us> Date: 25 Sep 90 02:55:17 GMT References: <9009110403.AA03158@csd4.csd.uwm.edu> Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: md89mch@cc.brunel.ac.uk (Martin Howe) Organization: Dept. of Elec. Eng. & Electronics, Brunel Univ., Uxbridge, UK Lines: 216 Approved: compilers@esegue.segue.boston.ma.us In the middle of my periodical musings about what should or should not go into an ``ideal'' programming language, and how people tend to resist such ideas on grounds of implementation difficulty (among others), I came across article <9009110403.AA03158@csd4.csd.uwm.edu> in which Mark William Hopkins writes: > Recently, an interesting idea has come to mind for a new kind of compiler: >a Multi-Compiler. What makes it different from your typical compiler is that >it accepts code from more than one source language: many source languages in >fact. In fact reading the Byte 15 Year Anniversary issue, it seems Jensen & Partners have come up with just that - the TopSpeed system. I tell you, FROM stdio IMPORT printf; looked pretty odd at first sight. > What would it look like ? The whole issue seems to revolve around > this concept (which I borrow from linguistics) of 'code-switching'. In fact, TopSpeed isn't, as far as I can make out, a true ``multicompiler''; JPI seem to do it around libraries. One uses one language as a top-level shell and calls library routines from whichever languages have been installed with your system. However, I have felt for some time, that multicompilers when they arrive will not solve the problem very much more than mixed-language compilation and linkable object modules do now. The __real__ problem, in my estimation is that of deciding exactly *what* should go in an all-embracing language. As Mark Hopkins says: > Different languages are designed to do different things better. I would go further: different programming pradigms do things better. This is obvious; but the solution, while equally obvious, doesn't seem to have been tried [except Trilogy ?] and multicompilers only sidestep it. There are at the moment, four well-known programming pradgigms: imperative, funtional, logic and object-oriented. There may be others, but these ones are the main four at the moment. People often ignore the fact that real-world problems often require one or more language types to solve them and for this reason, I have suggested in the past and will continue to suggest, that a ``multi-language'' which covers all four is, rather than an ``ideal impossibility'' or ``too difficult to implement'' or a ``bloated compiler'' [substitute whinge of your choice], an ABSOLUTE NECESSITY if anything even remotely like an ``ideal'' programming language is ever to be designed. I suggest that while we can never create the ideal , we can come pretty close, and I offer the following possible solution for discussion: For each type of language (four at the moment), extract a minimal language that fulfills the requirements. For example, bare-bones Modula-2 for the imperative requirements. Design a lexicon and grammar that covers all four and are as natural-language like as possible without being imprecise. If you have to go to LL(2) or have a two-level parser so be it; MIPS are cheap these days (hey, I'm a VLSI designer, I should know :-); and human time isn't. Let the library (ie, object class) writers extend as necessary. This is another focal point. It is stupid to say ``Oh, but the user can write routines to extend the language.'' Oh yeah ? Then tell me which of the following is more readable, given a library of complex arithmetic functions: sin := (e**z - e**-z) / 2i (* note the lack of garbage like FLOAT *) sin := CompDiv(CompSub(CompExp(z),CompExp(-z)),CompAssign(0,2)); It gets worse if you can't return user-defined non-cardinal types (ie pointers to them) on the stack. This is another flaw in some languages today. If I code VAR meow : ARRAY[0..262144] of byte; and later on in a procedure RETURN meow; I **know** the compiler isn't going to return a 256kByte array on the stack; it'll use a pointer. But I, the programmer DON'T NEED TO KNOW THIS ! There can be no excuse these days for not allowing ANYTHING to be returned from a procedure, but even Modula-2 Rev. 4 doesn't do this. Pfft! Furthermore, make it easy to define not only your own operators, but also __your own textual forms for literals__. I would rather write CONST zin = z2 / (5+3i) than CONST zin = CompDiv(z2,CompAssign(5,3)) for example. Again, at this point, people usually start to whine, but I would say that there is almost certainly a crossover point past which, as languages get more natural-looking, the designer can think in higher level terms, and express higher level ideas more succinctly, and therefore __LESS BUGGILY__. (Who cares about EOL & EOT ? WHILE (<>) looks fine to me). (Of course they can express higher-level algorithm flaws more succinctly :-) Of course, it must be remembered that someone who must have been very clever once remarked: "Enable programmers to program in English, and you will find that they can't". This is true up to a point. Our language must be limited, or it will lose any preciseness. I am saying also that a __lot__ of extra syntactic freedom in saying what you _can_ say in the language, and current languages just don't provide it. For example, is it really so difficult to parse out the noise words in z2 := the 53rd 130th root of z1; given a prodecure CompRoot(complex,integer,integer,complex) ? Perhaps with objects available, we can provide self.parser as a routine with each declared type [recursive compilation anyone ?]. Oh, and one more thing - MACROS ! If I am putting together a library of IO routines based on a library that comes with the compiler, I don't want a function call overhead, whenever I use any of those routines verbatim. For example, if I rewrite sin() and cos(), but leave exp() alone, I take a performance hit when I say MyExp(number:real); BEGIN RETURN maths.exp(number:real) END MyExp; since MyExp is a real function, not a macro. Furthermore, I frequently want to be able to dump a copy of a routine inline without doing it as a function call, eg., for reasons of speed; but keeping only one main definition of that function. How about BEGIN ... EXEC (some_horribly_complicated_test()) (*rather than *) some_horribly_complicated_test(); ... END; For that matter, INC(x) looks like a procedure, but it'd damn well better be a SINGLE assembler instruction in practice, or else. ------------------------------------------------------------------------ Well I've got that lot off my chest after so many years, so let's clean up the loose ends. Mark continues: >people I talked to about this seem to arrive at as a first idea, then you >have nothing more than a series of disjoint compilers integrated by a common >object code format and single linker. BTW, JPI use a common p-code and object code generator. > Syntax is not an issue. Here I must disagree. See above. > We're not talking about actualy merging the syntaxes of the source languages I am (sort of). >would be an interesting problem to solve. You bet ! > When you want your compiler to do C, you issue a #in c directive. When you > want it to switch to Pascal, you likewise issue a #in pascal directive, and > so on... I have thought of this before, but I'm not sure I'd like it. > With this latter strategy (more than one language per file), the issue of > what language you issue external declarations becomes moot: since it's all > "going down the same stomach" anyhow, it doesn't matter. I couldn't agree more, but I still feel the #C #pascal idea would look too odd. Still, its a matter of taste. > The best strategy to pursue to minimize these problems see to be to > simultaneously develop extensions of each language that are upwardly > compatible with the latest standard and which make these languages as much > alike as possible. This means adding C/Pascal-like data structures and > control structures to the likes of FORTRAN or BASIC, for instance. I'll go along with that in the meantime, despite the people who laugh when I say it. Believe me, many people I have talked to find such ideas anathema. > It seems to me, though, that the huge investment in this effort would be >very much worth it, since no matter where I talk and who I talk to about >this, the idea goes over extremely well: it seems that we're talking about >the ultimate programmer's workbench with this kind of utility. Agreed. > But there's this one nagging issue: what would this give us that using a >series of compilers, like MicroSoft's Quick series, with a good linker won't >already give you? A completely integrated and normalised language, tailored to fit the majority of real-world problems (at least those we know how to do at the moment) with as few _extraneous_ ways of doing the same thing as possible. Oh well. I can dream... Regards, Martin. (I leave Brunel University at the end of next week, but I'll happily discuss this (if anyone's interested) until then). -- Martin Howe, Microelectronics System Design MSc, Brunel U. [A J Perlis often commented that attempts to combine dissimilar language types produced "dumbbell shaped languages," i.e. the pieces didn't fit together very well. I'd also like a language that lets me say anything I want to say very concisely, but I'm not convinced that I can define something that combines all sorts of different stuff and doesn't end up looking totally ugly. More specific proposals could be persuasive. Also, there has been a long thread on this topic in comp.lang.misc. -John] -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.