Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!wuarchive!zaphod.mps.ohio-state.edu!tut.cis.ohio-state.edu!snorkelwacker!mintaka!spdcc!esegue!compilers-sender From: markh@csd4.csd.uwm.edu (Mark William Hopkins) Newsgroups: comp.compilers Subject: Multi-compilers Keywords: design, source Message-ID: <9009110403.AA03158@csd4.csd.uwm.edu> Date: 11 Sep 90 04:03:02 GMT Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: Mark William Hopkins Organization: Compilers Central Lines: 78 Approved: compilers@esegue.segue.boston.ma.us Recently, an interesting idea has come to mind for a new kind of compiler: a Multi-Compiler. What makes it different from your typical compiler is that it accepts code from more than one source language: many source languages in fact. However, it's an idea that is easier said than conceived. What would it look like? The whole issue seems to revolve around this concept (which I borrow from linguistics) of 'code-switching'. Code-switching is where a multi-lingual speaker switches from one language to another, often in mid-sentence. For instance, while waiting for a departure from an airport in Budapest, I got in a conversation with an East German traveller. However, my German was weak, his English was non-existant, and our Hungarian was not very strong. So we found it necessary to literally sprinkle our conversations with almost random switching between German and Hungarian. Each language offered something which compensated for something lacking in (our knowledge of) the other. A good programmer will also face the same kind of dilemma. Different languages are designed to do different things better. An extreme example is the case of writing a truly practical AI control program which would ideally handle all the intelligent rule-based tasks in Prolog, and all the event-driven tasks in assembly and C, and maybe even the recognition and learning tasks in the assembly of a special purpose neural net chip. The question, naturally, is: when are you allowed to code-switch? Depending on how you answer this, you either got a closely integrated set of *distinct* compilers (like the Quick series marketed by MicroSoft), or a truly integrated programmer's utility. If you force the "one-language-per-module" constraint, which a lot of people I talked to about this seem to arrive at as a first idea, then you have nothing more than a series of disjoint compilers integrated by a common object code format and single linker. In this case, it's "all in the linker". But in that situation, there would remain the question: when you define a module in language A, and use it in language B, which language do you declare it in? Declaring it in B, potentially means a lot of redundant header files, and declaring it in A means having to resolve the issue of how to interface data types of different languages. This could be very much complicated if your languages vary between the highly imperative C, to the highly declarative Prolog. If you allow for interlanguage mixing within modules, you will face a more extreme version of the data-type interfacing problem, and possibly even a control statement interfacing problem. Here, the ideal solution seem to be the "one-language-per-function" rule. But in this case, it's "all in the compiler", not the linker. Syntax is not an issue. We're not talking about actualy merging the syntaxes of the source languages into one horrific construct (though that would be an interesting problem to solve). When you want your compiler to do C, you issue a #in c directive. When you want it to switch to Pascal, you likewise issue a #in pascal directive, and so on... With this latter strategy (more than one language per file), the issue of what language you issue external declarations becomes moot: since it's all "going down the same stomach" anyhow, it doesn't matter. The best strategy to pursue to minimize these problems see to be to simultaneously develop extensions of each language that are upwardly compatible with the latest standard and which make these languages as much alike as possible. This means adding C/Pascal-like data structures and control structures to the likes of FORTRAN or BASIC, for instance. It seems to me, though, that the huge investment in this effort would be very much worth it, since no matter where I talk and who I talk to about this, the idea goes over extremely well: it seems that we're talking about the ultimate programmer's workbench with this kind of utility. But there's this one nagging issue: what would this give us that using a series of compilers, like MicroSoft's Quick series, with a good linker won't already give you? -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.