Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!cs.utexas.edu!wuarchive!uunet!snorkelwacker!spdcc!esegue!compilers-sender From: Olivier.Ridoux@irisa.fr (Olivier Ridoux) Newsgroups: comp.compilers Subject: Re: (C as) Intermediate Representation Keywords: C, code, translator Message-ID: <9008201059.AA08043@irisa.irisa.fr> Date: 20 Aug 90 22:11:07 GMT Sender: compilers-sender@esegue.segue.boston.ma.us Reply-To: Olivier.Ridoux@irisa.fr (Olivier Ridoux) Organization: Compilers Central Lines: 64 Approved: compilers@esegue.segue.boston.ma.us In article <1990Aug14.163258.2094@esegue.segue.boston.ma.us>, David Chase describes his experiment in using C as an IRL. I agree with all that he said, but I'd like to add a few points comming from an experiment in using C as an IRL for a Prolog compiler. 1- > It is > certainly a mistake (and I learned this the hard way) to expect any kind > of a clean mapping from Modula-3 types to C types; ... It is certainly a mistake to expect any kind of a clean mapping from anything in the compiled language to anything in C (or any programming language) even if both taste the same. For instance, in Prolog one predicate is mapped on one function to benefit from the induced scope. But recursion in a predicate is not mapped on recursion in a function. In general, C devices are used for only a part of their capabilities. This leads to "perverse" programs in which some capabilities are over-used and others never used. There is no shame in it as long as it remains portable. A first example of this is the use of a C function for its scoping capability only. Another example comes from the compilation of unification. Unification is mapped on a sequence of tests that do side-effect for controlling a term (tree-like structure) traversal. As soon as a test fails the sequence is aborted and a failure treatment executed; subsequent tests must NOT be executed. The left-to-right evaluation of "&&" or "||" has exactly this behaviour. So mapping unification on a "&&" expression is a concise way of coding the failure control, but produces degenerate expressions. Unfortunately it may produce pathologically large expressions. Previous authors mentioned the need for an efficient back-end; a robust front-end is required too. 2- > For Modula-3, the big difficulties were caused by (1) exception-handling, > (2) garbage collection, and (3) nested procedures. A fourth difficulty is the lack of label-as-first-class-value in C. Since one cannot map directly Prolog recursion on C recursion, Prolog recursion is implemented through the use of a kind of scheduler. I suspect that it is also the case in many languages. The scheduler wants to manipulate control points as values, but the only control points that can be so manipulated are function pointers. If one can afford supplementary constraints switch entries also do the job. But in many cases a label would do better. 3- "C as IRL" should perhaps been rephrased as "C as low-level IRL". I believe it is good taste to hide C behind abstract machines. In the case of Prolog, I use two abstract machines. One is called MALI and is in fact an abstract memory. It deals with the representation of terms, goal statements and choice points (depth-first management of indeterminism). It contains an efficient garbage-collector. MALI is implemented in C but is not much influenced by it. The other machine occupies the same ecological niche as the WAM (Warren's Abstract Machine). It is implemented in C and is deeply influenced by it. It is in fact the record of the assumed perversions (see first point). One can say as a gross approximation that Prolog data structures are mapped on MALI and Prolog control structures are mapped on the second machine. Olivier Ridoux -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.