Path: utzoo!attcan!uunet!mcsun!ukc!dcl-cs!aber-cs!athene!pcg
From: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Newsgroups: comp.lang.misc
Subject: Re: Multi-compilers
Message-ID: <PCG.90Sep23181145@teachc.cs.aber.ac.uk>
Date: 23 Sep 90 17:11:45 GMT
References: <2581@l.cc.purdue.edu> <1990Sep22.061027.9223@d.cs.okstate.edu>
Sender: pcg@aber-cs.UUCP
Organization: Coleg Prifysgol Cymru
Lines: 218
Nntp-Posting-Host: teachc
In-reply-to: norman@d.cs.okstate.edu's message of 22 Sep 90 06:10:27 GMT

On 22 Sep 90 06:10:27 GMT, norman@d.cs.okstate.edu (Norman Graham) said:

norman> From article <2581@l.cc.purdue.edu>, by cik@l.cc.purdue.edu
norman> (Herman Rubin):

cik> A language is a set of macros to be interpreted by the compiler
cik> into machine procedures.

This is a bit crude -- less crude, but still not entirely satisfactory,
may be that a language is a set of macros for a virtual machine, and the
compiler must both expand the macros and map the virtual machine to the
real machine.

cik> There is no good reason why the user should not be allowed to add
cik> macros in convenient notation to this list, nor why many of the
cik> restrictions on the current macros need be there.

The concept of a supercompiler -- you would be pleased with Lisp/Scheme,
that do not have any such restrictions, and many Lisp/Scheme compilers,
which use a distributed compiler approach -- each relevant function is
tagged with two (or more, if you want) properties, that define its
behaviour with respect to 'COMPILE and 'INTERPRET.

	e.g. the symbol PLUS is associated, under the COMPILE property,
	with code that _generates_ "load op1, load op2, sum op1,op2,op3"
	while under the INTERPRET property it will execute the same
	sequence. A Smalltalk/Self style compiling interpreter is one
	that generates code incrementally and then passes control to the
	chunks of code so generated on the fly. Not difficult to do
	in Lisp/Scheme either.

The compiler and the interpreter are then just symbolic reduction
engines that fire the appropriate rules for each symbol they see in the
source. In one case the reducation produces code to calculate results,
in the other case directly results.  Actually if one is clever it is
possible to realize that first one wants to reduce things at compile
time as much as possible, then one wants to compile the rest.

	What you want to do is to interpret a program as much as you can,
	and then compile what remains, that will depend on free variables.

I think that Forth is similar here -- two (conceptual) dictionaries and
two distinct but similar reduction engines.

norman> This view explains why you're constantly at odds with the rest of 
norman> the netnews community.

Precisely, indeed so many in the netnews community cannot conceive of
anything different from the C or Fortran they are accustomed to use,
or indeed some others cannot see Rubin's motivation entirely, as follows.

norman> First, as anyone with a basic knowledge of languages can tell you,

I am somehow under the impression that Rubin understands languages
better than somebody that gives much importance to syntax. A recent book
on language implementation maps many disparate languages into a simple
lisp/like syntax, with no great loss.

norman> a language is a set of strings (of finite but arbitrary length)
norman> over some alphabet.

Well, that is the syntax of the language -- essentially irrelevant,
except for pragmatics. If this were literally true, yacc would be all we
need for writing compilers. But:

norman> In addition, computer languages are notations for expressing
norman> computations; they are not notations for expressing the machine
norman> instructions you want a compiler to generate.

Uhhhmmmm. Here we seem to be describing some branch of mathematics
instead -- a fairly common idea.  Computer languages exist for pragmatic
reasons. Otherwise we'd be using lambda calculus or Turing machines. I
used to think, erroneously I now understand, that computer languages are
not there to describe computations; they are there for supporting an
engineering like activity, in which expressing computations is a fairly
small part of the whole picture. Analogies are always inaccurate, but it
would be like saying that the reason for a jet engine is the
demonstration of one of Newton's laws, or of the effects of heating a
gas. Maybe civil engineering is only really about differential equations
as well.

norman> As a general rule, a computer language exists apart from any
norman> particular compiler or computer architecture, although issues of
norman> compilation and computation model may influence the design of a
norman> language.

I would dearly hope they do. Actually I would hope they are *prominent*.
I would venture as far as saying that the machine is the reason for
which the language exists. The idea that a language *definition* should
imply a virtual rather than a real machine is to aid portability,
hopefully without hurting its applicability to real machines too much,
and for languages for which portability is not important, a virtual
machine based definition is often a burden.

norman> Second, it is simply naive to describe what a compiler does as
norman> macro expansion.

Maybe you are thinking of macros as they are found in languages like C,
those Rubin is quite unhappy about.

I am quite sure that a compiler can be implemented in Lisp/Scheme
macros. I am using "macro" here as 'procedure that is evaluated at
compile time, and whose function is to transform a piece of program into
something else'.  It need not be a mere syntax macro, say like in cpp.
If you are familiar with Lisp/Scheme, just consider the source and
effects of some common 'loop' macros -- to my poor tired eyes they look
like pieces of a compiler.

cjk> If the machine can do something, the user should be able to invent
cjk> a notation to do it, if necessary.  For example, I find the general
cjk> vector operation on the CYBER 205, with its dozen optional fields,
cjk> easy to explain in a notation easy to read and write.

Maybe Rubin is a bit rough in expression, but I think that he is trying
to say that he would dearly love to have extensible compilers, i.e.
compilers where the set of compile time procedures can be extended by the
user, i.e. the symbolic reduction engine is programmable. The "if
necessary" above does not refer to power, I guess, but to cost.

Well, maybe he should seriously consider Lisp/Scheme. It is well known
that they are excellently suited to numerical computations, and they
provide unparalleled access to the underlying implementation. Maybe a
look a T? Forth is another nice choice. Note that I am not joking here
-- I am very serious. I am considering writing myself an OS kernel in
Scheme or something similar rather than in C/C++. Scheme can be viewed
as a fairly good SIL, even if most Scheme systems do not implement it as
one (especially inasmuch memory management is concerned -- you don't
have the option of doing manual memory reclamation really).

norman> This is written as if you believe the 'power of the computer' resides
norman> in its instruction set. I disagree with this idea.

Interesting disagreement. So Peano arithmetic (zero, successor, test
against zero) is enough and everything else is redundant? Why are we not
using Turing machines? They are as powerful as anything else, and have
a pretty small instruction set.

norman> In fact, the whole RISC movement is founded on the idea that to
norman> build a faster computer you must _remove_ instructions from the
norman> instruction set.

Uhmmm. _must_? I don't think they see it as a necessity -- my own,
probably seriously wrong idea of the subject, is that the RISC movement
is about improving price/performance, not *power* (once you have got
Peano, you have got it all), by observing that it is _often_ a bad
investment to allocate your silicon budget to seldom used, complicated
instructions, but instead it gives better price/performance to allocate
it to speeding up the simpler, most frequently executed ones.

I don't think that this excludes that a RISC machine can have
complicated instructions (vide floating point), if the payoff is
sufficient.  So far the best known payoff for doing matrix computing is
in vector instructions, ergo they are RISCy :-). Or maybe I am no longer
current on this subject and I should agree that:

norman> Likewise, adding an instruction to a modern computer will not
norman> increase its computational power.

Or maybe you should reconsider -- try to remove vector instructions from
a Cray, if they let you. Indeed its power will not be diminished, as
long as Peano instructions are left in, but its price/performance ratio
may be another story, I guess. Or maybe you should reread Hennessy's and
Patterson's book to get a better idea of what is the RISC idea about
(the title gives away almost the whole story).

norman> I believe the 'power of the computer' resides in its ability to be
norman> used as an abstraction device on which we can build higher and higher
norman> levels of reality.

Here we fully agree. The reasons for which computers are so important
are that they are general purpose automata (abstraction engines), and
that they can be cost effective too. Leaving out the second aspect
brings us back to Babbage, I am afraid.

norman> Personally, I chose to live in a reality where
norman> lambda calculus is the underlying computational model.

We couldn't have guessed! :-). Most of us out here are instead in a
reality where they must pay for their machines, and are keenly
interested in price/performance ratios.

Rubin too is unfortunately locked in this sad and regrettable reality,
and also in a segment of it where absolute prices are so huge that
price/performance ratio of a machine/language combination seems not
easily dismissable.

He is quite annoyed that he cannot get the benefits of high level
languages without forgoing those of special purpose hw facilities that
substantially enhance the cost effectiveness of his work, and that also
carried a fairly large price tag (only a few million dollars, I guess,
that's the difference between the price of a 205 and that of a mainframe
of equivalent scalar power). He is even more annoyed as he suspects that
the problem could be rectified with just a little more effort on the
part of language designers and implementors.

norman> But by living in this level of reality I gain the ability to
norman> write extremely clear and straight-forward programs that are 10,
norman> 20, and sometimes even 30 times _shorter_ than the equivalent
norman> programs written in a typical high-level language.

Are you sure that your "programs" are equivalent? A lot of people
(including regrettably Prof. Dijkstra and Prof. Hoare) seem to have lost
sight of the distinction between what is a program and what is an
algorithm written in a program-like notation. A sort program, e.g.  UNIX
sort(1), is quite a different thing from a sort algorithm, e.g.
qsort(3).

My probably warped view of the programming scene is that the best known
general purpose language technology, applied under favourable
circumstances, seems (hand waving very prominently here) no more than
2-3 times more productive than mainstream language technology (Fortran,
Cobol, C, PL/1); claims to the contrary make me a bit skeptical. Now I
admit that even 200-300% is a lot, in economic terms, but it pales by
comparison with the different productivity levels of *people*.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk