Path: utzoo!utgpu!water!watmath!clyde!burl!codas!ufcsv!gatech!bbn!denbeste
From: denbeste@bbn.COM (Steven Den Beste)
Newsgroups: comp.sys.amiga
Subject: Re: Code Optimization
Message-ID: <6251@cc5.bbn.COM>
Date: 24 Jan 88 16:12:46 GMT
Organization: Bolt Beranek and Newman, Inc., Cambridge, MA
Lines: 126

Johan Larson writes:

> Currently, single pass compilers are very popular, due to their speed.  This
> speed allows the development cycle to be shortened, sometimes by very
> significant amounts.  However, unless I'm very wrong, the speed of the
> single pass compiler is gained at the expense of code quality.  One 
> immediately obvious solution is a code optimizer, which could be used on a
> compiled program once the critical debuggion is done.  However, such a 
> code optimizer could also be put to another, and potentially more powerful
> use.
It depends on what you mean by a "single pass" compiler. If you mean a
single-pass compiler which yields assembly language, then you are sort of
cheating, because the two passes of the assembler represent the second and
third passes of your "one-pass" compiler.

If you mean a single-pass compiler which goes direct to object (or executable -
did I mention the two linking passes above, too?) then the big big problem with
single pass compilers is that they have to keep everything in memory. If your
program won't fit (and it has to be the WHOLE program, too - you can't break it
up unless you have a separate linker (which means two more passes)) then you
are SOL.

The biggest advantage of a multi-pass compiler is that the available memory at
compilation is almost unrelated to the size of the program being compiled (the
only thing which is related is the symbol table size).

If a single-pass compiler operates in memory, there is no fundamental reason
why it can't generate code just as efficient as a multi-pass compiler.

(By the way, I think you must be a Mac person. I don't know of any single pass
C-compilers for the Amiga, and very little development is done in any other
languages for it.)

> The Mac and Amiga worlds are currently in the process of changing from the
> 68000 processor to the similar, but more powerful 68020.  The 68030 has also
> been rumoured to be lurking somewhere on the horizon.  However, regardless of
> whether a '000 or a '020 is installed, the same code is used.  Thus, if a 
> system is using the more advanced '020, it is using code which was written for
> a more primitive, though similar, processor and thus the code does not take
> proper advantage of the more advanced features of the more modern processors.
> The same agrument applies for the math co-processors, the 68881 and its new
> sibling the 68882, which are rapidly becoming more popular on home computers.
Now, I may be wrong about this, but the big advantage of the 68020 isn't so
much new instructions (there are a few, but they aren't commonly used) but the
fact that it executes the old instructions much faster - because it uses a
32-bit internal data path and the 68000 uses a 16-bit path, so it takes twice
as many internal cycles on the 68000 for a given amount of wide data.

The execution speed of a program using the 68000 set, but run on the 68020,
will not be much worse typically than the same program compiled to use the full
68020 set - but both will beat the pants off of a 68000 running the same clock
rate.

The one big exception to this is that the 68000 cannot work with a math
co-processor and the 68020 (and the 68010 before it) can - so for math stuff if
you use the coprocessor instructions you can get orders of magnitude increases
in speed. There are ways of working around this so that the same code works on
both, but takes advantage of the math chip if it is there:

Way #1: All math operations are handled as calls to a run-time-loaded library
of math routines. On a 68000 or a 68020 without math chip, you load a software
floating point library. On the 68020 with math chip, you load one which calls
the chip. (This is what the Amiga does.)

Way #2: Everyone's code uses the math chip instructions, but on a 68000 or an
uninstalled 68020 they trap as illegal instructions - to a special routine
which emulates them in software and returns!

In the former case you pay a small performance hit on the 68020 with math
chip as compared to the latter.


> The code optimizer, if properly written, could remove the inefficiency of
> using primitive code on an advanced processor, simply by modifying the
> code to take advantage of the features of the more advanced processor.
> Since not everyone has a 68020, it would be left up to the user to decide
> whether or not (s)he wants the code of a program modified in such a way.
> The optimizer could also be written to rewrite the program code so as
> to take advantage of a coprocessor, it one is present.
An optimizer cannot run on compiled binary - the problem is indeterminate. The
reason is simple: The optimizer cannot differentiate between data and code.
(And if it optimizes the data, the program dies.) Worse than that, even if it
knew that it was in code, it cannot unambiguously identify the beginnings of
instructions.

In any case, in anything greater than a peephole optimizer, even the raw
assembly language isn't enough - the optimizer must be given information about
the original program.

An example is "common subtree" optimization in a complex
expression: To identify this is difficult and fraught with errors in the
assembly language. The right place to do it is in the parser when building the
expression tree.

Another example (probably closer to your heart) would be "Oh, I can use a 68881
instruction here". Unless your math library emulates the 68881 on a
one-call-per-one-instruction basis, you may not be able to figure out that a
68881 instruction will help - without reference to the original source.
(And if you are operating on binary, you may not know that this particular
subroutine call was a call to the math library, anyway!)

Peep-hole optimizers have their place, but the kind of optimization you are
asking for can't be done that way.

...and even if it could, I gather that you are thinking of this as a COMMERCIAL
thing - and what manufacturer will distribute the assembly language of their
program, complete with labels? (I gather that your idea was that someone buys a
commercial program intended for a 68000, passes it through the magic optimizer,
and all-of-a-sudden it is optimized for the 68020 with math chip.)


> Is there some fundamental problem with what is described above?  Is anyone
> working on such an optimizer?
> I welcome your comments and criticisms.
> 
> 	Johan Larson

It seems to me you've missed an easy answer to your problem: Use your
inefficient fast single-pass compiler while you develop, and when you are done
use the efficient slow multi-pass compiler for the production version.

Why do you need an optimizer at all? Use the one that's built into the
multi-pass compiler.
-- 
Steven C. Den Beste,   Bolt Beranek & Newman, Cambridge MA
denbeste@bbn.com(ARPA/CSNET/UUCP)    harvard!bbn.com!denbeste(UUCP)