Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!husc6!mit-eddie!ll-xn!ames!ptsfa!ihnp4!homxb!mtuxo!mtune!codas!usfvax2!pdn!alan
From: alan@pdn.UUCP (Alan Lovejoy)
Newsgroups: comp.sys.mac
Subject: Re: Compiler efficiency
Message-ID: <1730@pdn.UUCP>
Date: Thu, 5-Nov-87 01:05:12 EST
Article-I.D.: pdn.1730
Posted: Thu Nov  5 01:05:12 1987
Date-Received: Sat, 14-Nov-87 05:36:39 EST
References: <3987@watdragon.waterloo.edu> <304@fairlight.oz>
Reply-To: alan@pdn.UUCP (0000-Alan Lovejoy)
Organization: Paradyne Corporation, Largo, Florida
Lines: 67

In article <304@fairlight.oz> gary@fairlight.oz (Gary Evesson) writes:
>In article palarson@watdragon.waterloo.edu (Paul Larson) writes:
>
>>I have heard some murmurs (they weren't loud enough to be be termed complaints)
>>that some (many?  all?) compilers for the mac produce code of questionable 
>>efficiency.  I don't know enough assembly to prove or disprove this statement.
>>Is there any merit to these murmurs?

You must remember that it's only been in the last year or two that
really *good* (by mini-/mainframe standards) compilers for the IBM PC
family have become available.  The 68000 in general, and the Mac in
particular, just haven't been successful enough long enough to attract the
critical mass necessary for state-of-the-art compilers to be developed.

The failings of most Mac compilers are as follows:

1) Poor register optimization.  This is a killer, since the architecture
of the 68000 *depends* upon efficient register usage for its
performance.  Also, most "C" compilers won't keep values in registers
unless told specifically by the programmer to use "reg"ister storage
for that variable.  This is one of the reasons for the Mac II's poorer
than expected showing against the '386 machines in the BYTE Benchmarks.
 
2) Poor use of the available addressing modes.  This, too, is a very  
serious problem.  The 68000 has some 14 addressing modes (even the 386
only has 9).  There is a tendency to use d8(An, Dn) for iterating over
arrays, when (An)+ would be much more efficient.  Most compilers won't
use 'LEA <complicated addressing mode operand>, An' folowed by repeated
references to (An) inside a loop, but instead use <complicated
addressing mode> inside the loop.  The expression 'a = b + c' should
be generated as 'move b(A6),D0; add c(A6),D0; move D0,a(A6)' (assuming
that a, b and c start out in memory and are not referenced again frequently
and/or soon enough to be worth keeping in the registers).  But many
compilers can't do things quite that efficiently.  You'll often see
'move c(A6),D0; move b(A6),D1; add D0,D1; move D1,a(A6)'.  I've seen
worse.  I could go on...

3) Poor run-time data structures and/or techniques.  By this I mean things 
like procedure activation records, stack frames, procedure calling 
conventions, argument passing techniques, local and/or external procedure
calling techniques, local, global and/or external variable reference 
techniques, inefficient jump tables and failure to use in-line procedure
expansion for short procedures (some of these problems are the fault
of the C language and/or the Mac's OS, but not most of them).

4) Few compilers make heavy use of the classical and mostly
machine-independent optimizations such as constant folding, common
sub-expression elimination, copy propogation, dead-code elimination, 
code-motion, induction variable elimination and strength reduction.

5) I don't know of *any* Mac compilers that attempt multiple generation
strategies for the some code block, and pick the best one based on the
estimated number of machine cycles each strategy will require for
execution. (This has to be done dynamically for an optimizing compiler,
because the "code template" for each source language construct is not
a static entity when the optimizer is making changes all over the
place).  

I have a Modula-2/68000 compiler that does all these optimizations, but it
doesn't run on the Mac. Its Sieve code does 10 iterations 
in 1.14 seconds on a 12 MHz 68000 (1 wait state).  You might expect
a time of 1.8 seconds if this code were run on an SE, 0.45 seconds
on the Mac II.  The best Mac compilers I've seen run around 3 seconds
on the SE and around 0.7 seconds on the II for this benchmark.  Go back
to the BYTE benchmark articles and compare these numbers!

--alan@pdn