Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!ames!purdue!gatech!bloom-beacon!hstbme.mit.edu!scs
From: scs@hstbme.mit.edu (Steve Summit)
Newsgroups: comp.lang.c
Subject: optimization (was: Re: swap(x,y))
Message-ID: <14357@bloom-beacon.MIT.EDU>
Date: 16 Sep 89 23:54:24 GMT
References: <4151@buengc.BU.EDU>
Sender: daemon@bloom-beacon.MIT.EDU
Reply-To: scs@adam.pika.mit.edu (Steve Summit)
Lines: 56

In article <4151@buengc.BU.EDU> bph@buengc.bu.edu (Blair P. Houghton) writes:
>>And by the way: why do you need an operator for swapping?
>Because if a machine can do it with a coupla gates and half a cycle,
>I'd like to do it with an operator.
>
>Some machines (we've already seen an example involving a DG) have
>opcodes to swap two values, and it seems a bit ludicrous to code
>some elaborate swapping routine when the optimizer is just going
>to throw it all out and insert the single instruction.
>
>Still, any good optimizer will catch all the obvious cases, but it's
>gotta be harder to write a compiler to do it that way than to implement
>just another canned routine.

I'd much, much rather that the optimizer bend 'way over backwards
to detect cases which reduce to simple opcodes, than have the
language cluttered up with features corresponding to every
"useful" operation any architecture has ever offered, with the
resulting requirement that every compiler writer implement
emulations for all those operations not directly supported by his
particular machine.  (True, as Blair seems to suggest, the
emulations could be portable, "canned" routines.)

Someone has already posted an example of a compiler/optimizer
which translated the dumb, obvious, temporary-variable-using
exchange into a single EXCH instruction.  I cheered when I saw
that -- it's the right way to do optimizations.

From time to time the relative merits of power-of-two
multiplicative arithmetic vs. shift instructions are discussed.
x <<= 2 is ugly if x *= 4 is what is really meant, and it's
completely unnecessary -- any compiler worth its salt will
generate the left shift anyway.  (Ritchie's original PDP11
compiler did so even without the optimizer.)

People always say "but I have to do source-level optimizations,
because I don't have control over the optimizer and it might not
make them."  If the optimizer isn't making the optimizations you
want it to, remember that it may also not be making optimizations
that you don't know about, or have no control over.  If highly
optimized code is that important to you, you'd be much better off
buying a better compiler than spending time, introducing bugs,
and compromising maintainability by cluttering your code with
mechanical, source-level optimizations.

The beauty of low-level optimization (even peephole optimization
following code generation) is that it is automatic and comes in
to play even if you forget to apply your source-level
optimization, and on aspects of the code (such as scaled pointer
arithmetic and subscript calculation) for which you can't.  Every
time you discover a new machine instruction or sequence which
could better implement some C fragment, figure out a way to have
the optimizer recognize the equivalent, obvious C code and
generate the optimized sequence, rather than figuring out some
obfuscated C code which will happen to generate the sequence
using the existing code generator.  The payoff is much greater.