Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!ames!purdue!gatech!bloom-beacon!hstbme.mit.edu!scs From: scs@hstbme.mit.edu (Steve Summit) Newsgroups: comp.lang.c Subject: optimization (was: Re: swap(x,y)) Message-ID: <14357@bloom-beacon.MIT.EDU> Date: 16 Sep 89 23:54:24 GMT References: <4151@buengc.BU.EDU> Sender: daemon@bloom-beacon.MIT.EDU Reply-To: scs@adam.pika.mit.edu (Steve Summit) Lines: 56 In article <4151@buengc.BU.EDU> bph@buengc.bu.edu (Blair P. Houghton) writes: >>And by the way: why do you need an operator for swapping? >Because if a machine can do it with a coupla gates and half a cycle, >I'd like to do it with an operator. > >Some machines (we've already seen an example involving a DG) have >opcodes to swap two values, and it seems a bit ludicrous to code >some elaborate swapping routine when the optimizer is just going >to throw it all out and insert the single instruction. > >Still, any good optimizer will catch all the obvious cases, but it's >gotta be harder to write a compiler to do it that way than to implement >just another canned routine. I'd much, much rather that the optimizer bend 'way over backwards to detect cases which reduce to simple opcodes, than have the language cluttered up with features corresponding to every "useful" operation any architecture has ever offered, with the resulting requirement that every compiler writer implement emulations for all those operations not directly supported by his particular machine. (True, as Blair seems to suggest, the emulations could be portable, "canned" routines.) Someone has already posted an example of a compiler/optimizer which translated the dumb, obvious, temporary-variable-using exchange into a single EXCH instruction. I cheered when I saw that -- it's the right way to do optimizations. From time to time the relative merits of power-of-two multiplicative arithmetic vs. shift instructions are discussed. x <<= 2 is ugly if x *= 4 is what is really meant, and it's completely unnecessary -- any compiler worth its salt will generate the left shift anyway. (Ritchie's original PDP11 compiler did so even without the optimizer.) People always say "but I have to do source-level optimizations, because I don't have control over the optimizer and it might not make them." If the optimizer isn't making the optimizations you want it to, remember that it may also not be making optimizations that you don't know about, or have no control over. If highly optimized code is that important to you, you'd be much better off buying a better compiler than spending time, introducing bugs, and compromising maintainability by cluttering your code with mechanical, source-level optimizations. The beauty of low-level optimization (even peephole optimization following code generation) is that it is automatic and comes in to play even if you forget to apply your source-level optimization, and on aspects of the code (such as scaled pointer arithmetic and subscript calculation) for which you can't. Every time you discover a new machine instruction or sequence which could better implement some C fragment, figure out a way to have the optimizer recognize the equivalent, obvious C code and generate the optimized sequence, rather than figuring out some obfuscated C code which will happen to generate the sequence using the existing code generator. The payoff is much greater.