Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!mcsun!ukc!dcl-cs!aber-cs!athene!pcg From: pcg@cs.aber.ac.uk (Piercarlo Grandi) Newsgroups: comp.lang.misc Subject: Re: Definition of "Aggressive optimization" Message-ID: Date: 11 Nov 90 17:58:11 GMT References: <1990Nov6.183955.394@lpi.liant.com> Sender: pcg@aber-cs.UUCP Organization: Coleg Prifysgol Cymru Lines: 140 Nntp-Posting-Host: odin In-reply-to: rcg@lpi.liant.com's message of 6 Nov 90 18:39:55 GMT On 6 Nov 90 18:39:55 GMT, rcg@lpi.liant.com (Rick Gorton) said: rcg> Seriously, I want to know language (and furthermore, rcg> what compiler) is being used here that is being rcg> used as the basis for some of the discussion going on: barmar> So aggressive optimization refers to optimization that does barmar> things like interprocedural flow analysis? Are loop unrolling, barmar> strength reduction, and common subexpression elimination barmar> aggressive or not? pcg> They are all aggressive. Even substituting a shift for a pcg> multiplication or division by a power of two may be considered pcg> aggressive in certain languages where the properties of the two are pcg> not precisely identical. rcg> Huh? If the language spec says that shifting is not the same as rcg> multiplying for powers of two, then it is not an AGGRESSIVE rcg> optimization to convert the multiply into a shift, rather an rcg> ERRONEOUS one. And not just at the intermediate language level. Well, I once called aggressive an optimization that involves considering the meaning of multiple language primitives. Just to continue the contrived example of shift vs. multiplication: if the language says that *in every relevant case* shift and multiplication by 2^N, then they are in effect the same thing (for those relevant cases) and can be translated in the same way. But: if the language for example says that shift and multiplication by 2^N ar eonly equivalent if both operands are positive, and at least one of the operands is a variable, then the correctness of the optimization depends on analysis of the flow of values, and is thus aggressive. barmar> The only optimizers that are independent of the primitive barmar> operators are peephole optimizers. pcg> Yep, yep. Long live peephole optimizers! rcg> Yes, peephole optimizers are usually independent of the rcg> intermediate language, but in the case specified (about no shifting rcg> instead of multiplying) It is REQUIRED that the peepholer know some rcg> things about the language involved, if it doesn't have hooks into rcg> the intermediate language or symbol table to get storage rcg> characteristics/requirements, etc. Why? The peepholer need only know the properties of the machine, not of the language being compiled. In other words it can rewrite the generated code in any way it likes, independently of the source language, as long as it can prove that in the given context the two machine instruction sequences are equivalent. Note that a peephole optimizer can be aggressive, and thus more or less error prone -- if the machine architecture gives simple rules for instruction sequence equivalencing then probably the peephole rewriter will be less buggy and more aggressive than one that has to deal with a more difficult architecture. Note that IMNHO aggressive peephole optimizers are not a good bet either. [ ... a clever code generation example ... ] rcg> A good peepholer is capable of doing all this, and much more. In rcg> fact, there are times when peepholers can (and will) monkey with rcg> loops, and pull things out of them that don't need to be in them. rcg> But none of this can be done if the language specifies that this rcg> behavior is incorrect. Assuming that the generated machine code is correct, the source language does not matter. Of course the peepholer can work better if it makes use of contextual information derived from the source, but this is not necessary or desirable. pcg> I intended to say that the code that effects the transformation adds to pcg> the size and complexity of the compiler and therefore affects the pcg> compiler's reliability....You have a large amount of hairy code in the pcg> compiler's bowels, and this is baaaad. rcg> Having large amounts of difficult code is only bad if you find rcg> bugs or limitations in it that require major rewrites. Ahem. Software physics... It can also be bad if the bugs affect the correctness of every other program in a system. A code generator is not just difficult, it is critical. Maybe the most critical code in a system. You know, even little trivial bugs, than can be corrected with a one line patch, in a code generator can cause major problems: not in the compiler, which still works 99%, but in all the compiled programs, which may stop working at all in unanticipated ways. Do we want to increase the chances for any type of bug in the code generator? Maybe, if the programmers ("dusty programmer" problem) can be expected to be even buggier. rcg> In this case, we are no longer talking about aggressive rcg> compilation, rather software [re]design, engineering, and rcg> development issues. Yep, precisely. But the motivation for doing aggressive optimization in the first place is precisely engineering, i.e. cost effectiveness. Aggressive optimization is not *necessary* for correctness or functionality, only for reducing one of the costs of a program. The probability of increasing the problems in a compiler is a software engineering issue, just as the probability of increasing the speed at which programs execute, just as are the consequences of using a code generator level aggressive optimizer versus a source level rewriter. These are all difficult sw engineering issues. rcg> So how about some POSITIVE discussions about what kind of features rcg> good software productivity tools have/need, instead of this rcg> continuous optimization-bashing festival? Note that I am not against optimization (intended as translation to optimal code) at all, and not even against aggressive optimization; the former is IMNHO very important, and the latter is too, as long as it is performed on the source, or however outside the compiler. All the difficult problems of aggressive optimization, stemming from program logic analysis and rewriting, are very interesting and important, only I don't think that the best place where to solve them is the code generator. Source level analysis and improvement is, in the EDP world, a very important and commercially profitable activity. EDP organizations spend immense amounts of money each year in cleaning, maintaining, converting dusty decks. What I would like to see is more of this going on in the scientific sector as well. But maybe many numerical analysts (because it is about these that we have been speaking mostly) are more stuffy than EDP people, and they resist any idea of learning anything beyond what they managed to get in their undergraduate Fortran or C classes (I'm not a programmer! Fortran II is perfectly adequate for my research, and besides I am too busy!). With Herman Rubin as a counterexample, just to to confirm the rule, maybe. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk