Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!dali.cs.montana.edu!uakari.primate.wisc.edu!aplcen!boingo.med.jhu.edu!haven.umd.edu!uflorida!travis!tom From: tom@ssd.csd.harris.com (Tom Horsley) Newsgroups: comp.sys.m88k Subject: Re: Harris NightHawk Message-ID: Date: 17 May 91 10:59:54 GMT References: <14836@encore.Encore.COM> Sender: news@travis.csd.harris.com Distribution: comp Organization: Harris Computer Systems Division Lines: 64 In-reply-to: soper@encore.UUCP's message of 16 May 91 15:18:24 GMT soper> What does the Harris compiler do that gcc does not do? Lots of things. Just off the top of my head: * We generate pretty good local code. The last time I looked, gcc didn't generate very good code for even simple expressions (regardless of any optimization being done). * The optimizer has more features than the gcc optimizer (I believe loop unrolling is one example, I don't think gcc does that, and loop unrolling is fantastically important if you want to get basic blocks with enough instructions in them so the instruction scheduler can keep the pipes full). * The optimizer goes to a lot of trouble to apply profitability analysis to optimizations so we only do them if we are pretty sure they will actually improve the generated code. Most compilers (even ones touted as being highly advanced optimizing compilers) tend to apply optimizations blindly, simply because it is possible to do them. The number of machine dependent interactions that can make an optimization unprofitable is unbelievable, and if you don't take them into account, any given optimization is just as likely to make the code worse as it is to improve it. * The register allocator tries to bind registers so as to avoid making the instruction scheduler block. By avoiding reuse of the same register in instructions that are "close", the register allocator allows the instruction scheduler more flexibility in moving instructions around. * If the job the register allocator did of binding registers turns out not to be good enough, the instruction scheduler can permute the register bindings to eliminate data dependencies that would otherwise prevent it from moving an instruction. * The compiler passes down a lot of alias information to the instruction scheduler level, giving the instruction scheduler more choices in its ability to shuffle around loads and stores. * We have a post-linker optimizer that can eliminate most of the or.u instructions that are needed as the 1st instruction in the two word instruction sequence that is normally used to reference static memory locations (it uses the "linker registers" as program global CSEs to hold the 4 most common or.u values). I suspect the most recent 88k gcc compilers generate better code than the earlier one we examined in detail (it was from DataGeneral and claims to be version 1.37.26). We compared the code from gcc, GreenHills, Diab, and our own compilers, and gcc lost every time. I freely admit that we pay a price for generating terrific code, our C compiler (if you turn on all optimizations) definitely uses more memory and runs slower than gcc, but we have also been working on that and are making some pretty good progress in reducing the memory it uses, but I doubt we will ever be able to generate the code we do while being as small and fast as gcc. Don't get me wrong though, our compilers are reasonably fast, just not as fast as gcc. -- ====================================================================== domain: tahorsley@csd.harris.com USMail: Tom Horsley uucp: ...!uunet!hcx1!tahorsley 511 Kingbird Circle Delray Beach, FL 33444 +==== Censorship is the only form of Obscenity ======================+ | (Wait, I forgot government tobacco subsidies...) | +====================================================================+