Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!uflorida!travis!tom From: tom@ssd.csd.harris.com (Tom Horsley) Newsgroups: comp.sys.m88k Subject: Re: Fastest 88k Message-ID: Date: 1 Nov 90 12:22:49 GMT References: <1172@iceman.jcu.oz> <42586@mips.mips.COM> <42593@mips.mips.COM> Sender: news@travis.csd.harris.com Followup-To: comp.sys.m88k Organization: Harris Computer Systems Division Lines: 53 In-reply-to: mash@mips.COM's message of 1 Nov 90 01:43:18 GMT >>>>> Regarding Re: Fastest 88k; mash@mips.COM (John Mashey) adds: mash> Now, just out of curiosity: mash> a) What are the numbers you get using the current production compilers? mash> b) About how far apart (in time) are those two versions? mash> c) Do you feel that tuneups done to improve SPEC numbers carry over mash> into improvements on other programs ... or not? mash> Any comments on those that you'd be willing to make would be good... mash> especially item c) would be interesting to a lot of people. a) I didn't pay a lot of attention to the numbers for the released compiler because we already had many of the optimizations under development at that time and knew we would get a lot better. All I remember was that the number was better than 15.2, but the individual benchmarks results were much more mixed (and the number was a lot closer to 15.2 than the 17.1 we are getting today). b) It is difficult to say how far apart in time the compilers are since the advanced development was going on at the same time as a different baseline was being stabalized and packaged up for the release. A ball-park figure would be "a few months". c) I would say that all the improvements we made are generally useful. We look at a lot more benchmarks than just SPEC (some of them are rather large real customer applications, or benchmarks derived from those applications). We like to pick which optimizations to work on based on cost/benefit analysis - if we don't see the need for something in a lot of places, we generally don't work on it. Some of the SPEC benchmarks reacted fairly dramatically to some of our optimizations, but the optimizations were not designed specifically to get that reaction from SPEC. For example: the biggest single improvement came from a combination of loop-unrolling, teaching the instruction scheduler how to safely shuffle some loads past some stores (to keep the data unit pipeline going), and teaching the register allocator to pick registers in such a way as to allow the instruction scheduler maximum flexibility (to keep the floating point pipeline going). All of this is great stuff and is useful in almost any program. The SPEC matrix300 benchmark, however, spends 99.9% of its time in a single matrix multiply-and-add loop. When the above set of optimizations hit the matrix300 benchmark, the performance skyrocketed. This does not mean our optimizations are not generally useful, but it does mean that real programs which do actual work may not see a similar performance boost (but they certainly should get better). -- ====================================================================== domain: tahorsley@csd.harris.com USMail: Tom Horsley uucp: ...!uunet!hcx1!tahorsley 511 Kingbird Circle Delray Beach, FL 33444 +==== Censorship is the only form of Obscenity ======================+ | (Wait, I forgot government tobacco subsidies...) | +====================================================================+