Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!tut.cis.ohio-state.edu!snorkelwacker!spdcc!ima!cfisun!stardent!wright From: wright@stardent.Stardent.COM (David Wright @stardent) Newsgroups: comp.arch Subject: Re: Bloat costs Message-ID: <1990Jun27.011149.2406@Stardent.COM> Date: 27 Jun 90 01:11:49 GMT Distribution: na Organization: Stardent Computer, Newton MA Lines: 66 In article <62777@sgi.sgi.com>, karsh@trifolium.esd.sgi.com (Bruce Karsh) writes: >> -- me > - Karsh >>But I must take issue with statements like: >>In order to make the FFT be >>really efficient, it's necessary to pretabulate trig tables and powers of 2, >>use tricky methods to perform the multiplications, arrange the code so that >>the integer and floating point calculations are overlapped... etc. When you >>do all that, you wind up with a pretty fast, but very unclear program. >I've never seen an optimizing compiler that will do any of these optimizations. >It's a widely held belief that compiler optimizations can get the most out >of an algorithm, but it's hardly ever true. There are optimizations that >compilers can do and optimizations that they can't do. These are in the >later set. Whoops. My fault for being unclear. I didn't mean to imply that compilers were smart enough to figure out how to pretabulate the trig tables. Powers of two, maybe. However, you're going to have to be more specific about what you mean by overlapped adds and multiplies that the compiler isn't smart enough to reorganize. Seems to me that with the advent of superscalar chips, for example, the compilers damn well better be smart enough to do this. Or are we not talking about the same thing? >Is there a C compiler which will recognize a complex multiplication and >generate code which needs 3 multiplications instead of four? Perhaps they >ought to exist, but the don't. I don't believe that compiler techniqes yet >exist to make such a compiler. But when you say "complex multiplication", exactly what do you mean? An example would be illuminating here. >Assembly language is just one of the tools for >solving these problems. There seems to be an attitude that these >problems aren't important, or are too "messy" to be dealt with. Sometimes >the result of this attitude is uncompetively slow systems. Could be. But I'm not trying to argue against the use of assembler where it's needed; the question is "where is it needed?" As an OS programmer, I'm apt to be in situations where assembler is the only thing I can use, and sometimes, as in your example of bcopy, it's just plain faster to do it in assembler. But obviously we're only going to do this where it makes a difference that matters. >Quicksort is nowhere near optimal. It's worst-case behavior is O(n*n). Golly gee, no kidding. But its expected behavior (unless it's really stupidly coded) is O(n*log(n)). I'm not arguing that the programmer doesn't need to understand the algorithm, nor am I arguing that for the utterly fastest behavior, the programmer can avoid looking at what the compiler produces (I do this myself). However, sometimes the result of this is a note to the compiler developers that results in faster code. And sometimes it's the pleasant discovery that the compiler did just what I wanted it to do. And my compiler can beat up your compiler :-) -- David Wright, not officially representing Stardent Computer Inc wright@stardent.com or uunet!stardent!wright After all, this is still the land of opportunity. If you know where to look. -- Jack Douglas