Path: utzoo!attcan!cmtl01!matrox!uvm-gen!uunet!lll-winken!ames!pasteur!ucbvax!decwrl!decvax!ima!haddock!suitti From: suitti@haddock.ima.isc.com (Stephen Uitti) Newsgroups: comp.sys.mac.programmer Subject: Re: Code generation in LSC Message-ID: <11536@haddock.ima.isc.com> Date: 25 Jan 89 20:59:37 GMT References: <1112@dogie.edu> <6337@hoptoad.uucp> <5623@phoenix.Princeton.EDU> <11494@haddock.ima.isc.com> <5801@phoenix.Princeton.EDU> Reply-To: suitti@haddock.ima.isc.com (Stephen Uitti) Organization: Interactive Systems, Boston Lines: 149 In article <5801@phoenix.Princeton.EDU> mbkennel@phoenix.Princeton.EDU (Matthew B. Kennel) writes: >In article <11494@haddock.ima.isc.com> suitti@haddock.ima.isc.com (Stephen Uitti) writes: >> >>Some time ago, I ran some C benchmarks on a Mac II & Sun III. >>...The Mac tended to have run times that >>were very close to those of the Sun, with some run times actually >>faster than the Sun. > >Hmm. This is surprising. What kind of benchmarks did you use? >How did you time them? The benchmarks were non-floating point. The sieve, for example, was faster on the Mac. Embedded time calls were used. Wall clock timing, etc., was used to make sure things were being reported at least approximately correctly. All runs were greater than 30 seconds. >> "Even on microcomputers" is incorrect. >>The compilers for PCs and Macs are MUCH better than for larger >>machines. It even makes sense. There is more money in it. > >In terms of compile time and overall convenience, undoubtedly. >Code generation? No. PCC based compilers are still far more common than GNU C for UNIX. MSC (for the PC) claims to do all sorts of interesting things. Global registers for the 8086 are probably a loss, since there aren't lots of registers... I find that Turbo C produces code that is smaller and about the same speed (see below) as MSC, on the same machine. It produces code at least three times faster. Of course, "microcomputer" compilers are also closer to supporting ANSI C (prototypes, etc.). I say "microcomputer", but my Mac II is at least 2.5 times faster than a 780 (though an SE will beat a Mac II in a foot race down the hall). I really mean "personal" or "home" computer. >Just looking at some of my programs with MacsBug, I can see >many _obvious_ inefficiencies that a even a peephole optimizer could remove: > > MOV.L -(SP), DO > MOV DO, D6 >or reloading the same expression which was already in a register. Can you see labels with MacsBug? Could it have been: MOV.L -(SP), DO foo: MOV DO, D6 Anyway, I've seen this type of thing fall through (even without intervening labels) with PCC based compilers optimizers. There are sequences where, in the above example, the compiler really did want the value in both registers... Anyway, LSC is not worse than large systems compilers here. >LSC doesn't do other kinds of optimizations such as turning > >for(i=0; i<=num; i++) > d += a[i]; > >into > >register int *p; >for(p=a; p< a+num; p++) /* a+num should _not_ be computed in each iteration */ > d += *p; /* of the loop! */ Some compilers will figure out what to stuff into registers and do loop invariant type stuff (MSC for the PC). My code specifies the use of registers, in order of preference, and tends not to have loop invariants. Other optimizations are also wasted. Compilers which attempt to do this for me tend to be broken (MSC for the PC, compilers for Cyber 205, IBM RT, etc), meaning that turning off the optimizer tends to allow my code to work. Another odd thing that has come up is that when I have written code that explicitly removes loop invariants, the optimizers of some of the "smarter" compilers tend to slow my code down. It seems that they allocate additional variables to point into the arrays, and even update the redundant copies. >For these trivial examples, it's no big deal, but in complicated >computationally-intensive programs, all of these types of optimizations >combine and can be very significant. For large programs, using a profiler will allow you to concentrate on the correct portion of the program. This will allow you to use a quick compiler to outperform an optimizing compiler, generally speaking. >Note that I generally write scientific programs with lots of loops, & >arrays and such in which good optimization can make a big difference. Floating point? >I suspect that many Mac applications are of the type >... >and so these kind of "global" optimizations aren't so important. >Intelligent register usage should always be a win, though! Mac applications mostly spend their time waiting for the next event (polling). However, when they do something, they often want to do it quickly, as they do not want to appear sluggish. >>Would gcc be better than LSC? Well, gcc would never produce code as >>quickly, gdb would never be as nice as LSC 3.0, you'd need "make", >>etc. The code *might* be as fast as LSC's code. > >If it were _only_ as fast at LSC, I'd think it were broken! :) The main reason it wouldn't be is that a typical LSC edit/compile/run/debug loop will be an order of magnitude better. LSC has a (simple) profiler. >Seriously, I've looked at the output from good optimizing compilers, >and in general, they do a better job than I (admittedly a non-expert >assembly programmer) could do without a very large amount of effort. I've yet to see an optimizing compiler that did as well as I would do. When doing assembler, I count bytes and cycles. I optimize register usage. One gets used to it. The painful part of assembly is when you've got code that works & have discovered a neat new way of doing the problem... It is often hard enough to get yourself to redo it when using a higher level language. >Looking at the output of the MIPSco C compiler, I am completely >astounded as to the transformational magic that it manages to perform >on my programs. (Then again, BASIC would scream on a MIPS) When I look at what PCC does to my code, then the optimizer undoes, I'm amazed. Still, optimized code (for me) is easier to debug... if I'm forced to debug in assembly. >> It probably wouldn't >>be as fast, since one would probably make int's 32 bits. >Quite true. But isn't this true only for a 68000, i.e. shouldn't >a Mac II be as fast w/ 32 bit ints as 16? (I think that some minis >are _slower_ accessing 16 bits rather than 32!) No. For whatever reason, 16 bits are considerably quicker. >NOTE: My direct experience has _ONLY_ been with LSC 2.15. If things >have changed significantly, please correct me! Thanks. LSC 3.0 (etc.) supports floating point on the 68881. When used, this is a win. 3.0 has a serious debugger. It is really awesome when your system has more than one screen (like mine does). Summary: I'd prefer a compiler that spits out reasonable code infinitely fast (such as LSC) to one that spends all day working on it. LSC is not as bad as you think. "Mainframe" optimizing compilers are not as good as you think. I believe that the LSC compiler can be improved, but that they are going about it in the correct manner. >Matt Kennel >mbkennel@phoenix.princeton.edu >"Assume a spherical cow." Stephen. "Everything in moderation. Including moderation."