Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!auspex!guy From: guy@auspex.auspex.com (Guy Harris) Newsgroups: comp.arch Subject: Re: Compilers and efficiency Message-ID: <7184@auspex.auspex.com> Date: 15 Apr 91 23:44:25 GMT References: <9782@mentor.cc.purdue.edu> <7117@auspex.auspex.com> <1406@ncis.tis.llnl.gov> Organization: Auspex Systems, Santa Clara Lines: 39 >Guy Harris's example (if I remember correctly) was that no language he >knew of had semantics for retrieving both the integer quotient and the >remainder from a floating divide, the point being that both values are >usually available from any implementation of floating point divide, but >that the language stands in the way of getting them at the same time. I don't think you remember correctly; if *I* remember correctly, I didn't give any particular example, and I certainly didn't give *that* example. *Herman Rubin* made the claim that no language he knew of had those semantics, and at least two other people jumped in to claim that Common LISP *did* have those semantics. I was mainly thinking of, indeed, such things as the added addressing modes; they may increase code density, but do they complicate instruction decoding and slow the machine down there? And what about some of the more elaborate procedure-calling, procedure-entry, or procedure-exit instructions? Admittedly, to some extent, the problems with the more elaborate features may come either from 1) sloppy implementations of them and 2) using the elaborate features even when inappropriate, e.g. not making use of simplifications that can be done at code-generation time, such as better management of registers. Both seem to come, at least in part, from the notion that "well, it's *one instruction*, that means it *has* to be fast!" - i.e., since it's a single instruction, they didn't worry about making it fast, or making the common case fast, and/or assumed it was *always* the right thing to do to use that instruction. As an example of 1), the CCI Power 6/32, as I remember, did a lot better job at implementing a VAX-like CALLS instruction than did the VAX-11/780; one trick I remember them doing was to generate the fields of the stack frame in order, so that the stores that built the stack frame would work well with interleaved memory. They also stored *decoded* instructions, rather than code bytes, in the instruction cache, as a way of avoiding the overhead of decoding VAX-style instructions. As an example of 2), consider, say, treating leaf procedures differently - can you get away with doing less than a full procedure entry?