Path: utzoo!attcan!uunet!cs.utexas.edu!sun-barr!apple!amdcad!sun!exodus!rbbb.Eng.Sun.COM!chased From: chased@rbbb.Eng.Sun.COM (David Chase) Newsgroups: comp.arch Subject: Re: speculative execution Message-ID: <1252@exodus.Eng.Sun.COM> Date: 10 Oct 90 17:22:00 GMT References: <1990Oct9.162639.23516@rice.edu> <3431@bnr-rsc.UUCP> <1990Oct9.224312.2031@rice.edu> <3432@bnr-rsc.UUCP> Sender: news@exodus.Eng.Sun.COM Organization: Sun Microsystems, Mt. View, Ca. Lines: 45 In article <3432@bnr-rsc.UUCP> bcarh185!schow@bnr-rsc.UUCP (Stanley T.H. Chow) writes: >In article <1990Oct9.224312.2031@rice.edu> preston@titan.rice.edu (Preston Briggs) writes: >>Given the same number of functional units, with same latencies, >>same number of registers, ... >>I believe I can write a compiler that will make code run >>faster than hardware that does speculative execution. >I assume you mean given two machines, otherwise identical, differing >only in that M1 does hardware speculative execution and M2 does not, >you can write a compiler that will make M2 run faster than M1. >This sounds wrong. I do not doubt that you can speedup M2 by speculative >execution (with hoisting, etc). But surely the same technology can be >applied to M1 with the same result. A priori, I would expect the benifit >to be the same for both machines. One point was not sufficiently emphasized in Preston's posting: "SAME NUMBER OF REGISTERS". A reasonable trick in hardware speculative execution is to make up boatloads of secret registers to hold the results of speculative computations. Preston says, "expose those registers to the compiler", which of course means "change your architecture", which is typically not acceptable to a company with customers committed to an existing architecture (be it VAX, MIPS, SPARC, RS/6000, 370, 80386, whatever). Preston is also assuming, probably correctly, that it requires less hardware to expose the registers and let the compiler juggle them than it does to hide the registers and manage the juggling on the fly in hardware. Less hardware means a number of things (possibly): a) higher yield b) more room for other stuff (cache, tlb, whatever) c) choice of a faster, less compact, or more power-hungry technology (i.e., GaAs or ECL) d) shorter critical path for clock cycle e) shorter pipeline. Thus, if you sell enough chips to recover your investment in high-powered compiler technology, then you win. Big IF there, of course, and you'll be selling a new architecture, and we haven't said much about context switching or any other OS-related issues yet, either. David Chase Sun Microsystems, Inc.