Path: utzoo!attcan!uunet!cs.utexas.edu!sun-barr!apple!amdcad!sun!exodus!rbbb.Eng.Sun.COM!chased
From: chased@rbbb.Eng.Sun.COM (David Chase)
Newsgroups: comp.arch
Subject: Re: speculative execution
Message-ID: <1252@exodus.Eng.Sun.COM>
Date: 10 Oct 90 17:22:00 GMT
References: <1990Oct9.162639.23516@rice.edu> <3431@bnr-rsc.UUCP> <1990Oct9.224312.2031@rice.edu> <3432@bnr-rsc.UUCP>
Sender: news@exodus.Eng.Sun.COM
Organization: Sun Microsystems, Mt. View, Ca.
Lines: 45

In article <3432@bnr-rsc.UUCP> bcarh185!schow@bnr-rsc.UUCP (Stanley T.H. Chow) writes:
>In article <1990Oct9.224312.2031@rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
>>Given the same number of functional units, with same latencies,
>>same number of registers, ...
>>I believe I can write a compiler that will make code run
>>faster than hardware that does speculative execution.

>I assume you mean given two machines, otherwise identical, differing
>only in that M1 does hardware speculative execution and M2 does not,
>you can write a compiler that will make M2 run faster than M1.

>This sounds wrong. I do not doubt that you can speedup M2 by speculative
>execution (with hoisting, etc). But surely the same technology can be
>applied to M1 with the same result. A priori, I would expect the benifit
>to be the same for both machines.

One point was not sufficiently emphasized in Preston's posting: "SAME
NUMBER OF REGISTERS".  A reasonable trick in hardware speculative
execution is to make up boatloads of secret registers to hold the
results of speculative computations.  Preston says, "expose those
registers to the compiler", which of course means "change your
architecture", which is typically not acceptable to a company with
customers committed to an existing architecture (be it VAX, MIPS,
SPARC, RS/6000, 370, 80386, whatever).

Preston is also assuming, probably correctly, that it requires less
hardware to expose the registers and let the compiler juggle them than
it does to hide the registers and manage the juggling on the fly in
hardware.  Less hardware means a number of things (possibly):

  a) higher yield
  b) more room for other stuff (cache, tlb, whatever)
  c) choice of a faster, less compact, or more power-hungry
     technology (i.e., GaAs or ECL)
  d) shorter critical path for clock cycle
  e) shorter pipeline.

Thus, if you sell enough chips to recover your investment in
high-powered compiler technology, then you win.  Big IF there, of
course, and you'll be selling a new architecture, and we haven't said
much about context switching or any other OS-related issues yet,
either.

David Chase
Sun Microsystems, Inc.