Path: utzoo!attcan!uunet!zephyr.ens.tek.com!uw-beaver!rice!titan.rice.edu!preston From: preston@titan.rice.edu (Preston Briggs) Newsgroups: comp.arch Subject: Re: speculative execution Message-ID: <1990Oct10.164353.21070@rice.edu> Date: 10 Oct 90 16:43:53 GMT References: <1990Oct9.212103.363@rice.edu> <12905@encore.Encore.COM> Sender: news@rice.edu (News) Organization: Rice University, Houston Lines: 74 I wrote >> In general, we need to be careful about fatally increasing >> register pressure. The i860's exposed pipeline provides an >> elegant way out, allowing simple aborts of optimistic >> computations by ignoring what's partially computed in >> the pipe. and In article <12905@encore.Encore.COM> jkenton@pinocchio.encore.com (Jeff Kenton) writes: >It would take a lot to convince me that the i860 is an elegant solution >to anything. No one has produced a compiler which can take advantage of >the theoretically possible parallelism of the i860. It's a very fast >chip for certain kinds of applications, but I wouldn't call it elegant, >or general purpose. Lots of complaints here... First, the exposed pipeline stuff. If we've got an if-then that looks like this int-1 int-2 int-3 if (something) { pfmul.ss f3,f4,f0 pfmul.ss f0,f0,f0 pfmul.ss f0,f0,f0 pfmul.ss f0,f0,f5 } fst.l f5,somewhere The idea is that if something is true we multiply f3 and f4 together, putting the reult in f5. Then we store f5. So we can't optimistically perform the entire multiply before knowing the value of "something" since f5 is live on the false branch. We can however, hoist the initial pipeline stages (perhaps overlapping them with earlier pipeline compuations). int-1, pfmul.ss f3,f4,f0 int-2, pfmul.ss f0,f0,f0 int-3, pfmul.ss f0,f0,f0 if (something) { pfmul.ss f0,f0,f5 } fst.l f5,somewhere The true path get much shorter. No increase in the path length of the false path. And no extra register required. It's perhaps a dirty trick rather than elegant, but I try to describe my ideas glowingly and reserve disparaging terms for other peoples' work. The point though, is that the exposed pipeline scheme requires less registers because result registers are not frozen at the beginning of a pipelined sequence, but at the end. Similarly, the source registers become avaliable immediately after they are used. In the example above, f3 and f4 are immediately avaliable after the 1st instruction and f5 isn't required until the result pops out of the pipe. Renaming helps, but requires more hidden registers that might be used profitably by the compiler for other work. Regarding compilers, I believe The Portland Group and Ardent both have compilers that will take advantage of the pipelined instructions. Besides that, the i860 is a wonderful source of thesis topics. The i860 may not be your ideal chip, but it's chock full of ideas. The good and useful ones shouldn't be ignored. -- Preston Briggs looking for the great leap forward preston@titan.rice.edu