Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!oliveb!apple!baum From: baum@Apple.COM (Allen J. Baum) Newsgroups: comp.arch Subject: Re: How to use silicon (was Re: Intel/MIPS Dhrystone ratio) Message-ID: <27711@apple.Apple.COM> Date: 22 Mar 89 17:49:21 GMT References: <37196@bbn.COM> <1989Mar16.190043.23227@utzoo.uucp> <24889@amdcad.AMD.COM> <355@bnr-fos.UUCP> <27600@apple.Apple.COM> <16080@cup.portal.com> Reply-To: baum@apple.UUCP (Allen Baum) Organization: Apple Computer, Inc. Lines: 63 [] >In article <16080@cup.portal.com> bcase@cup.portal.com (Brian bcase Case) writes: (after quoting me about auto-increment, its cost, etc. ....) >Exactly my point about superscalar. But note that for the expense of the >added data path (I assume it is essentially a duplicate of the primary >integer (and/or) floating-point pipe), you can now execute *any two* >instructions that don't have deliterious dependencies. Sure, adding only >the hardware needed for auto-increment is cheaper, but do you really want >that garbage in your architecture forever? When you do go to a super- >scalar implementation (and you will, whomever you are, just to keep up >with the joneses), you now have two data paths that have the added >complexity of auto-increment. Super-scalar is a good argument for simple >architectures. If auto-increment is frequent enough, then it can be done in addition to executing *any two* operations at once. The leverage really hits you- a 10 inst. loop, including a couple of atuo-incs. shrinks to 5 if you can average two instructions/cycle. At very little cost in hardware (I assert this as a hardware design type), maybe this shrinks to 4 insts., a 20% saving. Try to get 20% some other way- it's real tough! Your mileage may vary, of course. >>It's probably time to dust off those >>benchmarks and see how often it occurs, and how many cycles it will save. > >Well, I'm all for simulation and experimentation. If it is better and the >cost now and in the future is not prohibitive, then great. But it ceratainly >isn't clear that auto-increment is the right thing! My position is that it >is reasonably clear that one should be skeptical. Um, that was my point also, although perhaps I lean towards less skeptical than you. >> Since this kind of operation is used almost exclusively inside a loop, >>it has quite a bit of leverage. > >Yes, this is true. This is why one would like to look at the idea seriously. > >> Besides, who says you can't find something else to >>do with the extra write port when you're not doing address updates? > >I'm surprised to hear you say that! I think a more realistic outlook is >to say that "Besides, who says you *can* find something else to do with >the extra write port." Conjecturing, instead of proving, that an added >feature (with a significant cost) can be used for something else does not >constitute the rigorous persuit of good computer architecture. Shame! :-) >:-) :-) :-) :-) I did have something in mind for that hardware. I dispute the signifcant cost issue- it is roughly equivalent to register scoreboarding logic, and if you have that, the additional cost is small (again, I assert this in my capacity as a hardware design type that has gone through the exercise). I didn't conjecture that it might be used for something else, I know it can, and I know the kind of speedup it will give me, as well as the extra cost to use it for that something else. This is an exercise for the reader- Part A: what can an extra write port to a register file be used for (and what other hardware is required to make it useful)? Part B: Now, suppose this extra write port can be a read/write port? -- baum@apple.com (408)974-3385 {decwrl,hplabs}!amdahl!apple!baum