Path: utzoo!mnetor!uunet!husc6!purdue!gatech!amdcad!tim From: tim@amdcad.AMD.COM (Tim Olson) Newsgroups: comp.arch Subject: Re: RPM-40 [really forwarding] Message-ID: <20645@amdcad.AMD.COM> Date: 4 Mar 88 01:55:59 GMT References: <9727@steinmetz.steinmetz.UUCP> <9758@steinmetz.steinmetz.UUCP> Reply-To: tim@amdcad.UUCP (Tim Olson) Organization: Advanced Micro Devices Lines: 44 In article <9758@steinmetz.steinmetz.UUCP> sungoddess!oconnor@steinmetz.UUCP writes: | IMHO, a pipelined processor should run as fast as the its ALU | lets it. ... | | Even a simple bypass path adds to this delay. It means | that whatever the setup and delay times of this path, | it must be added to the basic machine cycle time, IF | that cycle time is determined by the ALU, as it SHOULD BE (IMHO). | This is LESS of a penalty than adding a register access, | but still a penalty. So is it a win ? It depends upon how often alu forwarding occurs (see below). If it is frequent, it is much better to slow the pipeline by the small amount of time it takes to forward the result, rather than stalling a whole cycle. For example , if the cycle time through the ALU is 20ns, forwarding takes 2ns, and forwarding occurs for 30% of all instructions, then Processor A (no forwarding) Processor B (forwarding) cpi 1.3 1.0 cycle time 20ns 22ns Raw MIPS 38.5 45.5 | To be honest, I don't know. Although I have read plenty of | research on BRANCH latency, I haven't seen much research on | how often ALU result latency would result in interlocks, or | even on how often LOAD latency would result in interlocks. | Perhaps John Mashey has. If so, I'd like to see the | references. Until then, I don't know what John means when he | says "any high-performance system" will :likely" have zero latency. Here are some numbers from the Am29000 simulator running a small "nroff" instructions executed: 89435 instructions requiring alu forwarding: 41420 (46%) instructions forwarding from load buffer: 13669 (15%) I haven't seen published studies on dynamic forwarding frequencies -- does anyone know of such papers? -- Tim Olson Advanced Micro Devices (tim@amdcad.amd.com)