Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!esosun!ucsdhub!sdcsvax!ucbvax!ji.Berkeley.EDU!shebanow
From: shebanow@ji.Berkeley.EDU (Mike Shebanow)
Newsgroups: comp.arch
Subject: Re: Branch prediction in the 532
Message-ID: <18576@ucbvax.BERKELEY.EDU>
Date: Sat, 25-Apr-87 02:12:44 EDT
Article-I.D.: ucbvax.18576
Posted: Sat Apr 25 02:12:44 1987
Date-Received: Sun, 26-Apr-87 05:40:01 EDT
References: <324@dumbo.UUCP> <165100008@uiucdcsb> <18537@ucbvax.BERKELEY.EDU> <1123@aw.sei.cmu.edu>
Sender: usenet@ucbvax.BERKELEY.EDU
Reply-To: shebanow@ji.Berkeley.EDU.UUCP (Mike Shebanow)
Distribution: world
Organization: University of California, Berkeley
Lines: 48
Keywords: branch prediction, branch target buffers, BTB

In article <1123@aw.sei.cmu.edu> firth@bd.sei.cmu.edu.UUCP writes:
>Mike, could you please elaborate on this, because after 2 days I find
>myself no nearer to understanding it.  The purpose of the "prediction"
>information seems to be to allow the processor to prefetch whichever
>of the branch continuations (successor or destination) is more likely
>to be needed.
>
>But if you have a branch cache, surely BOTH continuations are available
>already - one in the normal pipeline and the other in the cache.  So
>who cares which is needed? - and why bother to predict it?

Sure. For example, consider an instruction sequence like this:

(#1)	op1	a, b	{some don't care operation}
(#2)	op2	b, c	{some don't care operation}
(#3)	op3	a, c	{this one sets the condition codes}
(#4)	bequ	loc	{use the 'zero' bit in the condition codes}
(#5)	.....

Now assume that this is to run on a machine implemented using a 4 stage
pipe, with instruction #1 at the back of the pipe:

    +---------------+---------------+---------------+---------------+
--->|   prefetch    |    decode	    |    execute    |    retire	    |--->
    +---------------+---------------+---------------+---------------+

cycle #
1	   (#1)
2	   (#2)		   (#1)
3	   (#3)		   (#2)		   (#1)
4	   (#4)		   (#3)		   (#2)		   (#1)

The question is what to do now, load in the target of the branch (instruction
at loc) or load in the instruction following the branch (#5)?:

5	   (loc)	   (#4)		   (#3)		   (#2)
OR!!!
5	   (#5)		   (#4)		   (#3)		   (#2)

You don't know which, as the branch depends on instruction #3, which is in
decode.  Without prediction, the pipeline would have to stall until #3
executes (at least). With prediction, the pipeline is kept full. The downside
of branch prediction is if you predict wrong. In that case, all instructions
fetched after the predicted branch must be flushed.

sorry for the long winded explanation,

Mike Shebanow