Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!dptg!ulysses!andante!alice!jmk
From: jmk@alice.UUCP (Jim McKie)
Newsgroups: comp.arch
Subject: Re: Filling branch delay slot with test
Message-ID: <9881@alice.UUCP>
Date: 11 Sep 89 04:44:07 GMT
Organization: AT&T Bell Laboratories, Murray Hill NJ
Lines: 48


To paraphrase from 'Architectural Innovations in the CRISP
Microprocessor', Spring COMPCON 87 Proceedings, by
Berenbaum, Ditzel and McLellan:

The CRISP CPU has 'branch folding': the Decoded Instruction Cache
contains a 'next address' field and also, in the case of conditional
branches, an 'alternate address' field.
Logic in the PDU recognises when a non-branching instruction is followed
by a branching instruction and 'folds' the two instructions together.
This single instruction is then placed in the Decoded Instruction Cache.
The separate branch instruction disappears entirely from the execution
pipeline and the program behaves as if the branch were executed in
zero time.
The 'alternate address' field is used to hold the second possible 
next instruction addresss for folded conditional branch instructions.
When an instruction folded with a conditional branch is read from
the instruction cache, one of the two paths for the branch is
selected for the next instruction address, and the address that was
not used is retained with the instruction as it proceeds down
the execution pipeline. The alternate address field is retained with
each pipeline stage only until the logic can determine whether the 
selected branch path was correct or not. When the outcome of the 
branch condition is known, if the wrong next address was selected,
any instructions in progress in the pipeline following the conditional
branch are flushed and the alternate address from the folded
conditional branch is re-introduced as the next instruction address
at the beginning of the execution pipeline.

Determining which of the two possible next addresses of a conditional
branch is likely to be taken is aided in CRISP with static branch
prediction. The encoding for the conditional branch instruction
contains a single branch prediction bit which can be set by the
compiler.
Branch prediction is useful when a conditional branch instruction
in the pipeline can alter the flow of instructions before the 
result of a comparison can be computed. If, however, there are no
compare instructions in the pipeline then there is no need for
branch prediction. Since only the compare instruction may set the
condition code flag, the outcome of the conditional branch is known
with certainty.
CRISP intentionally has separate compare and conditional branch 
instructions so that a compiler or optimizer may insert instructions
between the compare and conditional branch (this is 'branch spreading').
By combining branch folding and code motion in CRISP there can be
no cost at all for either conditional or uncondditional branches.

Jim McKie	research!jmk -or- jmk@research.att.com