Path: utzoo!mnetor!uunet!seismo!sundc!pitstop!sun!decwrl!pyramid!voder!apple!bcase
From: bcase@apple.UUCP (Brian Case)
Newsgroups: comp.arch
Subject: Re: conditional branches
Message-ID: <7405@apple.UUCP>
Date: 15 Feb 88 19:54:23 GMT
References: <191@telesoft.UUCP> <1556@gumby.mips.COM> <208@telesoft.UUCP>
Reply-To: bcase@apple.UUCP (Brian Case)
Organization: Apple Computer Inc., Cupertino, USA
Lines: 60

In article <208@telesoft.UUCP> roger@telesoft.UUCP (Roger Arnold @prodigal) writes:
>In article <1556@gumby.mips.COM>, earl@mips.COM (Earl Killian) writes:
>> Roger Arnold proposes [..] (boolean registers)
>> 
>> The abstract concept of "compare two values and
>> branch accordingly" represents around 15% of the instructions executed
>> on a computer.  To make this take two instructions increases your
>> instruction count by 15% (and thus, in many cases, your cycle count by
>> 15%).
>I'm no authority, but it doesn't surprise me at all that a RISC would
>use condition codes.  Compare and branch, in one instruction, strikes 
>me as more at home in a CISC than a RISC architecture.  For one thing, 
>if the instruction cycle is closely tuned to the basic ALU cycle, there 
>won't be time to feed back the result of a compare to the branch in one
>cycle.  To allow it, the cycle would have to be extended, at a major
>performance cost to the rest of the instruction set.  

Er, not really true since condition codes come from the ALU and have to meet
a set up time for the condition code register anyway.  Adding compare and
branch has its worst effects by either putting the register file PLUS the
ALU in a critical path or delaying the effect of conditional branches by
one more cycle (at least this is how I see it; did I goof?).  Yes, there
is some cost, but adding a gate or two to the ALU stage is probably not
going to cause it to be the most critical path (cache access (TLB is a
cache) is probably the criticalest path).

>Of course, marvelous things are possible with pipelining and parallel
>lookahead, but there's still a second consideration: there aren't
>enough bits to specify two operands and a target address in one fixed 
>size instruction.  Not, at least, without imposing restrictions that do 
>ugly things to the instuction set's orthogonality.  (E.g., both operands 
>in a register, target address a limited displacement PC relative mode).  
>Then, too, condition codes enable instruction scheduling; very good for 
>pipelining and low level parallelism in top end hardware.

Wait a minute, condition codes HINDER instruction scheduling, not "enable"
it.  The problem is that, even if each instruction can decide to set the
cond. codes based on a bit in the instruction, there is only one condition
code register; what if the result of a compare needs to be saved for later?
What if you (or the compiler) want to move a compare to some distant spot
to fill up a branch or load delay slot but there is an intervening instruction
that sets the condition codes?  Outa luck.  Your "boolean registers" idea
definitely mittigates this problem, perhaps eliminating it (but it creates
others).

>As to the 15% "compare two values" and branch, I've no trouble at all
>believing that number.  In fact, it feels conservative, if it includes
>comparisons to zero.  But it was consideration of how many of those
>compaisions only repeat tests previously made that prompted me to 
>suggest boolean registers.  Sorry, no numbers.  Just a programmer's 
>notoriously unreliable "gut feeling", based on looking a LOT of code 
>over the years.

Your boolean register idea is simply a less general, special case of saving
the result of a comparison in a general data register.  I suppose we could
argue about whether or not it is really necessary to have the full generality
of booleans in data registers, but I can say that I, personally now, this is
just me, would rather write a compiler for a machine that has either one (1)
condition code register or uses booleans in data registers.  It's the old
"there should be zero, one, or (large) n of something."