Path: utzoo!attcan!uunet!husc6!uwvax!rutgers!apple!vsi1!wyse!mips!earl@wright.mips.com
From: earl@wright.mips.com (Earl Killian)
Newsgroups: comp.arch
Subject: Re: CC machines do execute 15% more instructions
Message-ID: <8768@wright.mips.COM>
Date: 24 Nov 88 18:56:37 GMT
References: <3386@pt.cs.cmu.edu> <74435@sun.uucp> <70@armada.UUCP> <550@m3.mfci.UUCP> <2152@ficc.uu.net> <552@m3.mfci.UUCP> <76500@sun.uucp> <10643@tekecs.TEK.COM> <8523@wright.mips.COM> <78977@sun.uucp>
Sender: earl@mips.COM
Organization: MIPS Computer Systems, Sunnyvale CA
Lines: 44
In-reply-to: ejensen@gorby.Sun.COM (Eric Jensen)

In article <78977@sun.uucp>, ejensen@gorby (Eric Jensen) writes:
>In article <8523@wright.mips.COM> earl@wright.mips.com (Earl Killian) writes:
>>Condition codes are not only harmful for translation, but for
>>performance.  15% of instructions are conditional branches.  If you
>>take two instructions instead of one to do this simple operation (set
>>the condition codes and then branch on them), you've just forced your
>>computer to execute 15% more instructions.
>
>This is nonsense.  From  "MIPS R2000 RISC Architecture" by Gerry
>Kane, pp. C-4 and C-5, I quote:
> "The R2000 provides a complete set of arithmetic comparisons against
> zero. (...). However, the only instructions for comparing a pair of
> registers are beq and bne.  To perform any other arithmetic comparison
> on a pair of registers or between a register and an immediate value,
> you must use a sequence of two instructions as listed in Table C.1 or
> C.2." 

Nonsense?  More like a carefully chosen design.  The credit goes to
the original Stanford MIPS crowd.  When they designed their original
MIPS chip they noticed that some branch conditions are much more
common than others, and that this could be exploited.  A < B, A <= B,
A > B, A >= B (for B != 0) are not as common as you might first guess
(look up Stanford's published results).  For example these are used at
the tops of loops, but they're not useful at the bottom (if you
implement loopback with <=, then your loops will fail when the
termination value is 2**31-1).  At loop bottom != is the appropriate
test.  (Also a global optimizer can convert A < B tests to A != B
tests in some cases.)  Hence inclusion of direct A == B and A != B
branches in the Stanford design, but not A < B etc (which are hard to
do as fast).  When B is a constant, then it needs to be loaded into a
register, but such a load is always loop-invariant and gets hoisted
out.

As for condition codes being set as a by-product of other operations,
can you cite any SPARC statistics?  My guess is that this is
insignificant for SPARC (but I don't have one to measure it -- sorry).

(It may be somewhat more significant for other cc machines, like the
VAX, that set the condition codes on memory loads so that
	for (p = list; p != NULL; p = p->next)
avoids a compare.)
-- 
UUCP: {ames,decwrl,prls,pyramid}!mips!earl
USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086