Path: utzoo!attcan!uunet!husc6!uwvax!rutgers!apple!vsi1!wyse!mips!earl@wright.mips.com From: earl@wright.mips.com (Earl Killian) Newsgroups: comp.arch Subject: Re: CC machines do execute 15% more instructions Message-ID: <8768@wright.mips.COM> Date: 24 Nov 88 18:56:37 GMT References: <3386@pt.cs.cmu.edu> <74435@sun.uucp> <70@armada.UUCP> <550@m3.mfci.UUCP> <2152@ficc.uu.net> <552@m3.mfci.UUCP> <76500@sun.uucp> <10643@tekecs.TEK.COM> <8523@wright.mips.COM> <78977@sun.uucp> Sender: earl@mips.COM Organization: MIPS Computer Systems, Sunnyvale CA Lines: 44 In-reply-to: ejensen@gorby.Sun.COM (Eric Jensen) In article <78977@sun.uucp>, ejensen@gorby (Eric Jensen) writes: >In article <8523@wright.mips.COM> earl@wright.mips.com (Earl Killian) writes: >>Condition codes are not only harmful for translation, but for >>performance. 15% of instructions are conditional branches. If you >>take two instructions instead of one to do this simple operation (set >>the condition codes and then branch on them), you've just forced your >>computer to execute 15% more instructions. > >This is nonsense. From "MIPS R2000 RISC Architecture" by Gerry >Kane, pp. C-4 and C-5, I quote: > "The R2000 provides a complete set of arithmetic comparisons against > zero. (...). However, the only instructions for comparing a pair of > registers are beq and bne. To perform any other arithmetic comparison > on a pair of registers or between a register and an immediate value, > you must use a sequence of two instructions as listed in Table C.1 or > C.2." Nonsense? More like a carefully chosen design. The credit goes to the original Stanford MIPS crowd. When they designed their original MIPS chip they noticed that some branch conditions are much more common than others, and that this could be exploited. A < B, A <= B, A > B, A >= B (for B != 0) are not as common as you might first guess (look up Stanford's published results). For example these are used at the tops of loops, but they're not useful at the bottom (if you implement loopback with <=, then your loops will fail when the termination value is 2**31-1). At loop bottom != is the appropriate test. (Also a global optimizer can convert A < B tests to A != B tests in some cases.) Hence inclusion of direct A == B and A != B branches in the Stanford design, but not A < B etc (which are hard to do as fast). When B is a constant, then it needs to be loaded into a register, but such a load is always loop-invariant and gets hoisted out. As for condition codes being set as a by-product of other operations, can you cite any SPARC statistics? My guess is that this is insignificant for SPARC (but I don't have one to measure it -- sorry). (It may be somewhat more significant for other cc machines, like the VAX, that set the condition codes on memory loads so that for (p = list; p != NULL; p = p->next) avoids a compare.) -- UUCP: {ames,decwrl,prls,pyramid}!mips!earl USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086