Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!bionet!ig!ames!vsi1!wyse!mips!mash From: mash@mips.COM (John Mashey) Newsgroups: comp.arch Subject: Re: Load/Branch ratio [was Re: 486 and 68040] Message-ID: <18253@winchester.mips.COM> Date: 27 Apr 89 21:40:33 GMT References: <17131@cup.portal.com> <12435@reed.UUCP> <3913@mipos3.intel.com> <17999@winchester.mips.COM> <3975@mipos3.intel.com> <18201@winchester.mips.COM> <25428@amdcad.AMD.COM> Reply-To: mash@mips.COM (John Mashey) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 48 In article <25428@amdcad.AMD.COM> tim@amd.com (Tim Olson) writes: >In article <18201@winchester.mips.COM> mash@mips.COM (John Mashey) writes: >| Yes, certainly a good tradeoff; loads are more frequent than branches. >Interesting -- what kind of numbers do you see? On the Am29000, we tend >to see just the opposite, although they are somewhat close: On R3000s, we see grossly similar effects, but the comment was directed to the 386/486 chips. I.e.: a) Across typical micro architectures, the NUMBER of branches would be grossly equal, even with different compiler technology,o thje the major exception of loop-unrolling effects in real loopy code. b) The NUMBER of loads/stores, however, can vary quite a bit, affected by: 1) The number of registers available at once 2) Register windows/stack caches/etc for subroutine calls 3) Global optimization technology 4) The nature of the program, i.e., some loads and stores can be eliminated by optimizers or windows, some won't go away no matter what you do. c) The PERCENTAGES of such things depend a lot on the remainder of the instruction set architecture and compiler quality, i.e., a good optimizer often drives the percentages of branches UP, because it generates better code for expression evaluation, for example, and the branches generally refuse to go away. The percentage of loads/stores can go up or down. Note that, although no one does this of course, if you want to drive the percentages of loads and branches down, just cripple your code generator! :-) d) Although I have no data on X86 instruction streams, I'd guess (Intel guys?) that the 486 made the correct choice for running X86 code, since it has less effective registers to play with, would tend to have a higher (# loads) / (# branches) ratio than the current crop of RISC machines. For example, take this to an extreme: suppose you had 2 registers only: you'd do almost anything to avoid an extra cycle of load latency, even if it cost you in branches, because your load/store numbers would be rather high. KDS's comment on preferring to optimize loads seems appropriate; it is also a GOOD example of why architecture isn't a cookbook set of rules, especially when continually improving architectures that started from different places. I.e., what might be the wrong tradeoff in a MIPS R????, might be the correct one for an 80?86. -- -john mashey DISCLAIMER: UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086