Path: utzoo!attcan!uunet!husc6!bloom-beacon!bu-cs!purdue!i.cc.purdue.edu!j.cc.purdue.edu!pur-ee!hankd From: hankd@pur-ee.UUCP (Hank Dietz) Newsgroups: comp.arch Subject: Re: Maximum MIPS for a given memory bandwidth? Summary: Bandwidth where? Message-ID: <8304@pur-ee.UUCP> Date: 13 Jun 88 16:27:13 GMT References: <6921@cit-vax.Caltech.Edu> Distribution: na Organization: Purdue University Engineering Computer Network Lines: 46 In article <6921@cit-vax.Caltech.Edu>, mangler@cit-vax.Caltech.Edu (Don Speck) writes: > A while ago, Rick Richardson was looking for a microprocessor > that could squeeze 4000 Dhrystones out of a 4 MHz 16-bit bus. ... > Processor avg read bus bandwidth MB/s:MIPS > latency width at the CPU MIPS ratio > SUN2 (68010) 400ns 16 5 MB/s 0.7 7 > Microvax II 400ns 32 10 MB/s 0.9 11 > VAX-11/750 ~440ns 32 9 MB/s 0.6 15 > VAX-11/780 ~440ns 32 12 MB/s 1.0 12 ... > I'm wondering if there is some formula for the maximum number of > MIPS that can be extracted from a memory system, based on its > bandwidth, bus size, and latency, i.e. "with that memory/cache > system you can't get more than N mips"? With a large enough table > of the above type, perhaps one could derive some rules of thumb in > this direction? Well, obviously there is such a formula using your definition of bandwidth... in fact, you effectively used the formula above. The major source of inconsistency is in what constitutes a MIP. Consider: 1. The average number of bits of memory referenced per instruction executed (hence also per MIP) depends on the instruction set and its encoding. The lower bound is 0 (i.e., processor crunching a microcoded instruction within its own registers) and the maximum is large-but-finite. 2. Your "bandwidth at the CPU" measure simply makes the use of CPU-internal registers/cache/instruction-decode-logic and the operand precsion of the machine all important. For example, if we assume that, on average, a 32-bit operand will be loaded/stored from CPU-external memory every 4 instructions and there are 8-bits per instruction, we would find that we need 2MB/s (16 MBits/s) for one MIP, giving a ratio of 2:1 in your terminology. Once you've picked your benchmark (persumably, Dhrystones) and set the precision of the operands, you're measuring how space-efficiently instructions are encoded and how well the CPU-internal memory system works -- not really all that interesting, because the choice of what to call CPU-internal and what to call CPU-external is completely arbitrary. If you break-down the bandwidth measure into bandwidths of the component parts (i.e., on-chip registers, cache, etc.), then you might get some interesting results...? -hankd