Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!mailrus!tut.cis.ohio-state.edu!husc6!bbn!oberon!cit-vax!mangler From: mangler@cit-vax.Caltech.Edu (Don Speck) Newsgroups: comp.arch Subject: Re: Maximum MIPS for a given memory bandwidth? Message-ID: <6955@cit-vax.Caltech.Edu> Date: 15 Jun 88 09:15:28 GMT References: <6921@cit-vax.Caltech.Edu> <22050@amdcad.AMD.COM> <291@wombat.UUCP> <22063@amdcad.AMD.COM> Distribution: na Organization: California Institute of Technology Lines: 43 In article <22063@amdcad.AMD.COM>, tim@amdcad.AMD.COM (Tim Olson) writes: > I think that average bandwidth > requirements are much more interesting -- it tells more about the cost > and complexity of a memory design than the peak rating, and seemed to be > more in line with what the original poster was asking. Average bandwidth requirements are the interesting thing for shared-memory multiprocessors, but I was asking about uniprocessors, where all of the bandwidth is dedicated to one processor and costs the same to provide whether the processor uses all of it or not. I consider caches to be part of the memory system, i.e. part of the von Neumann bottleneck. Instead of using the ambiguous term "MIPS", I should have said "number of times the speed of a VAX/780". Unfortunately it wouldn't fit in the column headings. Dhrystones would have been less ambiguous. I didn't expect enough accuracy that it would make much difference. So the table is amended as follows: Processor avg read bus bandwidth VAX MB/s:MIPS latency width available "MIPS" ratio 25 MHz 88000 45ns? 32+32 185 MB/s? 17 11? 16 MHz MIPSco ? 32+32 120 MB/s? 10? 13? 40 MHz RPM40 100ns 32+16 240 MB/s 15 16 25 MHz AMD 29000 80ns 32+32 170 MB/s 22 8 The AMD 29000 is remarkably bandwidth-efficient, despite using (on average) less than half of the memory cycles available. (Maybe this points out the efficacy of their optimizer). How much would the 29000 slow down if it had only one 32-bit path to a combined instruction+data cache, i.e. half as much peak memory bandwidth available? I had assumed that efficient use of bandwidth would require a narrow path to memory (with bit-addressable bit-serial being the most efficient). Perhaps this is not necessary. I still suspect that there's some lower bound on the number of bytes exchanged with cache/memory to perform the work of a "mythical" instruction. Don Speck speck@vlsi.caltech.edu {amdahl,ames!elroy}!cit-vax!speck