Path: utzoo!attcan!uunet!lll-winken!maddog!brooks
From: brooks@maddog.llnl.gov (Eugene Brooks)
Newsgroups: comp.arch
Subject: Re: Scalability?
Message-ID: <57556@lll-winken.LLNL.GOV>
Date: 2 May 90 04:56:27 GMT
References: <2075@naucse.UUCP> <6897@odin.corp.sgi.com> <49622@lanl.gov> <1990May1.154558.24009@cs.rochester.edu>
Sender: usenet@lll-winken.LLNL.GOV
Reply-To: brooks@maddog.llnl.gov (Eugene Brooks)
Organization: Lawrence Livermore National Laboratory
Lines: 18

In article <49622@lanl.gov> ryg@lanl.gov (Richard S Grandy) writes:
>I've got a glossy from SGI that shows the POWER center performance on 
>LINPACK (100x100, coded) as:
>		1 CPU	3.8 DP MFLOPS
>		4 CPU	16  DP MFLOPS
>		8 CPU	28  DP MFLOPS
>Does this mean than with 4 cpus you really get GREATER than a linear speedup?? 
I have documented superlinear speedups on linear system solvers for
machines with coherent caches hooked to a bus.  The effect can occur
when the problem size is larger than the cache on one processor, but small
enough to allow distribution of the data set in serveral caches without
cache spilling.  The data set in this case is 80K bytes.  As I recall, the
POWER series uses a 64K first level cache which is write-through backed up
with a 256 copy-back cache hooked to the bus.  One would expect that a
super linear effect would be possible given the size of the first level caches.


brooks@maddog.llnl.gov, brooks@maddog.uucp