Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!ukma!gatech!hubcap!david
From: david@june.cs.washington.edu (David Callahan)
Newsgroups: comp.parallel
Subject: Re: Superlinear(?) speedups
Message-ID: <3940@hubcap.UUCP>
Date: 19 Dec 88 18:05:39 GMT
Sender: fpst@hubcap.UUCP
Lines: 26
Approved: parallel@hubcap.clemson.edu

In article <3935@hubcap.UUCP> aglew@mcdurb.Urbana.Gould.COM writes:
>>It does not seem possible that n processors could run faster than n
>>times one processor for a given global algorithm.  One need only time
>>slice the runs of the n processors on the 1 processor, ignoring the
>>very real effects of cache effectiveness, etc ...
>
>And as I am fond of pointing out, timeslicing => context switch costs,
>and reduction of context switch costs is one very real source of
>linear super-unitary speedup.

Yeah, but setting a lower bound on time-slice size will put a 
upper bound on context switch overhead as a percentage of execution
time. Many of the examples discussed recently have not given
any arguments to beleive that the parallel algorithm is asymptotically
super-linear but seem to exploit increases in fast-memory, such
as registers. 

How many examples of super-linear speedup can be explained
by increases in memory-bandwidth due to more registers
and more cache rather than increases in CPU horsepower?

David Callahan

D
D
maintaining multiple sets of registers which