Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!ukma!gatech!hubcap!david From: david@june.cs.washington.edu (David Callahan) Newsgroups: comp.parallel Subject: Re: Superlinear(?) speedups Message-ID: <3940@hubcap.UUCP> Date: 19 Dec 88 18:05:39 GMT Sender: fpst@hubcap.UUCP Lines: 26 Approved: parallel@hubcap.clemson.edu In article <3935@hubcap.UUCP> aglew@mcdurb.Urbana.Gould.COM writes: >>It does not seem possible that n processors could run faster than n >>times one processor for a given global algorithm. One need only time >>slice the runs of the n processors on the 1 processor, ignoring the >>very real effects of cache effectiveness, etc ... > >And as I am fond of pointing out, timeslicing => context switch costs, >and reduction of context switch costs is one very real source of >linear super-unitary speedup. Yeah, but setting a lower bound on time-slice size will put a upper bound on context switch overhead as a percentage of execution time. Many of the examples discussed recently have not given any arguments to beleive that the parallel algorithm is asymptotically super-linear but seem to exploit increases in fast-memory, such as registers. How many examples of super-linear speedup can be explained by increases in memory-bandwidth due to more registers and more cache rather than increases in CPU horsepower? David Callahan D D maintaining multiple sets of registers which