Path: utzoo!attcan!uunet!ncrlnk!ncrcae!hubcap!argosy!fyodor
From: argosy!fyodor@decwrl.dec.com (Chris Kuszmaul)
Newsgroups: comp.parallel
Subject: Re: Information on Massively parallel machines
Summary: 99.9% parallelization is not as hard as it may seem
Message-ID: <7128@hubcap.clemson.edu>
Date: 20 Nov 89 14:19:27 GMT
Sender: fpst@hubcap.clemson.edu
Lines: 114
Approved: parallel@hubcap.clemson.edu

In article <7070@hubcap.clemson.edu>, ut-emx!gary@cs.utexas.edu (Gary Smith) writes:
> In article <7041@hubcap.clemson.edu>, SAROFF@BNLDAG.AGS.BNL.GOV writes:
> > I am interested in getting information, technical specs, software
> > available, personal experiences, benchmarks, that sort of thing about
> > the massively parallel machines currently available.  
> 
> I must admit I remain very skeptical that massive MIMD parallelism will
> pay off in the near future, if ever.  Now modest MIMD parallelism, up to
> perhaps 256 processors might, assuming we get a significant number of
> applications that can achieve parallelism in excess of 99.9%, and if syn-
> chronization overhead can be made arbitrarily small.  Of course, these
> are pretty severe constraints!


  There is no question that some applications will never be parallelizable
at all, and that some applications will only be trivially parallelizable,
but there are a large class of problems for which parallel computers
(including, ding ding ding, SIMD computers) will provide speedup
proportional to the number of processors assigned to the task.

  These applications include the fast fourier transform, and linear 
systems solvers, as two very wide ranging examples. If one examines
the profile of a typical application which uses an FFT, one will
find that for an FFT of 32 data elements, the profile is:

  83% parallelizable core.
  17% serial controll code.

  Well, gee, it seems that even if we had an infinite number of
processors, we would not get more than a factor of 6 improvement in
performance. 

   The problem with this analysis, and with Mr. Smith's analysis,
is the assumption that the percentage of time taken to perform
the serial portion of the application grows at the same rate as the
portion taken to perform the parallel core.

  In the case of the FFT, it is more appropriate (if not
PERFECTLY accurate) to say that the serial portion of the 
code grows linearly with the problem size (O(N)), and
the parallel portion grows as O(NlogN). Thus, the following
table:

problem size:     % parallelizability:
64                85
128               87
256		  88
512               90
1024              91
2048              91.6
4096              92.3
8192              92.8
16k               93.3
32k               93.7
64k               94.1
128k              94.4
256k		  94.7
512k              95.0
1M                95.2

   Now, actually, the above numbers are quite conservative for the
FFT, since the parallelizable portion of the problem involves
much more floating point arithmetic than the serial portion
(proportionatly). Moreover, the FFT itself is a conservative
example, because it is such a slowly increasing in complexity
algorithm.

  Consider now, a linear system solver, for which the parallel
portion of the code grows at a rate of O(n^3), and the serial 
portion grows at a rate of O(n^2). Let us assume that the 
overhead (non increasing-with-size-of-problem-serial-portion)
is rather large, so that for n = 10, the percentages work out as:

parallel:  50%
serial:    50%

the ensuing table:

problem size       parallel%
20                   66       
40                   80
60                   85
100                  90.9
200                  95.2
400                  97.5
1000           	     99.0
2000 		     99.5                
5000                 99.8
10000                99.9
20000                99.95
40000                99.98


   Now, one might object that there are not
very many applications which involve matrices of
size 40000 x 40000, or which use million
point FFTs. I would counter by agreeing! Of
course there aren't many applications of that size,
they REQUIRE too much parallelism for most of
todays computers.

   As I said before, the above is not intended as a proof
of the general applicability of massivley parallel
computers, it is merely a demonstration that massively
parallel computers will, (and are) revolutionizing
computing, and how we think about computing. It also
does happen to be the case that a tremendous number of
real applications parallelize and scale in the same way as the FFT
and linear systems solvers. Thus, I submit that massively
parallel computing is a good business to be in, but who
am I to talk?


    CLK