Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!ucsd!ucbvax!sprite.Berkeley.EDU!elm
From: elm@sprite.Berkeley.EDU (ethan miller)
Newsgroups: comp.arch
Subject: Re: LINPACK 1000x1000 MFLOPS per $$$
Message-ID: <37683@ucbvax.BERKELEY.EDU>
Date: 20 Jul 90 22:53:15 GMT
References: <MCCALPIN.90Jul18175935@pereland.cms.udel.edu><2349@crdos1.crd.ge.COM> <MCCALPIN.90Jul20102428@pereland.cms.udel.edu>
Sender: usenet@ucbvax.BERKELEY.EDU
Reply-To: elm@sprite.Berkeley.EDU (ethan miller)
Organization: U.C. Berkeley Sprite Project
Lines: 98

In article <MCCALPIN.90Jul20102428@pereland.cms.udel.edu>,
mccalpin@perelandra.cms.udel.edu (John D. McCalpin) writes:
%In article <> mccalpin@pereland.cms.udel.edu I wrote about MFLOPS/$:
%The configuration that I quoted has a rather small memory by current
%supercomputer standards, but 2 MW (64-bit) is hardly "tiny".

Problem #1 shows up right now.  You compare price/performance for the $13000
machine, and then turn around and say that you'd actually need to pay two
or three times that for a machine that will do what you want, or even
what you claim.

%Estimated cost for a full 128 MB = 16 MW is about
%$20,000 in addition to the base price of $8700 for the machine.

So let's assume you only get 8MW of memory, at a total price of about $20000
for the machine.  You've cut your advantage in half, and all you've done
is buy more memory.  Now start buying enough disk space for that simulation
data....

%Since Cray is still selling lots of Y/MP's with 32 MW memories, it is
%hardly fair to criticize a single-user workstation on that account.

Cray's I/O system is able to handle much more "paging," where programmers
shuttle data in and out to fit in a tiny memory space.  Can the PowerStation
accommodate this?  It's not just a question of I/O bus bandwidth; the disks
must be able to keep up as well.

%As far as disk storage goes, I have 1.5 GB of disk space on my
%graphics workstation, and will soon have a 2.3 GB tape drive.  So
%manipulating 500 MB datasets (see below) is entirely practical.
%
%The whole setup:
%[... setup deleted; see referenced article]
%is under $50,000 at University prices.

So now we're up to $50000 for the configuration that you're racing against
a Cray.  Suddenly, the killer micro isn't as killer.  Of course, if all
you want is lots of MIPS and MFLOPS, and you don't need much memory,
you're still OK.  However, the original cost/performance ratio has just
dropped by 4 or 5 times because you've added enough components to make
a real system.

%	(1) The cost must be within the available budget.
%	    This includes the cost of porting the code as well.

Is it any harder to port code to the Cray than to other machines?  How
about other supercomputers, such as Convex?  There will certainly be
porting costs, but I don't think they'll be much worse for a supercomputer
than for any other computer.  Please correct me if I'm wrong on this,
though.

%	(2) The wall-clock turnaround must be within the limits
%	    of the research project.

If you suffer a 25 to 1 slowdown of CPU time, that will change turnaround
times from overnight to one month.  That's a big difference.

%	(3) Point (2) usually requires sufficient memory to make
%	    the problem core-containable.

Not necessarily, especially if you're running on a computer with lots of
I/O bandwidth (assuming you have the devices to feed it).  There are also
quite a few simulations that aren't core-containable on any Y-MP.  What
then?

%	(4) Sufficient mass storage space and access speed must be
%	    available to save intermediate and permanent results
%	    without slowing down the calculation past the constraints
%	    of point (2).

It is this element that can contribute lots of cost to a computer system.

%  I recently submitted a proposal to the NSF to do some cpu-intensive
%studies of the equations governing a theoretical two-dimensional
%ocean.  The calculations are estimated to require 200 hours of Cray
%Y/MP time.  I don't consider this a trivial expenditure....
%With an IBM 320, I would probably be able to finish all of the 
%calculations before the proposal even completes the review process!

Really?  That's 5000 hours on a PowerStation (using the 25/1 ratio from
the table).  That's about 200 days, assuming you use every single CPU
cycle on the machine.  Since you'll be doing some I/O, though, I'd be
surprised to see better than 50-75% utilization, which brings total
running time to close to a year.  Granted, it's cheaper than Cray time,
but is it practical to wait a year for a single simulation to finish?

There are simulations that can run on a workstation instead of a
supercomputer.  These tend to be smaller simulations, though, for
memory, disk/tape storage, and CPU speed reasons.  You can probably
increase one, perhaps two dimensions of these axes and stay with a
workstation.  Once you increase all three, you have a supercomputer,
or pretty close to it.

ethan
=================================
ethan miller--cs grad student   elm@sprite.berkeley.edu
#include <std/disclaimer.h>     {...}!ucbvax!sprite!elm
Witty signature line condemned due to major quake damage.