Path: utzoo!attcan!uunet!snorkelwacker!usc!wuarchive!udel!nigel.ee.udel.edu!mccalpin
From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin)
Newsgroups: comp.arch
Subject: Re: LINPACK 1000x1000 MFLOPS per $$$
Message-ID: <MCCALPIN.90Jul20102428@pereland.cms.udel.edu>
Date: 20 Jul 90 14:24:28 GMT
References: <MCCALPIN.90Jul18175935@pereland.cms.udel.edu><2349@crdos1.crd.ge.COM>
Sender: usenet@ee.udel.EDU
Organization: College of Marine Studies, U. Del.
Lines: 105
In-reply-to: davidsen@crdos1.crd.ge.COM's message of 20 Jul 90 12:02:03 GMT

In article <> mccalpin@pereland.cms.udel.edu I wrote about MFLOPS/$:
 
>| (2) The $13,000 configuration includes no monitor or graphics adapter,
>| etc.  It is strictly a server, configured with 16 MB RAM and 120 MB
>| disk.  NFS is used to store results directly onto my graphics
>| workstation.  
 
In article <> davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) replies:
>   You have defined the solution by picking the dataset... You are
> talking about a tiny problem here, not at all typical of what is run on
> a Cray. Certainly there are problems requiring lots of CPU and tiny
> memory, and it's nice that you have one. Workstations are good at that.
> We run dedicated troff servers here, and they're workstations, too.

The configuration that I quoted has a rather small memory by current
supercomputer standards, but 2 MW (64-bit) is hardly "tiny".  As soon
as 3rd-party vendors start delivering memory boards at competitive
prices, the machine will be upgradable to 4 MW (64-bit) for about
$2000.  Since the machine was designed to accept 4 Mbit technology, it
is possible to configure it with up to 16 MW of memory.   I expect
that it will be a few months before IBM releases any boards based on 4
Mbit chips, and then a few more months before clones are available
from 3rd parties.  Estimated cost for a full 128 MB = 16 MW is about
$20,000 in addition to the base price of $8700 for the machine.

Since Cray is still selling lots of Y/MP's with 32 MW memories, it is
hardly fair to criticize a single-user workstation on that account.

As far as disk storage goes, I have 1.5 GB of disk space on my
graphics workstation, and will soon have a 2.3 GB tape drive.  So
manipulating 500 MB datasets (see below) is entirely practical.

The whole setup:
	Silicon Graphics 4D/25TG
		1.5 GB disk (2x760MB)
		2.3 GB tape
		32 MB RAM
		150 MB tape
	IBM 320 server
		16 MB RAM
		120 MB disk
is under $50,000 at University prices.

>   If you define the dataset to be typical Cray size, say 500MB, the
> workstation becomes impractical. And if you assume non-vectorable very
> large problems the Cray2 has the edge in scalar speed.

How did you decide that 500 MB was a "typical" Cray dataset?  There is
such a large variety of jobs that are run on Crays that defining a
"typical" job seems counter-productive.  There are *many* important
problems which are cpu-intensive that can fit comfortably into
machines with 2, 4, 8, or 16 MW of memory.  After all, Cray has only
been shipping X and Y machines with more than 8 MW of memory for about
2 years now.

Concerning the Cray-2 --- if the job *absolutely requires* at least
256 MW of real memory, then there are not many options (though I
believe that the Convex C-240 can be configured with 256 MW at
considerably less cost).  On the other hand, it might be more
cost-effective in the longer term to spend the programmer salary
required to port the application to run out-of-core on a much cheaper
machine. 

>   This is a lot like saying that you want to haul a bag of groceries at
> 100mph, and therefore sports cars are killing trucks. You have a sports
> car problem here, and your solution is cost effective. So? We still need
> trucks.

I said precisely the same thing at the end of my original posting.
However, I disagree that memory size is the primary dividing line
between jobs which require supercomputers and those which do not.
The IBM 320 that I described only has 2 memory board slots.  The
server configurations have more slots and can be configured with up to
512 MB = 64 MW of RAM (depending on the model) using 4 Mbit
technology.  Since most 8-cpu Y/MP's are shipping with memories of
this same size, it hardly seems like a clear distinction.

As other people have pointed out, the choice of a computational
platform is a multivariate constrained optimization problem.  Some of
the constraints are:
	(1) The cost must be within the available budget.
	    This includes the cost of porting the code as well.
	(2) The wall-clock turnaround must be within the limits
	    of the research project.
	(3) Point (2) usually requires sufficient memory to make
	    the problem core-containable.
	(4) Sufficient mass storage space and access speed must be
	    available to save intermediate and permanent results
	    without slowing down the calculation past the constraints
	    of point (2).

An anecdote:
  I recently submitted a proposal to the NSF to do some cpu-intensive
studies of the equations governing a theoretical two-dimensional
ocean.  The calculations are estimated to require 200 hours of Cray
Y/MP time.  I don't consider this a trivial expenditure....
With an IBM 320, I would probably be able to finish all of the 
calculations before the proposal even completes the review process!

> bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
> 	    "Stupidity, like virtue, is its own reward" -me
--
John D. McCalpin			mccalpin@perelandra.cms.udel.edu
Assistant Professor			mccalpin@vax1.udel.edu
College of Marine Studies, U. Del.	J.MCCALPIN/OMNET