Xref: utzoo comp.sys.super:359 comp.arch:23052 comp.parallel:2604
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!emory!hubcap!fpst
From: baugh%ssd.intel.com@RELAY.CS.NET (Jerry Baugh)
Newsgroups: comp.sys.super,comp.arch,comp.parallel
Subject: Massively Parallel LINPACK on the Intel Touchstone Delta machine
Message-ID: <1991Jun3.130104.15667@hubcap.clemson.edu>
Date: 31 May 91 18:33:54 GMT
Sender: baugh%ssd.intel.com@RELAY.CS.NET
Followup-To: comp.sys.super
Organization: Supercomputer Systems Division, Intel Corp.
Lines: 45
Approved: parallel@hubcap.clemson.edu


The Intel Touchstone Delta machine was unveiled for the public today at the
California Institute of Technology.  The Touchstone Delta machine consists of
528 i860 based computational nodes, connected by a high speed, two dimensional
mesh.  This computer is yet another milestone in the cooperatively funded
DARPA / Intel Touchstone project.

The LINPACK benchmark has often been used as one measure of comparison between
machines, and most recently, a new section of the report, entitled 'Massively
Parallel Computing' defines the same test, solve a dense set of linear
equations, but allows for the problem sizes to scale with the size of the
machine.  With the unveiling of the Touchstone Delta machine, Intel can now
publish the following double precision performance numbers for massively 
parallel LINPACK:

	np	    n	 time	GFLOPS	MFLOPS/node
	---	-----	-----	------	-----------
	192	 2000	  5.5	  .971	          5
	192 	 4000	 20.8 	 2.053 	         11
	192	 6000	 51.3	 2.808	         15
	192	 8000	102.5	 3.331	         17
	192	10000	178.2	 3.742	         19
	192	12000	284.8	 4.046	         21
	
	512	 2000	  4.5	 1.187	          2
	512	 4000	 14.2	 3.007	          6
	512	 6000	 31.5    4.574	          9
	512	 8000	 58.3	 5.867	         11
	512	10000	 96.8	 6.889	         13
	512	12000	146.9    7.844	         15
	512 	14000   213.7    8.561 	         17

Where np -> number of processors
      n  -> order of the matrix (full 64 bit precision)
      t  -> time in secs

The previous leader in massively parallel LINPACK was Thinking Machines
at 5.2 GFLOPS on a 26624 order matrix (64K processors).

The current production iPSC/860 has a published performance of 1.92 GFLOPS
on a matrix of order 8600, using 128 i860 processors, each with 8 Mbytes of
local memory.  Conventional 1K Linpack on one i860 node is published at 
25 MFLOPS.


Jerry Baugh
Intel Supercomputer Systems Division