Xref: utzoo comp.sys.super:359 comp.arch:23052 comp.parallel:2604 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!emory!hubcap!fpst From: baugh%ssd.intel.com@RELAY.CS.NET (Jerry Baugh) Newsgroups: comp.sys.super,comp.arch,comp.parallel Subject: Massively Parallel LINPACK on the Intel Touchstone Delta machine Message-ID: <1991Jun3.130104.15667@hubcap.clemson.edu> Date: 31 May 91 18:33:54 GMT Sender: baugh%ssd.intel.com@RELAY.CS.NET Followup-To: comp.sys.super Organization: Supercomputer Systems Division, Intel Corp. Lines: 45 Approved: parallel@hubcap.clemson.edu The Intel Touchstone Delta machine was unveiled for the public today at the California Institute of Technology. The Touchstone Delta machine consists of 528 i860 based computational nodes, connected by a high speed, two dimensional mesh. This computer is yet another milestone in the cooperatively funded DARPA / Intel Touchstone project. The LINPACK benchmark has often been used as one measure of comparison between machines, and most recently, a new section of the report, entitled 'Massively Parallel Computing' defines the same test, solve a dense set of linear equations, but allows for the problem sizes to scale with the size of the machine. With the unveiling of the Touchstone Delta machine, Intel can now publish the following double precision performance numbers for massively parallel LINPACK: np n time GFLOPS MFLOPS/node --- ----- ----- ------ ----------- 192 2000 5.5 .971 5 192 4000 20.8 2.053 11 192 6000 51.3 2.808 15 192 8000 102.5 3.331 17 192 10000 178.2 3.742 19 192 12000 284.8 4.046 21 512 2000 4.5 1.187 2 512 4000 14.2 3.007 6 512 6000 31.5 4.574 9 512 8000 58.3 5.867 11 512 10000 96.8 6.889 13 512 12000 146.9 7.844 15 512 14000 213.7 8.561 17 Where np -> number of processors n -> order of the matrix (full 64 bit precision) t -> time in secs The previous leader in massively parallel LINPACK was Thinking Machines at 5.2 GFLOPS on a 26624 order matrix (64K processors). The current production iPSC/860 has a published performance of 1.92 GFLOPS on a matrix of order 8600, using 128 i860 processors, each with 8 Mbytes of local memory. Conventional 1K Linpack on one i860 node is published at 25 MFLOPS. Jerry Baugh Intel Supercomputer Systems Division