Xref: utzoo comp.sys.super:382 comp.arch:23157 comp.parallel:2638 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!malgudi!caen!uflorida!gatech!hubcap!fpst From: prins@cs.unc.edu (Jan Prins) Newsgroups: comp.sys.super,comp.arch,comp.parallel Subject: Re: Massively Parallel LINPACK on the Intel Touchstone Delta machine Message-ID: <1991Jun8.200043.15944@hubcap.clemson.edu> Date: 7 Jun 91 17:06:16 GMT References: <1991Jun3.233741.8570@elroy.jpl.nasa.gov> <13301@pt.cs.cmu.edu> <1991Jun6.174129.25202@hubcap.clemson.edu> Sender: news@cs.unc.edu Followup-To: comp.sys.super Organization: UNC-Chapel Hill Computer Science Lines: 20 Approved: parallel@hubcap.clemson.edu In article <1991Jun6.174129.25202@hubcap.clemson.edu>, dodson@convex.COM (Dave Dodson) writes: > > "FLOPS is defined as (2/3 N^3 + 2 N^2) / elapsed-time", > > What is interesting about this is that there are algorithms based on > "fast" matrix multiplication, where the product of two K by K matrices > can be formed with fewer than O(K^3) floating point operations. If you > use one of these fast algorithms, you may do significantly fewer than > (2/3 N^3 + 2 N^2) floating point operations, but you get credit for > (2/3 N^3 + 2 N^2) operations anyway. [...] In particular, using this months asymptotically fastest solver, your lowly workstation can beat the LINPACK performance of *any* machine whose performance was obtained through use of the standard algorithm, provided enough time and space. Of course it's not clear whether you are allowed to report a record performance 10,000 years before it completes. --\-- Jan Prins (prins@cs.unc.edu) / Computer Science Dept. --\-- UNC Chapel Hill