Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!linus!alliant!jgreen From: jgreen@Alliant.COM (John C Green Jr) Newsgroups: comp.arch Subject: Re: Anything wrong with the i860 Summary: i860 "sweet spot" enlarged by multi-level cache Message-ID: <4690@alliant.Alliant.COM> Date: 20 May 91 18:32:01 GMT References: <848@llnl.LLNL.GOV> <1991May16.221437.10751@rice.edu> Organization: Alliant Computer Systems, Littleton, MA Lines: 51 Eugene D Brooks III and Preston Briggs have mentioned the size of the i860 "sweet spot". Alliant builds the FX/2800, an air cooled supercomputer from multiple i860s with global shared memory and has extensive experience in looking for, finding, and expanding the "sweet spot." RISC microprocessors, including the i860 need lots of bandwidth. Alliant believes the way to deliver this bandwidth to the i860 is a multiple level cache: Location Size Bandwidth Per Processor =========== ============================= ======================= On chip 4 KB Instruction+8 KB Data 960 MB/sec On board 256 KB 128 MB/sec Global 16 MB 80 MB/sec Main Memory 4096 MB 40 MB/sec If your program does enough data reuse to make most of its memory references out of cache then it will run well. The i860 "sweet spot" may be small, but the Alliant caches enlarge it to include: Advertised speed of light 2240 MFLOPS SP Advertised speed of light 1120 MFLOPS DP SPECthru 313 Linpack 100x100 31 MFLOPS DP Linpack 1000x1000 325 MFLOPS DP Convolution (500 filterx50,000 data) 2150 MFLOPS SP 2-D FFT (Complex 1K) 420 MFLOPS SP Matrix Multiply 1000x1000 (DGEMM) 985 MFLOPS DP (DP REAL matrix) Matrix Multiply 1000x1000 (ZGEMM) 1018 MFLOPS DP (DP COMPLEX matrix) To put this in perspective: * As of Dongarra's 3/19/91 Linpack report the 100x100 31 MFLOPS was the fastest air cooled system available. The next fastest was a NEC SX-1E eeking out 32 MFLOPS for $Millions more. * Also as of 3/19/91 the Linpack 1000x1000 325 MFLOPS was the fastest air cooled system available. The next fastest are the late ETA10-E at 334 MFLOPS and the Cray-2/4-256 at 360 MFLOPS. * Getting more realistic in price: on May 7 Convex announced the $8 Million air cooled GaAs C3800. This machine's advertised speed of light is 960 MFLOPS DP and 1920 MFLOPS SP. In important scientific library routines the Alliant i860 killer micro delivers 1018 and 2150 MFLOPS, i.e. greater than the C3800 speed of light for about 1/5 the price. * The Alliant FX/2800 was announced in Jan 1990 and shipped in Mar 1990. The FX/800 verstion starts at $189K. * As Eugene Brooks has said: "Nothing will survive the attack of the killer micros."