Path: utzoo!mnetor!uunet!mfci!root From: root@mfci.UUCP (SuperUser) Newsgroups: comp.arch Subject: Re: Performance increase - a suggestion Message-ID: <230@m2.mfci.UUCP> Date: 2 Feb 88 13:58:39 GMT References: <235@unicom.UUCP> <28200089@ccvaxa> <3127@phri.UUCP> Reply-To: colwell@m6.UUCP (Robert Colwell) Organization: Multiflow Computer Inc., Branford, CT. 06405 Lines: 53 In article <3127@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: >In article <28200089@ccvaxa> aglew@ccvaxa.UUCP writes: >> I wonder how much interest might be out there for a true double-precision >> floating point engine - one that did 64 bit floating point, or IEEE 80 >> bit extended floating point, or even 128 bit floating point, as its >> native floating point mode, as fast as single precision on nearly any >> other machine in its price range? > > I don't know about price range, but doesn't the FPS-164 do >exclusively 64-bit floating point? > > Actually, I wonder if a RISC-FPP would make sense. Consider a >machine like a Vax. Let's say we got rid of all the multiple precision >floating point formats and instructions (at last count, F, D, G, and H; did >I miss any?) and made all floating point math a single high-precision >format. Clearly that would save silicon. If we took that silicon real >estate and devoted it to making that single floating point format work >faster, could we build a machine which only has double precision, but does >it as fast as the old vax did single precision? > > Given that we could do double floating math as fast as single >floating math, the only advantage single would have left would be saving >memory on large arrays. Maybe we'd have to keep {load,store} >{single,double} instructions around for that. >-- >Roy Smith, {allegra,cmcl2,philabs}!phri!roy >System Administrator, Public Health Research Institute >455 First Avenue, New York, NY 10016 I don't believe you can do double precision math as fast as single precision math if both are implemented in the same technology. If we're including division and sqrts, derived via a high-radix iterative procedure, it's certainly not true, since you get only one or two bits of mantissa per trip through the ALU. In that case the time to solution is proportional to the number of bits of result you want. If we're only discussing addition, subtraction, and multiplication, then I still don't believe it. There's an adder at the heart of each of those, and its width decides its speed -- the wider, the slower (more levels of carry-lookahead). If your choice is between making one engine to do both single and double precision (or dbl and quad), or making only dbl (quad), then I think the engine that has less to do can be made slightly faster. My personal perception of the market for scientific computation is that given a choice between more precision and more speed, speed wins hands down. Bob Colwell Multiflow Computer 175 N. Main St. Branford CT 06405 mfci!colwell@uunet.uucp 203-488-6090