Path: utzoo!attcan!uunet!lll-winken!lll-tis!helios.ee.lbl.gov!pasteur!ames!vsi1!wyse!mips!mark From: mark@mips.COM (Mark G. Johnson) Newsgroups: comp.arch Subject: re: Fast FP addition Keywords: Standard Un*x H/W architecture Message-ID: <2626@quacky.mips.COM> Date: 19 Jul 88 20:37:14 GMT Reply-To: mark@mips.COM (Mark G. Johnson) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 25 In article <12005@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes $ Finally, I understand that handling the IEEE gradual underflow $ behavior can add an extra cycle of latency. I also have $ observed that the MIPS R2010 FPA (and maybe the new R3010 also) $ can do a floating add in 2 (!) clock cycles. How did they do $ that? Yes, the R3010 also does IEEE 64-bit FP adds in 2 cycles. Architecture, logic design, and circuit design of the R3010 are covered in a recent article in IEEE Micro: C. Rowen et al., "The MIPS R3010 Floating-Point Coprocessor", _IEEE Micro_, Vol. 8 No. 3, June 1988, pp. 53-62. The R3010's FP add unit's speed comes from 3 areas: (i) a hardware "add algorithm" that's optimized toward implementation in full-custom CMOS; (ii) an innovative logic partitioning / design that implements the algorithm; (iii) highly polished CMOS layout and circuit design. See the acknowledgments at the end of the paper if you're interested in knowing who was responsible for each of these areas. -- -- Mark Johnson MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086 ...!decwrl!mips!mark (408) 991-0208