Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!snorkelwacker!paperboy!USER
From: USER@osf.org (Michael Meissner)
Newsgroups: comp.arch
Subject: Re: Integer multiply and killer micros
Message-ID: <USER.90Jan10090700@pmax27.osf.org>
Date: 10 Jan 90 14:07:00 GMT
References: <158@csinc.UUCP> <787@stat.fsu.edu> <42701@lll-winken.LLNL.GOV>
	<5842@ncar.ucar.edu> <490@qusunl.queensu.CA>
	<sZcrr1G00hMNI3EF5i@cs.cmu.edu> <34259@mips.mips.COM>
Sender: news@OSF.ORG
Organization: Open Software Foundation
Lines: 29
In-reply-to: mark@mips.COM's message of 9 Jan 90 00:50:16 GMT

In article <34259@mips.mips.COM> mark@mips.COM (Mark G. Johnson) writes:

	...

|   The idea above proposes to use 80 million bits of RAM and 20 clock cycles
|   to compute a 32b integer multiply.  This in noncompetitive when compared
|   to killer micros, which multiply more quickly and consume far less real
|   estate.  Instead of lookup tables they implement dedicated hardware:
| 
|        R6000:     16 cycles      32x32 -> 64
|        R3000:     12 cycles      32x32 -> 64
|        M88000:     4 cycles      32x32 -> 32  **
| 
|   **88k computes the 32 lsb's of the 64b product (upper bits are discarded).

Actually if I remember chapter 7 of the 88100 user's manual, a
multiply 6 cycles (1 in FP1, 3 in the multiplier stage, 1 in FPLAST,
and 1 writeback).  Logically, the writeback phase should be available
to be feed forward, which logically shaves off 1 cycle.  However,
since non of the floating point operations do feed forwarding, I
wouldn't be surprised if integer multiply/divide don't feed forward
either.  As alluded to in an earlier article, multiple multiplications
can be done in parallel, since each cycle, the multiplier advances the
pipeline.  Floating point adds can similarly be pipelined.
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA

Catproof is an oxymoron, Childproof is nearly so