Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!snorkelwacker!paperboy!USER From: USER@osf.org (Michael Meissner) Newsgroups: comp.arch Subject: Re: Integer multiply and killer micros Message-ID: Date: 10 Jan 90 14:07:00 GMT References: <158@csinc.UUCP> <787@stat.fsu.edu> <42701@lll-winken.LLNL.GOV> <5842@ncar.ucar.edu> <490@qusunl.queensu.CA> <34259@mips.mips.COM> Sender: news@OSF.ORG Organization: Open Software Foundation Lines: 29 In-reply-to: mark@mips.COM's message of 9 Jan 90 00:50:16 GMT In article <34259@mips.mips.COM> mark@mips.COM (Mark G. Johnson) writes: ... | The idea above proposes to use 80 million bits of RAM and 20 clock cycles | to compute a 32b integer multiply. This in noncompetitive when compared | to killer micros, which multiply more quickly and consume far less real | estate. Instead of lookup tables they implement dedicated hardware: | | R6000: 16 cycles 32x32 -> 64 | R3000: 12 cycles 32x32 -> 64 | M88000: 4 cycles 32x32 -> 32 ** | | **88k computes the 32 lsb's of the 64b product (upper bits are discarded). Actually if I remember chapter 7 of the 88100 user's manual, a multiply 6 cycles (1 in FP1, 3 in the multiplier stage, 1 in FPLAST, and 1 writeback). Logically, the writeback phase should be available to be feed forward, which logically shaves off 1 cycle. However, since non of the floating point operations do feed forwarding, I wouldn't be surprised if integer multiply/divide don't feed forward either. As alluded to in an earlier article, multiple multiplications can be done in parallel, since each cycle, the multiplier advances the pipeline. Floating point adds can similarly be pipelined. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA Catproof is an oxymoron, Childproof is nearly so