Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!coherent!NeXT!chansen From: chansen@NeXT.UUCP (Craig Hansen) Newsgroups: comp.arch Subject: Re: Double Width Integer Multiplication and Division Summary: The Real Reason Mips has no double length divide Message-ID: <4016@bauhaus.NeXT.UUCP> Date: 7 Jul 89 17:41:10 GMT References: <1046@aber-cs.UUCP> <1380@l.cc.purdue.edu> <13943@haddock.ima.isc.com> Organization: NeXT Inc., Palo Alto Lines: 60 There's been some discussion of why double-length multiply and divide were or were not included on various machines. However, the real reasons behind why RISC machines don't like double-length multiply and divide hasn't been hit. I can speak most authoritatively on the Mips RISC processor. Most operations on a RISC processor can be expressed as a function of the contents of two general registers, yielding a single result. Double-length multiply (two sources; two results) and divide (three sources; two results) both violate this generalization, which makes them more expensive to implement. A Nice Clean Architecture would just add register specifiers and read and write ports until there were enough, but that's not way of RISC. The Mips R2000 uses two special-purpose registers, each of which hold half of the double-length result of a multiply, or the quotient and remainder of a divide. These registers are very carefully handled in the implementation to permit them to be written into immediately on starting up the operation, in order to avoid additional bypass and staging latches - they're written into two cycles earlier than the general registers. This is the reason why operations that modify these registers must not occur within two cycles after an instruction that reads them: if they are any closer, an interrupt or exception may require restarting the instruction stream at the special-register read, even though the register was previously modified by an instruction that modified the register and was aborted. For double-length divides, you'd need three words of source operand, and the R2000 only have two general register read ports. Yes, there's room in the instruction encoding for another register specifier, but there's no hardware to get a third value from the register file. Because of the way the special-purpose registers are not bypassed, it would not be possible to use one them to hold the third value; if an interrupt or exception required restarting a double-length divide, that third value would be corrupted. So that's why there's no double-length divide: although the divider itself is intrinsically able to perform the operation, it would have cost more latches and multiplexors to get the data into the unit. Latches and multiplexors make up a surprising amount of the cost of a RISC processor; one works very hard to minimize the number of them. There are other reasons not to bother: a double-length divide can overflow in difficult to detect ways; a single-length divide can also overflow, but it's easy to detect divide by zero and divide of MININT by -1 in software while the divide is running in parallel. Of course, as has been mentioned before, there's no expression of this operation in C, so a C compiler won't generate it. Finally, it should be noted that an integer divide is actually more complex (time x space-wise) than a floating-point divide: there are fast redundant-representation techniques that only work for normalized numbers. You'd probably find that a multiple-precision divide can be implemented faster using floating-point arithmetic than fixed-point on most RISC machines. Regards, Craig Hansen