Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!snorkelwacker!mit-eddie!rutgers!att!cbnewsi!reha From: reha@cbnewsi.ATT.COM (reha.gur) Newsgroups: comp.arch Subject: Re: Integer Multiply/Divide on Sparc Summary: real world numbers Message-ID: <1535@cbnewsi.ATT.COM> Date: 27 Dec 89 18:42:24 GMT References: <84768@linus.UUCP> <8840004@hpfcso.HP.COM> <1804@l.cc.purdue.edu> Organization: AT&T Bell Laboratories Lines: 37 In article <1804@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes: > It is clear that you are not to be trusted (see above). To multiply > two 32 bit numbers to get a 64 bit product on a 32x32 -> 32 machine, > the 32 bit numbers must be divided into 16 bit parts. The whole operation > takes about 20 operations (count them). Shift and add are far slower. > Divide is even worse. Also, there is considerable overhead in a > subroutine call; there are registers to save and restore. Open > subroutines (in-line functions) are a way around it, but they still > have the problem. > > Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 > Phone: (317)494-6054 > hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP) The numbers I get (from looking at the data sheets and other info) for two machines: a 25Mhz i486 and a 25Mhz SPARC are as below: Assuming no cache hits and various other items: i486: 18-31 cycles for signed 32 x 32 bit multipication (reg to reg) SPARC: 48-52 cycles for same (including subroutine call and return time) i486: 32 cycles for signed 32 bit division (acc by reg) SPARC: 41 (approximate best case) to 211 (approximate worst case) (depends on bits in dividend and divisor) The numbers above are approximate and results may vary. The SPARC subroutine call does not need any registers saved across the call. The code for multiplication is as given in the sparc architecture manual. Also note that some SPARC machines do have (or might have) integer mul and divide in hardware. reha gur attunix!reha