Path: utzoo!attcan!uunet!lll-winken!ames!purdue!decwrl!nestvx.dec.com!neideck From: neideck@nestvx.dec.com (Burkhard Neidecker-Lutz) Newsgroups: comp.arch Subject: Re: Questions on SparcStation 1 performance Message-ID: <8905120713.AA10368@decwrl.dec.com> Date: 12 May 89 07:13:11 GMT Organization: Digital Equipment Corporation Lines: 33 All Sparc implementations do not have any hardware support for integer divide and limited support for integer multiply in the form of the MULS multiply step instruction. There should never be any support beyond this, as this partitioning is part of the Sparc ARCHITECTURE. For multiply, the support is really sufficient, as most multiplications can be catched at compile time (read a paper about HP Precision Architecture and their measure- ments) and the speed of a software routine for the other cases with support from the MULS instruction is quite good. Quoted from the Cypress CY7C601 Sparc Architecture manual: ... Code for general mul subroutine ... This code has an optimization built in for short (less than 13-bit) multiplies. Short multiplies require 26 or 27 instruction cycles, long ones require 47 to 51 instruction cycles. For two positive numbers (the most common case), the cycle count is 47. They go on and claim 25 and 46 to 48 cycles respectively for unsigned multiplication. The integer divide code is much harder to understand and the only give an estimate for the most common case, worst case being up to 4 times slower (under strange circumstances). The formula is CEIL(log2(dividend/divisor)/3) x ( 21.5 ) + some setup cycles which translates into something from 30 to 260 cycles if I understand the formula correctly (which I probably don't). Burkhard Neidecker-Lutz, Digital CEC Karlsruhe, Project NESTOR neideck@nestvx.dec.com PS: How comes that John Mashey always answers my questions in my "notes read delay slot" so that I'm looking plain stupid when my question appears AFTER his answer... :-)