Xref: utzoo sci.math:9098 comp.arch:12898 comp.lang.c:24765 comp.sources.wanted:9957 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!samsung!usc!zaphod.mps.ohio-state.edu!mips!prls!pyramid!weitek!practic!vlsisj!davidc From: davidc@vlsisj.VLSI.COM (David Chapman) Newsgroups: sci.math,comp.arch,comp.lang.c,comp.sources.wanted Subject: Re: Integer Multiply/Divide on Sparc Summary: it's not hard... Message-ID: <15418@vlsisj.VLSI.COM> Date: 27 Dec 89 04:42:02 GMT References: <84768@linus.UUCP> Reply-To: davidc@vlsisj.UUCP (David Chapman) Organization: VLSI Technology Inc., San Jose, CA Lines: 42 In article <84768@linus.UUCP> bs@linus.mitre.org (Robert D. Silverman) writes: >Does any have, of know of software for the SPARC [SUN-4] that will >perform the following: > > [standard multiply and divide] > >The SPARC is brain dead [as were its designers] when it comes to doing >integer arithmetic. It can't multiply and it can't divide. There should be instructions on the order of "multiply step" and "divide step", each of which will do one of the 32 adds/subtracts and then shift. I'm not particularly fond of the SPARC architecture (don't like register windows), but this is a theoretical viewpoint and is not based on any direct exposure to assembly-language programming for it (translation: sorry, I can't give you any more help). Neither SPARC nor its designers were brain-dead when it was built. It's just that it is difficult to get multiplication and division (especially the latter) to run in 1 or 2 clock cycles. All instructions are supposed to execute in the ALU in 1 cycle; if the multiply and divide instructions take more time then the front of the processor pipeline has to be able to stall and this added complexity will slow down the entire processor. Thus they provide you with the tools to do your own multiply and divide. One of the benefits is that a compiler can optimize small multiplies and divides to make them execute quicker (i.e. multiply by 10 takes 4 steps instead of 32). It is important that you understand this if you are to write assembly language programs for a SPARC. If your instructions are not carefully optimized, the result could be slower than if you write in a high-level language and compile with its optimizer! (Unless the SPARC assembler performs instruction reordering.) P.S. Don't write a loop on the order of "MULSTEP, DEC, BNZ" or it will be incredibly slow. Unroll the loop 4 or 8 times (MULSTEP, MULSTEP, MULSTEP, MULSTEP, SUB 4, BNZ). Branches are expensive. -- David Chapman {known world}!decwrl!vlsisj!fndry!davidc vlsisj!fndry!davidc@decwrl.dec.com