Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!uakari.primate.wisc.edu!samsung!rex!ames!amdcad!nucleus!tim From: tim@nucleus.amd.com (Tim Olson) Newsgroups: comp.arch Subject: Re: Integer Multiply/Divide on Sparc Message-ID: <28635@amdcad.AMD.COM> Date: 5 Jan 90 14:57:08 GMT References: <84768@linus.UUCP> <8840005@hpfcso.HP.COM> Sender: news@amdcad.AMD.COM Reply-To: tim@amd.com (Tim Olson) Organization: Advanced Micro Devices, Inc., Austin, Texas Lines: 18 Summary: Expires: Sender: Followup-To: In article <8840005@hpfcso.HP.COM> dgr@hpfcso.HP.COM (Dave Roberts) writes: | (4) If you need the speed, you write the code inline. Loops kill | you in whatever architecture you use. If you do huge numbers | of arbitrary 32x32 mults, you're code will explode, but hey, | this is a RISC machine and your code size is already through | the roof, right? If you call a subroutine everytime you want | to do a multiply the overhead of the call will kill you. But | notice that this wasn't what I suggested, either. Inlining code for performing multiplies is an option, but the call overhead isn't going to "kill you" -- the overhead would probably be less than 10% -- especially if these kind of routines used a special calling sequence that the compiler knows about which doesn't have the overhead of a standard call. -- Tim Olson Advanced Micro Devices (tim@amd.com)