Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!crdgw1!crdos1!davidsen
From: davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr)
Newsgroups: comp.arch
Subject: Re: Integer Multiply/Divide on Sparc
Message-ID: <2017@crdos1.crd.ge.COM>
Date: 16 Jan 90 13:31:57 GMT
References: <8840005@hpfcso.HP.COM> <1249@otc.otca.oz> <KHB.90Jan11212327@chiba.kbierman@sun.com> <1255@otc.otca.oz> <2819@auspex.auspex.com>
Reply-To: davidsen@crdos1.crd.ge.com (bill davidsen)
Organization: GE Corp R&D Center, Schenectady NY
Lines: 26

In article <2819@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:

| Newer compilers will presumably include a command-line flag instructing
| them to either produce the multiply/divide instructions themselves or
| calls to ".mul"/".div" and company.  And such calls are surely *not*
| expanded with the "a.out" file is generated, unless you linked with
| "-Bstatic" - shared libraries, remember?

  Has anyone measured the time taken to just generate the mpy and trap
it vs the time for a procedure call? We used to trap some instructions
on the old GE series 20 years ago, and the time to trap and decode
(table lookup for decode) was only a few % slower than a call, when the
total time to execute the "instruction" was taken into account.

  Would it be better to just generate the instruction all the time and
trap it, rather than use the various libraries? It would certainly give
better performance on the machines with the mpy hardware, and based on
the very slow times reported here might not be a notable loss on
standard ABI SPARC.

  Has anyone measured these numbers to get a ballpark figure? I don't
have a good feel for how long the partial context change would take on
the trap.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
            "Stupidity, like virtue, is its own reward" -me