Path: utzoo!utgpu!watmath!att!tut.cis.ohio-state.edu!ucbvax!islington-terrace.csc.ti.com!pf
From: pf@islington-terrace.csc.ti.com (Paul Fuqua)
Newsgroups: comp.sys.ti.explorer
Subject: Re: fixnum vs single-float multiplies
Message-ID: <2833831045-4270667@Islington-Terrace>
Date: 19 Oct 89 23:17:25 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Distribution: inet
Organization: The Internet
Lines: 59


    Date: Thursday, October 19, 1989  1:54pm (CDT)
    From: seibert at xn.ll.mit.edu  (seibert)

    (defmacro i* (i j) `(the fixnum
    			 (ash (* (the fixnum ,i)
    				 (the fixnum ,j))
    			      (the fixnum right))))
      
    (defun foobar ()
        ...

        (timeit ()
          (setf result (* nbrw nbrw)))

Note that I* does ASH as well as *, while the floating-point timing only
does *.  On my Explorer 2, that make the difference between 1.41 usec (*
only, fixnum) and 3.14 usec (* with ASH, fixnum).

If I change the I* to * (so I just do multiplication), I get times of
about 1.6 usec for fixnum-fixnum, 5.5 to 6.5 usec for fixnum-bignum, and
7.5 usec and up for bignum-bignum.  Single-float multiplication is about
8.6 usec.  (I should probably point out that these aren't official
timings, just me hacking on my Explorer, which is old enough to have a
white exterior and contains a processor board of uncertain revision.)

    Date: Thursday, October 19, 1989  2:35pm (CDT)
    From: James Rice <Rice at sumex-aim.stanford.edu>
    
    @i[I] would expect that short float and fixnum
    multiplication times should be very similar.

I wouldn't, but that's because I know the microcode.  I think
small-floats are still decomposed into the same internal representation
as single-floats -- microcode space was (and is) a bit tight on Explorer
1s.

					      The addition of
    the exponents can be done in parallel to the multiple of
    the mantissae and should be faster.

If you had a parallel functional unit to do it with.  Multiplication on
the Explorer is done with multiply-step microinstructions (Booth's
algorithm) -- 1 bit at a time for the Explorer 1, 2 for the Explorer 2.

		Floating point addition is the thing that
    really screws you, since you have to slide the two
    mantissae until the exponents match before you can perform
    the addition.  All of those arithmetic shifts can be slow.

Especially on a machine whose hardware supports LDB/DPB more than
shifts.  I think the Explorer 2 has some normalisation hardware support,
but I was getting out of the microcode business by the time Release 3
and the Explorer 2 came out.

Paul Fuqua                     pf@csc.ti.com
                               {smu,texsun,cs.utexas.edu,rice}!ti-csl!pf
Texas Instruments Computer Science Center
PO Box 655474 MS 238, Dallas, Texas 75265