Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!ncar!hao.ucar.edu!moses
From: moses@hao.ucar.edu (Julie Moses)
Newsgroups: comp.sys.atari.st.tech
Subject: Re: Re: Floating Pt. Math with a 68881 not always faster
Message-ID: <8387@ncar.ucar.edu>
Date: 3 Sep 90 17:02:43 GMT
Sender: news@ncar.ucar.edu
Reply-To: moses@hao.ucar.edu (Julie Moses)
Organization: High Altitude Observatory/NCAR, Boulder CO
Lines: 62

| From (Ken Badertscher)
|
|Sounds to me like the libraries were not properly implemented to take
|best advantage of the peripheral 68881.  In series of tests that I did
|when working on a homemade set of bindings for Megamax Laser C some time
|ago, I sometimes noticed that the speedups weren't all that
|sensational... The timings I did were /never/ slower when using the
|68881, though.  And the Megamax floating point routines are /fast/.
|-- 
|   |||   Ken Badertscher  (ames!atari!kbad)
|   |||   Atari R&D System Software Engine
|  / | \  #include <disclaimer>

| From (Peter Mutsaers)
|
|Maybe the routines of Prospero are not very fast, or they are only
|single precision.
|
|The 68881 uses 80 bits, generally double precision takes 4 times longer
|then single precision.
|In Turbo C, which has the fastest floating point library available to my
|knowledge, 80 bits take 3 times as long as the 68881 does. So
|if there were single pricision routines they would be a bit faster
|then the 68881.

        The above messages and some others point to a solution to 
the question: why are simpler F.P. math functions slower with the
68881 than with the 68000?

Ken and Pete,

        Yes the Prospero libraries are not highly optimized as compared
to Turbo C or Megamax C but they are a solid group of functions supported
by a good working environment. However, the Prospero 68881 LIBS, I
would wager, are faster than Megamax C's libraries for two reasons:
1) Prospero's looks once for the 68881 at program bootup for the math 
chip while Laser C looks for it everytime it wants to do F.P. math, 
2) Prospero comes with two 68881 libraries, the second has no error 
checking and that eliminates some overhead (though you better know 
what the ranges of the solutions will be).
        The solution is that I was comparing single precision (32 bit)
F.P. math being done by the 68000 to  80 bit math by the math copressor.
Complex F.P. functions, such as Tangent(x), are always faster when
done by the 68881 math copressor, but simple functions such as add,
subtract and multiply are <slower> because of the time taken : 
1) waiting for the 68881 chip to be ready to receive, 2) moving 32 bits
to the math chip, 3) the math chip converting the the 32 bit floats to
80 bit floats, 4) returning the solution back to the 68000. I am doing
my single precision F.P. math in Fortran subroutines and linking them
into my Pro-C. Double precision F.P. math, such as done by C, I would 
agree, is probably always slower than the math by the 68881.
        Having looked at the Alcyon 68881 assembly source code, there
does not seem to be much one can do to further optimize the F.P. 
routines. Prospero's are probably based on Alcyons.  The TT's copressor
will probably run circles around any single precision done by a 68xxx
CPUs (I hope).

Julie Moses