Path: utzoo!attcan!uunet!lll-winken!uwm.edu!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!psuvax1!rutgers!cbmvax!daveh
From: daveh@cbmvax.commodore.com (Dave Haynie)
Newsgroups: comp.sys.amiga.tech
Subject: Re: MC68881/2 Support (hello, Dave Haynie)
Message-ID: <12013@cbmvax.commodore.com>
Date: 31 May 90 16:02:02 GMT
References: <1181@metaphor.Metaphor.COM> <11996@cbmvax.commodore.com>
Reply-To: daveh@cbmvax (Dave Haynie)
Organization: Commodore, West Chester, PA
Lines: 52

In article <11996@cbmvax.commodore.com> valentin@cbmvax (Valentin Pepelea) writes:
>In article <1181@metaphor.Metaphor.COM> djh@dragon.metaphor.com (Dallas J.
>Hodgson) writes:

>> Since the FFP instructions trap out thru the FLINE vector anyway (if there's
>> no coprocessor present) why don't we EMULATE a 6888x when the traps occur?

>Too much time would be spent decoding and emulating the coprocessor
>instructions. And what if someone plugs in a Weitek math coprocessor? 

Kind of a moot point anyway, since Weitek cancelled the 68030 bus version of
their FPU.  Of course, you'd get better performance out of the floating point
on various AT&T DSPs, the 96002, or these new really fast FPUs from BIT.  If
there were a reasonable way to harness this speed in a general way.  Right
now there isn't, but read on...

>The correct solution is to provide a shared library of math functions, and
>that's what we do. Those functions automatically take advantage what ever
>hardware the user has, thus the programmer does not have to worry about
>the configuration of the platform on which his software is about to run.

That's the best general solution, but still not perfect.  You get programs
written for the libraries when speed is not a major issue.  When it is, most
programs currently come in a separate version with direct FPU code.  Even 
the library interface is too slow to do your floating point multiplies in 
the inner loop of a ray trace or something like that.  Same as on the Intel
based systems -- you can get some MS-DOS programs in generic, '387, or
Weitek flavors.

What would really help all of this is a standardized library of useful higher
level math functions.  You're not going to suffer a library call for an
instruction that takes only 20 or 30 clocks if you're worried about speed,
but you might be willing to take a library call to get a routine that takes
a couple thousand clocks to run, removes lots of the work you're trying to
do, works close to theoretical limits on 68000 or 68030/68882, and would get
you going even faster with a faster math engine sitting around.  This is
what I call retargetable mathematics, directly analogous to the type of 
problems folks want to solve with graphics libraries.  Folks complain about
the speed of something like WritePixel() just as they do about the basic
IEEE library's multiply routine.  But you rarely hear complaints about the
higher level graphics functions -- they do enough work for you that you use
them, and they go fast enough.  We really need something to do mathematics
at a high enough level to make faster math coprocessors viable without a
proliferation of math-coprocessor-specific versions of programs running around.

>Valentin


-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	"I have been given the freedom to do as I see fit" -REM