Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!usc!elroy.jpl.nasa.gov!decwrl!shelby!neon!Kermit.Stanford.EDU!philip
From: philip@Kermit.Stanford.EDU (Philip Machanick)
Newsgroups: comp.sys.mac.programmer
Subject: Re: Reforming Mac Programming (longish, ~60 lines)
Message-ID: <1990Mar27.194624.18746@Neon.Stanford.EDU>
Date: 27 Mar 90 19:46:24 GMT
References: <5492@okstate.UUCP> <13828@eagle.wesleyan.edu>
Sender: news@Neon.Stanford.EDU (USENET News System)
Reply-To: philip@pescadero.stanford.edu
Organization: Computer Science Department, Stanford University
Lines: 61

In article <5492@okstate.UUCP>, minich@a.cs.okstate.edu (MINICH ROBERT
JOHN) writes:
>   Ah, a nice sentiment, but there's one prob! To a certain extent, the
> SANE on a 6888x machine does use the math chip. (Could be a bit more,
> bue Apple's software is just barely more ccurate on some transcendental
> routines.) The reason why this will never be as good as the FP options
> on the compilers is that you have the overhead of a procedure call for
> each operation yo perform. Imagine writing a bunch of one line functions
> like double my_add( double a, double b) {...}   for all the basic FP
> math you do. I think if you run thisthrough a profiler, you'll find way
> way too much time spent on calling routines rather than crunching
> numbers. This is what SANE does now, so you can probably guess that any
> app using FP math on the heavy side is probably going to be better off
> calling the chip directly. Otherwise, your app _alread_ will use the
> math chip for SANE routines! 
>   The other prob with transparent FP chip use is that the formats for
> SANE and the 6888x are of different sizes, and you have to do byte
> crunching to transfer them back and forth. That's more overhead on the
> routines, since everyone will expect the SANE format results! 
>   This is certainly best left to the compiler to use the direct FP calls
> and figure out who needs what beforehand, avoiding all the ugly delays
> with format swapping and routine calling. Orders of magnitude
> difference! If you want the easiest way out, compile your app in two
> versions, one with FP, one without. If you have only a few routines that
> will REALLY benefit from the boost, compile a version of them with FP
> calls to the chip and one without. Convert the values where necessary
> and at runtime, use whatever is appropriate. (Simply said, a bit more
> work in the code, though.)
Question: Does execution of a 68881 instruction on a 680x0 without a 68881
cause an invalid instruction trap (something not too strange to the Mac)?

If the answer is yes, here's an alternative solution: software emulation of
the 68881. The cost of such software emulation on Macs without the chip would
surely be no higher than SANE calls, whereas the cost of NOT using the 68881
directly can be unacceptably high. The only problem remaining to be solved is
whether to continue to support more accurate transcendentals as supported by
SANE (longer-term: Apple could phase out SANE, with minimal support for
obsolete
products, something like System 4.0).

> ... No, what you've hit on that could be a much more useful and elegant
> idea is that of _sharing_ common code, like MacApp. If you look over
> some PC type shoulders, you'll see that this is one of the features of
> OS/2...
>   So, Apple. Please send dynamic linking our way. I really don't think
> it can be all _that_ difficult. All the compilers can link. Why not the
> OS? 
> 
There is a trade-off here. Dynamic linking costs you every time you
launch; traps cost you every time you do a call. In many cases the latter is
perceived to be faster (though it isn't) because you don't have to wait as
long at one specific time. How long does an OS/2 app take to launch? As
long as the MPW linker takes to link? I think there's a case for supporting
both: Apple allows you to kludge a sort of dynamic link by getting a trap
address and doing a subroutine branch to it, but a cleaner mechanism would be
attractive. You could then specify specific performance-critical routines
should be "dynamically linked", whereas others could go through the trap
mechanism.

Philip Machanick
philip@pescadero.stanford.edu