Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!ucdavis!pollux!vmrad
From: vmrad@pollux (Bernard Littau)
Newsgroups: comp.sys.ibm.pc.programmer
Subject: Re: How is a 68000 as fast as an 80386??
Message-ID: <6933@ucdavis.ucdavis.edu>
Date: 6 Mar 90 00:26:28 GMT
References: <505@bilver.UUCP> <MARK.90Mar2090318@acsdev.uucp>
Sender: uucp@ucdavis.ucdavis.edu
Reply-To: vmrad@pollux (Bernard Littau)
Distribution: na
Organization: University of California, Davis
Lines: 46

In article <MARK.90Mar2090318@acsdev.uucp> mark@acsdev.uucp (Mark Grand) writes:
+In article <505@bilver.UUCP> alex@bilver.UUCP (Alex Matulich) writes:
+
+   In my current C programming project, I have written some functions that
+   perform statistical things on 400 separate data sets (linear regressions,
+   standard errors, etc).  This number-crunching part takes about a minute to
+   complete when I run it on my Amiga.  My Amiga uses a 68000 running at 14 MHz
+   (twice the normal cpu speed) and no math chip.  The compiler is Lattice C
+   4.0 in 32-bit addressing mode (similar to the IBM "large" memory model).
+
+   Naturally, I wanted more speed, so I ported the program to an AT&T 386WGS
+   at work, which is a 25 MHz 80386 IBM compatible.  I compiled it using
+   Turbo C 2.0, large memory model.  Then I watched in chagrined disbelief as
+   that number-crunching section still took about a minute to execute --
+   actually a few seconds longer than my Amiga.  All source code was the same!
+
+Sounds like you've discovered why Lattice charges more for their
+compiler.  The Lattice compiler does some real optimizations.  Turbo C
+does not do so much optimization.  Another factor is the fact that you
+were using large model pointers.  32 bit pointers (unless you're in
+native 386 mode) have a higher speed penalty associated with them on
+a 386 than on a 68000.  If there's any way for your data to be
+referenced using near pointers, you will be able to get more speed.

Another possibility is that you are using different floating point
conventions in the two programs.  I would be surprised if the 386 did
not make up for the pointer dereferencing penalty over the 68000 just
with faster and more efficient execution of the rest of the code,
especially if it is number crunching bound.

C is a poor math language in general.  Lattice may well have
implemented a single precision floating point library, while the 386
based C is using double.  You might try forcing both machines to use
double and see if this makes a difference.

I have seen cases where the same program takes longer to execute in
single mode than double.  This is due to the overhead of converting
single to double, doing the calculation in double, and converting back
to single.  

I am curious if this is the case, let me know.

Bernard Littau    VM Radiological Sciences        Telephone: (916) 752-4014
                  School of Veterinary Medicine   Internet:  vmrad@ucdavis.edu
                  University of California        BITNET:    vmrad@ucdavis
                  Davis, CA 95616                 UUCP: ucbvax!ucdavis!vmrad