Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!aplcen!uakari.primate.wisc.edu!uflorida!mephisto!mcnc!rti!xyzzy!wood
From: wood@dg-rtp.dg.com (Tom Wood)
Newsgroups: comp.sys.m88k
Subject: Re: Information wanted on m88000 Risc workstations
Keywords: 80386 m88000 Everex Opus UNIX DOS
Message-ID: <1879@xyzzy.UUCP>
Date: 8 Jan 90 20:22:06 GMT
References: <641@s5.Morgan.COM> <25A64468.11498@paris.ics.uci.edu> <648@s5.Morgan.COM>
Sender: usenet@xyzzy.UUCP
Reply-To: wood@gen-rtx.dg.com ()
Organization: Data General Corporation, Research Triangle Park, NC
Lines: 50

In article <648@s5.Morgan.COM> amull@Morgan.COM (Andrew P. Mullhaupt) writes:

>2. That ratio of Megaflops to MIPS sucks. Let me rephrase this. Given
>that the 88000 is the only RISC chip with onboard floating support,
>you've got to wonder why since it ends up being (relatively) so
>slow. Can you get an FPA for it? On the systems with the combined
>88000/80386 CPUs can you hang a quick Cyrix of the 80386, or a Weitek
>3167? or can you put a 4167 on the 88000? Does Motorola have some
>kind of remedy for those of us who like the looks of those soon to
>be announced 486/860 systems which will scream for floating point?

and later:

>Yeah, well that DecStation 3100 kind of stomps these 88000 boxes for
>double precision. And the application benchmarks in that issue show
>just how nasty the threat is from the 486 (e.g. the Cheetah Gold is
>in the same class as these other machines, and Weitek IS working on
>a floating point coprocessor for the 486. Also the Cheetah costs
>about 10,000 for the tested configuration.) It's not really clear
>how the price performance benchmark is arrived at, and the Dhrystone
>just doesn't represent what I need a box for. Right now I'm of a
>mind to get the 88000 if I can get good UNIX and some kind of 
>floating point help. Otherwise, it's back to square one. Oh well.

I'd like to entertain a discussion on the FP performance of the 88k.
I have yet to see a compiler that takes advantage of the pipeline
on this machine to any extent.  Theoretically, you can have 5 FP adds
and 6 FP multiplies going on at once (if I understand correctly, the total
here is not 11, but 9: at most 5 FP adds or at most 6 FP multiplies and
no more than 9 total).  So how would you feel if someone were able to
boost Mflops by a factor of say 3 (or better) by improving the compiler 
technology?

Here's a sample of what I'm talking about.  These are computed values
for the Matrix multiply inner loop:

	DO 10 J = 1,N
    10	    A(I,J) = A(I,J) + B(I,K)*C(K,J)

Code Generation Technique      Cycles/iteration      Mflops

    Naive code                      19                 2.10
    Naive code, 2 unrolls          35/2		       2.28
    Sophisticated, 4 unrolls       28/4		       5.71
    Sophisticated, 8 unrolls       48/8 	       6.67

Well, how 'bout it!?
---
			Tom Wood	(919) 248-6067
			Data General, Research Triangle Park, NC
			{the known world}!rti!xyzzy!wood