Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!wuarchive!rex!ames!sun-barr!newstop!sun!chiba!khb
From: khb@chiba.Sun.COM (Keith Bierman - SPD Advanced Languages)
Newsgroups: comp.arch
Subject: Re: 55 MIPS & 66 MIPS (really, embedded & military benchmarking)
Summary: When are statistics not statistics. Long screed. Sorry, must
	 be the garlic pizza. Also, a hint that GaS lives on.
Message-ID: <128680@sun.Eng.Sun.COM>
Date: 1 Dec 89 09:45:41 GMT
References: <31329@winchester.mips.COM> <1358@bnr-rsc.UUCP> <5275@omepd.UUCP> <32528@winchester.mips.COM>
Sender: news@sun.Eng.Sun.COM
Lines: 280


I must confess to not having followed the SAE stuff closely (I must
have spent a good minute or so glancing at the suntech blurb until now :>).
Thanks for bringing this article to my attention. When I get back to
work (taking next week off :> :>) perhaps I can coax someone to loan
me their copy of the report itself.

The Right Honorable J.M. sez:

>This note:
>1) Analyzes the Society of Automotive Engineers (SAE)'s final report
>"FINAL REPORT, 32 BIT COMMERCIAL ISA TASK GROUP, AS-5, SAE" .....

>The objective was: "the 32 Bit Commercial ISA Task Group was established
>to evaluate suitability of existing commercial architectures for use as
>general purpose processors in avionic and other embedded
>applications"

Does anyone happen to know how/when/why SAE become the arbiter of avionic
applications ?

>The approach was to request applications from any vendor who wanted to
>propose things, and they got AMD29K, Intergraph Clipper, MIPS R3000,
>NS32000, Sun SPARC, and Zilog ZS80000.  "A set of criteria were
> established and relative weights set."  This was split into:
>	60%: functionality of the instruction sets (general)
>	20%: capabilities of the current implementation
>	20%: performance
        ^^^

So while we performance weenies can carp about their benchmarking
techniques and/or data reduction techniques (What no SVD, covariance and
sensitivity analyses ? :>) the results of the benchmark section would
appear to be of much less entertainment value than their evaluation of
the functionality of the instruction sets (I thought all the machines
were turning complete .... so this must have been the area the group
spent the most time locked in discussion).


>"Results:
>         		29000	R3000	32532	SPARC
>General		42.88	40.12	42.56	43.40
        ----------------------------------------------

>The most significant point of the results is the very small spread of the
>point values."

Agreed, the difference between the high (SPARC) and the low (MIPS) is
only 7.6%. But as this section is pure gedanken study, the rationale
employed is probably of great interest to this group. If someone has
the report handy and is a good typist, posting it would be a service
(more entertaining than yet another posting of xstones, or 8queens
:>). 

Since this was a GS there is NO measurement error, or other source of
measurement error, there is little justification for using the same
statistical tools we use for measuring benchmarking activities.

It is not surprising that this section was counted more heavily than
the other two, as the folks who build missiles, planes, and spacecraft
are more concerned about long term issues than the hot chip of the week
(galileo, for instance, is 1802 based ... and it is possibly the most
complex deep space probe yet flown).


>(There are pages of such things; some of the numbers make sense, some
>are inexplicable to me, but that's OK. This particular one is somewhat
>inexplicable... Some of the ratings directly contradict the findings
>of people like JMI, whose C Executive runs on many micros, and which
>MEASURED things like interrupt-handling and context switching,
^^^^^^^^

Ah, "data". If you can measure it with a stop watch, it is part of
"performance" or "current implementation". Without pondering their
report long and deep I can't begin to second guess them; but mixing
the gedanken study whose intent it is to crystal ball gaze (it takes a
lot of years to develop an embedded system, so one tries very, very
hard to pick the technology that will be ripe when you are ...
typically 5 to 10 years down the road (at least where I came from). 

>Under "Current implementations", there were good things like:
>"How many compatible performance variations are available?
	AMD29000	1
	MIPS R3000	3
	National 32000	5
	Sun SPARC	5"
>(Interesting: it doesn't matter whether an implementation covers
>a wide range of performance, what counts is the number of different
>ones.

yep. If I want to guess what will really be on the shelf in 10 years,
I want the one with the most suppilers ... one is likely to have stuck
around. 1 is a really bad number (if for no other reason, it usually
addes several inches of paper to the documents which must be approved
for your project to fly).... there are of course other reasons. 

>I deleted the NS32532 column for space reasons, and added the data column
>at the right (which was the Ada compiler, -O, and whose results were
>available May 1989 and posted shortly thereafter (I think) on the JIAWG
>bboard by the TI folks.

One presumes they left out optimized results for some bizzare reason
of their own. I have played with both the Verdix and Telesoft SPARC
Ada compilers and both come with optimizers.

....perf figures and analysis
....

>  Of course, the committee report came out AFTER
>the JIAWG decision was made [i.e., it was irrelevant to that],
>and this report explicitly did NOT recommend anything as the architecture
>for military projects.

There are non-military government embeded projects. There are
non-JIAWG (at least there were when I lived in that universe). As I
recall it is rare for such committees to ever come out and say BUY IBM
or anything like that :> They come up with fancy numeric ranking
schemes to shield themselves from anything that tacky. Also a lot of
such projects end up with close figures .... most readers (and writers
in that world) rely on the ranking (at least they used to). 

Lessons:
>1) It's hard to evaluate things on paper.  I think the committee tried
>hard, in a really difficult job, but it's real hard...

But real necessary. Long term projects require long term thinking.

>2) It's always a good idea to look behind the summaries a bit.

And at the background of the organization(s) involved, past
recommendations, projects which relied on or ignored the
recommendations (implicit as well as explicit) and how they turned out
(including funding battles, etc.)

>3) It's important to understand the difference between numbers than
>mean something, and numbers that don't.  The committee did understand
>that there was insufficient difference to prove anything.

Perhaps. I don't know the SAE (other than as the folks who along with
God and Honda tell me what oil to put into my motorcyles). One must
know how to read behind the words.

>Now, everyone interprets data a bit differently,  Just for fun, let's look
>at how Frank Yien and Scott Thorpe of Sun interpreted this, in
>SunTech Journal, Autumn 1989, page ST8, in the article called:
>	"SPARC Scores In DARPA/SAE Architecture Test"

>(THERE'S BEEN PLENTY OF DATA; NOW WE GET SOME "MARKETING" ANALYSIS;

Data ? 

The SAE chose a 60-40 split of "that which cannot be measured with a
stopwatch or ruler" vs. "lets take our places and do the 100-nanosec
dash". Taking the resulting numbers and renormalizing, computing means
and etc. isn't data. It's analysis. Since it's not being done to
engineer a product, or to elucidate the logic of the report, its all
been "marketing" analysis. Very interesting and entertaining mind you,
but this warning is a bit late. It is a very nice rhetorical move though.


>The article leads off with:

>"In a recent comparison of leading 32-bit architectures by DARPA (the Defense
>Advanced Research Projects Agency), the SPARC architecture was ranked
>as the top processor architecture for use in military projects."
>	Well, it had the highest numbers, but they weren't significant,
>	and the committee said so.

How many reports of this nature say otherwise ? As far as I know, its
a standard disclaimer like "your milage may vary".

>	Of course, it didn't matter much anyway, because the key
>	decisions were being made somewhere else, and the choices elsewhere
>	[MIPS & Intel] reflected what the large contractors decided in
>	doing serious evaluations.

Perhaps, perhaps not....

Excerpted from a press release dated Nov 27

 SPEC To Develop Chip Set

SPEC has been contracted by NASA to develop a high-performance GaAs
RISC processor to demonstrate the inherent speed and radiation-hardness
advantages of GaAs.  Multiple GaAs SPARC processors will be included in
a demonstration board that SPEC is building to look at Gas
capabilities.  The board will include four to eight GaAs PARC
processors,  GaAs array communications coprocessors and GaAs floating
point coprocessors.

Under the agreement with SPEC, Sun can license SPEC's GaAs-based SPARC
design with the option to have it manufactured by one of the six
semiconductor vendors now manufacturing SPARC microprocessors.  Initial
samples of the GaAs SPARC processor will be available late in 1990.

Note that: this isn't the benchmarking group that ee-times, sun, mips,
hp, et al. set up. Instead it is Systems & Processes Engineering
Corporation (SPEC) provides systems engineering services and
manufactured products to the aerospace industry, international and
U.S. commercial business, and to government agencies.  Located in
Austin, Tex., SPEC is a privately owned company. This brings up a
question ... did the SPEC benchmarking group do a name search ?


The array communications coprocessor is a GaAs implementation of a
proprietary SPEC inter-processor communications architecture.  The
coprocessor provides a tightly coupled message/data passing interface
between processors in a multi-processor computer system.  The floating
point coprocessor supports 32-bit and 64-bit operations in a highly
pipelined mode with a peak throughput of one floating point operation
per cycle.  These three components make a complete chip set that SPEC
will incorporate into board-level products for the commercial
marketplace.

"These are the building blocks necessary to build the high-performance
systems of the future.  Single- and multiple-processor GaAs
workstations will form the high end of performance in the 1990s.  Other
technologies will not be able to approach the performance of GaAs,"
said SPEC President Randolph E. Noster.

According to SPEC's chief scientist, Dr. Gary B. McMillian, the GaAs
SPARC processor and coprocessors are being designed to operate at 200
MHz, with performance at 800 to 1600 MIPS in a four- to eight-processor
implementation.  SPEC plans to use a VME/FutureBus implementation,
which will provide enough bus bandwidth to support the multiple,
high-speed processors.


>"Finally, SPARC won the benchmark category, without using the most
>powerful >SPARC implementations available from SPARC manufacturers
>today.  The 80-MHz >ECL SPARC implementation was not used in these
>comparisons;" 
>	Of course it wasn't; the embedded avionics market is not
>	excited by ECL, and Sun didn't have an ECL system for them to
>	benchmark anyway. 

Not necessarily true. Gould/Encore, Elxsi and others have particpated
in the big iron/high powered embeded system marketplace. And, as the
clipping I included above notes, GaS is of a more than passing
interest (rad hard, widely used in some battlefield stuff) in some
circles. Late breaking events in what used to be the USSR may make
some of the high performance/rad hard research less important ... or
perhaps it will free up resources to do more serious space
research.... but as late as Monday some folks still thought they had
funding for such projects.

While I am certainly NOT prepared to say sun has an ECL machine, I
find it interesting that John is so positive that we don't. Some folks use
the old algorithm "announce, take orders, design, ship, test" but it
is increasingly dangerous to rely on it. I'd be willing to bet that,
for instance, IBM will be able to ship its RIOS box close to whatever
date IBM sez it can after announcement. 

..... misc hype from the suntech article <omitted>

	Well, each to their own.... Note that the real war for the 32-bit
	RISC embedded defense standard seems to have 2 winners...

Sometimes battles last longer than one round. 


Keith H. Bierman    |*My thoughts are my own. !! kbierman@sun.com
It's Not My Fault   |	MTS --Only my work belongs to Sun* 
I Voted for Bill &  | Advanced Languages/Floating Point Group            
Opus                | "When the going gets Weird .. the Weird turn PRO"

And in this case, my boss probably thinks I'm home asleep, not wasting
valuable computer cycles on the net. I may not even be speaking for
me.... I meant to go home hours ago.

"There is NO defense against the attack of the KILLER MICROS!"
			Eugene Brooks

        Nor should there be.
	 --khb

Keith H. Bierman    |*My thoughts are my own. !! kbierman@sun.com
It's Not My Fault   |	MTS --Only my work belongs to Sun* 
I Voted for Bill &  | Advanced Languages/Floating Point Group            
Opus                | "When the going gets Weird .. the Weird turn PRO"


Brought to you by Super Global Mega Corp .com