Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!wuarchive!decwrl!mcnc!uvaarpa!murdoch!madras!clc5q
From: clc5q@madras.cs.Virginia.EDU (Clark L. Coleman)
Newsgroups: comp.arch
Subject: Re: More Snake bytes.
Message-ID: <1991Apr5.191331.27524@murdoch.acc.Virginia.EDU>
Date: 5 Apr 91 19:13:31 GMT
References: <2004@kuling.UUCP> <8840021@hpfcso.FC.HP.COM> <569@diab.se>
Sender: usenet@murdoch.acc.Virginia.EDU
Organization: University of Virginia Computer Science Department
Lines: 102

In article <569@diab.se> pf@diab.UUCP (Per Fogelstr|m) writes:
>
>I think Mashey's statement is correct. It's not meaningless to compare
>normalized SPECint because it gives You a good indicator on how well
>the architecure is implemented.

I believe that the main point is not being responded to here. If I complicate
the architecture with all kinds of hardware that makes the critical path
longer, and add instructions to the ISA that slow the cycle down, then
it is an architectural issue, not just a sign of a poor implementation.

I think the real question for comp.arch is: Given the same semiconductor
process (e.g. any particular current CMOS 1.0 micron process), implement
various architectures in that process as best as you can --- then what is
the resulting performance? There are system level issues that I am leaving
out of the equation, I realize, but at least the question is infinitely
more relevant than "normalized Specint" comparisons.

Let's try a hypothetical. The JCN computer company is designing a new
workstation that is specifically geared towards performing well on the
Specint benchmarks. Two competing design teams develop prototypes.
(JCN has too much cash to burn, apparently.) One team comes up with a
prototype that is implemented in the company's own 1.0 micron CMOS process,
and it runs at 50MHz and achieves a Specint of 40. The other team,
comprised of blithering idiots, comes up with a chip that interprets
high-level code in a terribly complex circuit that has such long
critical paths that it can only run at 1MHz in 1.0 micron CMOS.
It achieves a Specint of 0.9, however, giving it a better Specint/MHz
ratio than the other processor. Naturally, the company chooses to market
the slower processor  --- it has a provably "superior" architecture,
based on the all-important Specint/MHz ratio, and that ratio will be
great advertising fodder. The more than 40-fold performance ratio
disadvantage must just be "implementation", not bad architecture,
according to the marketing MBA genius who chooses the slower chip,
"because if the first chip had only a 1MHz clock, it would have poorer
benchmarks than the second, and clock rate is just an implementation
matter."

Unfortunately, the team that designed the first chip leaves and starts
their own company, kicking the heck out of JCN in the marketplace.
The MBA then lays off a few technical staff and decides they need a
bigger advertising budget. THE END.

Seriously, I cannot believe I am reading so many people claim that MHz
is 100% implementation, 0% architecture.

>If we could push the clock frequency for the R3000 up to 66Mhz it would,
>if we scale the results, perform equally well with the HP9000/730.

And will the HP9000/730 sit still while you do that? Can the R3000 be
implemented TODAY in HP's technology at 66MHz ? Does anybody really
believe that?

>Well, this was the technical point of view, and it would not help the
>customers that want the boxes today, but I'm an design engineer.

"Technical point of view" ?? It is just a total misunderstanding of
computer architecture issues that constrain implementation and affect
the clock speed.

>
>So if i wanted the best price/performance solution for my fp intensive
>application today, i would probably chose an HP9000/730.

Yes, and you would say, "I sure wish JCN could improve their implementation
so I could get their superior architecture instead. Darn! Why don't they
do a better job of implementing over there?" And as time went by, their
implementation would get better and faster, but so would the first machine.
And the ratio would remain approximately 40 to 1 in favor of the first
machine until it started to run into physical limitations that have to do
with clock speed (bus noise, etc.) But it seems highly improbable that the
JCN turkey will ever catch up.

I have been watching HP's products for a decade. It always seemed to me
that they were lagging the market in semiconductor implementation. Only
their less efficient HP 9000/500 architecture got the benefit of their
best NMOS process when it became available --- other machines started
getting the same process several YEARS later (there must be some horrible
stories of corporate inertia lurking around there.) I said to myself years
ago that if HP were to implement the HP-PA stuff in state of the art
semiconductors, they would blow away the competition. Prophecy fulfilled
in 1991; the competitors ARE using a process that is just as good as HP's,
other postings notwithstanding; and the detractions I see on this thread
bespeak a lack of architecture understanding, or commercial envy and
axe-grinding in a few cases.

Put up or shut up, workstation vendors: Tell us what design rules would
be required to achieve HP's Specint numbers. Most of you have the numbers
on your "future evolution" sheets already. I suppose it is proprietary
info, of course; just don't keep posting bull about how HP's success is
just semiconductor process. When will we see a 66MHz SPARC ? When we
have 0.6 micron processes for it? Let's be honest and cut the  *****.

As for Per, my comments are directed not at you, but at the corporate
axe-grinders and assorted sideline sour grapes throwers.


-----------------------------------------------------------------------------
"The use of COBOL cripples the mind; its teaching should, therefore, be 
regarded as a criminal offence." E.W.Dijkstra, 18th June 1975.
|||  clc5q@virginia.edu (Clark L. Coleman)