Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!linus!philabs!cmcl2!seismo!ll-xn!cit-vax!speck
From: speck@cit-vax.Caltech.Edu (Don Speck)
Newsgroups: net.arch
Subject: Re: Mips / MHz
Message-ID: <467@cit-vax.Caltech.Edu>
Date: Mon, 12-May-86 05:18:17 EDT
Article-I.D.: cit-vax.467
Posted: Mon May 12 05:18:17 1986
Date-Received: Wed, 14-May-86 07:07:09 EDT
References: <1363@unc.unc.UUCP>
Distribution: net
Organization: California Institute of Technology
Lines: 56
Summary: Don't judge a chip by its clock generator

In article <1363@unc.unc.UUCP> hedlund@unc.UUCP (Kye Hedlund) writes:
>When comparing the performance of microprocessor architectures, it would
>be desirable to separate architectural factors from technological
>factors. For example, if machine A has performance 1.0 and machine B
>has performance 2.0, does B have a "faster architecture?"  The answer of
>course is "it depends."  It depends on many factors including the
>underlying technology used to fabricate the chip.  If B was implemented
>in 1.0um CMOS and runs off a 20MHz clock whereas A was fabricated with
>4um nMOS and runs at 5MHz then perhaps the architects (and
>implementors) of A did an excellent job with a slower technology.
>Perhaps A will run circles around B if implemented with similar
>technology.

>[...] a 68010 is about 0.4 mips @ 10MHz, and a 68020 is about 1.5mips @
>16.67MHz (measured on SUN workstations running C under 4.2BSD).
>This gives 0.040 mips/MHz for the 68010 and 0.091 mips/MHz for the 68020
>and suggests that there is better than 2:1 architectural and implementation
>advantage for the 68020 independent of the circuit technology.

The MHz rate entering the clock pin is not necessarily the same
as the internal operation rate.  All MOS chips derive multiple
internal clock phases from the clock input, and some need more
phases than others, and so have to divide the input by a larger
number to get enough reference points.

For example, the 68010 is designed in 3.2um nMOS, a technology
which needs dynamic logic to save power, and dynamic logic needs
lots of clock edges to get anything done.  So, the 68010 uses 4
phases internally, which it derives by dividing the input clock
by 2.  The 68020 is 2um CMOS, a technology which does not need
such extensive precharging, so two phases are adequate and the
clock is not divided down.

Thus,
    10MHz 68010 cycle time = 200ns,
 16.67MHz 68020 cycle time = 60ns.

This reduction in cycle time is not all due to technology;
2um transistors only switch about twice as fast as 3.2um
transistors.

It almost looks like the cycle time reduction accounts for
all of the performance, but no, the Sun-3 has wait states
(memory access = 4.5 cycles) and the Sun-2 doesn't (memory
access = 2 cycles).  The 68020 does have at least a 2:1
"architectural advantage" over the 68010, but it is in memory
cycles, not processor cycles.

It bugs me that rational people assign such significance to
the clock rate.  I have a microprocessor that uses a 60 MHz
clock - should you be impressed?  No, you shouldn't.  The
clock rate is so high only because I needed a 5X base clock
to generate all the required clock edges (the chip takes
precharging to an absurd extreme, which wasn't my idea).

Don Speck	speck@vlsi.caltech.edu	    seismo!cit-vax!speck