Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!sun!amdcad!mozart.amd.com!nucleus!davec
From: davec@nucleus.amd.com (Dave Christie)
Newsgroups: comp.arch
Subject: Re: taxonomy for superscalars/etc (LONG)
Message-ID: <1990Jul23.182546.25777@mozart.amd.com>
Date: 23 Jul 90 18:25:46 GMT
References: <9782@hubcap.clemson.edu>
Sender: usenet@mozart.amd.com (Usenet News)
Reply-To: davec@nucleus.amd.com (Dave Christie)
Organization: Advanced Micro Devices, Inc., Austin, Texas
Lines: 115

In article <9782@hubcap.clemson.edu> mark@hubcap.clemson.edu (Mark Smotherman) writes:
>I would like to suggest a possible taxonomy to distinguish among
>the differing organizations of new processors.  Your comments
>and corrections are welcome.
>
>I see three major areas: issue parallelism, start of execution
>(which can differ from time of issue if it is the responsibility
>of the functional unit to obtain its own operands), and resource
>naming (specifically registers).
>
>Thus I would like to give a three-part classification, i/e/n,
>to each machine.  The first field would be issue parallelism:
>
>  (1) single issue per cycle
>  (s) superscalar issue of independent instructions
>  (v) vliw issue in which several opcodes or instructions (i.e. i860)
>      are grouped into a wide instruction word

How about adding (d) = decoupled superscalar issue of independent 
instructions, where instructions are issued simultaneously from two
independent streams (Smith, James E., _Decoupled_Access/Execute_
Computer_Architectures_, ACM TOCS, Nov 1984, plus a half-dozen other
papers by him).  This is considerably different from the type of 
superscalar issue we are beginning to see in advanced RISC chips, 
and has been used commercially (Astronautics Corp's short-lived ZS-1(?) 
for one).  BTW, I'm not sure what your taxonomy is intended to cover - 
features that are architecturally visible, or implementation techniques 
regardless of whether they influence the architecture/compiler?  I assume
the latter, since renaming is used to hide stuff from the compiler.

>The second field would be execution start time:
>
>  (d) data dependencies (RAW) are interlocked by issue unit, which
>      stalls until resolution; the fn unit starts upon issue since
>      issue unit provides both the op specification as well as the
>      operands
>  (c) compiler must reorder instructions to avoid data dependencies
>  (x) out-of-order execution, where the fn unit starts only after
>      obtaining its operands -- the issue unit does not stall on
>      data dependencies but forces the fn unit to resolve them

These catagories aren't mutually exclusive - the R3000 which you
classify as 1/c/p is "d" for cache misses on a load, as well as
accesses to the HI and LO registers during multiply/divide.  Should
it be classified as "dc", or where does one draw the line?

I also have a semantic bone of contention for these first two parts.
When instruction execution becomes separated from instruction
"issue", most literature I've seen refers to the act of sending an
instruction to a functional unit as "dispatch", with the term "issue"
used to refer to the act of placing an instruction in execution.  I
don't think "execution" can be used in place of "issue" here, since
execution can take multiple cycles, and issue (& dispatch) indicate
single cycle events (to me, anyway).  

This may seem pretty picky, but rigorous definition of these terms
definitely helps avoid confusion when working with this stuff.  

>The third field would be resource naming, specifically register
>renaming:
>
>  (p) physical registers named in instructions -- must be concerned
>      about anti-dependencies (WAR) and output dependencies (WAW)
>  (r) hardware renames logical registers in instructions by tagging
>      or assignment of physical registers (removes WAR and WAW)

A few comments:
How about a designation for cases where possible dependencies are
eliminated mainly by using independent register sets (ref. i860
integer/fp registers, decoupled architectures)?

Dependencies between functional units can be handled via queues
between functional units - this is typical for decoupled architectures
(ref. aforementioned paper).  Moreover, this technique can be used 
in single-stream architectures with independent functional units,
and the queues can be architecturally visible (accessed via register
designators) or implemented using renaming (architecturally hidden),
where the renaming is only done for queued operands, not all operations.

I have seen implementations where the renaming is permanent, such
that the architectural state of a process is maintained by a set of
pointers into a pool of registers, and where it is only temporary
(to cover the average or maximum execution latency), with a set of
physically-addressed registers being updated in order from a reorder 
buffer or similar mechanism.  (This is probably getting too detailed
for your purposes.)

>Using this proposed taxonomy, I would classify the following machines:
>
>  Tandem Cyclone              s / d / p  ??

Maybe in a class by itself, considering how it relies on the compiler
to pair up instructions, which are then converted to a vliw-like
micrand via the control store.

>  IBM S/360 M91 (Tomasulo)    1 / x / r
>  IBM RS/6000                 s / x / r

These two use renaming in considerably different ways: in the 360/91
the tags refer to specific physical registers, and not just the floating
point registers.  In the RS6000, at least from what I can tell from the
literature I have, the renaming is used primarily to implement a load
operand queue, as described above.  I don't know whether or not renaming
is used to handle intra-FPU conflicts.  I'm pretty sure renaming is
not used within the fixed-point unit.  (The RS6000 actually has several
attributes of the single-stream decoupled access/execute architecture
described in Jim Smith's aforementioned paper.)

In any case, you should probably indicate what level of detail you wish
to represent with this taxonomy.  One can get really carried away with
this stuff [...he says nonchalantly, concluding the longest posting of
his life...:-)].

----------------------------------
Dave Christie           My opinions only.