Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!orion.oac.uci.edu!ucivax!ucla-cs!oahu.cs.ucla.edu!marc
From: marc@oahu.cs.ucla.edu (Marc Tremblay)
Newsgroups: comp.arch
Subject: Re: taxonomy for superscalars/etc (RS/6000 example)
Message-ID: <37240@shemp.CS.UCLA.EDU>
Date: 24 Jul 90 19:43:06 GMT
References: <9782@hubcap.clemson.edu> <1990Jul23.182546.25777@mozart.amd.com>
Sender: news@CS.UCLA.EDU
Organization: UCLA Computer Science Department
Lines: 51

In article <1990Jul23.182546.25777@mozart.amd.com> davec@nucleus.amd.com (Dave Christie) writes:
>>  IBM S/360 M91 (Tomasulo)    1 / x / r
>>  IBM RS/6000                 s / x / r
>
>These two use renaming in considerably different ways: in the 360/91
>the tags refer to specific physical registers, and not just the floating
>point registers.  In the RS6000, at least from what I can tell from the
>literature I have, the renaming is used primarily to implement a load
>operand queue, as described above.  I don't know whether or not renaming
>is used to handle intra-FPU conflicts.  I'm pretty sure renaming is
>not used within the fixed-point unit.

In the current version of the RS6000, floating point arithmetic instructions
are executed in sequence. There is thus no need to assign new tags to the
destination registers of these instructions. Only the floating point loads
create a new assignment in the mapping table (logical to physical translation
table). So the real purpose of the register renaming scheme in the RS/6000
is to be able to process floating point loads without waiting for source
registers to be used by previous instructions.

This type of out-of-order execution of instructions can lead to heavy hardware
support to deal with precise interrupts, which can eventually slowed down
the clock, but instead the RS/6000 offers a few alternatives/tradeoffs
regarding how tight one wants to monitor exceptions.
For example for some applications it may not be necessary to check
for exceptions immediately after each floating point arithmetic instruction.
For this case, one can insert an instruction that polls the status register
and trap if an exception is detected.
For instance in the following pseudo code:

	fdiv fp3,fp2,fp1
	fld  fp4,(r5)
	fld  fp6,(r7)
	...
	check_for_exceptions
	...

Why wait for the possible exception generated by the fdiv?
Instead the loads can be executed in parallel with the fdiv
and later an instruction can be inserted to check for exceptions.

For the purpose of debugging code and for running critical codes,
exceptions can be checked after every instructions by running
an application in the one-instruction-at-the-time mode.
Which means that you basically lose the throughput provided by
the pipeline implementation of the functional units.

_________________________________________________
Marc Tremblay
internet: marc@CS.UCLA.EDU
UUCP: ...!{uunet,ucbvax,rutgers}!cs.ucla.edu!marc