Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!sun-barr!apple!usc!ucla-cs!math.ucla.edu!sonia!pmontgom
From: pmontgom@sonia.math.ucla.edu (Peter Montgomery)
Newsgroups: comp.arch
Subject: Re: hardware complex arithmetic support (long)
Message-ID: <1607@sunset.MATH.UCLA.EDU>
Date: 22 Aug 89 02:06:22 GMT
References: <kYtuglW00V4G01l=ZK@andrew.cmu.edu> <1672@crdgw1.crd.ge.com> <4781@freja.diku.dk>
Sender: news@MATH.UCLA.EDU
Reply-To: pmontgom@math.ucla.edu (Peter Montgomery)
Organization: UCLA Mathematics Department
Lines: 103

In article <4781@freja.diku.dk> njk@freja.diku.dk (Niels J|rgen Kruse) writes:
>
>Consider that it is meaningless from a numerical viewpoint to
>represent one component of a complex number with greater
>accuracy than the other.
>
>This means that a dedicated storage format need only have *one*
>exponent. Comparing such a double precision format to a conventional
>representation as 2 double precision ieee numbers, 11 exponent
>bits are saved and 2 hidden fraction bits are lost (because
>only one fraction will be normalized in general and it may be
>any of them).
>
>This leaves 9 extra bits which can be used to extend the range
>and precision, for instance 4 extra bits in each fraction and 1
>extra exponent bit. This translates to more than a full decimal
>digit of extra precision and larger range too.
>
>The hardware cost of support for such a storage format need not
>be excessive, when speed is not the main concern. A low cost
>implementation could get away with only 2 extra instructions
>for each complex format (single, double precision) :
>a load instruction that would load the components of a complex
>number into 2 regular fp registers and the reverse store
>instruction. Complex arithmetic would then be done with a
>sequence of the regular instructions. The main cost of this is
>making registers and arithmetic units wider to accommodate the
>extra precision of the complex format. However, a few bits of
>extra precision on in-register arithmetic is useful for plain
>fp arithmetic too, especially for authors of math libraries.

	If the application environment is expected to make heavy use of
complex arithmetic, then it is reasonable to supply the fundamental
arithmetic operations (e.g., load/store, load conjugate, indexing into 
complex arrays, addition, subtraction, multiplication, 
multiplication by conjugate, multiplication by real number,
absolute value, equality/inequality test) directly in hardware.  
Doing division in hardware will be especially nice if it frees
the software from concern about exponent overflow (e.g.,
a quotient z1/z2 may be well-defined even though the intermediate results
z1*CONJUGATE(z2) and z2*CONJUGATE(z2) overflow or underflow).

	However, I cannot approve of using different storage conventions for
the components of a complex number than those used for ordinary floating point 
data.  Some of my reasons are:

    i)  I dispute that the real and imaginary components of a 
	complex number will always have comparable exponents.
        For example, after taking the principal logarithm of a 
        complex number, the real part of the result can have any
        magnitude while the imaginary part will be between -PI and +PI.  
	As another example, some of my fellow number theorists are
	studying the zeros of the Riemann zeta function; these zeros
	are known to have real part between 0 and 1 (and believed
	to always be exactly 0.5), yet the imaginary parts can get
	arbitrarily large.

   ii)  The FORTRAN language allows equivalencing of real and
        complex arrays, and requires the numbers have identical format.
        Many existing programs using complex arithmetic are coded in FORTRAN.
	Programs in other languages may also be dependent upon this when
	they pass a pointer to a component of a complex number.

 iii)   If the components of a complex number have more precision
        than regular numbers, it will severely complicate operations which
        look at the individual components.  For example, given a statement

               if (REAL(z1) > REAL(z2)) then ...

        will the compiler need to reduce the precision of z1 and z2 as it
        does the compares?  Niels suggests extra precision for in-register
	arithmetic, but that is inadequate unless those registers can be
	copied to memory and restored as needed.  For example, when converting 
	from ASCII to complex, how will the compiler (or run time library)
	arrange to convert one component from ASCII to extended precision, 
        save this value, convert the other component, and then merge the 
        two components into a complex number?  Different precisions
	for register arithmetic and storage arithmetic can also confuse
	the programmer; for example

		real x, y, sum

		sum = x+y
		if (sum .eq. x+y) then
		    ...

	may skip the "..." code if the compiler stores "sum" in memory
	and reloads it, because of truncation during the store.

   iv)  If the representations are identical, then a smart compiler 
        can use the complex arithmetic instructions even when the 
	original source code uses non-complex data.  For example, given

		real array1(10), array2(10), scalar

		array1 = array1 + scalar*array2      (array assignment)
	
        a machine with complex arithmetic support but no vector support
        can view each array as five complex numbers rather than ten real 
        numbers, and use the complex arithmetic instructions for the operation.
--------
        Peter Montgomery
        pmontgom@MATH.UCLA.EDU