Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!spool.mu.edu!news.nd.edu!mentor.cc.purdue.edu!pop.stat.purdue.edu!hrubin
From: hrubin@pop.stat.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: Re: new instructions
Summary: y
Message-ID: <12526@mentor.cc.purdue.edu>
Date: 20 May 91 13:47:38 GMT
Article-I.D.: mentor.12526
References: <9105200213.AA05095@ucbvax.Berkeley.EDU>
Sender: news@mentor.cc.purdue.edu
Lines: 140


Subject: Re: new instructions
Newsgroups: comp.arch
References: <9105200213.AA05095@ucbvax.Berkeley.EDU>

In article <9105200213.AA05095@ucbvax.Berkeley.EDU>, James B. Shearer
(jbs@WATSON.IBM.COM) writes:

>         Herman Rubin writes:
> Fixed point arithmetic is little used now because the hardware to support
> it reasonably well does not exist.  It is worse than the floating problems
> before hardware floating arithmetic, especially if floating is automatically
> normalized.  THAT feature of "modern" architectures is, in my opinion, a sheer
> horror.
> 
>         I think fixed point arithmetic is little used now because the vast
> majority of users find it harder to use than floating point with no comp-
> ensating advantagres.  Does anyone seriously believe if a few instructions
> were added to provide hardware support (btw what is missing?  In what way
> has fixed point support deteriorated?) fixed point usage would increase
> significantly?  Regarding unnormalized floating point what is this good
> for besides simulating multiple precision integer arithmetic?

The last question first:  How about multiple precision floating (or fixed)
arithmetic?  Considering that there are quite a few papers on this, it is
certainly a topic of interest.  I do not believe it should be necessary here
to go into the full range of situations I can list NOW where this would be
useful.  

Now what is the benefit of allowing only normalized floating point?  It 
eliminates the need for a normalization option in floating instructions,
and it provides ONE more bit of accuracy.  Is that ONE bit exactly what
is needed?  This is very unlikely.

Now what is the cost of not having forced normalization, besides the one
bit?  There would have to be a method for indicating which result of the
operation is wanted (upper, lower, normalized).  There would be little
additional hardware other than the decoding, by the floating unit, of
this information.

There are algorithms which benefit from using both Boolean and arithmetic
operations on either fixed point or floats.  These are not even readily
available on many of the newer machines.

>         Herman Rubin also writes:
> In the early FP computers, much function calculation was done in fixed point,
> to get increased accuracy at little cost.
> 
>         This only makes sense when the floating point fraction length is
> less than a full integer.  With 64-bit floating point there is little need
> for increased accuracy in any case.

So why not have increased integer accuracy?  It is no harder to do this, and
the same units can be used.  To someone with a mathematical outlook, the 
distinction is not integer/float but short integer/good arithmetic.

There are algorithms which call, at some stage at least, for fixed point 
arithmetic.  Infinite precision (no mistake here) methods of generating
non-uniform random numbers tend to be of this type.  Converting the fixed
point results to floating can be a major problem, as 0 is a possible value,
again a real problem only with automatic normalization.

>         Herman Rubin also writes:
> How do you expect users who do not even know of the existence of the operations
> to use them?
> 
>         I expect the compiler to generate the instructions for them.  If
> the compiler won't generate an instruction this is a strong reason for
> not having it in the instruction set.

The chicken and the egg again.  Anyone who is willing to say that something
is not useful is either ignorant, arrogant, or stupid.  Nobody can, or should,
attempt to ever do this.  Even the best people can make big mistakes.

This assumes the language designers and hardware designers perceive the need
for the instructions.  This is clearly not the case.  An example is the
attempt to add integer multiplication to the CDC 6x00 series.  The fact
that floating multiplication automatically normalized the product when
both factors were normalized made the operation far less useful than
intended.  There was too much already in the hardware to correct this.

How many current languages have user-definable operations?  That the designers
of C did not think of multiple precision arithmetic, or quotient and remainder,
or exponentiation as an operation and not a function, etc., does not mean that
these should have been omitted.  There are provisions for octal and hex
integers, but none for explicitly writing floats or fixed-point numbers to
those bases.  Even at my age, I am quite capable of operating entirely binary
even if there is not a good reason for it, and often there is.

>         Herman Rubin also writes:
> There are many more algorithms than are in the philosophy of software, and
> especially hardware, designers.

>         The argument here seems to be that there are numerous algorithms
> which are not used at all today which suddenly would be used all over the
> place if only a few changes were made to the instruction set.  I don't
> agree.  If two algorithms have similar performance both will be in use as
> there will be some problems which are particually suited to one or the
> other.  If an algorithm is not used at all on today's machines this means
> it is not competive even on those problems for which it is particually
> suited.  Making changes to the instruction set is unlikely to drastical-
> ly change the relative performance of two algorithms,  hence is unlikely
> to drastically change the amount of usage each gets.  I believe it is
> perfectly sensible for hardware designers to concentrate on speeding up
> the existing mix of applications.

Like the infamous frexp function in C?  For those not familiar with it,
frexp(x,&n) took a floating number x and transformed it to y*2^k, where
.5 <= |y| < 1, and the value of k is stored as n.  Instead of completely
inlining this operation, which would have made the whole thing simple and
suggested the obvious alternate form

	y,n = frexp(x)

which also assigns the values to registers or memory as wanted, and avoids
a clumsy function call, they did the other.  The 4.2BSD library even used
a machine independent algorithm, which needless to say was frequently slower
by a factor of more than 100.  It even went into an infinite loop in 0.  
Of course the work has to be done in machine-dependent code, and in the
integer registers, if they are separate.

Except for ALGOL and possibly COMMON LISP, I know of no language designers
who even made a real attempt to put in the variety needed.  Neither group
really succeeded.  

On vector and parallel processors, things which are of essentially no
importance on scalar machines suddenly become important.  On modern 
machines, even local transfers can be costly, and context switches more
so.  On parallel machines, how does one handle conditional transfers and
conditional calls?  They are deadly, and if one has 2^14 units (already
in existence), a one-in-a-million condition becomes one-in-62.  If the
condition calls for major work, say 100,000 cycles, and I have examples
of this, this can occupy most of the time.  There are single operations,
which deviate from the parallel idea, but are similar to those already
in use there, which can alleviate the mess.

-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)