Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!henry
From: henry@utzoo.UUCP (Henry Spencer)
Newsgroups: comp.arch
Subject: Re: What should be in hardware but isn't
Message-ID: <8646@utzoo.UUCP>
Date: Wed, 23-Sep-87 12:37:04 EDT
Article-I.D.: utzoo.8646
Posted: Wed Sep 23 12:37:04 1987
Date-Received: Wed, 23-Sep-87 12:37:04 EDT
References: <581@l.cc.purdue.edu>
Organization: U of Toronto Zoology
Lines: 66

> One situation of this type which has been discussed in this newsgroup is
> the proper treatment of quotient and remainder for integer division when
> the numbers are not both positive.  Everyone took a stand for some specific-
> ation.  I say "let the user decide."...

Of course, on most of the RISC machines the user *does* have the choice,
since division is generally done in software rather than hardware.

> Since floating point machines first came out, the much needed instruction
> to divide one floating point number by another with an integer quotient
> and a floating remainder has not, to my knowledge, appeared...

Although you don't get it bundled into one instruction, the pieces needed
to do this are present in any IEEE floating-point implementation, e.g. the
68881.  The remainder can be had with one instruction (on the 68881, FMOD
or FREM depending on exactly what you're doing), the quotient would take
two I think (just a divide and a convert-to-integer).

> ... Another
> operation is to multiply a floating point number by a power of 2 by 
> adding to the exponent; this was on the CDC 3600...

FSCALE on the 68881.

> ... Another is to find the distance to the next
> one in a bit stream, with an interrupt if the stream is emptied...

On most modern machines it should be possible to write a loop that will do
this at very nearly full memory bandwidth, looking at a byte or a word at
a time and using table lookup for the final bit-picking.  I am constantly
amused by people who scream for bit-flipping instructions when doing it a
byte or a word at a time, using table lookup for non-trivial functions, is
still faster.  "Work smart, not hard".

> On many machines, even if fixed point arithmetic is in the hardware, multipli-
> cation and division cannot be unsigned...

Again, on the RISCs you generally get your choice, because multiply is done
in tuned software rather than hardware.  (And it's usually faster than a
CISC multiply, since most multiplies are by small integer constants that a
RISC can generate custom code for.)

> I would not suggest that transcendental functions (except for the CORDIC
> routines) be hardware, as they would be merely encoding a software algorithm
> using the existing instructions as a hardware, rather than software, series
> of instructions...

Actually, there is one fairly good argument for putting the transcendentals
in hardware, to wit making a high-quality implementation available cheaply.
The transcendentals in (say) the 68881 are *better* than anything you will
come up with in software without large amounts of work.  You can buy a 68881
for far less than it would cost you to commission or license equivalent code.

> What I am suggesting is that instructions manipulating the
> bits in different ways, or using easy branching at nanocode time instead of
> slow branching when the hardware cannot use the non-restrictive nature of the
> branch, should be...

Note that many RISCs are directly quite specifically at this objective:
giving the programmer (or, more usually, compiler writer) detailed control
of the hardware, rather than putting a half-baked interpretive layer in
between.  To misquote the famous adage, "microcode stands between the user
and the hardware".
-- 
"There's a lot more to do in space   |  Henry Spencer @ U of Toronto Zoology
than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry