Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!husc6!rutgers!iuvax!pur-ee!j.cc.purdue.edu!k.cc.purdue.edu!l.cc.purdue.edu!cik
From: cik@l.cc.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: What should be in hardware but isn't
Message-ID: <581@l.cc.purdue.edu>
Date: Mon, 21-Sep-87 09:27:47 EDT
Article-I.D.: l.581
Posted: Mon Sep 21 09:27:47 1987
Date-Received: Tue, 22-Sep-87 05:36:22 EDT
Reply-To: cik@l.cc.purdue.edu (Herman Rubin)
Organization: Purdue University Statistics Department
Lines: 74


There are many instructions which are easy to implement in hardware, but
for which software implementation may even be so costly that a procedure
using the instruction may be worthless.  Some of these instructions have
been implemented in the past and have died because the ill-designed
languages do not even recognize their existence.  Others have not been
included due to the non-recognition of them by the so-called experts and
by the stupid attitude that something should not be implemented unless
99.99% of the users of the machine should be able to want the instruction
_now_.  As you can tell from this article, I consider the present CISC
computers to be RISCy.

One situation of this type which has been discussed in this newsgroup is
the proper treatment of quotient and remainder for integer division when
the numbers are not both positive.  Everyone took a stand for some specific-
ation.  I say "let the user decide."  Even if both signs are positive, 
which alternative I want for one problem may not be the one I want for
another problem.  Having 2-4 bits to specify the alternative for each
sign combination should take very little run time and little space.

Since floating point machines first came out, the much needed instruction
to divide one floating point number by another with an integer quotient
and a floating remainder has not, to my knowledge, appeared.  If you need
to see uses of this, look at any good trigonometric or exponential subroutine.

With the advent of floating point, fixed point operations seem to be 
vanishing.  On the early floating point machines, frequently numerical
functions would be done in fixed point for speed and accuracy.  The need
for this has not changed, but the availability has.  Also, it should be
possible to convert between fixed and floating point without the overhead
of a multiply; this was possible on the UNIVAC 1108 and 1110.  Another
operation is to multiply a floating point number by a power of 2 by 
adding to the exponent; this was on the CDC 3600.  The need for this as
a separate instruction is because of the possibility of overflow and/or
underflow.

I have run into situations in non-uniform random number generation for which
considerable time is needed to carry out tests which would be better handled
as exceptions.  One of these is to decrement an index, use the result for a
read or write instruction if non-negative, and interrupt if negative to a 
user-provided exception handler.  Another is to find the distance to the next
one in a bit stream, with an interrupt if the stream is emptied.  There are
procedures which are extremely efficient computationally, but for which the
overhead is large if this is not hardware; if a higher level language has to
be used for the instruction, I would make the cost prohibitive.  The VAXen
have in hardware (at least for some machines) a FFO instruction, but it 
requires three other operations, one of which is a conditional, to get one
result.

On many machines, even if fixed point arithmetic is in the hardware, multipli-
cation and division cannot be unsigned.  All of the multiple precision software
with which I am familiar is sign-magnitude.  An additional hardware bit to say
if signed or unsigned is to be used would be cheap.  (It is extremely difficult
to program multiple precision arithmetic in floating point.  It is difficult
on machines, of which there are many, which do not have reasonable integer
multiplication.)

I make no pretense that this list is complete.  While I might find it useful,
I would not suggest that transcendental functions (except for the CORDIC
routines) be hardware, as they would be merely encoding a software algorithm
using the existing instructions as a hardware, rather than software, series
of instructions.  What I am suggesting is that instructions manipulating the
bits in different ways, or using easy branching at nanocode time instead of
slow branching when the hardware cannot use the non-restrictive nature of the
branch, should be.  The cost of the CPU is usually a small part of the cost
of the computer.


-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (ARPA or UUCP) or hrubin@purccvm.bitnet