Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!mailrus!tut.cis.ohio-state.edu!bloom-beacon!bu-cs!purdue!i.cc.purdue.edu!k.cc.purdue.edu!l.cc.purdue.edu!cik
From: cik@l.cc.purdue.edu (Herman Rubin)
Newsgroups: comp.arch
Subject: Re: RISC v. CISC --more misconceptions
Summary: If something is not there, it is not used
Message-ID: <1002@l.cc.purdue.edu>
Date: 1 Nov 88 11:47:14 GMT
References: <156@gloom.UUCP> <18931@apple.Apple.COM> <40@sopwith.UUCP> <19762@apple.Apple.COM>
Organization: Purdue University Statistics Department
Lines: 58

In article <19762@apple.Apple.COM>, baum@Apple.COM (Allen J. Baum) writes:
> []
> >In article <40@sopwith.UUCP> snoopy@sopwith.UUCP (Snoopy T. Beagle) writes:
> >In article <18931@apple.Apple.COM> baum@apple.UUCP (Allen Baum) writes:
> >| You may find, howver, that it won't make any difference in your
> >| performance because no one needs an integer multiplier very
> >| often. Like a lot of things that RISC designers have left out.
> >
> >I humbly disagree.  Just because *you* never use an integer multiply
> >does not imply that noone else ever does.
> 
> Oh, please!
> I didn't say noone.
> I didn't say ever.
> Of course there are applications that are integer multiplication
> intensive (as opposed to floating point).
> What I did say is that they are quite rare.

They are rare because a good programmer knows that they are slow and
difficult to program.  If an operation takes 10 lines of code, each of
which is expanded into one or more hardware instructions, why use it
unless you cannot find a way around it?  The way around it may be clumsy,
but it is very likely to beat the unavailable operation.

> Integer floating point intensive is defined (here and now, by me) to
> be an application that will suffer a performance degradation of more
> than 3% without a fast hardware multiplier (2-3 cycles, vs. the
> average 11 cycles that HP can do in pure software. (A back of the
> envelope calculation will show that means .3%- pretty high for
> multiply) Most integer multiplies that I am aware of are used for
> index scaling and other address calculations. Good optimizing
> compilers will strength reduce these away

If the double-precision product of two single-precision integers is required,
and only single-precision products are available, it is necessary to go to
single-precision products of half-precision numbers.  This takes about 20
instructions.  How does the poster expect to do it in an average of 11 cycles?
Many of these jobs are not being done, or are being kludged by finding ways to
accomplish more-or-less the same results in 10 instructions.  And if a
subroutine call is made, double the time.

Many mathematical computations should be made in fixed-point arithmetic.  If
one does not have the hardware available, the cost is much greater than
floating point.  If the hardware is available, it is much cheaper.  None
of the major languages support fixed point.  So none of the hardware gurus
put it in, so none of the machines have it, so no one programs in it, so
the inclusion of it is objected to as a waste of resources, etc.

Another hardware operation missing on most machines is square root.  So one
does not use algorithms requiring square roots.  Again, the circular
argument.

An application using accurate arithmetic heavily will be spending most of its
time in multiple-precision subroutines, even with good hardware.  A time
penalty of a factor of 10 here is obviously costly.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)