Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!sun-barr!newstop!sun!quintus!ok
From: ok@quintus.UUCP (Richard A. O'Keefe)
Newsgroups: comp.arch
Subject: Re: Integer Multiply/Divide on Sparc
Message-ID: <1296@quintus.UUCP>
Date: 1 Jan 90 01:24:04 GMT
References: <84768@linus.UUCP> <8840004@hpfcso.HP.COM> <1804@l.cc.purdue.edu> <1535@cbnewsi.ATT.COM> <28548@amdcad.AMD.COM> <1989Dec30.161619.22893@phri.nyu.edu>
Reply-To: ok@quintus.UUCP (Richard A. O'Keefe)
Organization: Quintus Computer Systems, Inc.
Lines: 34

In article <1989Dec30.161619.22893@phri.nyu.edu> roy@phri.nyu.edu (Roy Smith) writes:
>                                                        Most (handwave)
>non-computational integer multiplies (by which I mean operations invented by
>the compiler for array subscripting, pointer arithmetic, etc) involve one
>constant factor with few ones, ideal for shift-and-add strength reductions.

Roy Smith isn't the only one to say this recently in comp.arch.
What I'm afraid of here is the coupling I see between RISC design and
crippled programming languages.  Smith's claim about array subscripting
is true in some pathetically crippled languages where array bounds are
completely known at compile time (C and Pascal, and some derivatives of
Pascal).  It wasn't true in Algol 60 or Simula 67, and it isn't true in
Fortran 77.  (The reason that the multiplications in Fortran 77 tend to
involve "hard" multiplication is that good compilers already use
strength reduction to remove multiplication entirely when they can.)
Fortran 8X and Ada also allow arrays with dynamic bounds.
Actually, there _is_ a way that addressing polynomials could be handled
efficiently on machines which have slow multiplication, and that is for
a dynamic array declaration like
	real array a[1:P,1:Q,1:R];
to generate a local "addressing subroutine" at _run_ time
	a_poly := lambda(i,j,k) ((check(1,P,i)*Q+check(1,Q,j))*R+
				  check(1,R,k))*sizeof real + &a_data;
using whatever multiply-step sequence was best.  But that, of course,
requires a cheap way of generating code on the run.  And there are some
RISC systems that make _that_ hard too.  The remaining alternative is to
precompute the offsets in some way, so that a[i,j,k] turns into
	a_data[QR_times[i] + R_times[j] + k]
which is all very well, but think about Fortran 8X cross-sections...

(This article is not a claim that fast integer multiplication and division
 is necessary to support Fortran and Ada well.  I haven't the figures to
 prove this.
)