Path: utzoo!mnetor!uunet!steinmetz!sungoddess!oconnor From: oconnor@sungoddess.steinmetz (Dennis M. O'Connor) Newsgroups: comp.arch Subject: Re: RISC is a nasty no-no! Message-ID: <9788@steinmetz.steinmetz.UUCP> Date: 4 Mar 88 16:24:06 GMT References: <4400@aw.sei.cmu.edu> Sender: news@steinmetz.steinmetz.UUCP Reply-To: sungoddess!oconnor@steinmetz.UUCP Organization: GE Corporate R&D Center Lines: 73 An article by daveb@geac.UUCP (David Collier-Brown) says: ] In article <4400@aw.sei.cmu.edu> firth@bd.sei.cmu.edu.UUCP (Robert Firth) writes: ] [discussion about array representation, and the hardware support ] appropriate to multiplicative addressing] ] ] >However, my reading of ANSI X3.9-1978, especially Section 5, on arrays, ] >leads me to conclude that array representation in column-major form, ] >and array access by chain multiplication-&-addition of subscripts, is ] >the only feasible implementation choice. ] ] My reading of the ANSI proposal leads me to the same conclusion. ] This is both good (it eliminates an ambiguity) and bad (it removes ] the freedom of an compiler-writer to do what is best applicable to ] her architecture). ] -- ] David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Why is it we have to reinvent the wheel so often ? N-dimensional arrays, whether column or row major order, can always be accessed without run-time multiplies. And can still be in a continuous block of memory. The method for doing this is at least 14 years old : I heard about it in 1974. An example will illustrate. BTW, this is NOT the most optimized algorithm, but they are simple permutations of this. Given an array that is N1 by N2 by N3 et cetera. First, allocate your block of memory for the N1*N2*N3*etc elements of the array. Then, for each dimension EXCEPT the minor one (in this example, Nn, but it could be N1 or any Nx ) allocate space for a vector of words with a number of elements equal to the cardinality of that dimension. Then simply fill in these vectors with (range 1..Cardinality(Nx)) * (whatever multiplier is used for this dimension in the multiplication-&-addition mwthod ). Note that this is either done at compile time, or ONCE when the array is allocated, and uses very little multiplication. Now, when you need to access ARRAY[a,b,c,...,n] you get the address by using ARRAY_BASE + N1VEC[a] + N2VEC[b] + N3VEC[c] ... + n. No multiplies. This, I understand, is the ORIGINAL meaning of the term "VECTORIZED ARRAY". Here's a concrete example : type FAST_ARRAY is array[1..6,1..4,1..7] of WHATEVER ; produces : type MEMBLOCK is array [ 6*4*7 ] of WHATEVER ; N1VEC : constant array [1..6] of INTEGER := ( 0, 28, 56, 84, 112, 140 ) ; N2VEC : constant array [1..4] of INTEGER := ( 0, 7, 14, 21 ) ; PRAGMA INLINE( GET_FAST_ARRAY, SET_FAST_ARRAY ) ; function GET_FAST_ARRAY( M : MEMBLOCK; A,B,C : INTEGER ) return WHATEVER is begin return M( N1VEC( A ) + N2VEC( B ) + C ) ; end GET_FAST_ARRAY ; procedure SET_FAST_ARRAY( M : in out MEMBLOCK; a,b,c : INTEGER; INPUT : WHATEVER ) is begin M( N1VEC( A ) + N2VEC( B ) + C ) := INPUT ; end SET_FAST_ARRAY ; This just an algorithmic description, I haven't compiled it. But you get the idea. And this will work for column-major, row-major, 3rd-dimension-major, whatever. It's essentially just having a look-up table for frequently used multiplies. -- Dennis O'Connor oconnor%sungod@steinmetz.UUCP ARPA: OCONNORDM@ge-crd.arpa (-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)