Path: utzoo!utgpu!watserv1!watmath!att!att!pacbell.com!ucsd!swrinde!zaphod.mps.ohio-state.edu!rpi!sci.ccny.cuny.edu!phri!cmcl2!kramden.acf.nyu.edu!brnstnd
From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)
Newsgroups: comp.lang.fortran
Subject: Re: Fortran vs. C for numerical work (SUMMARY)
Message-ID: <2098:Nov2902:40:0790@kramden.acf.nyu.edu>
Date: 29 Nov 90 02:40:07 GMT
References: <7097@lanl.gov> <17680:Nov2806:04:1090@kramden.acf.nyu.edu> <7928@uwm.edu>
Organization: IR
Lines: 91

Here's how to set up a multidimensional array if you care about speed.

Let x be an array of foo[m][n]...[p]. Store it as a contiguous set of
m*n*...*p elements, with the [i][j]...[k] element at position
base + i+m*(j+n*(...(k)...)). This is done the same in C, Fortran, and
machine language.

To move along a row or column, you need to add or subtract multiples of
successive products of dimensions. So store m, m*n, and so on. This is
done the same in C, Fortran, and machine language.

Now here's where pointers start becoming useful and C begins to shine.

To find a random element of the array (certainly not as common as moving
by step, but still a large portion of the time on vector machines), you
need to calculate the position from the base of the array as above. Say
you need to calcalate the corner of the hyperplane with fixed i and j,
i.e., base + i + m * j. Store base + m * j in an array of pointers
indexed by j. Then finding the location of the corner takes one
addition, one memory access, and another addition, where the original
took two additions and a multiplication. On most machines available
today, a memory access (particularly a cached memory access, as this is
almost certain to be) is faster than a multiplication.

You can store that extra array of pointers in C or in machine language.
But you can't do it in Fortran. The best you can do is store (e.g.) m*j
in an array. But then you have an extra addition, and I haven't seen any
Fortran compiler smart enough to replace your array of integer offsets
with the corresponding array of pointers. So you lose speed. (Jim, do
any of your compilers manage this trick?)

In article <7928@uwm.edu> tong@convex.csd.uwm.edu (Shuk Y Tong) writes:
> In article <17680:Nov2806:04:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >Fortran can't deal with a double-pointer array efficiently because it
> >doesn't have pointers. Simulating pointers efficiently is what I was
> >calling difficult. Do you disagree?
> Yes. First of all, arrays are the most common data structure in
> scientific computing. By not providing a generic array type in C(i.e.,
> one which can have a variable in its declaration in functions),
> it basically says C is not for numerical work.

That is a religious argument. How do you expect anyone to respond to it?
It's based on faith, not reason, so it can't be argued with.

> As for
> two-dim arrays, in C, it has to be simulated by a pointer array
> which is clumsy (it becomes clumsier for 6-dim arrays) and 
> runs much SLOWER on vector machines.

Would you like to run some benchmarks on a Convex?

Just because I use pointers doesn't mean I have to give up the
advantages of contiguous arrays.

> The reason
> is simply the compiler is unable to tell where each element a[i][j]
> points to. It might be posible to put tons of directives to tell
> the compiler what is going on, but why bother when you can do
> it by using a(i,j) alone ?

The Convex compiler can understand a linear expression in several
variables. It can reduce around it. In fact, I haven't seen a single
optimizing compiler that doesn't know at least a little bit about this
optimization (now commonly known as ``strength reduction'').

I don't have to ``put tons of directives to tell the compiler what is
going on''; I don't have to put even a single directive. The compiler
knows what is going on. Upon what do you base your claim?

> the CPU hog in scientifc 
> computing is LOOPS with ARRAYS in it. When a(i,j) is within a loop, 
> its address usually can be gotten by an addition (a constant stride 
> or an increment, depending on which index runs faster).

That's quite correct. There's nothing in using pointers that prohibits
you from doing exactly this.

If, in fact, moving through an array by step were the only operation,
there wouldn't be any point in having the extra pointer tables. But you
have to locate a position in an array before you zoom through a row or
column. Since zooming is twenty times faster on a Cray-style vector
machine, locating random spots suddenly becomes a noticeable percentage
of a program's run time. That's what you've failed to notice.

> Some of the common functions, like sqrt etc., is generated inline
> by a Fortran compiler, but no way of doing it in C.

This issue has been quite thoroughly addressed by previous articles in
this thread.

---Dan