Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!uunet!mcsun!hp4nl!charon!dik
From: dik@cwi.nl (Dik T. Winter)
Newsgroups: comp.lang.misc
Subject: Re: Arrays in languages (was: Anyone want to design a language?)
Message-ID: <8849@boring.cwi.nl>
Date: 28 Feb 90 02:22:11 GMT
References: <3528@tukki.jyu.fi> <14251@lambda.UUCP> <8836@boring.cwi.nl> <14255@lambda.UUCP>
Sender: news@cwi.nl (The Daily Dross)
Organization: CWI, Amsterdam
Lines: 152

In article <14255@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
 > In article <8836@boring.cwi.nl>, dik@cwi.nl (Dik T. Winter) writes:
 > > In article <14251@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
 > >  >            In fact, for programs of any really useful size, passing
 > >  > array arguments is vital.  In this respect, at least, C really _DOESN'T_
 > >  > have arrays.
 > >  > 
 > > But in this aspect Fortran (at least upto and including 77) does not have
 > > arrays either.  Let's compare some languages (for a matrix-vector product
 > > routine*):
Note (I will come back to that later) that it is about the support of arrays.
 > 
 > You are making two errors here.  The first is to misrepresent Fortran (see
 > below).
Perhaps; but I do not think so.
 >          The second is to assume that Fortran is a particular favorite of
 > mine.  Fortran has many faults.  Although none of them are quite as serious
 > as those of C, the language is in need of many improvements.
I never assumed anything, I just did a comparison.  And I really do not
know whether Fortran is inherently better than C.
 > 
 > Since you quoted my statement out of context (without even using elipses)
 > you missed an important part of what I said.
We are in the same boat I think.
 >                                               C converts arrays into 
 > pointers AND provides no way to undo the damage!! ** _SOME_ Fortran
 > environments (alright: most) do convert arrays to pointers - unaliased
 > pointers.  All fortran environments not only _allow_ you to redeclare
 > your arrays as such in the procedure - but require it!  Only the last
 > dimension of an array argument may be left unspecified.
C provides a way to undo the damage which is no more proof against abuse
as the Fortran way, again, see below.  And I know a lot of Fortran
environments that do allow you to redeclare your arrays in a procedure
as whatever you want.  I know this is against language rules, but ...
In fact there is no more and no less safety in C than in Fortran.  In C,
when an array is passed to a procedure, the procedure receives a pointer.
You can access through that pointer only elements of the original array;
everything else is against the standard (at least ANSI, and disregarding
the possibility to calculate the address just beyond the last element).
The guarantee of unaliased pointers in Fortran is fine, but is not the
essence of support for arrays.  There are languages that have fuller
support for arrays than Fortran and that disallow aliases (Ada) and
others that allow aliases (Algol 60).  Note that I hinted in my
original article on the possibilities for optimization when aliasing is
not allowed.
 > 
 >    **Note: Fortran is deliberately designed so that parameter passing
 > (including arrays) can be carried out as either call-by-reference or
 > as call-by-value/result.  So, in a distributed environment, copy-in/out
 > may actually be the _most_ efficient parameter passing mechanism and
 > Fortran is allowed to use it.  This dual implementation possibility is
 > why aliasing is prohibited in Fortran (at least, the original reason -
 > better optimization is a _real_ good reason to keep the prohibition).
Yup, the same applies to Ada.  But they made a clean breast.  And they
require call-by-value/result for scalars.
 > 
 > > [...]
 > > Fortran:
 > > 	      SUBROUTINE MATVEC(M, V, W, L1M, U1M, L2M, U2M)
 > > [...]
 > > 	C     NO BOUND CHECKING POSSIBLE
 > 
 > This is false.  I haven't ever worked with a Fortran environment which
 > didn't do array bounds checking (at least as an option).  Of course, the
 > compiler _does_ have to just take the caller's word for it that the array
 > bounds that he passed are correct.  In fact, the compiler has to take the
 > caller's word for it that the array really has as many dimensions as the
 > code claims.
But if the compiler has to assume things, it is clear no checking can be
performed!  So I stand by my claim, no bound checking is possible, except
for local accesses against the locally declared bounds of course, but that
is bogus.  I know, the routine is correct, but is it called correctly?
(ellipses for once)
 > ...
 > > C (Fortran style (you can write Fortran in any language!)):
 > ...
 > > 		/* No bound checking possible. */
 > 
 > Again false.  But this time, the bounds checking must be done 'by hand'.
Again it is clear that you think 'bounds checking' is only local; it is not.
 > ...
 >                                                          In C, you also
 > have to do all the subscripting 'by hand' - which involves, at least,
 > an extra macro definition for each array argument.
What is the difference with a DIMENSION statement?  I know the syntax is
a bit less clear; but effectively they are the same.
 >                                                     (Some C users claim
 > that using macros for this is inefficient and that you should always
 > do the subscript expressions explicitly - presumably because many C
 > compilers miss the strength reduction on the multiplies.
This is a bogus argument.  I know a Fortran compiler that is not able to
compile DSQRT correctly.  Do I avoid all occurrences of DSQRT?  No, I
yell at the supplier.  (At least, I would yell if it would be worth the
hassle, but considering the number of bugs in that compiler...)
 >                                                           I wonder if
 > Piercarlo Grandi ever uses cars or elevators or if his brand of do-it-
 > the-hard-way masochism is limited to programming only :-)
Let's avoid Piercarlo Grandi.
 >                                                            Finally,
 > there is no way in C to tell the compiler that your pointers (I mean
 > arrays) are not aliased - here is damage which simply can't be undone.
But that is of course a whole different can of worms.
 > ...
I agree on this.  (Oh, it is about RESHAPE and IDENTIFY and so on.)
 > ...
(About generating equivalent code.)
Again aliassing is a problem.  But the code I presented has no problems
with aliasses unless used on multi-processor systems (only innerproducts
are used, so vector processors have no problems, unless they are not good
at that, and some are).

Summarizing:  the support for arrays in C is only slightly worse than in
Fortran.  Problem areas are:
1.  Aliassing (as discussed ad nauseum).
2.  The pre-ANSI requirement to do everything in double precision, which no
    -f flag is able to annihilate (strange enough, not yet discussed in this
    thread).

Many C users think that aliassing can be detected by the run time, so that
multiple parts of code can be generated depending on the presence of an
alias or not.  This is not true.  If two pointers point to different
arrays one may not compere them.  If they point to identical arrays one
may compare, but there are (generally used) access methods where a complete
detection of the absense of aliassing takes as much time as the operation
itself.

Many Fortran users think that aliassing is allowed and hence do some
defensive programming against it (or even state in the documentation that
aliassing is allowed).  Also this is impossible.  Given that two parameters
are (against the rules) aliasses of each other, there is no way to detect
that.  A simple case I encountered was in a long integer package.  The
routine to subtract numbers 'SUB(A, B, C)' took two input arrays A and B
and one output array C.  An attempt was made to detect whether A and B
where the same.  The test failed on at least one system.

You may now ask: "what is your favorite language?"  I would answer that I
do not know.  I use the language available to perform the task that has to
be done.  All languages I used/use have their defects.  I think there is
nothing to be done about that; I am not in the field of Herman Rubin who
proposes a universal language encompassing all current and future hardware
instructions directly available.  I will do assembler without complaining
if need arises (or even if there is no need, but than just for fun).
But, for numerical processes, Pascal is inadequate, unless it is an
implementation with conformant arrays (Level 2?), Fortran and C are
semi-adequate.  Algol 68 is very nice, but not readily available
(Herman Rubin ought to learn it and get a compiler; he could get all
the infix operators he wants), and Algol 60 is pretty good of course.
And, need I mention Ada?  Yes, she has her good points, but of course
also her defects.
-- 
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl