Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!uunet!munnari.oz.au!goanna!ok
From: ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe)
Newsgroups: comp.lang.c
Subject: Re: low level optimization
Message-ID: <5475@goanna.cs.rmit.oz.au>
Date: 1 May 91 02:22:20 GMT
References: <1991Apr17.225944.15261@zoo.toronto.edu> <22839@lanl.gov>
Organization: Comp Sci, RMIT, Melbourne, Australia
Lines: 105

In article <22839@lanl.gov>, jlg@cochiti.lanl.gov (Jim Giles) writes:
> That is exactly the point I made earlier about empowerment.  C allows
> both subroutines, but most implementations pessimize both func2 and
> func3 because they can't tell whether their argument is aliased to
> thier global or not.  Fortran also restricts the user, but not in
> speed.  The Fortran compiler optimizes both func2 and func3 as if
> no aliasing has occurred - but then, the call in this example which
> causes aliasing in func2 is _ILLEGAL_.  Since deliberate aliasing 
> of this kind is rare (or can be made rare by coding around it), the
> Fortran rule permits the user to get better code generated.  

Here I think we have come to the core of the problem.
Friends, let's face it, Jim Giles' main point (that many currently
available C compilers miss chances to generate better code because
they think pointers _might_ be aliased) IS VALID.  There are some
really good compilers out there (Plan 9 sounds _wonderful_) but there
are some not-so-good compilers,  and especially on the top end machines
that I suspect Jim Giles cares most about, Fortran compilers really do
generate much better code than C.  (There are other reasons for using
Fortran on top end machines.  For example, a certain mini-super
manufacturer making what's basically an improved DAP lets the number of
processors in the actual machine "show through" in C, but gives you
however many "logical processors" you want in Fortran.  Ok, fair enough
for C to have a way of getting at the hardware directly, but there is no
advantage to customers in forbidding "virtual processors" to C programmers.)

The key difference is
    -- C permits aliasing.  The price of this is that in the absence of
       good interfile analysis, you get inferior optimisation.  This is
       a Quality of Implementation issue, because interfile analysis
       _can_ be done without violating the standard.
    -- Fortran forbids aliasing.  The price of this is that in the
       absence of good interfile analysis, the fact that the code that
       has been generated is WRONG because there _is_ aliasing goes
       quietly un-noticed.

Let me show you a typical example of aliasing in C:

	struct ListNode { struct ListNode *next; int item; }

	void ListDelete(struct ListNode **p; int item)
	    {
		struct ListNode **v = p;
		struct ListNode *q;

		for (q = *p; q != NULL; q = q->next)
		    if (q->item != item)
			*v = q, v = &(q->next);
		*v = q;
	    }

	    ...
		ListDelete(&var);
	    ...

At various times we have *v and *p as names for the same variable,
**p and *q as names for the same variable, **v and *q as names for
the same variable, and so on.  This is aliasing.

C pointers would be almost useless for managing linked structures unless
"aliasing allowed" were the default.  (This is not a defence of C
pointers.  There may well be better ways of accomplishing the same ends.)
Fortran, however, _used_ not to have pointers.  I know that Fortran
Extended _has_ pointers, but not having a current draft, I don't know
what form they now take.  Can Fortran Extended pointers be used for this
sort of linked structure manipulation, and what does that do to Fortran's
prohibition of aliases?

It should also be borne in mind that the prohibition of aliases in the
Fortran standard used to be widely ignored by Fortran programmers.  For
example, consider matrix addition:

	subroutine matadd(a, b, c, m, n)
	    integer m, n
	    real a(m,n), b(m,n), c(m,n)
	    integer i, j
	    do 20 j = 1, n
		do 10 i = 1, m
		    a(i,j) = b(i,j) + c(i,j)
10		continue		    
20	    continue
	end

I've lost count of the number of books I've seen which tell you that
it is ok to call a subroutine like this as
	call matadd(a, a, c, m, n)
or	call matadd(a, b, a, m, n)
This is _precisely_ the kind of aliasing that Fortran prohibits.  The
effect of such a call is undefined.  A Fortran compiler is perfectly
entitled to generate code that starts out
	if (iaddr(a).eq.iaddr(b)) call system('rm *')

The corresponding C code is required to work.

If Jim Giles accompanied his criticism of C for allowing aliasing
with a swipe at Fortran compiler vendors whose products do not by
default warn users when they commit the "error" of aliasing, then
I'd be impressed by his fairness.

(For what it's worth, I have no objections to using Fortran, and 
would _love_ to have access to a Fortran Extended compiler.)

-- 
Bad things happen periodically, and they're going to happen to somebody.
Why not you?					-- John Allen Paulos.