Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!cs.utexas.edu!uunet!auspex!guy
From: guy@auspex.auspex.com (Guy Harris)
Newsgroups: comp.lang.c
Subject: Re: Another silly question
Message-ID: <1677@auspex.auspex.com>
Date: 22 May 89 18:13:48 GMT
References: <17812@cup.portal.com> <607@kl-cs.UUCP> <749@mccc.UUCP> <17635@mimsy.UUCP> <756@mccc.UUCP>
Reply-To: guy@auspex.auspex.com (Guy Harris)
Organization: Auspex Systems, Santa Clara
Lines: 47

>Perhaps I've asked the wrong question.  I saw a couple of simple test
>programs that assigned 0 to each member of an array.  One used array
>subscript notation, and the other, pointer notation.  I compiled these
>on a 7300, a 3B2/400, and a 386 running Microport V/386, using a variety
>of compilers (cc and gnu-cc on the 7300, fpcc on the 3B2, and cc and
>Greenhills on the 386).  I ran each version and timed the execution. 
>The subscript versions had different run times from the pointer versions
>(some slower, some faster!).  I assumed - perhaps naively - that the
>differences were caused by differences in code produced by the different
>compilers (and of course the hardware differences).  Was that wrong? 
>How does one account for the differences?

Well, if the program that used subscript notation was something like:

	for (i = 0; i < LEN; i++)
		a[i] = 0;

and the program that used pointer notation was something like:

	p = &a[0];
	while (p < &a[LEN]) 
		*p++ = 0;

the answer has nothing whatsoever to do with the equivalence of "a[i]"
and "*(a + i)", since the latter program doesn't use the latter
construct, so you did ask the wrong question.

It has, instead, to do with the fact that the equivalence of the two
constructs in question is not as trivial as the equivalence of "a[i]"
and "*(a + i)", and therefore it may be less likely that the compilers
will generate the same code for them.  There may well be compilers that
*do* generate the same code for them - rewrite the first loop as:

	for (i = 0; i < LEN; i++)
		*(a + i) = 0;

and then note that on most architectures, this requires that the value
in "i" be multiplied by "sizeof a[0]" before being added to the address
represented by the address of "a[0]", and do a strength reduction on
that multiplication; you then find the induction variable not used, and
eliminate it, and by the time the smoke clears you have the loop in the
first example generating the same code as the loop in the second
example.  (I don't know whether there are any compilers that do this or
not.)

If the code generated for the two constructs is different, that could
account for performance differences.