Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!ames!think!barmar
From: barmar@think.COM (Barry Margolin)
Newsgroups: comp.lang.c
Subject: Re: pointer representation (was: Re: effect of free())
Message-ID: <29250@news.Think.COM>
Date: 11 Sep 89 20:06:32 GMT
References: <319@cubmol.BIO.COLUMBIA.EDU> <3756@buengc.BU.EDU> <29171@news.Think.COM> <2079@munnari.oz.au>
Sender: news@Think.COM
Organization: Thinking Machines Corporation, Cambridge MA, USA
Lines: 85

In article <2079@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes:
>In article <29171@news.Think.COM>, barmar@think.COM (Barry Margolin) writes:
>> If the optimizer for an architecture where pointers may compare equal
>> but not be completely equivalent were to perform such a
>> transformation, it would be a buggy optimizer.
>Yes, it would.  But the compiler writer would have been _seduced_ into
>that mistake by the standard.  People are encouraged to think of == as
>testing for EQUALITY.  In dpANS C, it appears that == does *NOT* have
>the properties of equality, and at the very least this needs to be said
>clearly and explicitly in the Rationale.

This is only true if there can actually be non-interchangeable
representations for pointers to the same location.  I'd expect the
compiler implementor for a system to know whether this is true, and
implement the optimizer accordingly.

By the way, since the difference between these representations is
outside the scope of the C standard, there's actually no reason such
optimizations couldn't be done (yes, I've changed my position from
when I concluded it would be a bug), as far as ANSI C is concerned.
In the case you described, the difference between using the two
pointers to the same object would be in whether it gets a memory
access violation because of ring numbers in the pointer; but C doesn't
specify the behavior regarding memory access violations.  If a C
implementation does performs such a transformation it may just not be
a good implementation to use for implementing certain inner-ring code
on such machines (although I'd expect the "volatile" modifier to be of
use in preventing such unwanted optimizations).  But it would still be
ANSI-conformant.

>> You appear to be
>> assuming a system-independent optimizer; what you've done is point out
>> why such a thing is not really a good idea.
>I am _sick_ of people trying to guess what I am assuming; they are _always_
>wrong.  (At least in this thread they have been.)  I wasn't assuming a
>system-dependent optimizer at all.  The point is that it is a standard
>technique described in most good compiler books to keep track of which
>locations hold the "same" value and to use the "cheapest" location with
>the desired value.  

But someone familiar enough with the architecture to be writing an
optimizer should know that "same" and "==" aren't necessarily the same
thing for pointers.  "Standard techniques" are fine, but they must be
understood in context.

>Anyone writing an optimising compiler for C who was
>not on the committee and hasn't been reading this thread, whether he is
>trying to out-do GCC or just generate code for the HAL Whizzbang 99 is likely
>to make the natural assumption that because == is _called_ an "EQUALITY
>operator" it has the properties of equality.  

Well, since I've reverse my position on whether that optimization is
valid (it's probably still not desirable, though), I don't see a major
problem with implementors who haven't read the thread.

>The bottom line for those of us who do not write compilers is that we
>should take care that two pointers are only compared when
>EITHER (a) the two cannot refer to the same object
>  OR   (b) the two must have been generated in the same way, so that they
>	   cannot have different access rights &c

Unfortunately, this may not be good enough.  What if the generated
code does the comparison and substitution without you writing the
comparison explicitly?  I doubt there's anything in the standard
prohibiting this.  I've never heard of an optimizer doing this, but I
don't read much compiler implementation literature; if I can conceive
of it, so can actual compiler researchers.

>and that a pointer must not be referred to in any way after it has been
>free()d.  Even ptr = NULL is too risky:  what if the compiler generates a
>sequence which uses a read-modify-write cycle with the old value being
>loaded into an address register?  -- remember how CLR works on a 68000...

Uh oh, how are you ever going to assign something to an automatic
pointer?  Their initial value is undefined, so they might contain
invalid addresses.  I suspect the standard requires assignments into
pointer variables to always work, regardless of the validity of the
previous value; the compiler will have to avoid generating
instructions that care what the old value is.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar