Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!snorkelwacker.mit.edu!bloom-picayune.mit.edu!news
From: scs@adam.mit.edu (Steve Summit)
Newsgroups: comp.lang.c
Subject: Re: NULL question not in FAQ
Message-ID: <1991Mar28.071834.27272@athena.mit.edu>
Date: 28 Mar 91 07:18:34 GMT
References: <1991Mar26.235643.4498@ux1.cso.uiuc.edu>
Sender: news@athena.mit.edu (News system)
Reply-To: scs@adam.mit.edu
Organization: Thermal Technologies, Cambridge, MA
Lines: 97

In article <1991Mar26.235643.4498@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
>Given that the compiler is supposed to translate the constant "0" to the
>appropriate value for a NULL pointer on the machine type, how does one
>get a pointer value whose representation happens to be all zeroes, but
>is [not necessarily a] NULL pointer?
>
>For low level code, e.g. drivers and such, where portability and machine
>independence are not issues, it would still be nice to be able to cast
>address values into pointers as desired.

Indeed.  In fact, K&R1 stated that "the mapping function [between
integral types and pointers] is... machine dependent, but is
intended to be unsurprising to those who know the addressing
structure of the machine."  (Section A14.4, p. 210.)  Of course,
in these more modern and civilized times, wanton conversions
between pointers and integers are more strongly discouraged, and
the "intended to be unsurprising" language does not appear in K&R2.

>Some machine types might very
>well have special functionality at address 0x00000000 and uses some other
>form of address for a NULL.

While such machines are certainly possible (and, with time,
perhaps more so) I would hazard to guess that they are rare at
best.  If a machine is intended to be manipulated directly via
absolute addresses, the manufacturer (and compiler architect)
will presumably make it convenient to do so.  In effect, the old
"intended to be unsurprising" language is likely to be observed.

I can think of three general ways to get a pointer that really
points at address zero.  (All are, of course, unportable, but
that's obviously okay here.)

     1.	char *p = (char *)0;

     2. int zero = 0;
	char *p = (char *)zero;

     3.	char *p;
	memset((char *)&p, 0, sizeof(char *));

(Chris and Henry, and others, have already described several
variations on methods 2 and 3.)

Number one is the most obvious (and easiest) technique; there's
nothing "magic" or surprising about it at all.  It simply assumes
that the internal representation of a null pointer really is
address 0.  (This is, after all, a safe assumption for many
machines, frequent exhortations here to the contrary
notwithstanding.)

Number two is a bit safer, because it dodges the source code null
pointer constant rules in favor of run-time int->pointer
conversion rules, which are more likely to behave as intended, as
long as the compiler author happens to have heard or though of
something equivalent to the old K&R1 "intended to be unsurprising"
advice.  (On a machine with nonzero internal null pointers,
number two wouldn't work if the compiler writer tried to be
"helpful" by making runtime int->pointer coercions, for zero
values, mimic compile-time null pointer generation.)

Number three (including analogous tricks using unions) is
probably the safest pure-C approach, but it is somewhat clumsy.

(Number four, not shown here, is to use an auxiliary assembly
language file, as Chris described, which is the very safest
technique.  "If you want assembly language, you know where to
find it.")

I always use technique one, if it works.  (So far, it always
has, for me.)

Falling-all-over-myself-disclaimer:  Obviously, these techniques
are all unportable, shamelessly violating both the information-
hiding intent of source-code null pointer constants, and also
everything comp.lang.c has been trying to teach anybody about
null pointers.  However, since accessing (for instance) a non-
maskable or power-up interrupt vector at location zero is
inherently unportable, it's obviously acceptable if unportable
code is used to do it.  (Phil understands this; I'm just
re-emphasizing the point.)

If you choose one of the simpler (but less safe) techniques for
accessing location 0 (and I wouldn't blame you for doing so),
you'll have to document that decision and be aware of future
compiler releases which might do things differently.

>Can someone summarize this, depending one what the real answers are, and
>include it in the FAQ in the section on NULL?  This might clear up (or
>confuse further) the distinction of NULL.

It's certainly true that broaching this essentially taboo subject
risks further confusion, and the question doesn't come up very
often, but it's worth a (small) mention in the FAQ list.

                                            Steve Summit
                                            scs@adam.mit.edu