Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!snorkelwacker!bloom-beacon!ora!minya!jc
From: jc@minya.UUCP (John Chambers)
Newsgroups: comp.unix.wizards
Subject: Re: What machines core dump on deref of NULL?
Message-ID: <418@minya.UUCP>
Date: 6 Jul 90 01:17:10 GMT
References: <444@mtndew.UUCP> <31079@cup.portal.com> <13226@smoke.BRL.MIL> <1990Jun29.132304.12550@athena.mit.edu>
Lines: 89

In article <1990Jun29.132304.12550@athena.mit.edu>, jik@athena.mit.edu (Jonathan I. Kamens) writes:
> 
>   From K&R Second Edition, Page 102: "C guarantees that zero is never a
> valid address for data, so a return value of zero can be used to signal
> an abnormal event, in this case, no space."  ANSI C is a lot newer than
> "about a decade ago." 

And on page 192 of my C bible I find the paragraph:
	The compilers currently allow a pointer to be assigned to an integer,
	an integer to a pointer, and a pointer to a pointer of another type.
	The assignment is a pure copy operation, with no conversion.  This
	usage is nonportable, and may produce pointers which cause address
	exceptions when used.  However, it is guaranteed that assignment of
	the constant 0 to a pointer will produce a null pointer distinguish-
	able from a pointer to any object.

The trouble with this statement is that I've never seen a C compiler
that implements it.  On extant processors, it is simple to prove that
it can't be implemented.  If you examine any of the current commercial
processors (68xxx, 8xx86, SPARC, MIPS, PDP11/VAX, etc.), you quickly
learn that all of them have the property that there is no bit pattern
that is guaranteed to cause a fault when used as an address.  True,
you can use the memory-management hardware to intercept attempts to
reference ranges of addresses, but this is a different issue.  The
memory-referencing hardware has no bit pattern that a compiler can 
use as "null" value, with the guarantee that its use as an address
will cause an interrupt under all circumstances.  All bit patterns
are legal (byte) addresses on these machines.

Yes, I'm aware that processors have been built that have a null pointer,
and some even have a bit pattern that is recognized as not-a-number by 
all the arithmetic opcodes.  I've written code (even assembly code) for
the Burroughs B5500 and B6700, for instance.  I also think that having
an explicit "illegal" value for all types is a Real Good Idea.  But in 
today's real world, the programmers who write the really low-level stuff 
rarely have the luxury of a well-designed processor; they are stuck with 
80386s and the like.  On such processors, C simply can't be implemented 
in conformance with such standards, no matter how much we'd like it.

>   It would seem to me that a simpler solution to the embedded processor
> problem than requiring a non-standard C compiler in order to write code
> for one would be to not have any physical memory at address 0, or to put
> program memory there (since, unless the program is self-modifying, it
> should never have to access its own memory, excluding perhaps function
> pointers).

So how do you get the code there?  

If I were designing my own processor, I'd probably try to make address 
zero illegal, since that would catch so many bugs during early testing.  
But I'm not, and I can't.  Given hardware that says that such-and-such 
is stored at address zero, you either write code that references address 
zero, or you hand the job over to someone else.  You also tend to use 
the C compilers that are available, since you need to get a product out 
the door, rather than wait until an acceptable compiler comes along and 
you get a signature on the purchase order.

BTW, perhaps this should be asked in comp.lang.c (though I recall it
being discussed a few years back, with much flamage but few answers);
can anyone show how one would portably code a statement that assigns
the value zero (not null) to a pointer?  If I am faced with hardware 
with a given structure in low memory, it'd help if I could declare:
	struct lowmem {
		...
	} *lowmem = 0;
and be guaranteed that it will point to the right place.  It'd also
be nice to be able to do the assignment at run time if necessary. 
As so many people have pointed out, the above code could legally be
implemented as any illegal value; what I need is a way to guarantee
that it will be zero, regardless of what null is and whether zero 
is a legal address at the moment.

BTW, this issue could come up for some people working in the Unix
kernel.  Unix isn't immune from the clever ideas of the hardware
implementors.  Very often the interrupt vector table is in low
memory, and you just might find yourself someday working with a
kernel that allows interrupt routines to be added and deleted from
a running system.  (After all, VMS can do it, as can DOS.)  How
do you plan to plug in a new level-0 interrupt routine on this
system, if you can't write to location zero?  As any 80x86 hacker
will tell you, ranting about the idiocies of the design won't help 
you get the system out the door (though it surely does feel good 
at times ;-).

-- 
Typos and silly ideas Copyright (C) 1990 by:
Uucp: ...!{harvard.edu,ima.com,eddie.mit.edu,ora.com}!minya!jc (John Chambers)
Home: 1-617-484-6393
Work: 1-508-952-3274