Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!utcsri!greg From: greg@utcsri.UUCP Newsgroups: comp.lang.c Subject: Re: char (*a)[] (was: Style [++i vs i++]) Message-ID: <5391@utcsri.UUCP> Date: Wed, 31-Dec-69 18:59:59 EDT Article-I.D.: utcsri.5391 Posted: Wed Dec 31 18:59:59 1969 Date-Received: Sun, 13-Sep-87 17:35:18 EDT References: <8298@brl-adm.ARPA> <587@cblpe.ATT.COM> <189@xyzzy.UUCP> <2310@mmintl.UUCP> <871@mcgill-vision.UUCP> <2348@mmintl.UUCP> <253@xyzzy.UUCP> Reply-To: greg@utcsri.UUCP (Gregory Smith) Organization: CSRI, University of Toronto Lines: 95 Summary: In article <253@xyzzy.UUCP> throopw@xyzzy.UUCP (Wayne A. Throop) writes: >> franka@mmintl.UUCP (Frank Adams) >> So does arithmetic on a null pointer produce undefined results? I don't >> have a copy of the proposed standard available, so I don't know what it >> says. This is about what it *should* say; if it doesn't, it should be >> changed. ... >on this point is that the standard does *NOT* say that arithmetic on the >null pointer produces undefined results (contrary to my own ... >> [it should be undefined because...] >> So we have the general principle that pointer arithmetic should not be able >> to adjust the value of the pointer outside the guaranteed "neighborhood" of >> legal values near that pointer. >> In the case of a null pointer, the only legal value in that neighborhood >> is null itself; thus "(char *)0 + 1" produces an undefined result. >> ("(char *)0 + 0" would be legal, and equivalent to "(char *)0".) I agree that it should be illegal to do any kind of arithmetic on a null pointer of any type. All this stuff gets very interesting when you consider what happens on an 80286 in its native 'protected' mode (as opposed to the 'fast 8086' mode in which most of them are warming their sockets). In this mode, a pointer is 32 bits; a 16-bit segment number, and a 16-bit offset. It is meaningless to do arithmetic on the segment number since it is just an index into a table maintained by the OS. Pointer arithmetic as we know it in C affects only the offset. The CPU supports a 'null' pointer as follows: Any pointer whose segment part is zero is considered a null pointer. It is not legal to dereference such a pointer, and it is not legal to load one into the stack-pointer register pair (SS:SP) or the program-counter register pair (CS:IP). Violations cause hardware traps. It is legal to load a null pointer as a 'data pointer'. (What this really means is that you can put 0 into DS and ES but not CS or SS). Thus the code for incrementing a pointer, when given a null pointer, will always produce a null pointer. The other weird bit concerns the range of these pointers. The compiler may assign a separate segment for every data object. The segment has a size, and any reference to that segment beyond this size causes a trap. Suppose I declare 'int foo[10]', then I may get a 20-byte segment for foo. Then &foo[10] is a pointer which is illegal to dereference. This is good. There are lots of bits of code like this: for( p = foo; p < &foo[10]; ++p ){ which cause p to be repeatedly compared to a constant invalid pointer until it becomes an invalid pointer itself. I can live with that. What gets a little weird is this: pointer inequalities are done by comparing only the offset part, since the comparison is invalid anyway if the segment numbers are different. Also, offset arithmetic is done in 16 bits. This means that foo[-1] is not only an invalid pointer, but it will be 'greater than' foo[0] since it will have an offset of 0xfffe. What this means is that the following won't work: for( p = &foo[9]; p >= foo; --p ){ /* loops forever */ Furthermore, if I declare a 64K segment ( int foo64[32768] ), the (overflowed) value of &foo64[32767] + 1 is the same as &foo64[0]. Thus not even this will work: for( p = foo64; p <= &foo64[32767]; ++p ){ /* loops forever */ In order to avoid these problems, then, we need a class of pointers which cannot be dereferenced but which can be used in comparisons. It is sufficient that these pointers be restricted to the form (&x)+1, where x is any valid data object. (&x)+1 > (&x) must always hold for any data object x (which rules out a full 64k byte segment on a 286). It would be nice if &x-1 were always less than &x, but that is not possible under this segmentation scheme. The ANSI standard must have something about such pointers. Do they say roughly the same thing about them as I have in the preceding paragraph? Sorry for all the blather, but I have noticed several previous postings that have overlooked these considerations. These people may never have to program on such an architecture, but it seems like it isn't too much trouble to avoid constructs which won't port. What I am looking for is a somewhat more concrete definition of which constructs will and won't work. [ e.g. what about: p = &foo[-1]; do{ ++p; ... }while( p <= &foo[9] ); Does the first ++p cause p to be &foo[0]? Can I legally add 4123 to &foo[0], and if I then subtract 4120 do I get &foo[3]? ] P.S. I am not a segment fan, but a pragmatist recently transplanted to the real world ( arrggg! ). -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...