Xref: utzoo comp.sys.ibm.pc:22905 comp.sys.intel:646
Path: utzoo!attcan!uunet!husc6!rutgers!att!mtuxo!mtgzz!drutx!mayer
From: mayer@drutx.ATT.COM (gary mayer)
Newsgroups: comp.sys.ibm.pc,comp.sys.intel
Subject: Re: correct code for pointer subtraction
Summary: pointer subtraction is simple, but subtle
Keywords: C pointer math has machine dependent limitations
Message-ID: <9878@drutx.ATT.COM>
Date: 7 Jan 89 01:22:44 GMT
References: <597@mks.UUCP> <3845@pt.cs.cmu.edu> <18123@santra.UUCP> <6604@killer.DALLAS.TX.US>
Organization: AT&T, Denver, CO
Lines: 52

In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:

> The same error occurs in the following program 
> (with Turbo C 2.0 as well as MSC 5.0):
>
> main()
> {
>         static int a[30000];
>         printf("%d\n",&a[30000]-a);
> }
>
> output:  -2768

I grant that this is probably not the answer you would like, but it
is the answer you should expect once pointer arithmetic is understood.

First, pointers are addresses.  On the 8088, etc. processors, memory is
addressed in bytes, and an integer is 2 bytes.  Thus the array above is
60,000 BYTES long, and the absolute difference between "a" (the address
of the first element of the array) and "&a[30000]" (the address of the
integer ONE PAST the end of the array, though that is NOT a problem here
and is a very common practice) is 60,000.

Second, when doing pointer arithmetic a scaling factor is used.  In pointer
subtraction, the result is the number of objects (integers here) between
the two pointers.  The scaling factor is not visible, but is used internally
and is the number of addressing units in the given object.  In this
example, the addressing unit is a byte and the object is a 16 bit 
integer, yeilding a scaling factor of 2.

Third, you might expect the answer to be 30,000, the result of 60,000 / 2.
This doesn't happen because of the 16 bit size, the fact that the result
of pointer subtraction is specified to be an integer, and a "weak" but
standard way of implementing the underlying code.  What happens is the
result of the initial pointer address subtraction is 60,000 (or xEA60).
The division by 2 is done as a signed operation and is thus interpreted
as "-5536 / 2" (xEA60 as an integer in 16 bits is -5536), yeilding the
-2768 result.  It is the treatment of this step of the operation that
I consider to be "weak", it is mixing signed and unsigned operations in
an unfavorable way, and done differently it would yield the correct
result in this case.

The problem is complicated on the 8088, etc. machines further because
a "far" pointer allows for arrays larger than the 16 bit integer size
can express.  The result of the subtraction of far pointers should be
a long integer, but I do not know what those compilers do.

In summary, be careful with pointers on these machines, and try to
learn about how things work "underneath".  The C language is very
close to the machine, and there are many times that this can have
an effect - understanding and avoiding these where possible is what
writing portable code is all about.