Xref: utzoo comp.sys.ibm.pc:22741 comp.sys.intel:637
Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!oliveb!olivej!rap
From: rap@olivej.olivetti.com (Robert A. Pease)
Newsgroups: comp.sys.ibm.pc,comp.sys.intel
Subject: Re: correct code for pointer subtraction
Keywords: Bug or not, deal with it.
Message-ID: <35495@oliveb.olivetti.com>
Date: 4 Jan 89 08:13:10 GMT
References: <597@mks.UUCP> <3845@pt.cs.cmu.edu> <18123@santra.UUCP> <142@bms-at.UUCP> <6604@killer.DALLAS.TX.US> <Jan.3.11.49.44.1989.22204@paul.rutgers.edu> <Jan.3.12.44.56.1989.4839@ron.rutgers.edu> <51@rpi.edu>
Sender: news@oliveb.olivetti.com
Reply-To: rap@olivej.UUCP (Robert A. Pease)
Organization: Olivetti ATC; Cupertino, Ca
Lines: 80

>Sizeof is irrelevant?  What about the following piece of code:
>
>	int	*a,b[30000];
>
>	a = &b[20000];
>	printf("%d\n", a-b);
>
>Obviously when you do the subtraction ( a-b ), the compiler only knows their
>adrresses, not the number of elements they differ.  So it HAS to calculate the
>diference in BYTES and then divide by sizeof( int ).  
>
>In example in question ( &a[30000] - a ), the difference in 60000 bytes, which
>is -5536.  Divide it by sizeof(int)==2 and you get the wonderful result: -2736.

In MSC these examples reduce to a constant and the  compiler
places  the  values as immediate data in the instruction.  I
make no claims as to  how  the  compiler  arrives  at  these
values.

>I am not saying it is correct.  Obviously the compiler is doing a signed
>division to get the result instead of the unsigned that it should do.
>THAT is the bug.  

The code generated by MSC 5.1 is shown below.


mov	WORD PTR _a,OFFSET DGROUP:_b+40000	; a = &b[20000];
mov	ax,WORD PTR _a				; a - b;
sub	ax,OFFSET DGROUP:_b			;
sar	ax,1					;
mov	WORD PTR _diff,ax			;

mov	WORD PTR _diff,-2768			; diff = &b[30000] - b;


This  doesn't  really  show  much.   If  the  constants  are
replaced  with  variables initialized to the same value, the
code below is generated.


;|*** 
mov	ax,_ind2		; a = &b[ind2];	/* ind2 = 20000 */
shl	ax,1
add	ax,OFFSET DGROUP:_b
mov	WORD PTR _a,ax

mov	ax,WORD PTR _a		; diff = a - b;
sub	ax,OFFSET DGROUP:_b
sar	ax,1
mov	WORD PTR _diff,ax

mov	ax,_ind3		; diff = &b[ind3] - b; /* ind3 = 30000 */
shl	ax,1
add	ax,OFFSET DGROUP:_b
sub	ax,OFFSET DGROUP:_b
sar	ax,1
mov	WORD PTR _diff,ax


Note that the code is the same for both examples except  for
the intermediate storage.

Now, Intel's description of SHL is that it shifts  0  in  on
the  right  and if the sign bit changes then OF is set.  The
description for SAR is that it shifts the sign bit in on the
left.

This does begin to wory me a bit.  If I  trace  through  the
actual  values  with  CodeView, I find that the value of the
difference between pointers has changed sign between the SHL
and SAR instructions for the values used (20000 and 30000).

So, the bottom line is that when I expect the answer  to  be
20000  or  30000, it isn't and the reason it isn't is due to
the SAR instruction  shifting  a  sign  bit  into  the  left
instead  of  a zero bit.  This is totally contrary to what I
set out to prove.

					Robert A. Pease
{hplabs|fortune|microsoft|amdahal|piramid|tolerant|sun|aimes}!oliveb!rap