Xref: utzoo comp.sys.ibm.pc:22741 comp.sys.intel:637 Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!oliveb!olivej!rap From: rap@olivej.olivetti.com (Robert A. Pease) Newsgroups: comp.sys.ibm.pc,comp.sys.intel Subject: Re: correct code for pointer subtraction Keywords: Bug or not, deal with it. Message-ID: <35495@oliveb.olivetti.com> Date: 4 Jan 89 08:13:10 GMT References: <597@mks.UUCP> <3845@pt.cs.cmu.edu> <18123@santra.UUCP> <142@bms-at.UUCP> <6604@killer.DALLAS.TX.US> <51@rpi.edu> Sender: news@oliveb.olivetti.com Reply-To: rap@olivej.UUCP (Robert A. Pease) Organization: Olivetti ATC; Cupertino, Ca Lines: 80 >Sizeof is irrelevant? What about the following piece of code: > > int *a,b[30000]; > > a = &b[20000]; > printf("%d\n", a-b); > >Obviously when you do the subtraction ( a-b ), the compiler only knows their >adrresses, not the number of elements they differ. So it HAS to calculate the >diference in BYTES and then divide by sizeof( int ). > >In example in question ( &a[30000] - a ), the difference in 60000 bytes, which >is -5536. Divide it by sizeof(int)==2 and you get the wonderful result: -2736. In MSC these examples reduce to a constant and the compiler places the values as immediate data in the instruction. I make no claims as to how the compiler arrives at these values. >I am not saying it is correct. Obviously the compiler is doing a signed >division to get the result instead of the unsigned that it should do. >THAT is the bug. The code generated by MSC 5.1 is shown below. mov WORD PTR _a,OFFSET DGROUP:_b+40000 ; a = &b[20000]; mov ax,WORD PTR _a ; a - b; sub ax,OFFSET DGROUP:_b ; sar ax,1 ; mov WORD PTR _diff,ax ; mov WORD PTR _diff,-2768 ; diff = &b[30000] - b; This doesn't really show much. If the constants are replaced with variables initialized to the same value, the code below is generated. ;|*** mov ax,_ind2 ; a = &b[ind2]; /* ind2 = 20000 */ shl ax,1 add ax,OFFSET DGROUP:_b mov WORD PTR _a,ax mov ax,WORD PTR _a ; diff = a - b; sub ax,OFFSET DGROUP:_b sar ax,1 mov WORD PTR _diff,ax mov ax,_ind3 ; diff = &b[ind3] - b; /* ind3 = 30000 */ shl ax,1 add ax,OFFSET DGROUP:_b sub ax,OFFSET DGROUP:_b sar ax,1 mov WORD PTR _diff,ax Note that the code is the same for both examples except for the intermediate storage. Now, Intel's description of SHL is that it shifts 0 in on the right and if the sign bit changes then OF is set. The description for SAR is that it shifts the sign bit in on the left. This does begin to wory me a bit. If I trace through the actual values with CodeView, I find that the value of the difference between pointers has changed sign between the SHL and SAR instructions for the values used (20000 and 30000). So, the bottom line is that when I expect the answer to be 20000 or 30000, it isn't and the reason it isn't is due to the SAR instruction shifting a sign bit into the left instead of a zero bit. This is totally contrary to what I set out to prove. Robert A. Pease {hplabs|fortune|microsoft|amdahal|piramid|tolerant|sun|aimes}!oliveb!rap