Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!sunybcs!boulder!ccncsu!longs.LANCE.ColoState.Edu!gs940971 From: gs940971@longs.LANCE.ColoState.Edu (glen sunada f84) Newsgroups: comp.lang.c Subject: Re: Pointer arithmetic and comparisons. Message-ID: <3408@ccncsu.ColoState.EDU> Date: 8 Dec 89 06:42:04 GMT References: <257ECDFD.CDD@marob.masa.com> Sender: news@ccncsu.ColoState.EDU Reply-To: gs940971@longs.LANCE.ColoState.Edu (glen sunada f84) Organization: Colorado State U. Center for Computer Assisted Engineering Lines: 103 In article <257ECDFD.CDD@marob.masa.com>, daveh@marob.masa.com (Dave Hammond) writes: > Machine/OS: 286-compatible/MSDOS > Compiler: Turbo-C 1.5 > [ stuff deleted in concern of bandwidth] > some_function(char *buffer, int len) > { > char *p = buffer; > char *e = &buffer[len]; > > while ((*p++ = getchar()) != EOF && p < e) { > ... > } > > ... > } > [ more stuff deleted ] > > The problem occurs when the address resulting from &buffer[len] exceeds > 65535. For example, if &buffer[0] is 65535 and len is 100, &buffer[len] > becomes 99, making `while (p < e)' immediately false. > > I was under the impression that (for n > 0) buffer[n] should not yield > an address lower than buffer[0]. Is the pointer comparison I am doing > non-portable, or otherwise ill-advised ? > > Thanks in advance. > > -- > Dave Hammond > daveh@marob.masa.com This question is probably one that a lot of people on the net have gotten bit by or will get bit by, so I will answer over the net. Yes the value of the pointer to the end of the string array is suposed to be larger than the pointer to the beginning of the string array. What is causeing the problems with the comparison in the MS-DOS enviroment is that the 8086 family of micro-processors has a segmented architecture. The net result of this is that in compareing pointers you need to compare both the segment and the offset to determine the placement in memory. Because of the segmented architecture compilers of MS-DOS machines support many memory models. These models are listed below: TINY - One segment - all pointers are a 16 bit offset SMALL - One segment for code and one for data - all pointers are a 16 bit offset but code is in a different segment than data (i.e. pointers to functions have a different segment that pointers to data) - usual default NOTE: The example you give should result in a stack corruption problem MEDIUM - any number of code segments on data segment pointers are non-normalized (i.e. there are many segment:offset pairs that resolve to the same linear address. (segmets are defined on 16 byte boundries wit a size of 64K) ) COMPACT - one code segment any number of data segments non-normalized pointers LARGE - any number of code segments, any number of data segments, non-normalized pointers HUGE - and number of code segments, any number of data segments, normalized pointers There exists also another difference for non-normalized and normalized pointers. The non- normalized pointers wrap around at 64K (65536 -> 0) and do not adjust the segment. Therefor, in the code fragment you posted when p is incremented it points to offset zero in the same segment as the start of the array, not the segment starting 64K above the one that holds the start of the array. But, a normalized pointer has it's offset automatically reset to keep the offset less than 16 bytes. To do this the segment is adjusted. This means that using huge pointers in the example above should (CMA - Cover My A**) fix the problem with wrap around. There is a dis-advantage to working with huge pointers - they take more time because of the normalization. I hope this quick overview of the architecture of the IBM-PC helps and is not just a waste of network bandwidth. BTW - this segmentation is not a poblem with *NIX on a PC because *NIX automatically uses the HUGE memory model for an 8088, 8086, 80188, 80186, and 80286 and either the HUGE or the protected mode flat memory model for the 80386 (unfortunately this model is not available to DOS on a 80386 beacuse DOS runs in real mode not protectred mode). Glen U. Sunada gs940971@longs.LANCE.ColoState.EDU