Path: utzoo!mnetor!uunet!oddjob!hao!ames!pasteur!ucbvax!ulysses!cjc From: cjc@ulysses.homer.nj.att.com (Chris Calabrese[rs]) Newsgroups: comp.lang.c Subject: Re: Bug in ANSI C?? Message-ID: <10095@ulysses.homer.nj.att.com> Date: 18 Feb 88 14:56:41 GMT References: <5331@cit-vax.Caltech.Edu> <241@oracle.UUCP> <2118@bsu-cs.UUCP> <16@dcs.UUCP> Distribution: comp.sys.ibm.pc,comp.lang.c Organization: AT&T Bell Laboratories, Murray Hill Lines: 25 Keywords: memcmp, memmove, strcmp, memcmp Summary: 8 bit international char's In article <16@dcs.UUCP>, wnp@dcs.UUCP writes: > In article <2118@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: > >In article <241@oracle.UUCP> rbradbur@oracle.UUCP (Robert Bradbury) writes: > >>On another note; does everyone realize that the current standard allows > >>the results of the str/memcmp() function to be implementation defined > >>if the characters being compared have the high-bit set? > > The purpose of this would be to allow the use of the "alternate" character > set (= codes > 127) to be used for international language applications. > Languages which have more than 26 alpha characters need the upper half > of the eight-bit code range to implement their languages, and in that > case ignoring the 8th bit would be very counter-productive. If ansi wants this to really work, they'll have to allow for 16 bit char's, the standard in Japanese and Chinese language word processors. There is still a problem with using the 8th bit, as many machines generate strict parity for character work. Assumably, the lexical ordering probelem can be eliminated by stripping the 8th bit before comparison, or better yet, 15 bit char's with 1 bit parity, or any other combo. Chris Calabrese AT&T Bell Labs ulysses!cjc