Path: utzoo!attcan!uunet!mcvax!ukc!reading!cf-cm!cybaswan!iiit-sh From: iiit-sh@cybaswan.UUCP (Steve Hosgood) Newsgroups: comp.std.c Subject: Re: Character Sets Message-ID: <456@cybaswan.UUCP> Date: 23 May 89 13:20:50 GMT References: <4623@freja.diku.dk> <12.UUL1.3#5077@aussie.UUCP> <2469@ogccse.ogc.edu> <373@cybaswan.UUCP> <10194@smoke.BRL.MIL> <442@cybaswan.UUCP> <10284@smoke.BRL.MIL> Reply-To: iiit-sh@cybaswan.UUCP (Steve Hosgood) Organization: Institute for Industrial Information Technology Lines: 34 In article <10284@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >However, X3J11 did mandate that the character values for '0'..'9' have >adjacent values in ascending numerical order. That is clearly a code >set requirement, which I argued against. The need for some way to >map digit characters to numbers and vice versa does exist, but other >means to meet this need could have been specified. Seems like a job for to me. Interesting though, I had never considered the possibility of non-contiguous numbers and alphabetics rearing its head now that EBCDIC is dead (slight :-)). >>The 'UCASE' hack to allow UN*X to work on silly old terminals was put >>into the TTY handler. So I believe should this trigraph thingy. >Not every system has such facilities, but I agree with your general >sentiment. In fact I expect that some of the more enlightened >implementors will take exactly this tack to deal with practical use >of so-called "European character sets". But if this trigraph thing gets into the standard, then *all* conforming compilers will *have* to have the code in their lexical analysers. As you say, enlightened (:-)) implementors will probably deal with the problem in the handler, but the compiler carries the baggage around for evermore *as well*. >The new ISO code set standards should also help. I certainly hope so. Presumably the C standard allows for 8-bit character sets? Also, what about such things as allowable characters in identifiers and such like? Just yesterday, I was writing a program where I would have liked to have used Greek characters as identifiers. Is that sort of thing permissable? Would 'toupper' return upper-case Epsilon if given lower-case epsilon as an argument? It's a tricky can of worms, and it gets worse the closer you look at it. Steve