Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!henry From: henry@utzoo.UUCP (Henry Spencer) Newsgroups: comp.std.internat Subject: Re: Character representation Message-ID: <8480@utzoo.UUCP> Date: Tue, 25-Aug-87 11:23:51 EDT Article-I.D.: utzoo.8480 Posted: Tue Aug 25 11:23:51 1987 Date-Received: Tue, 25-Aug-87 11:23:51 EDT References: <15381@mordor.s1.gov> <1583@athena.TEK.COM> <8462@utzoo.UUCP>, <737@maccs.UUCP> Organization: U of Toronto Zoology Lines: 17 > Ummm, Arabic, Hebrew, Greek, and Cyrillic are or will shortly be taken care of > by the same standardization process that produced ISO Latin-1. Each uses a > different upper half of the character set... Unfortunately, this brings us back to the old problem that the meaning of a byte is context-dependent. There were alternate character sets for the Latin languages before, and standard escape sequences for switching; much good it did us. Anything with mode-switching is an order of magnitude harder to handle intelligently than a modeless code like ASCII or ISO Latin. Don't forget the right-to-left problems in Arabic and Hebrew, for that matter. I don't know what the best answer is, and am not convinced that anyone else does either. Hence "wait and see". My sympathy goes out to the people who have compelling commercial reasons to do something about these issues now; it can't be much fun. -- "There's a lot more to do in space | Henry Spencer @ U of Toronto Zoology than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry