Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!henry
From: henry@utzoo.UUCP (Henry Spencer)
Newsgroups: comp.std.internat
Subject: Re: Character representation
Message-ID: <8480@utzoo.UUCP>
Date: Tue, 25-Aug-87 11:23:51 EDT
Article-I.D.: utzoo.8480
Posted: Tue Aug 25 11:23:51 1987
Date-Received: Tue, 25-Aug-87 11:23:51 EDT
References: <15381@mordor.s1.gov> <1583@athena.TEK.COM> <8462@utzoo.UUCP>, <737@maccs.UUCP>
Organization: U of Toronto Zoology
Lines: 17

> Ummm, Arabic, Hebrew, Greek, and Cyrillic are or will shortly be taken care of
> by the same standardization process that produced ISO Latin-1.  Each uses a
> different upper half of the character set...

Unfortunately, this brings us back to the old problem that the meaning of a
byte is context-dependent.  There were alternate character sets for the Latin
languages before, and standard escape sequences for switching; much good it
did us.  Anything with mode-switching is an order of magnitude harder to
handle intelligently than a modeless code like ASCII or ISO Latin.  Don't
forget the right-to-left problems in Arabic and Hebrew, for that matter.
I don't know what the best answer is, and am not convinced that anyone else
does either.  Hence "wait and see".  My sympathy goes out to the people who
have compelling commercial reasons to do something about these issues now;
it can't be much fun.
-- 
"There's a lot more to do in space   |  Henry Spencer @ U of Toronto Zoology
than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry