Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 alpha 4/3/85; site ukma.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!houxm!ihnp4!cbosgd!ukma!david From: david@ukma.UUCP (David Herron, NPR Lover) Newsgroups: net.internat Subject: Re: Is 8-bit ASCII enough? Message-ID: <2282@ukma.UUCP> Date: Thu, 10-Oct-85 11:15:44 EDT Article-I.D.: ukma.2282 Posted: Thu Oct 10 11:15:44 1985 Date-Received: Sat, 12-Oct-85 21:31:54 EDT References: <149@ecrcvax.UUCP> <10597@ucbvax.ARPA> Reply-To: david@ukma.UUCP (David Herron, NPR Lover) Organization: Univ. of KY Mathematical Sciences Lines: 28 In article <10597@ucbvax.ARPA> kupfer@ucbvax.UUCP (Mike Kupfer) writes: >I think that 8 bits is still not enough if you want to include oriental >or other non-Roman character sets. So using only 8 bits is reasonable >if you assume that a typical UNIX system will not be able to display >these characters (so why bother with them), but you should realize that >this assumption is being made. There's some work being done at Xerox, etc in representing foreign character sets and word-processing them -- Look in Sci. Am. in an issue a year or two ago. I think maybe that was the 'topic' for that month even. The method (As I recall) described in one article was to define one code as an "escape" code. You could follow the escape code with commands to switch character sets or whatever. So instead of an absolute encoding, you had a context sensitive encoding. Which will give you greater flexibility in the character sets you are storing. (They are aiming for a system whereby ALL text, regardless of language, may be word-processed, etc). One of the most interesting things I remember is that some languages have characters which *surround* other characters. This was making for an interesting typesetting problem. -- David Herron, ukma!david@ANL-MCS.ARPA, cbosgd!ukma!david (Soon -- david@UKMA.BITNET, and (hopefully) david@ukma.csnet) Hackin's in me blood! My mother was known as Miss Hacker before she married!