Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!cmcl2!brl-adm!brl-smoke!gwyn From: gwyn@brl-smoke.ARPA (Doug Gwyn ) Newsgroups: comp.lang.c,comp.std.internat Subject: Re: What is a byte Message-ID: <6270@brl-smoke.ARPA> Date: Sun, 9-Aug-87 19:56:22 EDT Article-I.D.: brl-smok.6270 Posted: Sun Aug 9 19:56:22 1987 Date-Received: Thu, 13-Aug-87 01:33:20 EDT References: <218@astra.necisa.oz> <142700010@tiger.UUCP> Reply-To: gwyn@brl.arpa (Doug Gwyn (VLD/VMB) ) Organization: Ballistic Research Lab (BRL), APG, MD. Lines: 18 Xref: mnetor comp.lang.c:3580 comp.std.internat:88 In article <2034@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: >While we're developing nightmares about the number of bits the Japanese >need in a char, remember for text processing that for 1 billion of the >earth's residents, the smallest unit of text processing is the ideograph ... I'm no expert, but I seem to recall that Chinese ideographs (which as I understand it come in several varieties) are pretty much made from a (relatively) small set of basic strokes placed in different positions. I think there are even Chinese typewriters, or at least type compositors. If this is correct, then one possibility would be to devise a suitable (acceptable to technical Chinese) representation for ideographs in terms of basic strokes and placement instructions, which could be treated as text units. After all, the letter "w" doesn't mean much when taken out of English context; we too need the whole word-symbol, not just a letter-component to express a meaning. It's just that our alphabet is simpler and is combined in 1 dimension instead of 2.