Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!ames!ucbcad!ucbvax!hplabs!hp-sdd!artecon!tony From: tony@artecon.UUCP Newsgroups: comp.lang.c,comp.std.internat Subject: Re: What is a byte Message-ID: <555@artecon.artecon.UUCP> Date: Tue, 18-Aug-87 13:08:43 EDT Article-I.D.: artecon.555 Posted: Tue Aug 18 13:08:43 1987 Date-Received: Thu, 20-Aug-87 01:52:11 EDT References: <218@astra.necisa.oz> <142700010@tiger.UUCP> <2792@phri.UUCP> <25736@sun.uucp> Organization: Artecon Inc., San Diego Lines: 71 Xref: utgpu comp.lang.c:3545 comp.std.internat:117 Summary: Kanji isn't all that hard. In article <25736@sun.uucp>, guy%gorodish@Sun.COM (Guy Harris) writes: >In article Peter Da Silva writes: > > In Japan programming languages are the least of the problems their written > > language causes them. An incredible amount of data is never stored anywhere > > but on the original form, photocopies of said form, or faxed copies of said > > form. Even with the best tools available it's just too hard to keypunch. > > > > This, of course, makes it even more amazing that they have been so succesful > > in the world community. It seems likely to me, though, that at some point > > they're going to have to break down and drop Kanji for professional use. > > I don't know about that. More and more machines are adding support for Kanji. > There are a large number of Japan-only (Japan-mostly? I seem to remember Jun > Murai saying these groups were forwarded to Carnegie-Mellon) newgroups in which > most of the traffic is in Japanese, represented in Kanji. (He said they added > Kanji support to X.10, including a "jterm" variant of "xterm" that emulated a > Kanji terminal.) The NEC PC also includes Kanji support; it is often used as a > Kanji terminal. > > These machines may not be able to handle every single Kanji character, but the > 90/10 rule may apply. > Guy Harris Yes, it is true that Kanji is getting more support. Hewlett-Packard has a new drafting plotter (HP-7595) which has a Kanji option. The form of specification is that when you invoke the Kanji font, you go into a two-byte mode. That is, it takes two bytes to specify one Kanji character. Control bytes are used as control bytes, but the 94 printing bytes are used in the Kanji specification. So, 94 * 94 = 8836 different characters you can use. This is a good way of doing it since you never know how your OS is going to muck with control codes or full 8-bit binary data going to I/O devices. I believe that this is a fairly standard way of doing this for printers. 8836 may not seem like a lot of Kanji (which is known to go to about 50000 in Japanese), but only 1850 are needed to graduate from high school, and usually about 3000 are used in college texts. There are two "JIS" standards set by the Japanese Ministry of Education. JIS level 1 is about 3000 characters (including the basic 1850, KataKana, HiraGana, English alphabet, Cyrillic, Greek, and special symbols), and JIS level 2 is about 8000 (including the 3000 JIS level 1). As a rule, one is supposed to try to stick to JIS level 1, but use JIS level 2 for Proper names and just a few other execptions. So, in reply to above: 1) You may not be able to handle all 50000 Kanji, but JIS level 2 is more than enough, 2) It really isn't that difficult to implement because: a) It is a well defined font, accessed easily in two-byte sequencing (you don't even need 8-bits, 7 will do) b) You can get already masked ROMS which contain Kanji in a rasterized form for raster printers. c) The Japanese are more than happy to help you implement Kanji in your products. They will digitize Kanji for whatever reasonable form you need it. -- Tony BTW, I am not Japanese...but.."I think I'm turning Japanese, I really think so!" "Konnichi-wa" -- **************** Insert 'Standard' Disclaimer here: OOP ACK! ***************** * Tony Parkhurst -- {hplabs|sdcsvax|ncr-sd|hpfcla|ihnp4}!hp-sdd!artecon!adp * * -OR- hp-sdd!artecon!adp@nosc.ARPA * *******************************************************************************