Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!clyde!rutgers!seismo!mcvax!lambert From: lambert@mcvax.UUCP Newsgroups: comp.std.internat,sci.lang.japan Subject: Re: What is a byte Message-ID: <51@piring.cwi.nl> Date: Sun, 16-Aug-87 06:13:53 EDT Article-I.D.: piring.51 Posted: Sun Aug 16 06:13:53 1987 Date-Received: Sun, 16-Aug-87 22:10:56 EDT References: <218@astra.necisa.oz> <142700010@tiger.UUCP> <2792@phri.UUCP> <6252@brl-smoke.ARPA> <479@sugar.UUCP> Organization: CWI, Amsterdam Lines: 61 Keywords: Kanji, Romaji, homonymy Xref: utgpu comp.std.internat:107 junk:5614 Summary: Problems in dropping Kanji [I have removed comp.std.c from the Newsgroups line and added sci.lang.japan] In article <479@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes: ) This, of course, makes it even more amazing that they have been so succesful ) in the world community. It seems likely to me, though, that at some point ) they're going to have to break down and drop Kanji for professional use. There seems to be a good reason for not doing this: after romanization, words written differently in Kanji may become the same. Although ambiguities caused by homonymy occur in all languages (like English "drill" = 1. [the use of] a tool for boring holes, metaphorically also boring exercise; 2. [[a tool for] sowing in] a furrow; 3. twilled cotton; 4. a baboon), these seem nothing compared to what the Japanese would face. For example, the word "kanji" itself can mean: 1. feeling, sensation, impression; 2. Kanji (Chinese character); 3. manager, secretary; 4. inspector, superintendent; 5. smilingly. These are all written differently now. A particularly bad example: "ko^ka" = 1. Faculty of Engineering; 2. consideration of services; 3. a high price; 4. an official price; 5. overhead, elevated; 6. merits and demerits; 7. effect, efficiency; 8. descent, fall; 9. the marriage of an Imperial princess to a subject; 10. mineralization; 11. colloid degeneration, gelatination; 12. hardening, cementation, vulcanization, stiffening; 13. hard money, cash; 14. a leave of absence; 15. taxes; 16. an evil effect; 17. the Yellow Peril; 18. an unfortunate slip of the tongue; 19. amalgamation; 20. a school song; 21. a high-rise building. This high degree of ambiguity is the combined result of two characteristics of Japanese. One is that there are say 1850 Kanji characters in common use, each having an independent semantic content and usually a one-syllable "reading", the so-called On reading, derived from the Chinese pronunciation. There may be more than one On reading and there are some bisyllabic On readings. There is also a Kun (original Japanese) reading, which is completely unrelated (like On = "chu^", Kun = "hiru"), and which more often than not is polysyllabic, but most single syllables occur as a Kun reading. I haven't counted them, but say there are about ninety syllables for readings of these 1850 characters, so typically a single syllable may be the reading of 20 different characters. The second characteristic that is important here is the ease with which compound words are formed in Japanese, often by stringing some Ons together. Thus, all the different "ko^ka"s above are the result of combining a highly ambiguous "ko^" with a highly ambiguous "ka", and there are hundreds of other potential meanings for this compound than the few given above (culled from a dictionary). Written in Kanji, there is no ambiguity. Not all words are that ambiguous if spelled in Romaji, but glancing through my dictionary I estimate that about one third to one half of the entries have the same romanization as another entry, and the number of clusters of four or five homonymous entries may be as many as one thousand (as I find one on almost every page of 1000 pages, sometimes two or three). It may be that I am overestimating the problem and that the context would suffice well enough to disambiguate romanized Japanese to make it acceptable for professional use. Perhaps a Japanese reader of this article may care to comment. -- Lambert Meertens, CWI, Amsterdam; lambert@cwi.nl