Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!rutgers!mcnc!xanth!kent From: kent@xanth.UUCP (Kent Paul Dolan) Newsgroups: comp.lang.c Subject: Re: What is a byte Message-ID: <2068@xanth.UUCP> Date: Tue, 11-Aug-87 20:07:12 EDT Article-I.D.: xanth.2068 Posted: Tue Aug 11 20:07:12 1987 Date-Received: Fri, 14-Aug-87 04:22:14 EDT References: <218@astra.necisa.oz> <142700010@tiger.UUCP> Reply-To: kent@xanth.UUCP (Kent Paul Dolan) Distribution: world Organization: Old Dominion University, Norfolk Va. Lines: 26 Keywords: 32 bit bytes! You ain't seen nothin', yet. In article <34@piring.cwi.nl> lambert@cwi.nl (Lambert Meertens) writes: >In article <2034@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: >) >) While we're developing nightmares about the number of bits the Japanese >) need in a char, remember for text processing that for 1 billion of the >) earth's residents, the smallest unit of text processing is the ideograph, >) and that even 21 bits is probably barely sufficient to represent the number >) of written words in Chinese. > >Are you suggesting that there are more than 2**20 = 1048576 different >written words in Chinese? At typically 60 entries on a page, their >dictionaries must have then some 17500 pages or more. I think that 16 bits >are enough to accommodate all Chinese characters, and certainly ample for >the about 5000 that are in actual use. >-- >Lambert Meertens, CWI, Amsterdam; lambert@cwi.nl Surely not! My own English active/passive vocabularies are 100,000/250,000 words. The Oxford Dictionary of the English Language fills a five foot book shelf and contains well over a million entries. The Chinese have had a LONG time to work with a written language; I would expect their numbers to exceed these. There seems little chance that 65536 ideographs would suffice. (Comments from members of the Chinese community who know the answers to such questions would save a lot of fruitless debate here! ;-) Kent, the man from xanth.