Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!rutgers!umnd-cs!umn-cs!ems!mark From: mark@ems.MN.ORG (Mark H. Colburn) Newsgroups: comp.unix.wizards Subject: Re: Size of SysV "block" (really: byte != 8 bits) Message-ID: <413@ems.MN.ORG> Date: Fri, 31-Jul-87 10:19:58 EDT Article-I.D.: ems.413 Posted: Fri Jul 31 10:19:58 1987 Date-Received: Sun, 2-Aug-87 04:46:09 EDT References: <218@astra.necisa.oz> <142700010@tiger.UUCP> Reply-To: mark@ems.UUCP (Mark H. Colburn) Organization: EMS/McGraw-Hill, Eden Pairie, MN Lines: 35 In article <6156@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) ) writes: >In article <463@unisoft.UUCP> greywolf@unisoft.UUCP (The Grey Wolf @ ext 165) writes: >>I see nothing wrong [with] eight bits for a character. >I take it you don't pay much attention to the rest of the world, then. Often times I have seen a lot of flaming with absolutely no explanation as to why the original poster was wrong. This is one of those cases. Rather than say that an opinion is wrong, it would help to explain why it is wrong, so that the original poster (hopefully) learns by his mistakes. Doug is right of course. There is a need for more than eight bits for representing characters in other languages. The most glaring example is Kanji or Katakana, where there are literally 100,000+ letters in the alphabet. Obviously, it would be very difficult to express that in 8 bits :-). Other less obvious examples would be German, Norwegien, French and Greek. All of these languages, and others as well, make use of letters with special attributes. For example and e or u with an umlaut in German, a c with a circumflex (^), accent grave ('), or accent ague (`) in French, or the ae combination in Greek. Any of these characters are not in the standard ASCII 8-bit character set. Many of these are handled by extensions to ASCII or some other character set standard, however, 8-bits is not enough for some of the glyph-oriented alphabets. If you would like more information on this topic, there have been a number of good papers written and given at USENIX, as well as appearing in many of the trade journals. In addition, it is addressed in the proposed POSIX standard. -- Mark H. Colburn DOMAIN: mark@ems.MN.ORG EMS/McGraw-Hill UUCP: ihnp4!meccts!ems!mark AT&T: (612) 829-8200