Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!cs.utexas.edu!yale!mintaka!bloom-beacon!eru!hagbard!sunic!lth.se!newsuser From: Dan@dna.lth.se (Dan Oscarsson) Newsgroups: news.software.b Subject: Re: New USENET header: Language Message-ID: <1990Dec30.095938.23011@lth.se> Date: 30 Dec 90 09:59:38 GMT References: <1990Dec29.093002.10739@lth.se> <91B+H%A@b-tech.uucp> Sender: newsuser@lth.se (LTH network news server) Organization: Computer Science, Lund Institute of Technology, Sweden Lines: 28 In article <91B+H%A@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: >>The only character set needed is ISO 10646. It covers nearly every character >>in the world. > >I'll agree that using just ISO 10646 would be simple and would solve >the current problems with other language use on usenet. Even so, >"nearly" in the second sentence points out the potential problem with >the first sentence. > True, but the characters missing today are very few and there are plenty of space left that can be filled in with missing characters at a revision of the standard. >The efficiency issue would have to be addressed also. You would probably >end up with some compression method (which is what an escape sequence to >switch char sets is). ISO 10646 defines sequences for switching character length and subset of the total set. This allows letters using ASCII or ISO 8859-1 to be sent exactely as today with not one character more. ISO 10646 can use 8,16,24 and 32 bits per character and can dynamically change between them so the standard defines a way to efficiency send text. Dan -- Dan Oscarsson Department of Computer Science Lund Institute of Technology e-mail: Dan@dna.lth.se Box 118 S-221 00 Lund, Sweden