Path: utzoo!utgpu!watserv1!watmath!att!tut.cis.ohio-state.edu!cs.utexas.edu!sdd.hp.com!uakari.primate.wisc.edu!caen!umich!terminator!pisa.ifs.umich.edu!rees From: rees@pisa.ifs.umich.edu (Jim Rees) Newsgroups: news.software.b Subject: Re: New USENET header: Language Message-ID: <4ef9aa39.1bc5b@pisa.ifs.umich.edu> Date: 3 Jan 91 02:38:05 GMT References: <1990Dec29.093002.10739@lth.se> <91B+H%A@b-tech.uucp> <1990Dec30.095938.23011@lth.se> Sender: usenet@terminator.cc.umich.edu (usenet news) Reply-To: rees@citi.umich.edu (Jim Rees) Organization: University of Michigan IFS Project Lines: 39 In article <1990Dec30.095938.23011@lth.se>, Dan@dna.lth.se (Dan Oscarsson) writes: ISO 10646 defines sequences for switching character length and subset of the total set. This allows letters using ASCII or ISO 8859-1 to be sent exactely as today with not one character more. ISO 10646 can use 8,16,24 and 32 bits per character and can dynamically change between them so the standard defines a way to efficiency send text. I'll have to plead guilty of getting us off the track. Brad just wanted some way to skip articles written in French because he doesn't read French. I now understand that this is orthogonal to the issue of character sets. I think we should: 1. Introduce a "Language:" header. This wouldn't affect how your news reader displays text. You could put "Language: French" in your kill file if you want. 2. Adopt a standard for character set encoding. This should be easy, since there are so many to choose from. ISO 10646 looks good to me, if it isn't too hard to implement. 3. For those of us with X terminals, modify the Athena text widget so that it understands and can display the encoding selected in (2) above. This widget could then be plugged in to xrn or your favorite X-base news reader. Could be plugged in to your mailer, too. Unfortunately xterm doesn't use the text widget. Maybe someone already has a ISO 10646 widget -- anyone know? 4. People with ASCII terminals will need a reader that transliterates to ASCII using some kind of table. I'm not going to worry about this. 5. I don't know what to do about input. I assume there are standards for this too. 6. Per Brad's suggestion, we should start sending binary stuff as 8 bit binary. News propogates it just fine. It's a horrible waste of bandwidth to uuencode all that stuff. We'll want to fix our readers/posters to deal with this.