Xref: utzoo news.software.b:6443 soc.culture.lebanon:71 soc.culture.vietnamese:1493 Path: utzoo!utgpu!watserv1!watmath!att!tut.cis.ohio-state.edu!cs.utexas.edu!yale!umich!terminator!pisa.ifs.umich.edu!rees From: rees@pisa.ifs.umich.edu (Jim Rees) Newsgroups: news.software.b,soc.culture.lebanon,soc.culture.vietnamese Subject: Re: New USENET header: Language Message-ID: <4ec122f7.1bc5b@pisa.ifs.umich.edu> Date: 22 Dec 90 20:58:39 GMT References: <1990Dec22.081718.2109@looking.on.ca> Sender: usenet@terminator.cc.umich.edu (usenet news) Reply-To: rees@citi.umich.edu (Jim Rees) Followup-To: news.software.b Organization: University of Michigan IFS Project Lines: 34 In article <1990Dec22.081718.2109@looking.on.ca>, brad@looking.on.ca (Brad Templeton) writes: I propose a new USENET header item, namely "Language:" The default, for historical reasons, would be "Language: English" but other fields would be fine. And sorry to be so anglo-centric, but I suspect that the language names should be the English names for the languages, since it is consistent with the use of English header names. I like this a lot, but it's really just the tip of the iceberg. We need a way to feed non-latin scripts through the net. Way back in 1983 I posted some patches to make B news 8-bit safe. Almost all new news transport is now done either by nntp or uucp, both of which are 8-bit safe, so it's just a matter of making sure your relay program doesn't strip bits. Are modern B-news and C-news 8-bit safe? The other half of this is fixing the reading and posting programs. For example, xrn could be fixed so that if it sees "Language: Japanese" in the header, it switches to a text widget that can display Japanese text. This would require agreement on which of the Japanese text encoding schemes to use (EUC, JIS, etc.) but this shouldn't be hard if we discuss it here and just pick one. There are already text widgets to display Japanese so hooking one up to xrn shouldn't be hard. Right now there is a debate going on in soc.culture.lebanon on the best way to write Arabic using a latin script. That misses the boat, as I see it. What we should be doing is sending and displaying actual Arabic characters, not some romanised bastardisation. This has already happened in soc.culture.vietnamese, where postings are full of diacriticals following the letters they should go over: "da^'u na(.ng" is a poor substitute for real, written Vietnamese. I'll shut up now since I'm not volunteering to do any real work.