Xref: utzoo comp.std.internat:831 comp.protocols.tcp-ip:15601 Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!elroy.jpl.nasa.gov!decwrl!mcnc!uvaarpa!murdoch!usenet From: randall@Virginia.EDU (Randall Atkinson) Newsgroups: comp.std.internat,comp.protocols.tcp-ip Subject: Re: universality of Latin-1 Message-ID: <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU> Date: 10 Apr 91 17:27:56 GMT References: <16968@hoptoad.uucp> <1110@sranha.sra.co.jp> Sender: usenet@murdoch.acc.Virginia.EDU Organization: University of Virginia Lines: 60 John Gilmore originally wrote: % And my windows all use ISO Latin 1. If Torbj|rn would send the % umlauted letter in that standardized character set, it would look right % in both the States and in Sweden. In article <1110@sranha.sra.co.jp>, Erik M. van der Poel responded: >Have you ever tried to send yourself a message in Latin-1? Did it >work? And even if *you* have a reasonable version of sendmail (one >that doesn't strip the 8th bit), what makes you so certain that >Torbj|rn's message and anyone else's won't pass through a site that >*does* strip the 8th bit? It does work for a fair and ever increasing subset of the Internet. BITNET doesn't do very well with it. Clearly we need to move towards 8-bit and 16-bit and 32-bit transparent mail transport mechanisms. Fortunately there are a number of possible transport mechanisms out there to choose from, some of which are already 8-bit transparent. >Also, what's so "standardized" about ISO Latin-1? What makes it more >standard than, say, Latin-2? ISO 8859/1 is NOT any "more standard" than ISO 8859/2, however sites in the US are in fact migrating towards ISO 8859/1 from US ASCII and most sites in the US are NOT migrating towards ISO 8859/2 (though they might support it on the side as vendors begin to). The languages that are most commonly used in the US are in ISO 8859/1 and the languages supported by ISO 8859/2 are less commonly used (again in the US as a whole). Note that ISO Latin-1 is ISO 8859/1 which is the 8-bit character set used for Western European languages. ISO Latin-2 is ISO 8859/2 which is the 8-bit character set for Eastern European languages. Clearly we need to add additional information to the header of mail messages to indicate which character set to use. I'm not sure of the current state of the Internet protocols (RFC 822 et. al.) with respect to this. If there isn't the equivalent of a "Character-set:" header yet, serious consideration should be given to adding one with clearly defined values for at least existing ANSI and ISO character sets. Character sets that should have a defined string to use with such a header field include at least: ASCII ISO 8859/1 ... ISO 8859/N (where N is the last defined set) ISO 10646 (once it gets completed) The Internet is the dominant mail transport network at present, partly because so many other networks gateway with it. Getting the Internet to convert to supporting such needs would be a big step in the right direction. Perhaps someone on the IETF can comment on their current activities in this area ?? Ran Atkinson randall@Virginia.EDU