Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!ccut!wnoc-tyo-news!dclsic!sjc!leia!harkcom From: harkcom@spinach.pa.yokogawa.co.jp Newsgroups: comp.std.c Subject: Re: wchar_t values Message-ID: Date: 5 Apr 91 00:11:46 GMT References: <1006@sranha.sra.co.jp> <15651@smoke.brl.mil> Sender: news@leia.pa.yokogawa.co.jp Organization: Yokogawa Electric Corporation, Tokyo, Japan Lines: 52 In-reply-to: keld@login.dkuug.dk's message of 3 Apr 91 22:59:44 GMT In article keld@login.dkuug.dk (Keld J|rn Simonsen) writes: =}JIS X 0208 (basic Japanese 16-bit standard) /035/099 JIS X 0208 doesn't cover the ASCII characters. It has a double sized (zenkaku) English character set though. 'c' in all three of the popular multibyte encodings (EUC, JIS, SJIS) is 0x63 (same as ASCII). The most common wide character format (UJIS) has 'c' as 0x0063 (ASCII in 2 bytes). I don't know the encodings for the Chinese & Korean well, but the standards don't seem to cover 'c'... =}None of these values have the nice property of having ASCII 'c' =}extend into these values when loading as a 16-bit or 32-bit int. See above... =}think there is a problem =}and they have not yet been able to solve it. A problem with ISO 10646? A problem with the 'East-asian de jure' character sets in reference to wchar_t? =}Thus the internal widechar representation of 'c' and the external =}multibyte representation SHOULD not be the same for character sets =}like ISO 10646, JIS X 0208, KS C 5601 and GB 2312. =}At least this should hold for characters in the C character set. Huh? This doesn't follow... It doesn't even sound correct. A single byte wide character set using values above 0x80 in addition to the ASCII characters would become difficult... =}The reason why the Japanese have not seen the problem before with =}JIS X 0208, but first with 10646, is beyond my understanding. =}Maybe some Japanese could enlighten us (me!) on this? What 'problem' do the 'Japanese' see with ISO 10646? =}>No wonder there has been renewed interest in other standards such as =}>"Unicode" (about which I know little at present other than that it =}>has a broad base of industry support). =} =}Now you are talking about things that you know very little of, Doug! Speaking of 'harsh tones'... Your apparent knowledge of the JIS standard shows you have little room to point... Al