Path: utzoo!attcan!uunet!mcsun!ukc!stc!inset!mikeb From: mikeb@inset.UUCP (Mike Banahan) Newsgroups: comp.std.c Subject: Multibyte characters Message-ID: <1467@inset.UUCP> Date: 3 Jul 90 13:47:58 GMT Reply-To: mikeb@inset.co.uk (Mike Banahan) Organization: The Instruction Set Ltd., London, UK. Lines: 36 On the interesting subject of wide characters, multibyte characters and so on, I haven't noticed a discussion in this group which touches on the following. Let's say that I do have a multibyte execution character set which supports for the sake of argument, English and Greek, with Greek using a shift-in shift-out mechanism. A string of the form "abc@d" is valid C (using @ to represent the Greek character `alpha'. It will contain 8 bytes, counting the shift-in, shift-out and the null at the end. Presumably the integral constant '@' is a three-byte constant, no matter what it may look like? An alternative interpretation is that it violates the constraint in 2.2.1.2 `a .. character constant .. shall begin and end in the initial shift state', but presumably I can expect my implementation to do the necessary good deeds and put a shift-out in there too. Since it is a three-byte constant (assuming I'm right), then can I be sure that I do not get overflow when I assign it to a char variable? 3.1.3.4 says that the value of a multi-character character constant will be implementation-defined, and 3.2.1.2 says that that (paraphrase) demoting an int to a char gives an implementation-defined result. So to call it `overflow' is perhaps overstating the case, but I clearly end up in implementation-defined territory twice over. Sorry if this has been discussed before. If not, could someone enlighten me as to the actual situation? Thanks in advance, Mike Banahan -- Mike Banahan, Technical Director, The Instruction Set Ltd. mcvax!ukc!inset!mikeb