Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!mintaka!spdcc!ima!haddock!karl From: karl@haddock.ima.isc.com (Karl Heuer) Newsgroups: comp.std.c Subject: mb functions when size is 0 Message-ID: <18433@haddock.ima.isc.com> Date: 4 Oct 90 23:15:58 GMT Reply-To: karl@kelp.ima.isc.com (Karl Heuer) Organization: Interactive Systems, Cambridge, MA 02138-5302 Lines: 31 What should the program #include #include int main(void) { printf("%d %d %d %d\n", mbtowc((wchar_t *)NULL, "x", 0), mblen("x", 0), mbtowc((wchar_t *)NULL, "", 0), mblen("", 0)); return 0; } output? (You may assume there are no shift states, and that "x" is a valid multibyte character of length 1.) My opinion is "-1 -1 -1 -1", but the Standard seems self-contradictory here. For the first two values, I claim that the functions return -1 because one cannot form a valid multibyte character out of zero bytes. For the third value, I note that 4.10.7.2 says that "at most n bytes will be examined", and it is impossible for "x" and "" to produce different results without examining the first byte. Since 4.10.7.1 states that mblen() is equivalent to this form of mbtowc(), I claim that the third and fourth values should be the same. On the other hand, the RETURNS section of 4.10.7.1 has a clause for *s=='\0' before discussing n, and so a literal reading would seem to require that mblen("", 0) return 0. If so, what would this imply about the other three? Or is the behavior undefined, and if so, why? Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint