Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!utcsri!utegc!utai!garfield!dalcs!mnetor!uunet!seismo!husc6!cca!mirror!ima!haddock!karl From: karl@haddock.UUCP Newsgroups: comp.lang.c,comp.std.internat Subject: Re: What is a byte Message-ID: <899@haddock.ISC.COM> Date: Fri, 7-Aug-87 16:33:47 EDT Article-I.D.: haddock.899 Posted: Fri Aug 7 16:33:47 1987 Date-Received: Sun, 9-Aug-87 18:42:46 EDT References: <218@astra.necisa.oz> <142700010@tiger.UUCP> Reply-To: karl@haddock.ima.isc.com (Karl Heuer) Organization: Interactive Systems, Boston Lines: 41 Xref: utgpu comp.lang.c:3377 comp.std.internat:82 [I probably should have included comp.std.internat earlier, but I didn't think of it. c.s.internat readers can pick up context from comp.lang.c if desired.] In article <6216@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn) writes: >In article <851@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes: >>[For example,] on a bit-addressible machine in an Arabic- or Japanese- >>language environment, one might have "short char" be 1 bit, "char" be 8, >>and "long char" be 16. > >... I would prefer that a (char) be capable of holding an entire basic >textual unit, since many applications are already based on that assumption. >...might as well simply make (char) be the right thing and not introduce a >new type. ... most international implementations could make (short char) >8 bits and (char) or (long char) 16 bits. >>If this is to be phased in without breaking a lot of programs, X3J11 should >>immediately bless all three names, but insist that they all be the same size. >>(Which restriction should be deprecated, to be removed in the next standard.) > >I don't think it's within the realm of practical politics to say that the >problem will not be solved until the next issue of the standard. The problem with your proposal is that it would break existing code that assumes sizeof(char) == 1. If a user wants to write a portable program that refers to objects smaller than 16 bits%, he can't use (short char) because existing compilers won't accept it, and he can't use (char) because new ones might make it too big. That's why I suggested the temporary restriction. Also, in the world of international text processing I don't think we have all the questions yet, let alone the answers. I figure X3J11 should take care of one thing we do know (that "char" as commonly implemented nowdays won't suffice) and pave the way for a real fix later. (Hmm. If I were a Japanese user, using a VAX, and I was told that, because Japanese characters require more than 8 bits, and because (char) is the obvious datatype for characters, and because C requires that nothing be smaller than (char), my compiler couldn't address individual bytes, then I think I'd start looking for a new vendor or a new programming language.) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint %Assuming the implementation allows such an object to exist at all.