Path: utzoo!attcan!uunet!mcvax!ukc!dcl-cs!aber-cs!pcg From: pcg@aber-cs.UUCP (Piercarlo Grandi) Newsgroups: comp.lang.c Subject: Re: signed/unsigned char/short/int/long Summary: Why "signed char" is "better" than "char int" ? Question remains... Message-ID: <347@aber-cs.UUCP> Date: 8 Dec 88 13:18:58 GMT References: <264@aber-cs.UUCP> <8982@smoke.BRL.MIL> <8983@smoke.BRL.MIL> <277@aber-cs.UUCP> <225@twwells.uucp> <330@aber-cs.UUCP> <9086@smoke.BRL.MIL> Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi) Distribution: eunet,world Organization: CS Dept., University College of Wales, Aberystwyth, UK Lines: 167 X-Disclaimer: Any statement is purely personal. In article <9086@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) ) writes: You're still inaccurate. "far" and "near" were never at any time in any draft of the proposed ANSI C standard. "noalias" was not in any draft other than the one sent out for the second public review. How could it be, when it had just been invented at the previous meeting and was retracted at the next? Just a moment! I have just apologized saying that my inaccuracy was to imply that near and far eventually made it into dpANS. I acknowledge that they did not make it there, still at one time (fairly long ago) they were considered to be going to be part of ANSI C (yes, I know that ANSI C does not yet exist), and the usual trade interests advertised (they shouldn't have done it, I know) these as ANSI conforming features of their compiler. I am pleased that, along with noalias (that did eventually make it into just one official dpANS issue) X3J11 had enough sense to avoid them. Are you just making this stuff up, or do you have drug-using advisors, or what? Maybe I have seen too many Laurel & Hardy films ["... Look at what you made me do!"] and I cannot keep them too well distinct from X3J11 work :-) :-). More seriously, I have been using C for 8-9 years now, and following X3J11 for many years as well. The attention I have devoted to following X3J11 in later years has not been great, as disappointment set in (volatile? signed? reserved word functions? structural equivalence only if different compilation units? etc...). I would like also to add that you are right to ask that X3J11 be taken to account only for what has been perpretated in the latest official, published version of dpANS, but I am right too to raise again the specter of old issues that have been discussed quite seriously, as they are part of the full picture. When you want to run for President, after all, you know that people will look at whether you stole cookies when you were twelve :-)... >As to the last point, char has been so far just a short short; a char >value can be operated upon exactly as an integer. Except that whether it acts as signed or unsigned depends on the implementation. Gee, I see you have indeed read the Classic edition of K&R. Let me be nitpicking. I said "integer", not "int", and for once I was accurate :-). You know the meaning of "integer" and "integral". What I was saying is that I cannot really see a strong enough difference in the semantics of "char" and "int/unsigned" to need an "integral" class distinct from "integer"; I think that the Classic C book can be slightly reinterpreted or amended to make char belong to "integer" (approximately, just as a modifier on int/unsigned). The other problem with the Classic C book (that is, apart from distinguishing between "integer" and "integral"), and you seem to have understood it correctly :-), is that it only defined "char", whose signedness was implementation dependent, and "unsigned char". What I am asking is why X3J11 did not legalize the combination "int char", hitherto not legal, but accepted by some popular compilers because of an easily explained benign mistake, to mean "signed char", WITHOUT the introduction of a new keyword with further complication of the rules for declarations. I cannot believe they did not think of it... A related but distinct issue, that fits in nicely with the first, is why it has not been stipulated that there are two integral types, int/unsigned, with different arithmetic properties, and three optional lengths for wither of them, char/short/long, instead of writing up tables of permitted combinations, which are somewhat more complex, and less clear as to the fundamental difference in semantics between unsigned and int. >Historically char constants have been really the size of integer >constants... You mean "character constants"; in C they ARE integer constants specified in a certain character-oriented way. Exactly, thnak you for the nitpicking. I used this point to show that "philosophically" char in C is just a shorter type of integer. This is not suprising, considering that C is a descendant of BCPL (whose single most annoying feature is having to use putbyte() and getbyte() for string manipulation, as it has just one length of integer). In a sense C is a wonderfully equilibrated mix, BCPL with quite a good lot of Algol68 thrown in, and this shows thru in things like some semantics (BCPL-ish) of integer types, and their syntax (Algol68-ish). I can say this having studied in depth (several years ago) both Algol68 and BCPL; it is a pity that so many C programmers don't know either, and miss the pleasure of contemplating some important threads of history (e.g. BCPL and Algol68 are themselves related by way of CPL). >Now I reiterate the question: why was a new keyword introduced "signed" >when it just sufficed to sanction the existing practice of some >compilers (PCC had it, more recent BSD versions fixed this "bug") to >say "int char" or better "char int"? I have never seen a C compiler that accepted "int char"; Well, you have seen few, I surmise, or you never tried (more likely, I admit). As I explained, the fact that some (or even several) compilers do accept "int char" is the result of an easily made mistake in a particular, but popular, parsing strategy for C declarations. certainly Ritchie didn't intend for it to be valid. Also, char has never been guaranteed to be signed; read K&R 1st Edition. I am pleased that we do agree on something, indeed Ritchie never intended it to be valid and he did carefully not specify the default signedness of char; I am also pleased that you have actually read Classic K&R, and not just the less delectable works from X3J11. It happened to be most efficient on the PDP-11 to make it signed, ... [ well known list of machines and defaults for char signedness omitted ] ... implementation dependence in his BSTJ article on C in 1978. You even read the BSTJ! My, you must be quite a learned fellow. If so, you will also know that char is by default unsigned also in some 68K compilers, while most Intel compilers have it signed. Incidentally, I have even seen two compilers for the same architecture (68k) implement a different default! Unfortunately, your precious information (that can be found, by the way, in a table in any Classic K&R book) is beyond my my point. Also, I am not entirely surprised/amused at your repeated assumptions that nobody has bothered to read the Classic C book. [ By the way, for the benefit of our audience, I will add that many Classic C and Unix articles from various BSTJs etc... have been reprinted in a more easily obtained set of two volumes; if I remember correctly, "Unix Papers" by Academic Press. ] >Amusingly it persists even today in other compilers, among them >g++ 1.27, where interestingly "sizeof (char int)" is 4 and "sizeof >(int char)" is 1 on a 68020... I don't know what C++ rules for basic types really are, but if as I suspect g++ is getting it wrong, you should report this bug to the GNU project. Well, technically this IS a mistake. On the other hand I am not going to complain, of course... (except that I do not like the dissimetry between "char int" and "int char"). If you had read the full paragraph, I did say that it is an unintentional "feature", I even explain why and how this mistake is commonly made by C compiler writers. What I am still waiting for, instead of cheap innuendo and showing off that one had read the Classic K&R (as though nobody else did), is for somebody to make a good case for: introducing the signed keyword and related paraphernalia instead of allowing "int char" (an existing unintentional "feature" of some compilers, by the way) to do the trick, NOT stipulating that there are two fundamental types with very different semantics, that can come in four different lengths, and therefore having to do with three word long type specifiers, and fairly tedious tables of what is permitted, and not emphasizing the distinction between int and unsigned. Note that both things are essentially issues of elegance and easier comprehensibility, which are damn important in a language like C, and both can be introduced into the language with essentially a slight reinterpretation and/or the removal of restrictions of existing rules. -- Piercarlo "Peter" Grandi INET: pcg@cs.aber.ac.uk Sw.Eng. Group, Dept. of Computer Science UUCP: ...!mcvax!ukc!aber-cs!pcg UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)