Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!mailrus!tut.cis.ohio-state.edu!bloom-beacon!gatech!hubcap!ncrcae!ncr-sd!se-sd!rns From: rns@se-sd.sandiego.ncr.com (Rick Schubert) Newsgroups: comp.lang.c Subject: `char' parameters: a follow-up/summary ( Long - but worth it :-) ) Summary: A plea for more help Keywords: parameter, K&R, ANSI C, narrowing, widening, promotion Message-ID: <1626@se-sd.sandiego.ncr.com> Date: 8 Sep 88 19:29:13 GMT Reply-To: rns@se-sd.sandiego.NCR.COM (Rick Schubert) Organization: NCR Corp. Systems Engineering, San Diego Lines: 214 References: What have I done wrong? I posted an article asking a question and received an underwhelming response. Didn't I insult the right people or stomp on anyone's dogma? In <1616.se-sd.sandiego.ncr.com> I (Rick Schubert) asked about function parameters declared to be of type `char'. A compiler I was/am using warned that the declaration was being adjusted to be of type `int'. I asked whether or not this was a bug, and more importantly, where in K&R I and/or the Draft ANSI C Standard this was addressed. There were 3 or 4 follow-ups posted to comp.lang.c in addition to the 3 mail messages I received. I got conflicting opinions on this, as well as conflicting references. I will attempt to summarize and possibly draw a conclusion. I would like to thank these people for their contributions and hope that none of my responses are taken as flames. In <8408@smoke.ARPA> gwyn@smoke.ARPA (Doug Gwyn ) stated that the compiler HAS to make the adjustment to `int' since the compiler passes an `int' as the argument. He made no references to K&R I or the Draft. In <66530@sun.uucp> swilson%thetone@Sun.COM (Scott Wilson) found a reference in K&R I (page 205, section 10.1 of Appendix A) which states: C converts all float actual parameters to double, so formal parameters declared float have their declaration adjusted to read double. >Your're [sic] right though, that int vs. char is not explicitly mentioned >in this section. Even though `int' vs. `char' is not mentioned here, I took it as evidence (but not proof), that the same adjustment applies to `char' parameters, since C typically treats integral promotions and floating promotions analogously. [Also, there is one relevant (but not necessarily deciding) difference between the situations involving `float' -> `double' and `char' ->? `int': on a system that has different formats for `float' and `double' (other than the number of bits in the mantissa), the compiler cannot simply treat one word of the `double' (passed as the argument) as a `float'; however, the compiler CAN (technically, if not legally) treat the appropriate byte (or whatever length `char's are) of the `int' as a `char' and therefore get the correct value passed in.] I could not find a similar reference in the Draft; however, an opposing viewpoint was presented: In <16432@apple.Apple.COM> bgibbons@Apple.COM (Bill Gibbons) quoted the Draft: >In section 3.7.1 of the draft standard, it says: > On entry to the function the value of each argument expression shall be > converted to the type of its corresponding parameter, as if by assignment > to the parameter. >This is known as _narrowing_. It is very important, for exactly the reason >you point out: if the parameter is not narrowed, and you take its address, >the pointer is to a different type than expected. >Narrowing is very easy for a compiler to do: for CHAR and SHORT, it just >adjusts the stack offset at which it thinks the value was passed, and changes >the type. (On PDP11 etc, it doesnt [sic] even change the offset.) >Floating-point is a little harder on most machines; it needs an explicit >conversion. This caused me a lot of effort last fall, when I was porting a >UNIX application to IDRIS running on an Atari box. The Whitesmiths compiler >did not do narrowing, and I had to modify the code (with a tool) to add >explicit narrowing. He interprets this to mean that the `char' parameter MUST remain a `char', although the Whitesmiths' compiler does do the adjustment. One possible argument against this is that, if the compiler DOES do the adjustment to `int', then the "type of the corresponding parameter" is `int'. The following responses are taken from private mail. I have elided the names of the authors since I haven't asked their permission. I hope I am not out-of-line in quoting from them -- I am not disclosing any confidential information and am not presenting any embarrassing information. In private mail #1, someone wrote: >It is perfectly legal to do so, just not very efficient on a correct >implementation. The reason is that the value must be correctly treated >as an int so that the expression > c += 1; >must be careful to wrap from 127 to -128 on a signed char implementation. >This means that the value cannot be treated directly as an int and so >must be converted every time it is used. > >It is never legal to treat the parameter as type 'int' directly. The HCR >'C' test suite tested this fairly severely and did of course find bugs >in the VAX compiler. The "&c" must point to the actual character. If it >doesn't then it is very broken since a common type of routine is > putc(c) char c; { write( &c, 1, 1 ); } These point out some coding practices that assume that the parameter remains a `char', but it does not back it up with a reference. In private mail #2, someone wrote: >The pointer will point to a RANDOM byte, and the incoming parameter will >have RANDOM value, and the compiler is warning you becuase [sic] it is an >UNSAFE thing to do. what "really" happens depends on your hardware and >compiler. The only safe thing to do is, unfortunately, > > int foo( usra ) int usra; { char a = usra; > >Usually, an int pointer dereferenced as char will get the first byte: >hi-order on IBM & Motorala [sic], (correct way to do it, not what you want) >lo-order on Intel & DEC. (wrong way to do it, what you want) [I'm not sure about pointing to a "RANDOM byte" containing a "RANDOM value": if an `int *' and a `char *' have the same representation and are, for example, both byte addresses, then I can think of only 2 possible bytes the pointer can point to, and depending on the byte, only 1 or 2 possible values are possible (either the pointer points to the the low-order byte, which contains the `char' value, or it points to the high-order byte, which contains the padding (all 0's or all 1's) resulting from promoting the `char' argument to an `int'.] I'm not sure which side he is supporting here, unless he is saying that `char c' is not portable, so the portable way to code is as he shows. This is probably good advice for programmers but doesn't tell the compiler writer what to do. And again, no reference. In private mail #3, someone wrote: >There's a long and a short answer to your posting. The short answer is >that you can ignore what the compiler does (at the expense of speed). >Let's look at why. > >f(c) >char c; >{ > some expression involving c; >} > >What happens in the following call: > > f('b'); > >'b' is a char, no doubt about that. (Let's assume 32 bit ints, 8 bit >chars). But it is widened to an int in the call to f. It would >appear that we have a problem, since f is expecting a char, and we think it >will pull only 8 bits off the stack. But this is not what happens. The >parameter is also treated as an int (that is, the compiler treats the code >as if you had written the following): > > f(c) > int c; > { > some expression involving c: a cast is added by the compiler > } > >All uses of c are cast to the original type. So some original expression >involving c, say > printf("Input: %c\n", c); >is treated as if you had > printf("Input: %c\n", (char)c); > >since c had been widened to an int. > >Clear? See my book, [reference omitted to preserve anonymity] > >[What you should really do, for efficiency, is rewrite f (as I'm sure you >know): > > f(c) > int c; > . . . > >And be sure to call it with an int argument.] He doesn't explain what happens with `&c', which is my main concern, and, doesn't give a reference. I found 1 more reference in the Draft that deals with this situation. In section 3.3.2.2 ("Function calls"), page 39/line 35 through page 40/line 4: If the expression that denotes the called function has a type that does not include a prototype, the integral promotions are performed on each argument[,] and arguments that have type `float' are promoted to `double'. These are called the >default argument promotions<. If the number of arguments does not agree with the number of parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined. This passage reaffirms that the arguments are promoted (no question here). It talks about the types of the parameters "after promotion" but does not explicitly state that the effective type of the parameter is the promoted type. More input: The UNIX(tm) compiler that I have, which I believe is an AT&T compiler (PCC or PCC II based?) does not promote the `char' parameter; the effective type appears to be `char': 1. `&c' points to the byte containing the character passed in; 2. `c = 0x12345678' results in `c == 0x78'; 3. `sizeof(c) == 1'. -------- I have been vacillating in what my conclusion is. At times I believe there is more evidence in K&R I and in the Draft that promotion is the correct thing to do. At other times I tend to disregard this evidence because it is not explicit enough with respect to my question. In these times, lack of a clear statement that the parameter is promoted would seem to say that it remains the type specified by the programmer (no need to explicitly say this if it is so). Also, my experience with C compilers prior to the one the is the subject of my original posting led me to believe that the parameter type was not adjusted; however, apparently Whitesmiths' compiler does adjust. And other times I tend to think that this is falls into the category of >unspecified behavior<; in this case, either interpretation is correct and thus all compilers are correct. However, this doesn't seem like the sort of thing that is typically unspecified. -------- In conclusion, then, this is still an open issue in my mind. I call on the wizards to speak up on this (not meant as a slight to anyone who has already spoken). Karl Heuer? Chris Torek? Any others? Help! -- Rick Schubert (rns@se-sd.sandiego.NCR.COM)