Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!utfyzx!sq!msb From: msb@sq.UUCP Newsgroups: comp.lang.c Subject: Re: Character types in ANSI C (long, but the meat is at the top) Message-ID: <1987Feb19.132926.533@sq.uucp> Date: Thu, 19-Feb-87 13:29:26 EST Article-I.D.: sq.1987Feb19.132926.533 Posted: Thu Feb 19 13:29:26 1987 Date-Received: Fri, 20-Feb-87 01:44:30 EST References: <472@myrias.UUCP> Reply-To: msb@sq.UUCP (Mark Brader) Organization: SoftQuad Inc., Toronto Lines: 81 Checksum: 11029 Summary: 3 types > What exactly are the compatibility rules for character types in ANSI C? > I.e. which of the following [pointer assignments] are legal: > > char *p1; unsigned char *p2; signed char *p3; > p1 = p2; p1 = p3; p2 = p3; They are all illegal. The draft specifies three different "char" types, though in any particular implementation two of them are treated similarly. To avoid confusion, let me add that the *character* assignments *p1 = *p2; *p1 = *p3; *p2 = *p3; are all legal, and the *explicit* pointer conversions p1 = (char *) p2; p2 = (unsigned char *) p3; are also legal. Furthermore, the treatment of "signed" in conjunction with "char" is different from its treatment in conjunction with "int" or "long". In the latter cases, "signed" is a noise word. Thus if "char" in the original example was changed to "int", then p1=p3; would be legal. In my formal submission, which was too long to post to this group, I suggested that most of #3.1.2.5 needed editorial improvements, and provided the following suggested text, which I believe to convey the same facts as the existing draft is supposed to, but more understandably. This is based on a close reading of the draft and mail conversations with Larry Rosler. Any errors are mine. --- The following are always *signed integral types*: "signed char", "short int", "int", and "long int". For the last three types listed, the set of values of each type is a superset of the set of values of the preceding listed type. An object declared as "signed char" is large enough to store any member of the execution character set, and if any member of the re- quired source character set enumerated in #2.2.1 is stored in the ob- ject, its value is guaranteed to be positive. The size of an object declared "int" is a natural size suggested by the architecture of the execution environment. Corresponding respectively to the above four types are the *unsigned integral types*: "unsigned char", "unsigned short int", "unsigned int", and "unsigned long int". In each case an object of unsigned in- tegral type utilizes the same amount of storage as does an object of the corresponding signed integral type, including its sign. The set of nonnegative values of a signed integral type is a subset of that of the corresponding unsigned integral type, and the representation of the same value in each type is the same. A computation in an unsigned integral type can never overflow, because a result that cannot be represented in the type is reduced modulo the largest number that can be represented in the type plus one. The type "char" is either a signed integral type with the same set of values as "signed char", or an unsigned integral type with the same set of values as "unsigned char"; which of the two applies is implementation-dependent. Even if the implementation defines two or more types of integers to have the same set of values, they are nevertheless different types.** **Thus even if "char" is a signed integral type, "signed char" is a different type. On the other hand, as explained in #3.5.2, "signed int" is merely an alternate way of specifying the type "int". --- The reference to #3.5.2 is to the following text, which I would put there: --- The keyword "signed" has no effect when specified in conjunction with "int" or in a construction where "int" is implied.** **Thus "signed" alone is equivalent to "int" alone. --- Mark Brader, utzoo!sq!msb #define MSB(type) (~(((unsigned type)-1)>>1))