Newsgroups: comp.lang.c Path: utzoo!sq!msb From: msb@sq.sq.com (Mark Brader) Subject: Re: Portability vs. Endianness Message-ID: <1991Mar24.005821.8399@sq.sq.com> Organization: SoftQuad Inc., Toronto, Canada References: <1991Mar12.105451.19488@dit.upm.es> <2628@ksr.com> <829@saxony.pa.reuter.COM> Distribution: na Date: Sun, 24 Mar 91 00:58:21 GMT Lines: 44 > > Bytes[0] = (var >> 24) & 0xFF; > > Bytes[1] = (var >> 16) & 0xFF; > > Bytes[2] = (var >> 8) & 0xFF; > > Bytes[3] = var & 0xFF; > >This code is guaranteed. > > I don't think that the standard guarantees that chars are eight bits. > I will agree that that's probably even more common than 4-byte longs, but > even this assumption about word sizes cannot, IMHO, be guaranteed to be > portable. ... The original question asked that the output be in "68000 format", or some such words. Therefore it *is* correct to pull off 8 bits at a time, even if the code is running on a machine where chars are larger than 8 bits. (They aren't allowed to be smaller.) However, the >> operation is implementation-defined when applied to a signed integer variable whose value is negative. To avoid problems, the original value should be copied into a variable of type unsigned long. This will also in effect convert it to 2's complement format, as desired for the "68000 format" result, no matter what format the machine uses. (Guaranteed in ANSI C.) However, it isn't correct to assume that the original long fits in 32 bits. That is, the original value may not be convertible to the desired format, and the code should check for this. There are several ways to do so, none of them particularly pretty. For instance, supposing that var is the original input and uvar is the result of converting it to unsigned long: if (((uvar >> 31) >> 1) != (((var < 0)? -1UL : 0UL) >> 31) >> 1) complain_about_overflow(); There may be easier ways that I haven't thought of. Note that it is not portable to shift right by 32 bits, as shifts by the entire number of bits in a type are undefined. Also, I have not tested the above expression, since I don't have access to a machine with longs wider than 32 bits. Old compilers will require a cast rather than the UL suffix, but the original poster asked for ANSI C. -- Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com #define MSB(type) (~(((unsigned type)-1)>>1)) This article is in the public domain.