Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!ucsd!sdcsvax!ucsdhub!hp-sdd!hplabs!sri-unix!quintus!ok From: ok@quintus.UUCP (Richard A. O'Keefe) Newsgroups: comp.lang.c Subject: register unions Message-ID: <686@cresswell.quintus.UUCP> Date: 24 Feb 88 07:53:39 GMT Organization: Quintus Computer Systems, Mountain View, CA Lines: 145 Keywords: attitudes Someone (I have lost the original posting) suggested that 'register union { foo* x; baz *y; ... }' would be a useful construct in C. Well, it's already legal. According to the Oct '86 dpANS, "A declaration with storage-class specifier 'register' is an 'auto' declaration, with a suggestion that the objects declared be stored in fast-access machine registers if possible. The types of objects that are stored in such registers and the number of such declaratiopns in each block that are effective are implementation- defined [footnote: The implementation may treat any 'register' declaration simply as an 'auto' declarations. However, ... the unary & (address-of) operator may not be applied to an object declared with storage-class specifier 'register', whether or not a machine register is actually used.]" The System V Programmer's Guide explicitly says "excess or invalid 'register' declarations are ignored." Similar statements appear in other C manuals. Interestingly enough, DEC's VAX C manual explicitly says "If the variable requires storage (for example, arrays or structures), the object of the variable is not placed in a register." It seems that any C compiler which rejects 'register union foo X' as an error has always been broken: this is legal K&R C. But a compiler has always been within its rights to ignore 'register' in this or any other case, though better-quality compilers print a warning message. The construct thus already being legal, I assumed the original poster to be urging that compilers SHOULD put unions in registers if it is possible for them to do so, and to be alleging that this was particularly important for pointers. I posted a message pointing out that this simply doesn't make sense on some machines (specifically including PR1MEs), and employed a familiar humourous device to stress this. I have received some flaming messages from people who took exception to this commonplace observation. Oddly enough, no-one using a PR1ME has complained to me yet... Here's why I feel strongly about issues like this: (1) I did a couple of days consulting once for a company who had found that the only practical way of porting 4.2BSD to their machine was to change the microcode so that *(char*0) == 0. (2) I have had the unpleasant experience of porting a program which used pointers heavily to a machine where 'int' and 'char*' were not the same size. Even changing the definition of NULL to 0L (which is not, strictly speaking, correct) didn't help. I had to go through more lines of code than I care to remember changing 0 to (char*)NULL. (3) I had to port a program which assumed that, given the declaration union two {int a; char *b;} jim; the calls harry(jim); and harry(jim.b); were identical. Suffice it to say that they weren't. (4) I ported a middle-sized program to a machine, and watched someone else port a much large program to the same machine, where although character pointers and word pointers were both the same size as an integer, they had different representations. So, for example, int data[50]; fwrite(data, sizeof *data, (sizeof data)/(sizeof *data), output); wasn't just badly typed (it has always been that), it gave the wrong answers. Again, one had to go though changing things like this to fwrite((char*)data, ....); (5) I had to advise someone that a very large (and *very* useful) program of theirs would be too expensive to port to a PR1ME because they had assumed throughout that word pointers and character pointers were both the same size as an integer. What was really tragic about this was that the program in question had very little real use for character pointers, but things had been converted to this "common currency". What is the point of something like this: union ptr { char *c; int *i; long *l; int (*f)(); }; register union ptr fred; Surely the point is to say fred.c = /* something */; ... fred.i ... and have it go fast. But this is going to give you major porting headaches in the future. Or more plausibly, it is going to give someone else major porting headaches. Too bad that it is already legal... Is there something comparable which is less trouble for porting? YES. Use casts. Do something like #if .... typedef int UsualStorageUnit; #elif ... typedef short UsualStorageUnit; #elif ... typedef char UsualStorageUnit; #else /* if case not handled, syntax error in next declaration */ #endif typedef UsualStorageUnit *UsualPointer; (void*) is close, but not quite identical. (void*) has to handle the worst case, and is usually much the same as (char*). UsualPointer is to be the "native" pointer type, just as int is the "native" integer type. #define AsUsualPtr(x) ((UsualPointer)(x)) #define AsShortPtr(x) ((short*)(x)) #define AsIfuncPtr(x) ((int (*)())(x)) and so on. Then you can declare routines like void apply1(Fn, Arg) register UsualPointer Fn, Arg; { (*AsIfuncPtr(Fn))(*AsShortPtr(Arg)); } What's the difference? Well, apart from the fact that the compiler is more likely to put UsualPointers into registers than unions (though putting both, and putting neither, are both ALREADY legal), the compiler can now spot each type change, and can tell you about the ones that aren't going to work. Look *very* carefully at **any** union in your programs which is not #ifdeffed by machine or implementation; bugs breed in them like mosquitos in a swamp. Using different members of a union at different times is fine, but putting something into one member of a union and picking it up again from another member is bad practice in any programming language. (My first introduction to the problem was trying to port a Pascal program from a CDC machine to a B6700. The Pascal programmer had assumed that 10 characters = 1 integer, and not only was that not the case, but the bit pattern of the first N characters often wasn't valid as an integer.) Frankly, I am not impressed by people who say "don't be so condescending, this is a useful construct on MY machine." I do not use a PR1ME (or any of the other machines I hinted at above) myself. I have done, and hope never to do so again. Life is difficult enough for these people without going out of our way to make things worse. Oh yes, another porting problem: sizeof *main. On at least three machines that I can think of, the function pointers that C programs pass around is actually a pointer to a control block, *not* a pointer to the code... Don't expect the bit value C has to be equal to what you see in a load map.