Xref: utzoo gnu.gcc:274 comp.os.vms:12673 comp.lang.c:16979 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!bloom-beacon!adam.pika.mit.edu!scs From: scs@adam.pika.mit.edu (Steve Summit) Newsgroups: gnu.gcc,comp.os.vms,comp.lang.c Subject: Re: Problems with GCC and/or VAX LINK Keywords: globalref, extern Message-ID: <9876@bloom-beacon.MIT.EDU> Date: 16 Mar 89 06:35:43 GMT References: <1680@levels.sait.edu.au> Sender: daemon@bloom-beacon.MIT.EDU Reply-To: scs@adam.pika.mit.edu (Steve Summit) Lines: 198 Quotes and comments have been discussed to death, but I haven't seen any discussion of the globalref issue. (Perhaps it was confined to gnu.gcc or comp.os.vms, where I wouldn't have seen it. Apologies for any redundancy.) In article <1680@levels.sait.edu.au> ccdn@levels.sait.edu.au (DAVID NEWALL) writes: >I recently tried compiling the VAX LZCMP compression program on our >VAX. The VAX is running VMS 5.0 and the C compiler is GNU C (don't >know what version). I encountered a few problems: > >2. The LZCMP program includes a DCL command table, which is... > referenced (in the program) via a > "globalref" variable; the declaration looks like this: > globalref dcl_table; /* this is the DCL command table */ > I assumed that "globalref" meant the same as "extern". This turns > out not to be the case. It seems that VMS has a "global" class for > symbols, and that extern variables aren't "global". It also turns > out that extern functions _are_ global -- what I am saying is that > "extern dcl_table" didn't work (dcl_table didn't point to the right > place), but "extern dcl_table()" did! External linkage from C under VMS is a real can of worms. Given some fixed historical precedents, the globalref/globaldef stuff (and the resulting workarounds under compilers which don't have it) is probably necessary, albeit messy and nonstandard. First, you need to know that VMS object files, and the VMS linker, deal with multiple Program SECTionS, or psects. Unix uses two or three analogous segments ("text," "data," and "bss"), but VMS typically deals with many. Psects have a number of attributes: whether they're executable, whether they're writable, what their alignment is, and -- here's an interesting one -- whether multiple psects of the same name, contributed from different object files, concatenate or overlay each other. Psects that overlay each other might sound useless at best or dangerous at worst, but it turns out they're just what you want for, say, Fortran COMMON blocks. When you say extern int x; or int x = 3; under DEC's VMS C compiler, you don't get a conventional defined or undefined global symbol in a data psect. You get a new psect, named "X", of size sizeof(int), marked with the overlay attribute. All global variables named "x" in all modules therefore end up sharing the same storage, as expected. I don't know for certain why this somewhat unusual and unexpected implementation of C global variables was chosen, but I suspect it had to do either with 1. An attempt to maintain compatibility with one of the weaker models for C external linkage, suggested but not required by K&R, but which many existing C programs assume. (The "strong" model is "exactly one defining instance;" i.e. all but one declaration of a global variable must use the word "extern." Unfortunately, various "common" models are, er, common, such as programs that say int x = 3; in one module and int x; in another.) 2. An attempt to make linking of C and Fortran modules easy, by mapping C externs to Fortran COMMON blocks. Given the "common psect" implementation for conventional C externs (and, for better or worse, that is the implementation), if what you want is a regular defined global symbol in a data psect, you've got to use globaldef (or globalref to reference it), for that is exactly what globaldef and globalref do. I doubt it would be easy to add these to gcc, since they show up in the grammar. gcc probably had to go with the common psect model for regular externs for compatibility with VAX11C. The reason that extern dcl_table(); worked is that functions do deal with conventional defined symbols (as opposed to named psects). Since all you did with dcl_table was (I presume) pass its address back to the CLI routines, the C compiler never had to generate any code other than to push the address, so the fact that it was (incorrectly) declared as a function didn't matter. This is a nice workaround, which I hadn't seen before (Did you invent it? Congratulations!) and it is probably the correct thing to use. >3. My investigations into "globalref" high-lighted a problem with either > the VMS linker, or with both GCC and VAX C. Essentialy, I can compile, > link and execute the following program: > extern v1; > int v2; > ... > printf("&v1=%d\n&v2=%d\n", &v1, &v2); > > Compiling with GCC, I get &v1 == &v2. Compiling with VAX C I get > &v1 + 4 == &v2. In either case, I think it's wrong. I think that > I should get a linker error complaining about an undefined external > variable (v1). VAX C worked because of the way overlayed psects work -- each declaration of (in this case) v1, whether a "defining instance" or not, generates a reference to a psect named "V1", so even if there never is a defining instance, the psect gets created. This is mildly surprising, but no more so than some of the screwball things the Unix compilers and linkers have always let you get away with. (The other day I discovered that Ritchie's pdp11 compiler accepts extern int x = 3; although I don't know what it means.) gcc probably ended up with &v1==&v2 because of a misunderstanding or bug in its implementation of the named psect nonsense. The big problem with implementing C externs as named psects is that the linker won't then search for undefined externals (if it did, the "expected" error for an undefined v1 would have resulted from the above example). Instead, undefined externals spring into existence, as noted, without (here is the killer) being loaded from libraries. (This issue would have qualified for a "frequently asked questions" list on comp.os.vms when last I followed it.) That is, if you have extern int x; in an explicitly-loaded object file, and an object in a library containing only int x = 3; that library member won't get loaded, and x will remain 0. The solutions are either to request the library member explicitly, or to use globalref/globaldef, or to add to the library member a definition of some other required symbol (such as a function, which links conventionally) to force the member to be loaded. At one point I heard that a future version of the VMS linker would be able to search for psects, perhaps to solve this problem; that may have been implemented by now. If you think globalref and globaldef are weird, have you looked at globalvalue? A totally unfamiliar concept to C programmers, though useful in a VMS environment. If you say globalvalue int SS$_NORMAL; int retval = SS$_NORMAL; you'll end up with something like movl $1, _retval rather than .extern _SS$_NORMAL movl _SS$_NORMAL, _retval ; no $, no immediate constant, ; SS$_NORMAL is here an address That is, compiler will generate code not to dereference a location whose address the linker will fill in, but to use an absolute value (which the linker will fill in). Under Unix predefined magic constants are typically implemented with #defines in standard header files; under VMS the linker will fill them in from the standard libraries. It turns out that you can simulate globalvalue with the same kind of trick as for globalref -- you could say extern int SS$_NORMAL(); int x = SS$_NORMAL; and presto (ignoring type clash warnings) x would be set to 1. (This does not mean that globalref and globalvalue are equivalent and therefore redundant; the globalref workaround replaced something like globalref dcl_table; cli$xxx(..., &dcl_table, ...); with extern dcl_table(); cli$xxx(..., dcl_table, ...); Note the ampersand; globalref and globalvalue differ by a level of indirection.) Steve Summit scs@adam.pika.mit.edu