Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!snorkelwacker.mit.edu!bloom-picayune.mit.edu!news From: scs@adam.mit.edu (Steve Summit) Newsgroups: comp.std.c Subject: Re: Pointers to Incomplete Types in Prototypes Message-ID: <1991May6.044414.18120@athena.mit.edu> Date: 6 May 91 04:44:14 GMT References: <700@taumet.com> <683g+p#@rpi.edu> Sender: news@athena.mit.edu (News system) Reply-To: scs@adam.mit.edu Organization: Thermal Technologies, Cambridge, MA Lines: 160 People are having a hard time understanding how extern blort(struct piffle *); struct piffle { int cazart; }; creates two *different* struct piffle's, i.e. the struct definition on the second line does *not* complete the incomplete struct within the prototype. For those who prefer concrete examples, I will describe how structures are typically implemented within compilers. (I hasten to point out that such an excursion should not, strictly speaking, be necessary -- we are supposed to be able to answer these questions by reading the documentation, without peeking at the source code. In this case, the Standard is unambiguous, but understanding how its requirements map to the compiler internals may put a few minds at rest.) Inside the compiler, we might have a structure which describes a structure. It might look something like this: struct structure { char *tag; int flags; struct symtabent *members; int nmembers; }; The tag field obviously records the structure's tag name, or is NULL for unnamed structures. The members and nmembers fields record the number, names, and types of the structure's members, but those details do not concern us here. The important fact to realize is that if the compiler has to structures lying around which it wants to test for compatibility, it does *not* do so by comparing the tag names: struct structure *sp1, *sp2; ... if(strcmp(sp1->tag, sp2->tag) != 0) /* WRONG */ error("incompatible structures"); Rather, with one exception [footnote 1], it does so by comparing the pointers themselves for equality: if(sp1 != sp2) error("incompatible structures"); I can't say exactly why the Standard requires compilers to behave in this way; one reason is obviously that tag comparison can't work for structures without tags. (I confess that I can't find explicit language in the Standard which requires behavior such as I have described, but if you look at section 3.1.2.6 -- "Two types have compatible type if their types are the same" -- and section 3.5.2.1 -- "The presence of a struct-declaration-list in a struct-or-union-specifier declares a new type, within a translation unit" -- it's clear that tag comparison is not used. Section 3.5.2.3 is devoted to tags, which we'll now explore.) The second thing to understand is the way that scopes nest, and the way that existing names are looked up in, and new names inserted into, these nested scopes, particularly when the name is a structure tag. (Chris Torek has already described this process in considerable detail; the informal treatment I present here may be a bit easier to follow.) When a compiler sees a structure tag without a struct- declaration-list (the brace-enclosed list of the structure member names and types), it looks through the current set of nested scopes for a matching struct tag. (Unlike compatible structure testing, this search *is* made by string comparison of the tag names.) If it finds one, then this struct tag is a reference to an already-declared struct, and that already-declared struct is used as the type of whatever is being declaring now (via the standalone struct tag just encountered, at the beginning of this paragraph). In particular, part of the type of the thing being declared now is the pointer to the struct structure of the struct with the matching tag. (Got that :-) ?) If a matching struct tag is *not* found, the compiler has encountered an incomplete struct definition. It allocates a new struct structure, with the given tag and no members (and perhaps an explicit indication in the flags field that this is an incomplete struct). This incomplete structure definition must now be entered (still in its incomplete form) into the scope list. At which level? At the current one, just like any other definition. No other choice would be regular, or make much sense. When a structure declaration with a struct-declaration-list is encountered, whether it has a tag or not, it is the definition of a new struct type. (See section 3.5.2.1, page 61, lines 23-24.) This new structure definitely gets defined at the current scope level. If this new structure has a tag, and if there is already a structure with the same tag at this scope level, and if that existing struct was incomplete, this declaration completes it. (The incomplete definition's struct structure is used, so that anything already declared using the incomplete type will remain compatible, by the method of pointer comparison.) If the new structure has a tag, and if there is already a structure with the same tag at this scope level, and if that existing struct is *not* incomplete (already has members), it's an error (an attempt is being made to redefine the structure) [footnote 2]. Finally, given that there is a new, microscopic (but still nested) scope active within a function prototype that is not part of a function prototype, we can see why extern blort(struct piffle *); struct piffle { int cazart; } x; blort(&x); defines two different struct piffles, such that the call to blort in the third line is in error, while struct piffle; extern blort(struct piffle *); struct piffle { int cazart; } x; blort(&x); works as intended. That empty struct piffle on the first line is just to get an incomplete struct piffle entered at file scope, so that the incomplete struct piffle in the prototype on line 2 will reference it rather than creating a new one, and so that the definition on line 3 will complete the same struct referenced in the prototype, so that the call on line 4 will use a properly compatible type. Steve Summit scs@adam.mit.edu Footnote 1. The exception is when structures must be compatible across translation units. Obviously, if they're compiled separately, the compiler can't compare pointers to its run-time data structures. In fact, the compiler isn't going to check compatibility at all; nor, for that matter, does the linker usually do so. Section 3.1.2.6 describes when structures are compatible across translation units; presumably a utility like lint might make use of it. (Obviously, the programmer must also be aware of this information, if the programs are to work, although the common and recommended practice of putting structure definitions in header files of course ensures compatibility.) Curiously, section 3.1.2.6 requires that the members have the same types and be in the same order (obviously) and also that they have the same names, but *not* that the structures have the same tags. Presumably this means that two structure types with different tags (or without tags) but with identical descriptions would be strictly compatible across translation units. (Obviously, the code would work correctly, under any conceivable architecture, in any case, and nothing would be likely to go wrong if the member names didn't match, either.) Footnote 2. A few months ago, there was a long discussion about incomplete types and the precise interpretation of the term "an enclosing scope." I don't remember if the discussion concerned structure tags (it might have been about incomplete array types), but section 3.5.2.3 states explicitly that when a tag is declared, "Subsequent declarations [with the same tag] shall omit the bracketed list."