Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!caip!ut-sally!pyramid!decwrl!sun!guy From: guy@sun.uucp (Guy Harris) Newsgroups: net.lang.c,net.micro.pc,net.unix Subject: Re: My pointer stuff: C caught me again (?) but it has truths in it Message-ID: <4609@sun.uucp> Date: Sun, 29-Jun-86 02:11:01 EDT Article-I.D.: sun.4609 Posted: Sun Jun 29 02:11:01 1986 Date-Received: Mon, 30-Jun-86 04:19:53 EDT References: <1242@ncoast.UUCP> <418@dg_rtp.UUCP> <1267@ncoast.UUCP> Organization: Sun Microsystems, Inc. Lines: 149 Xref: watmath net.lang.c:9640 net.micro.pc:8910 net.unix:8446 > The code in question is two analogous sections: > > -------- section 1 --------- > > struct sfld (*__cursf)[] = (struct sfld (*)[]) 0; > > if ((__cursf = (struct sfld (*)[]) calloc(n, sizeof (struct sfld))) > == (struct sfld (*)[]) 0) ... > > ---------------------------- > > This was intended to allocate an array and assign it to a variable of type > ``pointer to array of (struct sfld). I suspect the type is wrong but I'm > not sure how to decalre such a beastie; I suspect that it *does* *not* > *exist* *at* *all* in C, now that I've played with it. Wrongo. "struct sfld (*cursf)[]" *is* a declaration of a pointer to an array of "struct sfld". However, it is not possible to generate a value with that type by taking the address of an object which is an array of "struct sfld". You *can* generate a value of that type by using the name of an array of arrays of "struct sfld"; such a name has the type of a pointer to an element of that array, and hence the type "pointer to array of 'struct sfld'". (By the way, the casts of "0" are not necessary; the compiler knows that the LHS of the "=" operator in the declaration, and the "==" operator in the "if", is a pointer, and thus knows that it must coerce the "0" into a null pointer of the appropriate type.) The "malloc" here *allocates* an *array* of "struct sfld"; however, it *returns* a pointer to the first element of that array. > This could easily have been done correctly: > > int array[3]; -- should declare a pointer followed by 3 integers, with the > pointer initialized to the 3 integers > int array[]; -- should decalre a pointer. No, NO, *NO*, ***N*O****, N N OOOOO ! NN N O O ! N N N O O ! N N N O O ! N N N O O ! N NN O O N N OOOOO ! "int array[3]" does not, and should declare any sort of pointer. It should reserve storage for three "int"s - PERIOD! "int array[]" should, if "array" is initialized, declare an array with as many members as appear in the initialization; if it's not initialized, it should either be an error or be considered an "extern" declaration of an array whose size is specified (and whose storage is reserved" in another module. The only pointers involved should be the *constant expression* "array", which has type "pointer to 'int'" when it appears in an expression. NO storage should be reserved to hold this "pointer", because no storage NEEDS to be reserved to hold this pointer - any more than storage needs to be reserved (except, possibly, in the instruction stream, or maybe in a literal pool) for the "3" in the expression "x + 3". > C should treat ``int array[]'' as a different type from ``int *ptr'', It does. That's what people have been trying to tell you! > and while ``int array[3]'' and ``int array[]'' are the same type, the sized > array's pointer should be treated as a constant. (This may be arguable.) Damn straight it's arguable. NEITHER array has a "pointer" in the sense of a location of memory which holds a pointer to that array. The name "array" is, when used in an expression, a *constant* pointer to the first member of that array - in *both* cases. > the malloc()'ed one is type (int *), to the C compiler (to me, int []) > the declared one is type (int []), to the C compiler > (which defines (int []) as (int *)) No, it doesn't. You haven't been listening. *Start* listening. To the C compiler, "int []" declares an array of "int"s, which is normally implemented as a consecutive block of locations holding "int"s. However, an array can *not* be used as an object in an expression. You can't do array assignment, you can't add two arrays, you can't pass arrays to functions as arguments, and you can't have a function which returns an array. When the name of an array is used in an expression, it is *reinterpreted* as a *constant* pointer to the first element of that array. The "malloc()'ed one" is type "int []"; however, "malloc" returns a pointer to the first element of that array. This is not much stranger than int *x; x = (int *) malloc(sizeof int); "malloc" can't very well return an "int" here, it can *only* return a *pointer* to what it has allocated. You *have* to declare a "pointer to 'int'" here, even though the object which "malloc" has allocated is an "int", not a "pointer to 'int'". The same is almost true of arrays, except that you declare a pointer to an object of type , rather than of type "array of ", when "malloc"ing an array. > and they are in fact identical in memory, so the C compiler treats them as > identical period. Bullshit. A pointer to "int" and an array of "int" are in NO WAY identical in memory. > Come to think of it -- can malloc() or similar be typed right anyway? I > suspect this is why Pascal uses the ``new(pointer)'' construct, known to the > compiler; it's type-able at compile time. But catching the allocation of an > (int []) (vs. an (int)) from malloc() and forcing the former to be assigned > to a variable of type (int []) and the latter to an (int *) is nearly > impossible even when the language considers (int []) and (int *) to be > different. No, no, no! If you "malloc" an array, you don't assign the result of "malloc" to a variable of type "int []". What you want is to be able to assign it to a variable of type "pointer to array of 'int'" and use that pointer to refer to that array. If you "malloc" an "int", you don't assign the result to a variable of type "int", do you? The problem here is that you don't deal with pointers to arrays in the following fashion: int (*pointer_to_array)[]; pointer_to_array = (int (*)[]) malloc(number_of_array_elements * sizeof int); third_element_of_malloced_array = (*pointer_to_array)[2]; If arrays had been first-class types in C, this would have been how you would have done it. Instead, you have to do: int *pointer_to_first_element_of_array; pointer_to_first_element_of_array = (int *)malloc(number_of_array_elements * sizeof int); third_element_of_malloced_array = pointer_to_first_element_of_array[2]; /* or *(pointer_to_first_element_of_array + 2) */ This is the source of infinite confusion for some C programmers, and I agree with Wayne that it was, in balance, a mistake. It *can't* be fixed now, however fervently one might wish to do so. It's *too late*. C is *already out there*, and changing it now would break too many programs. If you change it, you'll have to call the resulting language D (or P). -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)