Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!caip!seismo!umcp-cs!chris From: chris@umcp-cs.UUCP (Chris Torek) Newsgroups: net.lang.c Subject: Pointers and Arrays Message-ID: <2206@umcp-cs.UUCP> Date: Sun, 29-Jun-86 12:10:14 EDT Article-I.D.: umcp-cs.2206 Posted: Sun Jun 29 12:10:14 1986 Date-Received: Mon, 30-Jun-86 06:03:12 EDT References: <1242@ncoast.UUCP> <418@dg_rtp.UUCP> <1267@ncoast.UUCP> <2201@umcp-cs.UUCP> Reply-To: chris@maryland.UUCP (Chris Torek) Organization: University of Maryland, Dept. of Computer Sci. Lines: 240 In article <2201@umcp-cs.UUCP> chris@maryland.UUCP (Chris Torek) writes: >Perhaps I just have an odd mind, but all this pointer/array stuff >never really bothered me. Or perhaps I simply read K&R, chapter 5, Pointers and Arrays. I needed to refer to K&R recently (see article <2204@umcp-cs.UUCP>), and while I was looking at it, I just happened to stumble across some text in this chapter that seems to me quite clear. Let me give some excerpts, with commentary. (Suggestion: while reading this, imagine me grinning teasingly at points. I hope the tone comes across properly, but I have spent enough time revising this now---great grief, an hour and a half now!) It is also necessary to declare the variables that participate in all of this: int x, y; int *px; The declaration of x and y is what we've seen all along. The declaration of the pointer px is new. int *px; is intended as a mnemonic; it says that the combination *px is an int, that is, if px occurs in the context *px, it is equivalent to a variable of type int. In effect, the syntax of the declaration for a variable mimics the syntax of expressions in which the variable might appear. This reasoning is useful in all cases involving complicated declarations. For example double atof(), *dp; says that in an expression atof() and *dp have values of type double. So much for understanding declarations. K&R said it all, eight years ago. ... Any operation which can be acheived by array subscripting can also be done with pointers. The pointer version will in general be faster but, at least to the uninitiated, somewhat harder to grasp immediately. K&R seem to have a gift for understatement. The correspondence between indexing and pointer arithmetic is evidently very close. In fact, a reference to an array is converted by the compiler to a pointer to the beginning of the array. The effect is that an array name *is* a pointer expression. ... (Note `expression', not `variable'. The above does not apply to sizeof.) There is one difference between an array name and a pointer that must be kept in mind. A pointer is a varible, so pa=a and pa++ are sensible operations. But an array name is a *constant*, not a variable: constructions like a=pa or a++ or p=&a are illegal. `p = &a' is much like `p = &3': illegal by fiat, not because it cannot be done. If it were legal, `&a' would have type `pointer to ' (compare with `a', which has type `pointer to '). When an array name is passed to a function, what is passed is the location of the beginning of the array. Within the called function, this argument is a variable, just like any other variable, and so an array name argument is truly a pointer, that is, a variable containing an address. ... As formal parameters in a function definition, char s[]; and char *s; are exactly equivalent; ... This is all in the context of singly-dimensioned arrays, but with the proper mindset applies to multi-dimensional arrays without trouble. (With the wrong mindset it leads to much confusion.) K&R will have more to say about this later. Note that this is where sizeof starts acting odd: A compiler treats the following as equivalent: array pointer ----- ------- f(arr) f(ap) int arr[]; int *ap; { { ... ... f(a2) f(a2p) int a2[][5]; int (*a2p)[5]; { { ... ... The second equivalent pointer version is neither `int **a2p' nor `int *a2p'; nor for that matter is it `int *a2p[5]'. This is consistent, if (painfully apparently, given recent net.lang.c articles) confusing. 5.7 Multi-Dimensional Arrays C provides for rectangular multi-dimensional arrays, though in practice they tend to be much less used than arrays of pointers. ... ... In C, by definition a two-dimensional array is really a one- dimensional array, each of whose elements is an array. Hence subscripts are written as day_tab[i][j] rather than day_tab[i, j] as in most languages. ... What they do *not* mention is that day_tab[i,j] is a valid expression, and tends to surprise people. Lint does not, unfortunately, warn about these. If a two-dimensional array is to be passed to a function, the argument declaration in the function *must* include the column dimension; the row dimension is irrelevant, since what is passed is, as before, a pointer. What did I tell you? Note that this *is* consistent. One cannot pass an array as an argument to a function. Pointers, however, are fine, *including pointers to arrays*. Given a two or more dimensional array, the array `constant' is converted to a pointer to an array of one fewer dimensions. This is now a *pointer*, and remains a pointer until dereferenced. For example, in int day_tab[2][13] = { ... }; the following are type-correct calls: f2d(p) int (*p)[13]; { ... } f1d(p) int *p; { ... } proc() { /* argument types: */ f2d(day_tab); /* pointer to array 13 of int */ f2d(&day_tab[0]); /* pointer to array 13 of int */ f1d(day_tab[0]); /* pointer to int */ f1d(&day_tab[0][0]); /* pointer to int */ } Calling f2d(&day_tab[0][0]) passes the right *value* but the wrong *type*. That it happens to work is not an excuse to do it. If C were different, it would be different, but it is not, so it is not. To return to K&R: 5.10 Pointers vs. Multi-dimensional [sic] Arrays (So they are not consistent with capitalisation in section names.) Newcomers to C are sometimes confused about the difference between a two-dimensional array and an array of pointers, ... Ah, a gift indeed. Given the declarations int a[10][10]; int *b[10]; the usage of a and b may be similar, in that a[5][5] and b[5][5] are both legal references to a single int. But a is a true array: all 100 storage cells ahve been allocated, and the conventional rectangular subscript calculation is done to find any given element. For b, however, the declaration only allocates 10 pointers; each must be set to point to an array of integers. Assuming that each does point to a ten-element array, then there will be 100 storage cells set aside, plus the ten cells for the pointers. Thus the array of pointers uses slightly more space, and may require an explicit initialization step. But it has two advantages: accessing an element is done by indirection through a pointer rather than by a multiplication and addition, and the rows of the array may be of different lengths. That is, each element of b need not point to a ten-element vector; some may point to two elements, some to twenty, and some to none at all. Now for some even more horrid examples of my own, all type-correct: /* declare st as array 1 of array 5 of pointer to char */ char *st[1][5] = { { "fee", "fie", "foo", "fum", "foobar" } }; /* declare x as pointer to array 5 of pointer to char */ char *(*x)[5] = st; /* declare y as array 1 of array 3 of array 4 of pointer to array 5 of pointer to char */ char *(*y[1][3][4])[5] = { { { st, 0, 0, st }, { 0, st, st, 0 }, { 0, 0, st, st } } } ; /* declare p as array 2 of pointer to array 3 of array 4 of pointer to array 5 of pointer to char */ char *(*(*p[2])[3][4])[5] = { y, 0 }; It does take some trickery to do this. Given the declaration char *strings[5] = { ... }; the type of `strings' is `array 5 of pointer to char', which, when used in an expression, becomes `pointer to pointer to char' (by changing the first `array of' to `pointer to'), but for `x' and `y' I wanted a type of `pointer to array 5 of pointer to char'. It might be nice if I could write `&strings' to get this, but I cannot; however, I can use the declaration above for `st' to get `array 1 of array 5 of pointer to char'. Changing the first `array of' yeilds `pointer to array 5 of pointer to char', which was what I wanted. Likewise, for `p' I wanted `y' to evaluate to `pointer to array 3 of array 4 of pointer to array 5 of pointer to char'; in order to get that, I again used a `fake' [1] in the declaration. `You can hack anything you want, with pointers and funny C . . .' -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu