Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!lll-winken!ames!ncar!tank!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Newsgroups: comp.lang.c Subject: Re: dynamically allocating array of struct Keywords: dynamic array struct Message-ID: <16757@mimsy.UUCP> Date: 5 Apr 89 17:52:47 GMT References: <3658@uhccux.uhcc.hawaii.edu> Distribution: usa Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 398 In article <3658@uhccux.uhcc.hawaii.edu> cs411s03@uhccux.uhcc.hawaii.edu (Cs411s03) writes: >I am having difficulty writing a program which dynamically allocates >an array of structs via malloc() and casting. ... >struct ttst (*tptr)[]; % cdecl explain struct ttst (*tptr)[] declare tptr as pointer to array of struct ttst declare tptr as pointer to array of struct ttst Warning: Unsupported in C -- Pointer to array of unspecified dimension struct ttst (*tptr)[] % C arrays *must* have a size. What you really want, given the rest of your example, is `struct ttst *tptr'. Remember that a pointer to an object that is part of an array can be used to access the entire array. Time for some replays. From: chris@mimsy.UUCP (Chris Torek) Subject: Re: pointers to arrays Date: 18 Feb 89 04:32:47 GMT If you think you want a pointer to an array allocated with malloc(), you are probably wrong. You really want a pointer that points *at* (not `to') a block of memory (`array') containing a series of `char *' objects each pointing at a block of memory containing a series of `char's. The type of such a pointer is `char **'. You might ask, `what is the difference between a pointer that points ``at'' a block of memory and one that points ``to'' an array?' The distinction is somewhat artificial (and I made up the words for some netnews posting in the past). Given a pointer to array pa: int a[5]; int (*pa)[5] = &a; /* pANS C semantics for &a */ I can get a pointer that points `at' the array instead: int *p = &a[0]; The latter is the more `natural' C version of the former: typically a pointer points at the first element of a group (here 5). The rest of the group can be reached via pointer arithmetic: *(p+3), aka p[3], refers to the same location as a[3]. The pointer need not point at the first element, as long as it points somewhere into the object: p = &a[2]; Now p[1] refers to a[3]; p[-2] refers to a[0]. To use pa to get at a[3] one must write (*pa)[3] (or, equivalently, pa[0][3]). The thing that is most especially confusing, but that really makes the difference, is that *pa, aka pa[0], refers to the entire array `a'. *p refers only to one element of the array. This can be seen in the result produced by `sizeof': (sizeof *p)==(sizeof(int)), but (sizeof *pa)==(sizeof(int[5]))==(5 * sizeof(int)). Pointers to entire arrays are not particularly useful unless there are several arrays: int twodim[3][5]; Now we can use pa to point to (not at) any of the three array-5-of-int elements of twodim: pa = &twodim[1]; /* or pa = twodim + 1, in Classic C */ and now (*pa)[3] (or pa[0][3]) is an alias for twodim[1][3]. Note especially that since pa[0] names the *entire* array-5-of-int at twodim[1], pa[-1] names the entire array-5-of-int at twodim[0]. \bold{Pointer arithmetic moves by whole elements, even if those elements are aggregates.} Thus pa[-1][2] is an alias for twodim[0][2]. This is merely a convenience, for we can do the same with p: p = &twodim[1][0]; Now p points to the 0'th element of the 1'th element of twodim---the same place that pa[0][0] names. p[3] is an alias for twodim[1][3]. To get at twodim[0][2], take p[(-1 * 5) + 2], or p[-3]. Arrays are are stored in row-major order with the columns concatenated without gaps; they can be `flattened' (viewed as linear, one-dimensional) with impunity. (The flattening concept extends to arbitrarily deep matrices, so that a six-dimensional array can be viewed as a string of five-D arrays, each of which can be viewed as a string of four-D arrays, and so forth, all the way down to a string of simple values.%) Once you understand this, and see why C guarantees that p[-3], pa[-1][2], and twodim[0][4] are all the same, you are well on your way to understanding C's memory model (not `paradigm': that means `example'). You will also see why pa can only point to objects of type `array 5 of int', not `array 17 of int', and why the size of the array is required. ----- % For fun: the six-D array `char big[2][3][5][4][6][10]' occupies 7200 bytes (assuming one byte is one char). If the first byte is at byte address 0xc400, find the byte address of big[1][0][3][1][5][5]. I hid my answer as a message-ID in the references line. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Subject: Re: char ***pointer; Keywords: allocating space Date: 18 Nov 88 07:40:26 GMT char *p; declares an object p which has type `pointer to char' and no specific value. (If p is static or external, it is initialised to (char *)NULL; if it is automatic, it is full of garbage.) Similarly, char **p; declares an object p which has type `pointer to pointer to char' and no specific value. We can keep this up for days :-) and write char *******p; which declares an object p which has type `pointer to pointer ... to char' and no specific value. But we will stop with char ***pppc; which declares `pppc' as type `pointer to pointer to pointer to char', and leaves its value unspecified. None of these pointers point *to* anything, but if I say, e.g., char c = '!'; char *pc = &c; char **ppc = &pc; char ***pppc = &ppc; then I have each pointer pointing to something. pppc points to ppc; ppc points to pc; pc points to c; and hence, ***pppc is the character '!'. Now, there is a peculiar status for pointers in C: they point not only to the object immediately at *ptr, but also to any other objects an an array named by *(ptr+offset). (The latter can also be written as ptr[offset].) So I could say: int i, j, k; char c[NPPC][NPC][NC]; char *pc[NPPC][NPC]; char **ppc[NPPC]; char ***pppc; pppc = ppc; for (i = 0; i < NPPC; i++) { ppc[i] = pc[i]; for (j = 0; j < NPC; j++) { pc[i][j] = c[i][j]; for (k = 0; k < NC; k++) c[i][j][k] = '!'; } } What this means is perhaps not immediately clear%. There is a two- dimensional array of pointers to characters pc[i][j], each of which points to a number of characters, namely those in c[i][j][0] through c[i][j][NC-1]. A one-dimensional array ppc[i] contains pointers to pointers to characters; each ppc[i] points to a number of pointers to characters, namely those in pc[i][0] through pc[i][NPC-1]. Finally, pppc points to a number of pointers to pointers to characters, namely those in ppc[0] through ppc[NPPC-1]. ----- % :-) ----- The important thing to note is that each variable points to one or more objects whose type is the type derived from removing one `*' from the declaration of that variable. (Clear? :-) Maybe we should try it this way:) Since pppc is `char ***pppc', what ppc points to (*pppc) is of type `char **'---one fewer `*'s. pppc points to zero or more objects of this type; here, it points to the first of NPPC objects. As to malloc: malloc obtains a blob of memory of unspecified shape. The cast you put in front of malloc determines the shape of the blob. The argument to malloc determines its size. These should agree, or you will get into trouble later. So the first thing we need to do is this: pointer = (char ***)malloc(N * sizeof(char **)); if (pointer == NULL) quit("out of memory... goodbye"); Pointer will then point to N objects, each of which is a `char **'. None of those `char **'s will have any particular value (i.e., they do not point anywhere at all; they are garbage). If we make them point somewhere---to some object(s) of type `char **'---and make those objects point somewhere, then we will have something useful. Suppose we have done the one malloc above. Then if we use: pointer[0] = (char **)malloc(N1 * sizeof(char *)); if (pointer[0] == NULL) quit("out of memory"); we will have a value to which pointer[0] points, which can point to N1 objects, each of type `char *'. So we can then say, e.g., i = 0; while (i < N1 && fgets(buf, sizeof(buf), input) != NULL) pointer[0][i++] = strdup(buf); (strdup is a function that calls malloc to allocate space for a copy of its string argument, and then copies the string to that space and returns the new pointer. If malloc fails, strdup() returns NULL.) We could write instead i = 0; while (i < N1 && fgets(buf, sizeof(buf), input) != NULL) *(*pointer)++ = strdup(buf); Note that **pointer++ = strdup(buf); sets **pointer (equivalently, pointer[0][0]), then increments the value in `pointer', not that in pointer[0]. But using *(*pointer)++ means that we will later have to write pointer[0] -= i; to adjust pointer[0] backwards by the number of strings read in and strdup()ed, or else use negative subscripts to locate the strings. Probably all of this will be somewhat clearer with a more realistic example. The following code creates an array of arrays of lines. /* begin code (untested) */ /* this assumes prototypes are available */ #include #include #include static char nomem[] = "out of memory, exiting"; quit(char *msg) { (void) fprintf(stderr, "%s\n", msg); exit(1); /* NOTREACHED */ } /* * Read an input string from a file. * Return a pointer to dynamically allocated space. */ char *readstr(FILE *f) { register char *s = NULL, *p; int more = 1, curlen = 0, l; char inbuf[BUFSIZ]; /* * The following loop is not terribly efficient if you have * many long input lines. */ while (fgets(inbuf, sizeof(inbuf), f) != NULL) { p = strchr(inbuf, '\n'); if (p != NULL) { /* got it all */ *p = 0; l = p - inbuf; more = 0; /* signal stop */ } else l = strlen(inbuf); /* * N.B. dpANS says realloc((void *)NULL, n) => malloc(n); * if your realloc does not work that way, you will * have to fix this. */ s = realloc(s, curlen + l + 1); if (s == NULL) quit(nomem); strcpy(s + curlen, inbuf); if (more == 0) /* done; stop */ break; curlen += l; } /* should check for input error, actually */ return (s); } /* * Read an array of strings into a vector. * Return a pointer to dynamically allocated space. * There are n+1 vectors, the last one being NULL. */ char **readfile(FILE *f) { register char **vec, *s; register int veclen; /* * This is terribly inefficent, but it should be correct. * * malloc below is implicitly cast to (char **), but this * depends on it returning (void *); old compilers need the * cast, since malloc() returns (char *). The same applies * to realloc() below. */ vec = malloc(sizeof(char *)); if (vec == NULL) quit(nomem); veclen = 0; while ((s = readstr(f)) != NULL) { vec = realloc(vec, (veclen + 2) * sizeof(char *)); if (vec == NULL) quit(nomem); vec[veclen++] = s; } vec[veclen] = NULL; return (vec); } /* * Read a list of files specified in an argv. * Each file's list of lines is stored as a vector at p[i]. * The end of the list of files is indicated by p[i] being NULL. * * It would probably be more useful, if less appropriate * for this example, to return a list of (filename, contents) pairs. */ char ***readlots(register char **names) { register char ***p; register int nread; register FILE *f; char **vp; extern int errno; p = malloc(sizeof(char **)); if (p == NULL) quit(nomem); for (nread = 0; *names != NULL; names++) { if ((f = fopen(*names, "r")) == NULL) { (void) fprintf(stderr, "ThisProg: cannot read %s: %s\n", *names, strerror(errno)); continue; } vp = readfile(f); (void) fclose(f); p = realloc(p, (nread + 2) * sizeof(char **)); if (p == NULL) quit(nomem); p[nread++] = vp; } p[nread] = NULL; return (p); } /* e.g., instead: struct file_data { char *fd_name; char **fd_text; }; struct file_data *readlots(register char **names) { register struct file_data *p; register int nread; register FILE *f; char **vp; extern int errno; p = malloc(sizeof(*p)); if (p == NULL) quit(nomem); for (nread = 0; *names != NULL; names++) { <...same file-reading code as above...> p = realloc(p, (nread + 2) * sizeof(*p)); if (p == NULL) quit(nomem); p[nread].fd_name = *names; p[nread].fd_text = vp; nread++; } p[nread].fd_name = NULL; p[nread].fd_text = NULL; return (p); } */ /* end of code */ -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris