Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!helios!bcm!dimacs.rutgers.edu!mips!swrinde!elroy.jpl.nasa.gov!decwrl!world!ksr!jfw From: jfw@ksr.com (John F. Woods) Newsgroups: comp.lang.c Subject: Re: Novice question about malloc and pointers Message-ID: <3182@ksr.com> Date: 17 Apr 91 21:31:06 GMT References: <9104171614.AA14362@enuxha.eas.asu.edu> Sender: news@ksr.com Lines: 85 trotter@ENUXHA.EAS.ASU.EDU (Russell T. Trotter) writes: >I am trying to get an array of strings, therefore I am using the >following declaration: char *str[MAX] where MAX is an arbitrary >constant. My question is how do I allocate the memory for each >character position? I assume by "character position" you really mean each pointer in the array; each one can be malloc'ed separately, one string at a time. >Do all the characters strings for each element >in the array need to be allocated contiguously? Unless your application wants to run blindly off the end of one string and onto the next (sounds like a bad idea to me), each string can be discontigous with the others. Each string represents a contiguous block of memory, note, so one string won't be represented by several pieces. >The problem involves reading in lines of input. >Each line would be stored as a string and >the number of lines make up the number of elements in the array. Here is where the Art of programming comes in. The most obvious implementation (which arbitrarily limits each line to 512 bytes plus newline) is: #include ... #define MAX 666 char *str[MAX]; ... snuffle_file(f) FILE *f; { char buf[514], *p; int i; /* Read in lines from the file f */ for (i = 0; i < MAX; i++) { /* Read in a line */ if (fgets(buf, 514, f) == NULL) break; /* Treasure the line in a copy */ if ((str[i] = malloc(strlen(buf))) == NULL) { fprintf(stderr,"Go buy more memory.\n"); return; } strcpy(str[i], buf); /* Note that if you have strdup() the above 5 lines * become: * if ((str[i] = strdup(buf)) == NULL) { * * } */ } } This isn't necessarily the best one can do, though; arbitrary line length limits are annoying, so the first obvious improvement is to replace the fgets-into-a-buffer strategy with something like (in pseudo-C) newstring = malloc(some pittance likely to hold most lines, like 32) while not-yet-eof and haven't-seen-a-newline fgets into a buffer if there's not enough room in newstring to add the buffer to the end, newstring = realloc(newstring, current size + some) /* remember that memory can run out */ add buffer contents to end of newstring newstring = realloc(newstring, actual size) /*free a few bytes at end*/ I believe that 4.4BSD will have a "readline()" function that does roughly that, or you can figure out how to roll your own, or dig up any one of the innumerable versions that have been lodged in netnews postings over the past decade. The next annoying limit that should die is "MAX": a similar strategy of realloc-if-out-of-room can be used to remove the arbitrary limit on the number of lines. Something else to ponder is: if you *know* that the file size is relatively small in comparison to the amount of memory you have, and if you *know* that your OS does reasonable things for large reads from files, it *may* be worth mallocing a contiguous buffer which is large enough to hold the file, slurp it all in with one single read, and then pick it apart into lines. Note that this strategy may perform poorly on someone else's machine for perfectly good reasons, and worse, may misfire badly the first time some clown gives your program a 16Mb text file when you "knew" the limit would be 16Kb. Balancing the tradeoffs well is an art.