Path: utzoo!attcan!uunet!mcsun!unido!mikros!mwtech!martin From: martin@mwtech.UUCP (Martin Weitzel) Newsgroups: comp.lang.c Subject: Re: why is free() a void? Message-ID: <944@mwtech.UUCP> Date: 25 Oct 90 23:10:15 GMT References: <1749@meaddata.meaddata.com> Reply-To: martin@mwtech.UUCP (Martin Weitzel) Organization: MIKROS Systemware, Darmstadt/W-Germany Lines: 152 In article <1749@meaddata.meaddata.com> rob@pmserv.meaddata.com writes: > [Q: why free() doesn't return something "useful"?] > >From K&R 2, page 252: > > "void free(void *p) > ... p MUST be a pointer to space previously allocated by calloc, > malloc, or realloc." (Emphasis on MUST by me.) > >What happens if p WASN'T allocated as it should've been?? How do we know >there was a problem?? Is there really a problem?? As the ANSI-Standard tells us in 4.10.3.2, with a NULL pointer you have a no-op, otherwise UNDEFINED BEHAVIOR. As some others (esp. Chris Torek) never seem to get tired to invent scenarios to illustrate undefined behavior, I'll also try to add one: Undefined behavior *can* mean that your machine becomes inoperatable until you have typed the following sentence 100 times at the console terminal: "The argument handed to free must be NULL or a pointer to space previously allocated by calloc, malloc, or realloc" But let's get more serious. K&R 2 contains in section 8.7 a sample implementation of a storage allocator; the code for the free()-function is on page 188. As this allocator IMHO served for long years as a modell for real implementations, looking at the effects of free()ing some arbitrary pointer with this allocator will serve as a good realistic example for undefined behavior. (And it will give you enough reason to avoid such behavior, even if someone ensures you that "undefined behavior" will neither make your equipment explode, nor result in rude mail beeing sent to your boss ...) The basic idea of K&Rs allocator (the example is essentially the same in K&R I and II), is to keep the free()ed parts of dynamically allocated memory in a linked list. For that purpose each allocation requests some additional bytes, just enough for storing a number and a pointer(%1). So, after you have allocated, say 20, bytes the memory *may* look like follows (lower adresses to the left): Bytes: 4 + 4 20 4 +-----+------+-------------------------------------------+ | XXX | 4 | 20 bytes allocated | | +-----+------+-------------------------------------------+ HEADER ^ | pointer returned be malloc, calloc or realloc (Note that the byte counts above are only given to make the example a bit more realistic - they don't mean that pointers or int-s allways occupy 4 Bytes or are equal in size on all systems!) In the above example the allocator had "really" reserved 32 Bytes. But this "waste" shouldn't be of much interest for the programmer: He or she can count on at least 20 bytes beeing available above the pointer that was returned. Immediatly below the returned pointer there is some very important information stored: The size of the "really" allocated area, meassured in units of the size of the header (8 in the example). These four bytes are the *sole* information for a later free() to know the size of the allocated chunk of memory. So we should have a look at free() now. Its basic work is to walk thru a linked list of previosly freed chunks, which are chained by the first field in the header (XXX above). The list is kept in the order of increasing adresses (for reasons we'll see in a second). After free() has determined where the new part fits in, it links it either into the list, or combines it with the immediatly adjacent chunks. (Clearly, this strategy should keep fragmentation low.) Now let's come to the question what may happen if you try the following: char *greet = "hello, world"; .... free (greet); (Again please note that in the following scenario I make some assumptions, e.g. that static data is located together with string constants in the lower parts of the memory, below the heap; these assumptions are quite realistic, but there are certainly machines and implementations for which this doesn't hold). If free() follows the K&R implementation, it will first notice that the given adress is below the lowest adress of the previously free()ed parts of dynamic memory(%2). Now some bytes immediatly *below* the adress where the character 'h' from the "hello, world"-string is stored, will be taken as int which gives the size of the chunk free() is about to link into the list. NOTE THAT THIS IS TOTALLY UNRELATED TO THE LENGHT OF THE "hello, world"-STRING! Now free() decides that the bytes which are just beeing returned to the allocator cannot be combined with something else, so it links them as an alement into the list, WHICH CHANGES SOME BYTES AT SOME DISTANCE BELOW THE ADRESS OF THE CHARACTER 'h'. Besides that, so far not much damage has happened, and there is a fair chance that the changed bytes belong to some static variable (or other string) which is not any more needed until the program ends. Let's consider the next request to the allocator now (malloc, calloc or realloc). It steps thru the list, treating the bytes above the link-pointer as size information to decide if a given chunk may be used to fullfill the request. If we are lucky, these bytes (again: the contents of some totally unrelated static variable, or some other string) tell that it is too small, and nothing will happen. With less luck, the space of the "hello, world"- string will now be reused - something the programmer who originally free()ed it might have had in mind :-) - but not only that: Remember that the "how much space is available"-information is IN NO WAY RELATED to the length of the "hello, world"-string. If less than the "available" space was requested, some space at the "end" of the chunk under consideration is returned (and of course the size information is properly adjusted). I think some graphic will help to show what this really means: Bytes: 4 4 ---------16---------- 4 4 8 +------------------------------------------------------------+ | XXX nnn hello, world\0 | YYY | mmm | ZZZ | +------------------------------------------------------------+ ^ ^ | pointer handed to free | malloc(6) - XXX are the bytes overwritten by the erreneous free(); - nnn are some bytes that may accidentially form the value 5 (if treated as an int) when the request to malloc() starts beeing processed; nnn will be changed to 3 before malloc() returns; - mmm are some bytes that will be treated as an int and set to the value 2 before malloc() returns; - ZZZ are the bytes that will be overwritten, if the "space" to which malloc() returns a pointer gets used; - YYY are some bytes that may be used as a link pointer if the space just allocated will ever be returned (It is left as an exercise to the reader (:-)) to realize that this will not happen if the free follows immediatly and that also one more malloc() with a specific size could result in changing YYY :-) I hope everyone who has followed so far will be convinced now that things become more and more unpredictable and that undefined behavior is the last thing which one would wish for some program. At least *I* definetly wouldn't want to debug such a program, as it may be very very hard to trace back from the point the error shows up to the point where things started go wrong(%3). ---------------------------------------------------------------------- %1: Note that a more "space efficient" implementation would need only the space for the size, but not for the pointer, as the latter is only required as long as the chunks are "free". The K&R implementation was choosen to implement the obscure AND UNPORTABLE feature that you can use free()ed space until the next malloc() -- a thread about this topic appeared a short time ago in this group. %2: Interestingly enough, though K&R II specifies in the appendix about the library that free(NULL) does nothing, they fail to implement this behavior in their "own" function. What can we learn from this? (No, I have no answer, take it more as a fact I observed.) %3: For debugging techniques which might help you out of such situations, you may want to read Robert Ward's book "Debugging C" or the Article "Controlling the Malloc Heap" by Eric White in the CUJ Volume 7, Number 2 (Feb 1989). -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83