Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!dtgcube!ed From: ed@DTG.COM (Edward Jung) Newsgroups: comp.sys.next Subject: Re: [self free] is a bad example (long) Summary: Objective-C semantics supports example Message-ID: <1990Jan14.121250.5302@uunet!dtgcube> Date: 14 Jan 90 12:12:50 GMT References: <1842@opus.cs.mcgill.ca> Sender: ed@uunet!dtgcube (Edward Jung) Reply-To: ed@DTG.COM (Edward Jung) Organization: The Deep Thought Group, L.P. Lines: 214 In-Reply-To: clement@opus.cs.mcgill.ca (Clement Pellerin) In article <1842@opus.cs.mcgill.ca>, clement@opus (Clement Pellerin) writes: [a bunch of stuff] This is a rather long message in reply to Clement Pellerin's question about [self free] in Objective-C. SOME BACKGROUND Pure object oriented languages do not allow direct access to instance variables; all access is mediated by messages. In Smalltalk and CLOS (Common Lisp Object System), for example, you do not access instance variables directly, but rather through an access method. Objective-C, being a hybrid language anyhow, allows instances to access their own instance variables directly, and the instance variables of other instances that have been declared using the @public keyword or via structs and the @defs keyword. One side-effect of direct access to instances is that instance variables must be fixed in size throughout the duration of an execution of a program. You can add methods to a class definition at run-time, but you cannot change the size or number of a class' instance variables. (We are working on a system that will allow this via incremental run-time relink). HOW THE COMPILER MAKES AN OBJECT An object (class instance) is allocated using the equivalent of malloc(). This is performed by Object's new method. The symmetric free() is performed by Object's free method. Since all allocation and freeing is performed via the inheritance chain (calls to super), all objects end up allocated and freed in this manner. Classes are allocated at load time, along with information about the methods, method names, argument types, selectors, instance variables, instance variable names, and instance variable types. All this information is available at run-time and is embedded in the OBJC segment in a Mach-O file. Note also that there is a provision for class variables, but the Objective-C syntax does not handle this (yet?). HOW THE COMPILER HANDLES INSTANCE VARIABLE ACCESS Every reference to an object's instance variables is translated to an offset pointer lookup to self. self is a parameter that is secretly passed into every method (as is the selector for that method); it is used implicitly to reference an object's instance variables. Assume the following method in the class MyObject, and anInstanceVariable is an instance variable of MyObject: - setAnInstanceVariable:(int)toThisInt { anInstanceVariable = toThisInt; return self; } The code produced by the compiler actually looks something like this (in ANSI C syntax): id setAnInstanceVariable(id self, SEL _cmd, int toThisInt) { self->anInstanceVariable = toThisInt; return self; } So methods are converted into ordinary functions. Of course the real code is output in object code on the NeXT, and the function name is bound to a rather strange name, but the basic idea is the same. HOW THE COMPILER HANDLES MESSAGE SENDS Message sends, on the other hand, are all vectored through one of two message dispatchers: objc_msgSend() and objc_msgSendSuper() (or equivalent). These dispatchers are given the value of the receiver of a message, and the message selector, perform a lookup to find the function (the converted method, as in the above example) associated with the receiver and selector, and jump to that function. In this process, the dispatcher examines the "isa" link of the receiver. If it has an invalid value, the dispatcher gives a run-time error. WHEN YOU FREE AN OBJECT USING [anObject free], THE isa LINK IS GIVEN AN INVALID VALUE, so the error is generated. This is to assist in trapping the occasional bug where you send a message to a freed object. Since access to instance variables is performed through a pointer offset, this mechanism is not sensitive to the value of the isa link. As this is a very fast means of accessing the instance data, the overhead of adding a check of the isa link to every such call would be significant (since method lookup is a rather more complex process, the lookup is relatively less expensive). GETTING BACK TO THE QUESTION So the example given by NeXT is correct, though perhaps ambiguous. [self free] is the correct way to release the memory associated with an object. [self free] is semantically equivalent to: char * aPtr; ... aPtr = malloc(256); ... free(aPtr); /* aPtr is now freed */ In the above example, you could access aPtr after freeing it, but that's not the "right" thing to do. You can still exist in the scope of the method that called [self free], because freeing an object does not do anything to the method code; it does, however, invalidate certain variables (again, just like the malloc/free example above). This is why the safest way to write a free method in a class is as follows: - free { free(aPtr); /* free all the other junk */ return [super free]; } WHAT IS WRONG WITH THE SUICIDE EXAMPLE? You should not do anything with an object after it has been freed. Just like a dynamically allocated pointer. Should an object that does not exist anymore be allowed to continue executing a method if it does not access its variables nor sends a message to itself? Yes. Its storage has been freed, but the methods are independent, and might be thought of as owned by the class, which still exists. The definition of an instance of a class is essentially a copy of the data formed from the template defined in the class; the methods are shared by all the instances, and thus are independent from the existance of the instances. I would prefer if I would get a run-time error by accessing the variables. This is a problem stemming from the semantics of dynamically allocated memory without garbage collection. Is it worth contacting NeXT? I don't think so. Perhaps the manual could be made clearer, but the semantics of Objective-C are correct for its model. Perhaps the thing to wish/ask for is garbage collection, if memory management is a headache. In C, dynamically allocated objects (instances, pointers or otherwise) are the programmers' responsibility to track. Note that you can do some interesting things with the new and free semantics, such as implementing an object cache to minimize heap fragmentation from repeated allocations of object instances and/or the instance variables that are dynamically allocated pointers: #define FOP_SIZE (32) static MyClass free_object_pool[FOP_SIZE]; static int num_in_fop = 0; + new { if (num_in_fop > 0) { num_in_fop--; self = free_object_pool[num_in_fop]; anInt = 100; } else { self = [super new]; anInt = 100; aPtr = malloc(10000); } return self; } - free { if (num_in_fop < FOP_SIZE) { free_object_pool[num_in_fop] = self; num_in_fop++; return nil; } else { free(aPtr); return [super free]; } } Of course in this example you may be able to continue to send messages to instances that have been freed, but you should never do that anyhow. There are a host of other optimizations that can be done with Objective-C, including determining the static address of a method to avoid message passing overhead in inner loops, etc. Further detail about this and other Objective-C matters might be better addressed via email to conserve news bandwidth. -- Edward Jung The Deep Thought Group, L.P. BIX: ejung 3400 Swede Hill Road NeXT or UNIX mail Clinton, WA. 98236 UUCP: uunet!dtgcube!ed Internet: ed@dtg.com