Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!cmcl2!yale!bunker!ppi!cox From: cox@ppi.UUCP (Brad Cox) Newsgroups: comp.lang.misc,comp.lang.smalltalk,comp.lang.c++ Subject: Re: Software ICs (long! was Re: C++ vs Objective-C) Message-ID: <1662@ppi.UUCP> Date: Tue, 27-Oct-87 16:13:50 EST Article-I.D.: ppi.1662 Posted: Tue Oct 27 16:13:50 1987 Date-Received: Mon, 9-Nov-87 04:47:51 EST References: <3405@ece-csc.UUCP> <638@its63b.ed.ac.uk> <1811@watcgl.waterloo.edu> <3179@ames.arpa> Organization: Productivity Products Int'l, Sandy Hook, CT Lines: 231 Summary: Dynamic binding makes Software-ICs different from libraries. Xref: mnetor comp.lang.misc:875 comp.lang.smalltalk:411 comp.lang.c++:563 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In article <3179@ames.arpa>, fouts@orville.nas.nasa.gov (Marty Fouts) writes: > In article <1661@ppi.UUCP> cox@ppi.UUCP (Brad Cox) writes: > > >. . . But the improvement has always turned out to > >be arithmetic in impact. The geometric improvements needed to bring our > >productivity in line with that of hardware engineers will not result from > >better programming languages, but by focusing our attention outside the > >language. For example, by learning to program by producing and reusing > >components from large libraries of pre-tested Software-ICs. Yes, these > >libraries are hard to build, and expensive. But each well-tested, > >well-documented library component provides a geometrical improvement to the > >productivity of each of its users, and the improvement is open-ended, unlike > >the productivity enhancement of features that are hardwired into a new > >programming language. > > I have three problems with this comment. The one that bothers me the > most is the standard marketing ploy of renaming something from its > original lackluster name (library routine) to something that sounds > exciting, like "Software-IC". Libraries have been around at least as > long as programming languages; and have contributed their share to the > productivity improvement, but they aren't the glorious path to > geometric improvement, or we would have been seeing geometric > improvements over the decades. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I'd like to thank Marty for broaching a subject that is precisely at the heart of where Objective-C departs from the more traditional languages. The distinction seems to be difficult for many people to grasp at first, so we coined terms like `Software-IC' and `ICpak' (library of Software-ICs) to highlight for them that dynamically bound encapsulation and inheritance introduces something that is different, FUNDAMENTALLY different from programming as done via traditional libraries. You'd agree that the traditional library concept does not support these concepts, from which it follows that Software-IC is not simply a fancy name for library. It may be less obvious why these differences MATTER, e.g. why dynamic binding and (to a lesser extent) inheritance, relieve some of the technical obstacles that have prevented the library concept from introducing geometric improvements. I went into all this at great length in my book (Object-oriented Programming; An Evolutionary Approach; Addison Wesley 1986), but I'll summarize the argument briefly here. Yes, libraries HAVE been around for a long time, and they have certainly not been the glorious path to geometric improvement. For example, many people have worked very hard to bring about corporate-wide `reusability' by collecting large libraries of functions and macros, cataloging them in large databases, and publishing them for reuse. The projects have generally failed, or at best brought about only arithmetic improvement. But why? Is it because the groups responsible for distributing the software, or their clients, or their managers, were lazy or stupid? No. Was the software undocumented, or unreliable, or too slow? No, not usually. Was it because the software was not published via a fully-integrated programming environment with a glitzy iconic browser? No. They failed primarily because of a ordinary technical problem (and perhaps secondarily because of the usual religio-political issues that crop up around code reusability). The problem is simply that code stored in a conventional library is tightly coupled to the supplier's problem domain, and its consumers could not apply it easily in their unique environment. In other words, the code was statically bound. Static binding turns out in this context to be a vice, not the univeral virtue that compiler developers seem to believe. Static binding produces binary files that are tightly coupled to that which was known when the code was compiled by the code's supplier, thus removing ability from the code's consumer to install it in his radically different execution environment. Late binding relieves this restriction by loosening the coupling between a supplier's reusable code and the environments his consumers will apply it in. To state my position as concretely as possible, static binding is a tool, not a panacea (Ada devotes, take note!). Dynamic binding is also a tool, not a panacea (Smalltalk-80 devotes, take note!). Both tools are specialized for particular kinds of problem and inappropriate for others. For example, consider the different kinds of problems in building an automobile. In designing the AutomobileEngine it is appropriate and useful to state as early as design time that each EngineCylinder can contain only instances of class Piston, and to have this desicion strictly enforced (strict type-checking) during the implementation phase. Static binding is the right tool for this job. By contrast, in designing the AutomobileTrunk, it is not desirable to make these kinds of decisions any earlier than when the automobile is put into service. Dynamic binding is a far better tool for this job, because strict typechecking is entirely the wrong idea for loosely coupled collections like the trunk. Now extend this example by imagine the tools a distributor of replacement automobile parts, by analogy with our Software-IC concept. If static binding were the only tool available for defining replacement parts like piston, you've got a clash between the static binding of the piston to the cylinder and the more amorphous needs of the distribution channel involved in putting a replacement piston into service (e.g. how to also express CrateOfPistons, or worse, PartsInventory?). The piston designer could never anticipate all of the environments into which his consumers might want or need put his piston, and would value a late binding tool that would move these decisions into the hands of his consumers So much for the contribution of dynamic binding. How about dynamic binding as provided by C++ as opposed to Objective-C? In one sense, the C++ virtual function machinery is dynamic in that the implementation is certainly chosen at run-time based on the recipient. But in an important sense, the binding is static because the dispatch is based on compile-time knowledge of the receiver's type, at least to the extent of knowing a common supertype of all possible receivers. By contrast, Objective-C acquired from Smalltalk a different style of binding that is dynamic in both of these senses; binding is done entirely at run-time. In the current implementation, this involves hashing the receiver's class (which is stored in the receiver at run-time) with the message selector and using that as an index into a cache of recently-used implementation addresses (function pointers). When the cache doesn't contain the desired implementation, a slower linear search mechanism kicks in to update the cache by consulting dispatch tables stored in each class. Please notice that the cache is only one of many ways to implement the lookup mechanism. A fully-indexed implementation that never invokes the linear lookup is quite possible (as in C++), but was not used because it imposes unbounded space overheads that discourage aggressive use of inheritance. A recent article described an example that is useful for pointing out the advantages of totally dynamic binding. Building a HashTable class requires that all HashTable members provide hash and isEqual: methods that the HashTable needs. But how can the HashTable supplier (who doesn't control the members' common superclass) arrange this? Since Objective-C style binding does not require the members to have a common superclass, one solution would be to just require each newly-written member class to just provide the two needed methods, in which case newly-developed classes will work correctly as members even though they have no common superclass, but not older classes. But a better solution is possible that automatically fixes the older classes too. The HashTable supplier can provide an additional class, HashTableMember, that defines default semantics for only the hash and isEqual: methods (for example; unless overridden, two objects are equal if and only if they are exactly the same object). He can direct his consumers to encorporate this class into every application that uses HashTable. At startup time, the class will send itself a special message that causes HashTableMember's dispatch table to be inserted at the front of Object's dispatch table (taking care to update the cache as well). Presto, ALL objects immediately recognize the two new methods. Similar classes could also be provided to provide special hash and isEqual: methods for those already-released classes that should override the default implementation with specialized ones. We actually solved this particular case by simply implementing hash and isEqual: in the Object class. The method donor mechanism (also known as poseAs:) is generally used for other problems, such as for repairing and/or extending code that has already been released to the field. For example, the most recent case was to extend our (already released) Object class with a mechanism for storing lists of those objects that depend on other objects so that iconic user interfaces can be automatically updated whenever the objects they're interfacing have changed. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > The second problem I have is the analogy which isn't stated here, but > is frequently drawn between "Software-IC" and hardware IC. If you > follow the component industry at all, you know that the age of TTL > 7000 series ICs has all but ended and almost all serious design now is > being done with semicustom or custom components. You also know that > hardware designers have long bemoaned their lack of productivity, > although they refer to it as "design turn-around time", and they > haven't been seeing geometric improvements either. Finally, you will > realize that the use of off the shelf components has alternated with > the use of special purpose design, going back at least as far as the > early sixties when the first published circuit books came out. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you're saying that late binding is not a panacea, I agree wholeheartedly. It is a tool; something to be picked up or laid aside according to the job at hand. I fault Smalltalk-80 for not providing any tools for doing early binding, and I fault Ada for not providing any tools for doing late binding. Both C++ and Objective-C avoid this trap, although C++ provides stronger tools than C (and thus Objective-C) for doing early binding and Objective-C provides stronger tools than C++ for doing late binding. As you pointed out, hardware designers use tools, not panaceas, and feel free to choose the most effective tools for any job. At times, they choose off-the-shelf components, and at other times they choose to build custom logic. Nonetheless, and in spite of the bemoaning on the part of hardware designers, it does seem that the geometric improvement has been realized, if not from each individual hardware designer, then certainly by the companies that employ them. When I was in graduate school twenty years ago, the EE department built its own computer (the Maniac II) from discrete components. Then computers-on-a-chip came out and for a while it became fashionable for deparements, and soon thereafter, individuals, to build their own computers. Moore's law predicts a yearly doubling of the number of components per chip. This works out (2^20) to a million-fold improvement over these twenty years. I sense that change of a similar magnitude has been demonstrated in computing power delivered to the consumer, and possibly per manhour consumed in delivering it. I'd be grateful for any hard data to support or contradict this conjecture. I am not aware of anything like a million-fold improvement in software. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > The third problem I have is with the loose claims of 'geometric' as > apposed to 'linear' improvements in productivity. I've been reading > about programmer productivity for a long time, since Marvin Minsky > first claimed that advances in programming languages would do away > with the need for programmers within a decade (about thirty years ago) > through James Martin's claims to the same effect ten years ago until > now. Nobody even knows what programmer productivity is, yet alone at > what kind of rates it has been improving at over the last three > decades. Further, various kinds of programming have received > differing amounts of attention, and ease of accomplishing tasks in > some fields has improved greatly compared to others; for instance, > using 4GL query languages like SQL, it is now possible to > interactively ask for data in a few seconds which used to require > hours of programming plus days of backlog waiting for a programer to > become available to accomplish. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You've alluded to several tools that have indeed contributed geometric improvements in specialized areas. I'd extend this list with my own personal favorites, the now well-established pipes/filters concept from Unix, program generators like yacc/lex/4GL, and the less-well-known fully-dynamic style of binding employed in typeless languages like Smalltalk-80 and hybrid languages like Objective-C. I promote the latter tools more extensively than the former, not because they're better, but because they're less well-known. Regarding the word productivity, if you can offer a precise definition, we'd all be glad to use it. But that won't change the urgency of people's need to change it, or to discuss it, any more than other imprecisely defined terms like `the trade deficit', `the stock market', or `Company X's image in the marketplace'. > All in all, many things are important in the improvements that have > been achieved and none of them alone are going to give the ultimate > performance improvement. Careful implementation of languages for > maximum expressiveness has improved productivity, as has understanding > the way programs should be laid out to aid understanding; but so have > faster machines, interactive operating systems, and decent debuggers. > It all needs to be worked on, and none of it is going to give us magic > productivity enhancements. Who said magic? I said tools, not panaceas, and geometric improvements, not magic. I do believe that the proper use of all available tools CAN move programmer productivity from an arithmetic to a geometric growth curve.