Path: utzoo!attcan!uunet!lll-winken!csd4.milw.wisc.edu!uxc!uxc.cso.uiuc.edu!m.cs.uiuc.edu!p.cs.uiuc.edu!johnson From: johnson@p.cs.uiuc.edu Newsgroups: comp.sw.components Subject: Re: Inheritance vs. component efficienc Message-ID: <130200002@p.cs.uiuc.edu> Date: 7 Jun 89 13:13:00 GMT References: <5682@hubcap.clemson.edu> Lines: 135 Nf-ID: #R:hubcap.clemson.edu:5682:p.cs.uiuc.edu:130200002:000:7805 Nf-From: p.cs.uiuc.edu!johnson Jun 7 08:13:00 1989 I disagree with just about everything in Bill Wolf's article on inheritance. I have done a lot of object-oriented programming, including an optimizing compiler for Smalltalk in Smalltalk and a framework for operating systems in C++. First, inheritance is NOT primarily a way to reuse specifications. It is primarily a way to reuse code. The real question is, "what is the kind of code that is being reused?" It turns out that it can not only be code that describes an implementation, but a textual description (i.e., a code) that describes a design. There are at least three reasonable ways of using inheritance. One is inheritance to specialize a component. This is the kind that is talked about the most. The second is similar to taking an existing program and editing it into shape. In other words, you subclass the original class and redefine methods until it does what you want. This style of programming results in ugly class hierarchies and can lead to the inefficiencies that Bill Wolf complained about. However, it is very valuable during rapid-prototyping and helps lead to families of interchangeable components. The third way of using inheritance is to inherit from an abstract class, also called a deferred class. This is the most elegant use of inheritance, and is essentially using the superclass as a template to generate new classes. The first problem is that regardless of the fact that certain operations may have been overridden (redefined), the implementation of the other operations has not been modified to account for this fact. Thus, a large percentage of the effort expended by certain inherited operations may well be devoted to the maintenance of aspects of the state of the base component which are crucial to the correct functioning of the operations which have now been overridden. This introduces major inefficiency into each invocation of the inherited operations, and constitutes a severe performance penalty. A cost which could have been paid once and for all at development time (by designing the component efficiently) is now being paid forevermore, at run time. The second problem is that the above-mentioned useless state information also consumes space, penalizing us in both dimensions. Abstract classes leave important parts of the implementation undefined, so subclasses rarely inherit something that they don't need. In particular, abstract classes usually don't define much state information, so there is almost NEVER a space penalty. The third problem is that the inherited operations were implemented with regard only to the operations provided by the base component. However, it is frequently the case that the addition of even a single operation has a dramatic impact on the nature of the best solution to the implementation problem; since the implementation of the inherited operations and the implementation of the non-inherited operations are mutually independent, the efficiencies which would have been realized had the designer implemented all the operations together are sacrificed, resulting again in time and space penalties. All I can say is that this doesn't happen much in practice. This can certainly happen during rapid-prototyping, but it is easy to fix and is unlikely to happen in a mature class library with a lot of abstract classes. Since the basic rationale behind software components is the exploitation of economies of scale, it makes economic sense to seek an extremely good (or perhaps optimal) implementation with little regard to development costs; these costs are spread over thousands or millions of applications and are generally trivially recoverable. Inheritance is a mechanism which seeks to minimize development costs at the probable expense of utilization costs, and is therefore something which is of little value to the developer of software components, who seeks to sell his product into a market whose economic characteristics are essentially those of a commodity market. This statement completely flies in the face of reality. The problem with software today is not that it is too slow, but that it is too expensive to construct. 95% of the time, the people who are worrying about efficiency are worrying about the wrong thing. We don't need to make our software faster, we need to make it more reliable, easier to use, easier to change. In fact, most software that is written is run on only a few machines. Few programmers are writing software that will run on thousands of machines. Object-oriented programming in general, and inheritance in particular, helps make software more reliable, easier to use, and easier to change. Moreover, it does not have to make programs any slower. In fact, object-oriented programs can be faster than conventional programs even though being built out of components that you might criticize as being too general purpose. The first example is the Choices operating system framework, which is written in C++. C++ implementations are quite efficient. We are just now getting to the point where we can build a Unix-compatible version, so we have not done complete system performance comparisons. However, we have been able to benchmark various pieces, such as the kernel call mechanism or the file system. These pieces are all faster than Unix running on the same hardware. Further, we expect to use this set of components to build real-time operating systems as well, with performance equal to that of other real-time operating systems. Programmers who do not want to pay for a feature like virtual memory can easily replace the memory-management module with one that has much less overhead. On the other hand, we have been inventing all sorts of exotic virtual memory features, some of which are fairly expensive. Only those people who want the benefit have to pay the cost. Programmers can build customized versions of components by inheriting from an abstract class, which provides a template for their customized class and which ensures that the new class works with the existing classes. The second example comes from my optimizing compiler, which is written in Smalltalk. Smalltalk is is quite a bit slower than C, which is why we are building an optimizing compiler for it. However, some parts of the compiler are faster than equivalent programs written in C. In particular, our table-driven code generator converts our intermediate representation to machine language using hash tables, which are built into Smalltalk. The original C program used YACC to compare a statement in the intermediate language with the machine description. In both cases we were just reusing what was available, but the Smalltalk program is a lot faster. This is due mostly to the high quality of the Smalltalk class library. Another advantage of Smalltalk is the quality of the programming environment, since it is easy to find the performance bottlenecks in the system and fix them. Getting back to the issue of overemphasizing performance, we all know that most programs spend 80% of their time in 20% of the program. However, it takes just as long to write the 80% as the 20%. Thus, it makes sense to use techniques that produce inefficient code on the 80% and use the time you save figuring out how to make the 20% go faster. There is a lot I could say about object-oriented design and the importance of inheritance, but I don't have time. However, I will refer you to a paper I wrote with Brian Foote in V1 N2 of the Journal of Object-Oriented Programming called "Designing Reusable Classes". To summarize, inheritance is very important, and should not be casually discarded. Ralph Johnson - University of Illinois at Urbana-Champaign