Path: utzoo!utgpu!watmath!clyde!att!ulysses!andante!alice!bs From: bs@alice.UUCP (Bjarne Stroustrup) Newsgroups: comp.lang.c++ Subject: Re: Current O-O Languages as Software Engineering Tools Message-ID: <8414@alice.UUCP> Date: 11 Nov 88 16:54:37 GMT References: <5155@thorin.cs.unc.edu> Organization: AT&T Bell Laboratories, Murray Hill NJ Lines: 278 coggins@retina.cs.unc.edu (Dr. James Coggins) presents 3 propositions and a conjecture for comments. As people would expect (after reading the original note) I'll challenge the conjecture. For good measure I'll also challenge ALL of the propositions! > In order to give the religious debate on C++ vs. Objective-C a more > solid (well, a little less vaporous) foundation, consider the > following propositions that were proposed to me recently: > 1. Smalltalk-like languages (including Obj-C) do a better job of > separating specification and implementation than Simula-like languages > (including C++). Not at all! They do a different and often inferior job of this separation. Let us first consider the key difference between the two styles of language and to simplify matters let us compare Smalltalk and C++ as representatives of their respective schools of thought. Each carries scars from their history, each lacks features provided by newer and more ``researchy'' members of their family, and each has proven itself successful in its own core application areas. Furthermore, I like both better than most of the other languages and systems in the OOP arena. Naturally, I have to make some simplifications in this discussion. The topic is more suitable for a couple of major dissertations than a comp.lang note. Try considering the major points rather than simply looking at each sentense finding the 5 qualifications necessary for making it strictly true. I probably know them too, but I'm not writing a Ph.D. -- Actually, it would be nice if someone would do a really thourough discussion of these issues. In my opinion is the key issue/difference is type system. Most of my comments will relate to this: ST relies exclusively on run-time type checking C++ relies extensively on static type checking So, what does ``separating specification and implementation'' mean? For now, let us consider this by comparing the answers to the following questions: I changed the implementation of something, how do I know I didn't change the specification? how are clients affected? and what do I have to get the program running again? Ideally, with a perfect separation of specification and implementation the answers are ``its obvious,'' ``not at all,'' and ``nothing''. C++ class declarations provide the user with the ability to specify strongly typed interfaces. If you change the interface it is obvious - and clients depending on the interface will fail to compile. Furthermore you can specify separate interfaces to different kinds of clients. In particular, you can provide one interface to the general public, another to implementors of derived classes, and a third to yurself (see for example Alan Snyder's paper for OOPSLA'86 or Barbara Liskov's formulation of the same ideas in the OOPSLA'87 keynote address). Smalltalk has problems in this area. There are of course many subtle dependencies that are not ameanable to verification by static type checking such as ``f() must be called before g()''. This is a problem in every language. C++'s constructors and destructors and initialization rules provide a few features in this direction and dynamic checking of all sorts of properties can always be done. I do, however, strongly prefer declarative properties that can be checked before execution to properties that are essential dynamic in nature and can only be checked when running. The snag with run-time checking is that it is notoriously difficult to ensure that errors detected at run-time is handled in a reasonable manner. Entering even the most advanced debugger after detecting a vector range violation or a ``method not found'' is useless if there is no programmer present. So C++ can detect a large class of problems at compile time, but what can it do once they are detected? This is the notorious and I think largely misunderstood ``header file problem''. Smalltalk doesn't have it, once you have decided what to change, you simply does it and keep running. The effect of a change ``instantly'' affects the whole program. Once you make a change in a C++ program you need to recompile. If you simply make a change to a member function, in principle you need only to recompile and link that function (and there are systems being built that does exactly and only that). The problem comes when you make a change to something that in part of an interface. Then you need to recompile the clients too. That is the essential price you pay for having the static checking. I contend that in all but the most trivial programs it is worth it. In this, I am backed by evidence of actual compile, debug, and integration times of medium sized (<500K lines of code) projects. As I have often said before C++ needs a tool that determines the minimal recompilation needed after a change and prefarebly an incremental compiler and linker to minimize the work of doing this recompilation. A UNIX `make' that recompiles the world because you changed a comma in a comment in a heavily included header file simply isn't a suitable component in a C++ programming environment. There is one C++ design decision that affects the amount of recompilation that people usually fail to appreciate (I think largely because of the lack of tools). The private part of a C++ class is part of the declaration of the class itself and a change to it may force the recompilation of clients. For example: class X { int a; public: f(); }; main () { X x; x.f(); } If you change X to class X { int a; int b; public: f(); }; main() will have to be re-compiled. The reason is that by declaring an automatic (on the stack) variable the client main() required knowledge of the size of X (the compiler needs that information to generate code for the function call and return since that requires knowledge of the size of the stack frame). Clearly that could be avoided (even in a C++ implementation) if we were willing to use a smarter link environment or code generation strategy (so that the layout of stack frames was handled dynamically), but that would seriously hurt C++'s portability and its ability to fit into a traditional environment. Alternatively we could use the trick of never allocating class objects on the stack. This is the strategy of Simula and most of its decendents. The snag is that if we did that we would incur an overhead of two memory management operations per function call and the cost of indirecting every access to a class object. Measurements on Simula indicates that this cost is at least a factor of 2 in run-time. Most of Simula's decendents pay an even higher price. Accepting this overhead would imply giving up large application areas to C, assembler, and Fortran. C++ was specifically designed to preserve efficiency in this area. The apparant cost is recompilation time. However, if you don't use these features, that is, if you don't declare automatic or static variables of a type, if you don't have inline functions that depend directly on the contents of the private part in a type, and if you don't take the sizeof a type THEN the users of a type is insulated from changes in the implementation of a type EXACTLY as in languages that always use indirection in the access to class objects. For example, had main() been written like this: main() { X* p = new X; p->f(); } it would have been unaffected by the change to X's representation - and it would of course have incurred the run-time cost. My contention is that often run-time cost matters and often it doesn't. C++ serves you in both cases. However, the current C++ tools does not help you sufficiently. Curiously enough I was building tools for finer grain dependency analysis and tools for taking advantage of it 4 years ago, It is not particularly hard, but the explosion of C++ use distracted me. What you see here is not a design flaw in the C++ language but a deficiency in the available support tools. This deficiency is finally being remedied. > 2. Smalltalk-like languages are better tools for developing small > programs because of their massive built-in class libraries and their > more flexible (later, dynamic) binding which allows polymorphic types. Again I must disagree. Of course you can throw a small Smalltalk program together to do many things that would be painful to build from scratch in C++. However, the massive libraries and the wonderful program development environment of Smalltalk is not something that C++ lacks because of some inherent defect. Rather, Smalltalk is about 10 years older than C++ and have had something like 100 times more effort and resources lavished on its environment and libraries. I am in particular looking forward to trying ParcPlace's Cynegy C++ program development environment. It has the potential of bringing some of the Smalltalk expertise to bear on the fairly well understood problems with C++'s (lack of) program development tools/environment. If you consider UNIX and MS-DOS to be C++ toolsets - imperfectly inherited from C - C++ fares a bit better, but C++ can and will progress much further in the programming environment and standard libraries areas. Where it comes to using a traditional tool such as a data base system, an standard (grubby) operating system interface, a Fortran engineering library, a C device driver, etc. C++ has an edge. If your small program needs to run on a very small machine, an unusual machine, or a mainframe, C++ has an edge. The ability to coexist and ease of porting can be essential even for very small projects. > 3. Simula-like languages are better tools for medium size software > development because for these larger projects it is worth building > specialized class hierarchies for the particular system, yielding a > better match to conceptual structures and increased programmer and > run-time efficiency. Here I was naturally sorely tempted to agree, but #3 also is an oversimplification that could confuse the issues. It really also depends on how you define ``medium''. If medium is defined as programs that it take more than one person more than one year to build I agree, but here of of course the choice of language/system can radically affect the perceived size of the project. If the problem is a good ``fit'' for a Smalltalk system on a suitably sized workstation you can see spectacular benefits from using Smalltalk. However, if that fit isn't there using Smalltalk could turn into an exercise in horrendous contortions. For all sizes of projects it is essential to chose a suitable tool for the problem. There is no perfect tool. > Now for the leap... > 4. The Smalltalk-like languages will be the better tools for large > software development because item #1 above will kick in to make the > project more manageable. Well, I disagreed with #1 so I could rest my case here. > Opinions? There is absolutely no basis for this conjecture and the absense of really large projects successfully developed and supported in a Smalltalk-like language is a contraindication. After 16 years of Smalltalk variants and offshoots we should not be conjecturing on this point. Smalltalk has a string of spectacular successes to its credit, but as far as I'm aware large scale systems development isn't among them. Nor was it mentioned among the things Smalltalk was supposed to be especially good at. The Smalltalk immitations, hybrids, and commencial offshoots have consistently been far inferior to ``the real thing''. For larger projects C++'s strong static type checking provides essential benefits in the integration phase by providing mechanisms for documenting interfaces in a way that can be mechanically checked. For many-person projects the ability to plug exclusively dynamically checked together with great ease and no (static) checking simply defers checking until after integration where nobody is able comprehend the total system anyway. This is the reason for all the incompatible plugs we have in the hardware arena. You don't really want to be able to plug everything together. People using an electric shavers appreciate not being able to plug them into high power outlets. The value of a Smalltalk like environment and superb debugging features is greatly diminished when the system is operated by someone who does not understand the total system, isn't allowed to change it on the fly, and might not even be a programmer. Also, it still is (and I expect it will remain) most unusual for a large system to fit on a (single) personal workstation. The ST-style programming environments still appears to be primarily aimed at serving a single user on a single system. Often a large software system will have to serve a large user-community on a large (typically ugly and often diverse) hardware base. Also run-time efficiencies often matters when you start pressing against the limits of the hardware (memory sizes, mainframe CPU's, network connections, disc bandwith). Here C++ comes into its own. LARGE system development is typically a mess. The state of the art is poor. There is much that can be done and I think that OOP (in its various incarnations) has a vital role to play, but we have a long way to go yet. I am convinced, however, that statically typed interfaces that can provide a base for documentation and design methods allowing relatively large numbers of people to cooperate is an essential part in any remedy for LARGE system development. Anyway, I consider the belief that there is exactly one right way of doing things and especially the belief that there is exactly one right language of doing it in and infantile disorder. ``Anything works on small projects.'' - James Coggins Thanks, Jim, for posting these propositions. - Bjarne Stroustrup