Path: utzoo!utgpu!watmath!clyde!att!ulysses!andante!alice!bs
From: bs@alice.UUCP (Bjarne Stroustrup)
Newsgroups: comp.lang.c++
Subject: Re: Current O-O Languages as Software Engineering Tools
Message-ID: <8414@alice.UUCP>
Date: 11 Nov 88 16:54:37 GMT
References: <5155@thorin.cs.unc.edu>
Organization: AT&T Bell Laboratories, Murray Hill NJ
Lines: 278


coggins@retina.cs.unc.edu (Dr. James Coggins) presents 3 propositions
and a conjecture for comments. As people would expect (after reading
the original note) I'll challenge the conjecture. For good measure I'll
also challenge ALL of the propositions!

 > In order to give the religious debate on C++ vs. Objective-C a more
 > solid (well, a little less vaporous) foundation, consider the
 > following propositions that were proposed to me recently:

 > 1. Smalltalk-like languages (including Obj-C) do a better job of
 > separating specification and implementation than Simula-like languages
 > (including C++). 

Not at all! They do a different and often inferior job of this separation.
Let us first consider the key difference between the two styles of language
and to simplify matters let us compare Smalltalk and C++ as representatives
of their respective schools of thought. Each carries scars from their
history, each lacks features provided by newer and more ``researchy'' 
members of their family, and each has proven itself successful in its
own core application areas. Furthermore, I like both better than most
of the other languages and systems in the OOP arena.

Naturally, I have to make some simplifications in this discussion. The topic
is more suitable for a couple of major dissertations than a comp.lang note.
Try considering the major points rather than simply looking at each sentense
finding the 5 qualifications necessary for making it strictly true. I probably
know them too, but I'm not writing a Ph.D. -- Actually, it would be nice if
someone would do a really thourough discussion of these issues.

In my opinion is the key issue/difference is type system. Most of my comments
will relate to this:

	ST relies exclusively on run-time type checking
	C++ relies extensively on static type checking

So, what does ``separating specification and implementation'' mean?
For now, let us consider this by comparing the answers to the following
questions:

	I changed the implementation of something,
		how do I know I didn't change the specification?
		how are clients affected? and
		what do I have to get the program running again?

Ideally, with a perfect separation of specification and implementation
the answers are ``its obvious,'' ``not at all,'' and ``nothing''.

C++ class declarations provide the user with the ability to specify strongly
typed interfaces. If you change the interface it is obvious - and clients
depending on the interface will fail to compile. Furthermore you can specify
separate interfaces to different kinds of clients. In particular, you can
provide one interface to the general public, another to implementors of
derived classes, and a third to yurself (see for example Alan Snyder's
paper for OOPSLA'86 or Barbara Liskov's formulation of the same ideas in
the OOPSLA'87 keynote address). Smalltalk has problems in this area.

There are of course many subtle dependencies that are not ameanable to
verification by static type checking such as ``f() must be called before g()''.
This is a problem in every language. C++'s constructors and destructors
and initialization rules provide a few features in this direction and
dynamic checking of all sorts of properties can always be done. I do,
however, strongly prefer declarative properties that can be checked before
execution to properties that are essential dynamic in nature and can only
be checked when running.

The snag with run-time checking is that it is notoriously difficult to
ensure that errors detected at run-time is handled in a reasonable manner.
Entering even the most advanced debugger after detecting a vector range
violation or a ``method not found'' is useless if there is no programmer
present.

So C++ can detect a large class of problems at compile time, but what can
it do once they are detected? This is the notorious and I think largely
misunderstood ``header file problem''. Smalltalk doesn't have it, once you
have decided what to change, you simply does it and keep running. The effect
of a change ``instantly'' affects the whole program.

Once you make a change in a C++ program you need to recompile. If you simply
make a change to a member function, in principle you need only to recompile
and link that function (and there are systems being built that does exactly
and only that). The problem comes when you make a change to something that
in part of an interface. Then you need to recompile the clients too.
That is the essential price you pay for having the static checking. I contend
that in all but the most trivial programs it is worth it. In this, I am backed
by evidence of actual compile, debug, and integration times of medium sized
(<500K lines of code) projects.

As I have often said before C++ needs a tool that determines the minimal
recompilation needed after a change and prefarebly an incremental compiler
and linker to minimize the work of doing this recompilation. A UNIX `make'
that recompiles the world because you changed a comma in a comment in a
heavily included header file simply isn't a suitable component in a C++
programming environment.

There is one C++ design decision that affects the amount of recompilation
that people usually fail to appreciate (I think largely because of the lack
of tools). The private part of a C++ class is part of the declaration of the
class itself and a change to it may force the recompilation of clients.
For example:
		class X {
			int a;
		public:
			f();
		};

		main ()
		{
			X x;
			x.f();
		}

If you change X to

		class X {
			int a;
			int b;
		public:
			f();
		};

main() will have to be re-compiled. The reason is that by declaring an automatic
(on the stack) variable the client main() required knowledge of the size of X
(the compiler needs that information to generate code for the function call and
return since that requires knowledge of the size of the stack frame).

Clearly that could be avoided (even in a C++ implementation) if we were
willing to use a smarter link environment or code generation strategy
(so that the layout of stack frames was handled dynamically), but that
would seriously hurt C++'s portability and its ability to fit into a
traditional environment.

Alternatively we could use the trick of never allocating class objects on
the stack. This is the strategy of Simula and most of its decendents.
The snag is that if we did that we would incur an overhead of two memory
management operations per function call and the cost of indirecting every
access to a class object. Measurements on Simula indicates that this cost
is at least a factor of 2 in run-time. Most of Simula's decendents pay
an even higher price. Accepting this overhead would imply giving up large
application areas to C, assembler, and Fortran. C++ was specifically designed
to preserve efficiency in this area. The apparant cost is recompilation time.

However, if you don't use these features, that is, if you don't declare
automatic or static variables of a type, if you don't have inline functions
that depend directly on the contents of the private part in a type, and if
you don't take the sizeof a type THEN the users of a type is insulated from
changes in the implementation of a type EXACTLY as in languages that always
use indirection in the access to class objects. For example, had main()
been written like this:

	main()
	{
		X* p = new X;
		p->f();
	}

it would have been unaffected by the change to X's representation - and it
would of course have incurred the run-time cost.

My contention is that often run-time cost matters and often it doesn't.
C++ serves you in both cases. However, the current C++ tools does not
help you sufficiently. Curiously enough I was building tools for finer
grain dependency analysis and tools for taking advantage of it 4 years
ago, It is not particularly hard, but the explosion of C++ use distracted
me. What you see here is not a design flaw in the C++ language but a
deficiency in the available support tools. This deficiency is finally
being remedied.

 > 2. Smalltalk-like languages are better tools for developing small
 > programs because of their massive built-in class libraries and their
 > more flexible (later, dynamic) binding which allows polymorphic types. 

Again I must disagree. Of course you can throw a small Smalltalk program
together to do many things that would be painful to build from scratch in
C++. However, the massive libraries and the wonderful program development
environment of Smalltalk is not something that C++ lacks because of some
inherent defect. Rather, Smalltalk is about 10 years older than C++ and
have had something like 100 times more effort and resources lavished on
its environment and libraries. I am in particular looking forward to
trying ParcPlace's Cynegy C++ program development environment. It has
the potential of bringing some of the Smalltalk expertise to bear on the
fairly well understood problems with C++'s (lack of) program development
tools/environment.

If you consider UNIX and MS-DOS to be C++ toolsets - imperfectly inherited
from C - C++ fares a bit better, but C++ can and will progress much further
in the programming environment and standard libraries areas. Where it comes
to using a traditional tool such as a data base system, an standard (grubby)
operating system interface, a Fortran engineering library, a C device driver,
etc. C++ has an edge. If your small program needs to run on a very small
machine, an unusual machine, or a mainframe, C++ has an edge. The ability
to coexist and ease of porting can be essential even for very small projects.

 > 3. Simula-like languages are better tools for medium size software
 > development because for these larger projects it is worth building
 > specialized class hierarchies for the particular system, yielding a
 > better match to conceptual structures and increased programmer and
 > run-time efficiency.

Here I was naturally sorely tempted to agree, but #3 also is an
oversimplification that could confuse the issues. It really also depends
on how you define ``medium''. If medium is defined as programs that it
take more than one person more than one year to build I agree, but here
of of course the choice of language/system can radically affect the perceived
size of the project. If the problem is a good ``fit'' for a Smalltalk
system on a suitably sized workstation you can see spectacular benefits
from using Smalltalk. However, if that fit isn't there using Smalltalk
could turn into an exercise in horrendous contortions.

For all sizes of projects it is essential to chose a suitable tool for the
problem. There is no perfect tool.

 > Now for the leap...

 > 4. The Smalltalk-like languages will be the better tools for large
 > software development because item #1 above will kick in to make the
 > project more manageable.

Well, I disagreed with #1 so I could rest my case here.

 > Opinions?

There is absolutely no basis for this conjecture and the absense of really
large projects successfully developed and supported in a Smalltalk-like
language is a contraindication. After 16 years of Smalltalk variants and
offshoots we should not be conjecturing on this point. Smalltalk has a
string of spectacular successes to its credit, but as far as I'm aware
large scale systems development isn't among them. Nor was it mentioned
among the things Smalltalk was supposed to be especially good at.

The Smalltalk immitations, hybrids, and commencial offshoots have consistently
been far inferior to ``the real thing''. 

For larger projects C++'s strong static type checking provides essential
benefits in the integration phase by providing mechanisms for documenting
interfaces in a way that can be mechanically checked. For many-person projects
the ability to plug exclusively dynamically checked together with great ease
and no (static) checking simply defers checking until after integration where
nobody is able comprehend the total system anyway.

This is the reason for all the incompatible plugs we have in the hardware
arena. You don't really want to be able to plug everything together. People
using an electric shavers appreciate not being able to plug them into high
power outlets.

The value of a Smalltalk like environment and superb debugging features is
greatly diminished when the system is operated by someone who does not
understand the total system, isn't allowed to change it on the fly, and
might not even be a programmer. Also, it still is (and I expect it will remain)
most unusual for a large system to fit on a (single) personal workstation.
The ST-style programming environments still appears to be primarily aimed
at serving a single user on a single system. Often a large software system
will have to serve a large user-community on a large (typically ugly and
often diverse) hardware base.

Also run-time efficiencies often matters when you start pressing against the
limits of the hardware (memory sizes, mainframe CPU's, network connections,
disc bandwith). Here C++ comes into its own.

LARGE system development is typically a mess. The state of the art is poor.
There is much that can be done and I think that OOP (in its various incarnations)
has a vital role to play, but we have a long way to go yet. I am convinced,
however, that statically typed interfaces that can provide a base for
documentation and design methods allowing relatively large numbers of people
to cooperate is an essential part in any remedy for LARGE system development.

Anyway, I consider the belief that there is exactly one right way of doing
things and especially the belief that there is exactly one right language
of doing it in and infantile disorder.

		``Anything works on small projects.''
			- James Coggins

Thanks, Jim, for posting these propositions. 

	- Bjarne Stroustrup