Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!rutgers!seismo!mcnc!unc!rentsch
From: rentsch@unc.UUCP
Newsgroups: comp.lang.misc
Subject: Re: OOPS
Message-ID: <814@unc.unc.UUCP>
Date: Sun, 8-Feb-87 01:09:25 EST
Article-I.D.: unc.814
Posted: Sun Feb  8 01:09:25 1987
Date-Received: Mon, 9-Feb-87 01:38:59 EST
References: <364@oracle.tc.fluke.COM>
Reply-To: rentsch@unc.UUCP (Tim Rentsch)
Distribution: comp.lang.misc
Organization: CS Dept, U. of N. Carolina, Chapel Hill
Lines: 243

An advocate of dynamic binding (that's me!) enters the fray.  What
follows, however, is not a rebuttal but an attempt at explanation
and presentation of a point of view.  Kurt, you will see that your
article has been reordered somewhat as I have tried to respond --
please bear with me to the end, and see what you think.

Note to potential Follow-uppers:  if someone has something
constructive to say, fine.  But please avoid the temptation to quote
a line or two out of context and reply debate style.  You will be
wasting our time and your breath.  I realize nits can be picked, but
the points I present here are made over paragraphs, not sentences.
Clear?

In article <364@oracle.tc.fluke.COM> Kurt Guntheroth writes:
> Also, I have repeatedly heard the claim that dynamically bound
> languages are "just as fast" as static languages.  OK, make me a
> believer.  Who says so?  

I will try.  Please excuse the rather roundabout path we travel to
the destination.  

Let us for the moment ignore the possibility of "optimizing away" the
actual call in a staticly bound procedure call (which possiblity
would only help the statically bound case).  Even so, one statically
bound procedure is faster than one dynamically bound procedure call
(on the average).  

The reason for this slowdown is that the dynamic binding takes an
extra step over static binding, which must (sometimes) be done
sequentially.  [The "sometimes" is also known as the "you can't guess
right all the time" provision -- which explains the "on the average"
from the previous paragraph.] 

So, on some level, the claim is false.  But, in another way the
claim is true, and more.  Please read on.


> I read a paper by a guy who said LISP could be optimized to a C-like
> level.  [My summary: the comparison was "apples and oranges".]

It seems inherently difficult to compare early-bound and late-bound
languages, and the LISP-C comparison author apparently fell into one
of the many problem areas.  Even so, it is worth *something* to know
that compiled LISP runs as fast as PCC [the C compiler in the
comparison], because then there is no reason to choose PCC over LISP
on the basis of performance (and presumably LISP wins on other
counts, but that's a different story).  That's good to know for
those out there who have no C but PCC and prefer LISP, but it still
does not show that LISP is as fast as any C (i.e., statically bound
language).  Still, let's keep going.


> One of the touted advantages of Smalltalk when it is called object
> oriented is dynamic binding.  Specifically you can send a "foo" message
> to any class.  The designer of the system insures that all relevant
> objects provide a method to "foo" themselves.  Naturally, if an object
> doesn't know how to "foo" itself, you get a message not understood
> error.  So here's the question...What has dynamic binding bought you?

Before answering this let me give some factual answers to the next
few questions.

> Don't you have to inspect something in the object or
> remember when you created an object what type it is so you know it can
> receive a "foo"?  

If by "you" you mean the interpreter, the answer is yes.  If by "you"
you mean the programmer, the answer is no.  The Smalltalk interpreter
does indeed inspect something, which is to say the class of the
object in question, said class indeed having been stored by the
interpreter when the object was created.  The programmer is not
responsible for either storing the class or inspecting it.

> Don't you have to know pretty much what object you are sending "foo"
> messages to?  

No, and let me give an example.  Suppose we want to pass a parameter
X which acts like a stream of Floats, specifically X responds to
the message 'next' and always returns a Float.  X could be any
Stream with Floats in it (Stream being one of the supplied classes
in Smalltalk), but X could also be an instance of class Random!
Now, Stream and Random have almost nothing in common (they happen to
share the 'next' message, and their only common superclass is
Object).  This example shows a useful case where we know essentially
nothing about the object X (only that it responds to 'next' and
returns a Float).

> Ignore the boring case where ALL your objects know
> how to "foo".  You might just as well have used a language with early
> binding and inheritance (CLU, Simula, ...)  

(1) Minor correction:  CLU's provision for inheritance is weaker even
than Simula's.  (2) Binding in Simula is not exactly late, but it's
not early either -- virtual procedures require indexing the virtual
procedure vector, and hence constitute a weak form of delayed
binding.  (3) It is important to distinguish the "boring" case where
all objects know how to 'foo' *the same way* (i.e., with the same
procedure) from the "less boring" case where all object know how to
'foo' but they all do it *differently*.  Note that the second
requires more dynamic binding than the first.


> Finally, I should say that I consider languages like CLU, Simula, HOPE, etc
> to be a major improvement on C.  Inheritance and information hiding are a
> necessary part of any state-of-the-art language.  I don't think the votes
> are all in on late binding yet though.

Side comment:  I detect a hint of confusion between static binding
and static typing.  At the risk of repeating myself I caution against
such confusion -- static type checking is perfectly compatible with
dynamic binding.  This compatibility does not, however, mean that
statically checked systems can be statically bound.  And, I apologize
if in fact no confusion indeed exists.  [And, glad to see more people
realizing the value of inheritance etc as an improvement over (ugh!)
C.] (End of side comment.)  

Dynamic binding is slow on current machines for the same reason that
indexing was slow in early machines:  there is no hardware assist to
speed it up.  Note that, even in current machines, indexing is not
free; even so, the (time) cost is so small, and the benefits so
large, that no serious machine architect of today would consider
leaving it out.

Dynamic binding is similar.  Late bound procedure calls will always
cost more (measured on a per-call basis) than early bound calls.  But
with appropriate hardware support, the overhead on late binding can
be reduced to a level where the benefits clearly outweigh the costs
(sorry, no argument, just my opinion -- but read on to the advantages
and see if you don't agree with me).  Continuing the parallel with
indexing, just because late binding is provided doesn't have to mean
early binding is prohibited -- just that early binding is used only
rarely, in only those cases where the last drop of speed is
important.  


> What has dynamic binding bought you?

We finally return to the $64 question.

The most immediately obvious consequence of Smalltalk's dynamic
binding is incremental compilation.  That is, I can start running
Smalltalk, pick an arbitrary procedure (Smalltalk calls them methods)
anywhere in the system, edit it, recompile and relink it into the
running system, with immediate effect everywhere, in only a few
seconds!  My canonical example when giving introductory Smalltalk
demos is floating point multiplication (Float *), which I usually
change to return the value 50, independent of the arguments.  Bang!
The change takes effect instantly.  Live with incremental compilation
for only a little while (a week, say), and you won't want to give it
up.  

A more subtle advantage of the particular brand of dynamic binding in
Smalltalk is insensitivity to certain kinds of difficulties which
other languages usually call "errors".  Consider the example I just
gave:  Well, that returned value was *integer* 50, not Float 50.0.
Even ignoring the fact of wrong result value, most systems would have
problems because the "shape" of the answer was wrong, i.e., integer
rather than floating point.  Smalltalk does misbehave slightly
because 50 is not the right answer, but it does NOT misbehave to the
extent of interpreting 50 as some incredibly ridiculous floating
point number.  This robustness results directly from dynamic binding
-- because which operation is invoked depends on the receiver's class
at the very last moment.  (If you have access to a Smalltalk system I
urge you to try this experiment yourself -- just comment out the
method definition for Float * and substitute ^50.  The results are
mildly amusing, and certainly not catastrophic.  Change the ^50 to
^100 to further illustrate the change.  Then (of course) change it
back to be the original method.)

Once you get used to writing code in typical Smalltalk style (some
would say object oriented programming), you find that your
procedures are naturally very polymorphic (Valley girls might even
say "maximally polymorphic" :-).  This polymorphism gives tremendous
leverage in terms of code reusability, and in other ways more
difficult to explain.  A true-life example:  I wrote a tiny but
intellectually complex set of procedures to perform tensor
multiplication.  Ever try to test a general tensor product?  There
are so many numbers it is very difficult.  But, by defining String *
(multiplication) to be concatenation, and String + (addition) to be
concatenation with a "+" in the middle, I have enough symbolic
manipulation machinery to test my routines symbolically!  All that
remains is to put together the example symbolic tensors, multiply
them together, and look at the result -- because the code for
multiplication depended only on the availibility of '+' and '*' as
operations on the tensor elements.  Recompilation of the tensor code
is not required, by virtue of dynamic binding.

Even given dynamic binding, full polymorphism is difficult when
constrained by completely static typing.  (Not all the votes on
static typing are in yet, either.  :-) [Please note the ':-)' and
don't bother to flame.]  If even just a part of your system is not
type checked, dynamic binding ala Smalltalk can turn messages like
'bus error -- core dumped' into 'message not understood'.  If even
ONE 'core dumped' message is changed into something more reasonable
(and recoverable), that's a win in my book.  This is another
advantage of dynamic binding.

Finally, and saving the best for last, dynamic binding can actually
be faster than static binding, not in the small but in the large.
The argument goes as follows:

Smalltalk style polymorphism supports 'factoring', the property of
one piece of code being in only one place.  Smalltalk's dynamic
binding allows the factored source code to be compiled into only
one (factored) object-code object, which is shared among all the
various "types" which use it.  Because of factoring, code growth is
more linear as a function of system functionality, which is to say
Smalltalk object code get larger less quickly than statically bound
systems.

At some point -- exactly where depends on what you are building and
what machine you are running on -- the smaller Smalltalk code will
run faster, not because it is inherently faster but because the
smaller system will fit in main memory of the machine on which you
are running.  This has long been recognized but rarely acknowledged
-- it is system performance rather than microscopic performance that
counts.  It doesn't take a large ratio of stuff on the other side of the
"access cliff" of disk memory before the smaller system starts to
run faster because the larger system is running at disk speeds
rather than memory speeds.  Remember, with main memory speeds of
today, disk runs 5 to 6 orders of magnitude slower.  And, 5 to 6
(decimal) orders of magnitude are quite a lot.

Factoring at the source code level can be accomplished by a variety
of language designs, including statically bound ones.  But,
factoring at the object code level, which is what is needed for
smaller systems, is better supported by dynamically bound languages.
So, for any given size of main memory, more highly factored systems
can pack more functionality in the space available.  I claim that
dynamic binding makes for more factored systems.


Well, that's it.  If I didn't convince you, I hope at least you have
more to think about.  

cheers,

Tim