Xref: utzoo comp.lang.misc:7194 comp.object:2976 comp.lang.eiffel:1481
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!usc!apple!agate!stanford.edu!neon.Stanford.EDU!hoelzle
From: hoelzle@neon.Stanford.EDU (Urs Hoelzle)
Newsgroups: comp.lang.misc,comp.object,comp.lang.eiffel
Subject: Re: CHALLENGE: typing and reusability
Message-ID: <1991Mar31.215721.18687@neon.Stanford.EDU>
Date: 31 Mar 91 21:57:21 GMT
References: <22032@yunexus.YorkU.CA> <11820:Mar1923:59:3591@kramden.acf.nyu.edu> <19MAR91.22493670@uc780.umd.edu> <18271:Mar2013:19:1091@kramden.acf.nyu.edu> <1991Mar20.214231.3411@neon.Stanford.EDU> <jls.669591628@rutabaga> <1121@tetrauk.UUCP>
Organization: Computer Science Department, Stanford University, Ca , USA
Lines: 127

NOTE: the following discussion is relatively long.  If you don't want to 
read all of it, skip to the "executive summary" at the end of the article.

Several people (e.g. paj@mrcu (Paul Johnson), rick@tetrauk.UUCP (Rick
Jones) and bertrand@eiffel.UUCP (Bertrand Meyer)) have argued that the
types A and B in my example should be related since they have common
behavior.  With this change, they showed how the problem can be solved
using multiple inheritance.

This is indeed the case, but my case rests on the claim that this is
not achievable in the Real World if we assume a high degree of
code reuse [I originally posted the challenge because someone had
claimed that static typing has absolutely no effect on reusability.]

Here's the scenario on which my claim is based:

1) You haven't written A and B yourself but bought them from someone.
   That is, you cannot change A or B to inherit from a common base
   class.  I consider this as an absolutely unavoidable problem (more
   justification below).

   There is a good objection why dynamic typing wouldn't help:

|> If A and B had completely independent designers then even in Smalltalk
|> it is unlikely that the two classes would have this commonality:
|> different names and different arguments would probably have been
|> chosen.

   The solution to this is to subclass A and B (with classes SubA and
   SubB) in order to define the necessary "interface glue". In a
   statically typed language, SubA and SubB would both inherit from a
   common superclass (FooDisplayable).

   So far, so good - no real disadvantages for statically-typed languages.

2) Unfortunately, the solution above only works well for leaves in the
   inhertiance tree.  If A already had subclasses C and D, it would be
   very tedious to subclass all of these, too.

   More importantly, it only works for "first-level imported types", but
   not for "second-level" types such as List[A]: in typed languages,
   you cannot pass a List[SubA] in place of a List[A].  Thus, every
   piece of functionality which has second-level arguments *cannot be
   reused*.  I contend that such cases are frequent (since "container
   classes" (collections) are very useful.)

   Bertrand Meyer argues that this problem (exemplified by someProc in
   my example) would not occur:

|> Of course, this means that B to must be a descendant of SPECIFIC,
|> and Mr. Hoelzle's point is probably that this is inappropriate since
|> some_proc will only be called with arguments of type LIST [A]. However
|> the problem does not arise in this way in ``pure'' object-oriented 
|> programming of the Simula/Eiffel (and, unless I am mistaken, Smalltalk) style.
|> In this world some_proc would not take an argument l of list type,
|> as in Mr. Hoelzle's statement of the problem given above, but
|> would be a procedure in a list class. 

   But this is just an artifact of my simplified example (I didn't
   want to include additional classes), and there are plenty of
   scenarios where some object takes a list of objects as one of the
   arguments of a function (e.g. scheduler.doSomething(list_of_processes)).
   Thus I still contend that the problem will show up very frequently:
   not every function which takes a list should be defined in a list
   class (actually, most of them probably shouldn't).

3) The Real Problem (C) doesn't show up until you actually try to do
   this in the Real World (TM).  If you do indeed have a high degree
   of code reuse, you basically compose your new program out of
   existing pieces of functionality.  Typically, there will be dozens
   if not hundreds of these components in your program, selected from
   a library of thousands.  That is, reuse is very fine-grained.

   In this world, the "impedance mismatch" caused by the problems
   outlined in 1) and 2) becomes very annoying since the main job
   of a new program is to pass objects (and often collections of
   objects) from one subpart to another, and static typing hinders
   this communication because of the type system's restrictions.

   Furthermore, it is extremely unlikely that the designers of the
   thousands of predefined pieces of functionality have foreseen every
   possible commonality between them and factored the inheritance
   hierarchy accordingly.  In fact, I claim that there will always be
   different (and equally valid) conceptual views (and thus type
   hierarchies) of some domains.

   Therefore, we must assume that situations where the types A and B
   are unrelated (do not share the "right" supertype) are the rule
   rather than the exception, even in a well-structured world.

   As Paul Johnson correctly observes, type hierarchies must be very
   fine-grained in order to achieve true reusability.  Most pieces of
   functionality depend only on subsets of broader interfaces, and it
   is impossible to foresee all combinations of such subsets at design
   time.  And even if it were possible, the subtyping problems
   outlined above would make it hard to take advantage of the full
   potential for reusability.  Johnson's "sibling-supertype rule" would
   help here, but it still exhibits the problems with "second-level
   types" shown in 2).  Furthermore, the introduction of numerous
   "impedance-matching" types could impair programming productivity
   ("one gets buried in paperwork").

   It is here where dynamically-typed languages have a subtle but
   important advantage.  All pieces of functionality truly (per
   definition) depend only on that part of their argument's interface
   which they actually use.  Thus, programs written in such languages
   do not experience "impedance mismatch" and can achieve maximal
   reuse with minimal effort; the languages do not prevent reuse
   because of technicalities (i.e. restrictions in the type system).
   [Of course, the downside of dynamically-typed languages is that
   they also do not prevent abuse.  But this is a philosophical issue
   orthogonal to this discussion, and I don't want to start another
   language war ;-)]

Executive summary: dynamically-typed languages facilitate the
communication between objects, whereas today's statically-typed
languages tend to create an "impedance mismatch" which can impair
communication.  In a world of high reusability, creating a new program
is mainly the task of coordinating the communication of existing
parts.  Therefore, this subtle advantage of dynamically-typed
languages can make all the difference in determining the degree of
reusability which can be achieved.

-- 
------------------------------------------------------------------------------
Urs Hoelzle                                            hoelzle@cs.stanford.EDU
Center for Integrated Systems, CIS 42, Stanford University, Stanford, CA 94305