Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!agate!shelby!neon!craig
From: craig@Neon.Stanford.EDU (Craig D. Chambers)
Newsgroups: comp.object
Subject: Re: Examples of Multiple Inheritance?
Message-ID: <1990Dec22.220016.19624@Neon.Stanford.EDU>
Date: 22 Dec 90 22:00:16 GMT
References: <47353@apple.Apple.COM> <1990Dec17.223306.25756@Neon.Stanford.EDU> <PCG.90Dec22154707@odin.cs.aber.ac.uk>
Organization: Stanford University
Lines: 98

In article <PCG.90Dec22154707@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>On 17 Dec 90 22:33:06 GMT, craig@Neon.Stanford.EDU (Craig D. Chambers) said:
>
>craig> If you're concerned about space overheads of MI, other implementations
>craig> of MI than the one in C++ 2.0+ have low space overheads, comparable to
>craig> pre-MI C++ implementations (i.e. a word or two extra per object).
>
>This is not strictly true: the space overhead of MI in C++ 2.x when
>implemented without code thunks is not in the object, but in pointers to
>virtual member functions for the object's class. Also, multiple
>virtual inheritance also requires N-1 extra pointer words in the object
>for each suboject inherited virtually inherited N times.

I was talking about per-object space overhead, i.e. extra space in the
representation of an object over the representation of its components.
There is additional per-class overhead too (in C++, the virtual
function pointer arrays; in other languages, message lookup tables);
if a class has many instances, then the per-class overhead becomes
relatively small.

>craig> C++'s MI implementation is designed to trade away some space per
>craig> object to get faster message lookup
>
>No, the "problem" is that in C++, but for one case, all components of an
>object, whether inherited or not, are expanded inline, and functions are
>not dynamically overloaded by default.
>
>In the languages you mention functions are dyncamically overloaded by
>default and subobjects they are accessed by a very convenient extra
>level of indirection. The extra overhead of MI is thus absorbed in these
>two already existing overheads. As Dave Weinreb remarks in another
>article, there is an overhead of one pointer for every member of a CLOS
>object, for example.

I think this is a strange viewpoint.  C++ allows the programmer to
declare that certain things should be implemented more efficiently,
with a sacrifice in programming power and language simplicity.  One
thing you mentioned was allowing components to be referenced either by
pointer (what I would consider the normal case, and the only possible
case in other higher-level languages) or contained in-line
(sacrificing the ability to share the component, as well as making
certain other tasks of the run-time system, such as GC, harder to
implement efficiently (since you might get interior pointers)).  The
other thing you mentioned is that C++ allows the programmer to
explicitly distinguish virtual from non-virtual functions, with
non-virtual functions sacrificing language power in the name of
efficiency.

In my C++ programming, I rarely use in-line components (almost always
using pointers instead) and avoid non-virtual member functions.  In
these cases, C++ has the same "overhead" as the other languages we
talked about as far as 1 word per component for the pointer and extra
space to handle the dynamic binding of virtual functions.  The
programmer can explicitly opt to sacrifice language power to get some
space and time efficiency, but I wouldn't compare C++ to other
languages by assuming that the C++ programmers always performed these
optimizations.

Also, all class-based languages I know of embed the representation of
their superclasses into the subclass, so there's no overhead here
(sharing of the superclass is not an issue).  Self, a classless
language, forces the programmer to explicitly define how he wants to
handle parent prototype objects, by building the appropriate object
structures.  Our current style uses separate linked objects for each
ancestor's prototype (see the "Organizing Programs without Classes"
paper for more details), leading to extra space costs, but this is up
to the programmer.  He could avoid this chaining and manually copy
down "instance variable" declarations to save space.  We don't
normally do this because we aren't concerned that much about the small
space overheads involved.

The fact still remains, then, ignoring the "overhead" of pointers and
dynamically-bound message passing, that other implementations
typically use an extra word or two of space *per object*, independent
of whether MI is used or even a feature of the language.  MI C++, on
the other hand, uses more space *per object*.  C++ sacrifices this
space overhead to make sure that message passing is fast even in the
worst case (just an indirect procedure call, plus some extra pointer
arithmetic).  Other languages save space per object by using a slower
message passing implementation in the worst case (e.g. probing hash
tables/scanning inheritance graphs); they sometimes rely on other
techniques (e.g. in-line caching) to speed message passing in the
normal case.

Of course, post-MI C++ has more per-class overhead than pre-MI C++.
Virtual function pointer arrays are twice as big, and a single class
may have more than one of them.  But it's hard to compare per-class
costs with other systems that have redically different architectures.
In caching implementations, for example, the per-class cost can change
over time.

One final point: per-object overhead is more important that per-class
overhead because it affects the time to create new objects.  The
smaller an object is, the faster it is to create (fewer memory words
to initialize).  In a GC'd system, smaller objects also lead to less
frequent (or faster) GCs, further speeding the system.

-- Craig Chambers