Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sdd.hp.com!apollo!williams_j
From: williams_j@apollo.HP.COM (Jim Williams)
Newsgroups: comp.lang.c++
Subject: For Philosophers and Lawyers
Message-ID: <519f23cc.19260@apollo.HP.COM>
Date: 17 May 91 19:30 GMT
Organization: Hewlett-Packard Apollo Division - Chelmsford, MA
Lines: 136

I would like to throw out some ideas for discussion.  Hopefully these
concepts have not already been discussed to death.


--------------------------- Example 1 ------------------------------------------

class A;

class B;

B* f(A* ap) {return (B*)ap;}

--------------------------------------------------------------------------------


The compiler accepts the above code with no problems.  But how can the complier
cast an A pointer to a B pointer?  The compiler doesn't know anything (yet)
about A or B.  Is A derived from B?  Is B derived from A?   If B is virtually
derived from A, the cast in f() is illegal, but the compiler, not knowing if
B is virtually derived from A, signals no error.  Does this make sense?

Actually the problem derives from the fact that type casting of pointers to
classes can have two VERY different meanings.  The first is simply change the
type without changing the binary representation.
The second is as the primary means for expliting the polymorphic nature of
class objects.  It is an anachronism from single inheritance C++ that these
two meanings had identical implementation.  Unfortunately they now have
different implementations but still have the same syntax.

The following example exibits this ambiguity in a much more subtle way.

--------------------------- Example 2 ------------------------------------------

class A {
     int take_up_space;
     };

class C;    // this declaration is required so B::next() can be defined

class B {
     B* next_b;                            // this is a pointer to another B;
     public:
     C* next() {return ( (C*)next_b ); }   // this function accesses next_b,
     };                                    // and casts it.

class C: public A, public B {};

main() {
     C* cp;
     // ....
     cp=cp->next();                        // This statement doesn't work right!
     //...
     }

--------------------------------------------------------------------------------

The cast from a B* to a C* requires subtracting an offset since B is offset by
the size of A within C.  In the above example, the compiler doesn't subtract
the offset because it doesn't know that C is derived from B at the time that
the inline function next() is defined.

This can be easily fixed by changing the order of declarations.

--------------------------- Example 2a (fixed) ---------------------------------

class A {
     int take_up_space;
     };

class C;    // this declaration is required so B::next() can be defined

class B {
     B* next_b;                            // this is a pointer to another B;
     public:
     C* next();                           // this function accesses next_b,
     };                                   // and casts it.

class C: public A, public B {};

inline C* B::next() {return ( (C*)next_b ); }    // move this declaration

main() {
     C* cp;
     // ....
     cp=cp->next();                        // Now it works!
     //...
     }

--------------------------------------------------------------------------------

This whole scenario raises the question in my mind as to wheather enough
attention was paid in the design of the language to casting pointers to
base types to pointers to derived types.  This seems like something that
many language users would want to do.

There would be a simple way to implement type safe casting
of pointers to classes.  The compiler could assign a unique type identifier
for each class type (perhaps the address of a static member).  Then
at the end of the virtual table, it would add a list of ordered pairs
consisting of class type identifiers and the corresponding offset that
needed to be added to the pointer to cast to that type.  This scheme 
may not be the best way, but it is one way to accomplish this.

When a cast needed to be done at run time, the list would be searched to 
find the needed offset.  If the type identifier is not found in the list,
then the cast is illegal and a run time error or exception would be signaled.

This would allow casting from pointers to base to pointers to either
non-virtual or virtual derived types.  It could also allow for casting
between unrelated types provided that the object pointed to was derived
from both types.

There are a number of problems with this.  Because of inefficiencies, it could
make sense to have a compiler switch to turn off checking.  Often an unchecked
cast can be done far faster than a checked cast.  Also this scheme assumes 
the existance of a virtual table when in fact there may not be one.  The
compiler has no way of knowing if a given class will require a virtual table
for casting.  The cost of adding a virtual table and virtual table pointer
to every class would make that an impractical solution.

Possibly the best compromise would be to use the virtual table if it is there.
If there is no virtual table, do the cast without checking if possible, and
issue a compile time warning.  If the cast can't be done without a virtual
table and there isn't one, issue a compiler error.  The programmer can
always add a member function "virtual void do_nothing() {}" to force the
creation of a virtual table if it is necessary.

If it is desired to simply change the type without changing the binary
representation, the following syntax could be used  "bp = (B*)(void*)ap;".
If type checking could be done, it seems within the spirit of the language
to allow implicit as well as explicit casting from pointers to base to 
pointers to derived types.

I'm sure there are other problems that I havn't thought of, but this area
seems worthy of thought and discussion.