Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!usc!apple!voder!tolerant!procase!roger
From: roger@procase.UUCP (Roger H. Scott)
Newsgroups: comp.lang.c++
Subject: Re: Chameleon objects (calling virtual functions from constructors)
Keywords: virtual functions, constructor functions.
Message-ID: <47479d37.12160@espol>
Date: 7 Dec 89 09:01:00 GMT
References: <11266@csli.Stanford.EDU>
Reply-To: roger@procase.UUCP (Roger H. Scott)
Organization: proCASE Corporation, Santa Clara, CA
Lines: 128

In article <11266@csli.Stanford.EDU> Neil%teleos.com@ai.sri.com writes:
>When a class has a constructor and virtual functions, cfront1.2 generates
>code within the constructor function to set up the virtual function pointers.
>If a class is derived from such a base class, and one of the virtual
>functions is called from the base class constructor, the wrong virtual
>function is called, since during the execution of the base class constructor,
>the virtual function pointer table is set up as for the base class,
>and is not changed to show the derived class virtual functions until
>the derived class constructor is entered.

This is one of the really nasty unsolved problems in C++.  As much as I hate
this behavior, I have to admit that it is really the only correct behavior
for virtual functions.  The function that is called as a result of a virtual
call is determined by the dynamic type of the object, and it seems pretty
clear that the dynamic type of an object executing T::T() is T, regardless
of any subclassing.  The analogous thing holds true during destructors -
during the execution of T::~T() the dynamic type of an object is demoted to T.

Perhaps what is needed here is a "finalization" function that is automatically
invoked by the compiler immediately after normal construction of an object.
The programmer could declare this function to be virtual in the base class
and then redefine it in derived classes to "finalize" the object in the
appropriate way(s).  The syntax below is not a suggestion:

    class Base {
    public:
	Base(); // do invariant Base stuff
	virtual !Base(); // [finalizer] do Base variant of finalization
	...
    };

    class Derived : public Base {
    public:
	Derived(); // do invariant Derived stuff
	!Derived(); // do Derived variant of finalization
	...
    };

    ...
    Base *p = new Derived; // p = (tmp = new Derived, tmp->!Base(), tmp)
    ...

I'm not at all thrilled with the prospect of Yet Another Language Extension,
so here's an approach that works in 2.0 C++ as-is:

    // constructors are private so you won't "forget" to finalize ...
    class Base {
	Base(); // do invariant Base stuff
    protected:
	virtual Base *Finalize(); // do Base variant of finalization
    public:
	static Base *New() {return (new Base)->Finalize();}
	...
    };

    class Derived : public Base {
	Derived(); // do invariant Derived stuff
    protected:
	Base *Finalize(); // do Derived variant of finalization
    public:
	static Derived *New() {return (Derived *)(new Derived)->Finalize();}
	...
    };

    ...
    Base *p = Derived::New();
    ...

[Digression #1]
By the way, an advantage to using static member functions for public
construction rather than C++ constructors is that static member functions are
[more nearly] first-class entities in C++ than constructors - you can take
their address and treat (pointers to) them as variables.  Such is not the
case with T::T().
    
    typedef Base *BaseMaker();

    // Create a Base (or a subclass of Base) and use it ...
    void makeABaseAndDoSomethingWithIt(BaseMaker *makebase) {
	...
	Base *b = (*makebase)();
	...
    }

    void foo() {
	makeABaseAndDoSomethingWithIt(&Base::New);
	// The cast in the following line should not be necessary -
	// see (***) note following.
	makeABaseAndDoSomethingWithIt((BaseMaker *)&Derived::New);
    }

[Digression #2 - for Language Lawyers only]
(***) Note:
"Derived *(*)()" [pointer to function returning pointer to Derived]
should be type compatible with "Base *(*)()" [pointer to function
returning pointer to Base].  These types were compatible in 1.2.
AT&T maintains that these are incompatible for the same reasons that
"Derived **" is incompatible with "Base **", but the two cases are
*not* analogous - there is no danger of "unsafe" things happening
in the former case.  It is not as if you could assign to the
"object" pointed to by a pointer-to-function and thus alter what
will be returned when that p-to-f is called through.

Genuine unsafe example:
    Base *IPointToABase = new Base;
    void f(Base **pp) {
	*pp = IPointToABase; // BECAUSE YOU *CAN* DO THIS ...
    }
    Derived *IPointToADerived = new Derived;
    void g() {
	Derived **mypp = &IPointToADerived;
	f(mypp); // ... YOU *CAN'T* DO THIS, FOR FEAR OF ...
	Derived *dp = *mypp; // ... GETTING A "Base *" HERE!
    }

Bogus pseudo-analogy:
    Base *IReturnABase() {return new Base;}
    void f(Base *(*pf)()) {
	 *pf = IReturnABase; // BECAUSE YOU *CAN'T* DO THIS ...
	  ...
    }
    Derived *IReturnADerived() {return new Derived;}
    void g() {
	Derived *(*mypf)() = &IReturnADerived;
	f(mypf); // ... YOU *SHOULD* BE ABLE TO DO THIS, SECURE
		 // IN THE KNOWLEDGE THAT ...
	Derived *dp = (*mypf)(); // ... THIS CAN'T YIELD A "Base *"!
    }