Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!rex!ames!excelan!unix!quintus!arisia!sgi!shinobu!odin!delrey!shap
From: shap@delrey.sgi.com (Jonathan Shapiro)
Newsgroups: comp.lang.c++
Subject: Re: Variable sized objects
Message-ID: <1881@odin.SGI.COM>
Date: 9 Dec 89 21:59:16 GMT
References: <1262@amethyst.math.arizona.edu>
Sender: news@odin.SGI.COM
Organization: Silicon Graphics, Inc., Mountain View, CA
Lines: 98

There is a lot of misunderstanding about the purpose of operator
new(). Let's see if I can help straighten some of it out.

There are three reasons for wanting to allocate your own storage for
objects, and they are distinct.

1. Multiple Arenas

The new version of operator new() is useful for an environment with
multiple heap arenas.

2. Collection Initialization Control

Operator new is also useful in managing collections, wherein you wish
to allocate some number of contiguous fixed-size objects, then
reallocate later and ensure that only the *new* objects are
constructed.  Consider the following example:

   myClass *cp = new myClass[10];  // allocates AND CONSTRUCTS 10 of them
   ... do some computation ...
   ... decide to realloc ...

   myClass *cp2 = new myClass[newSize];
   for(int i = 0; i < 10; i++)
      cp2[i] = cp[i];

   delete cp;
   cp = cp2;

There are three problems with this code.  First, many more
constructions are done than are necessary or desirable.  Second, it
depends on the user getting operator= right, which they most likely
didn't.  Third, it calls some destructors.  If myClass does reference
counting, lots of things will break.

Consider the following alternative:

    #include <new.h>
    myClass *cp = new myClass[10];
    ... decide to realloc ...
    {
	myClass *cp2 = new char[sizeof(myClass) * newSize];
	(void) memcpy(cp2, cp, sizeof(myClass) * 10);
        // construct only the new ones
	(void) new(cp2[10]) myClass[newSize - 10];
	cp = cp2;
    }

No destructors are called, and only the new items are reinitialized.
This is a case that is handled well by the new variant of operator
new().

3. Abuse of the construction mechanism.

This is the case of allocating truly variable-sized objects.  It is
not addressed by operator new(), nor should it be.  The existing
constructor technology does *not* support this concept.  The closest I
can find a way to come that works properly is as follows:

   class VarObject {
     public:
       VarObject(int bytesize);
   } ;

   VarObject *
   buildVarObject(int bytesize)
   {
       void *p = new char[bytesize];
       return new(p) VarObject(bytesize);
   }

Since one can always allocate the variable-sized component on the
heap, the objective is simply to eliminate the extra dereference.
There are several good reasons not to do this.

First, this object cannot be built on a stack, because it's length
isn't known to the compiler.  The semantic implications of a heap-only
object aren't clear.

Second, the implementation of such objects tends to be convoluted to
later readers.

Finally, the savings obtained tends to be *very* small.  If you only
plan to access a single element in the variable sized portion, the
extra load probably doesn't matter, and if you plan to iterate through
it, you probably want to load the base address of the variable portion
anyway for efficiency.

The real issue is the (largely specious) argument about the cost of
malloc().  If you are truly concerned about the cost of doing the
malloc within new, arrange for new to be overloaded so that you can
allocate and manipulate your own arena.  This is a much smarter
strategy than trying to abuse the mechanisms that are present, and
puts the complexity in a place where the rationale for the complexity
is clear.

Jonathan Shapiro
Silicon Graphics, Inc.