Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!think.com!hsdndev!husc6!ngo From: ngo@tammy.harvard.edu (Tom Ngo) Newsgroups: comp.std.c++ Subject: Final draft of ~const proposal (2.2) Message-ID: Date: 26 Mar 91 03:41:18 GMT Sender: news@husc6.harvard.edu Distribution: comp Organization: Harvard Chemistry Department Lines: 761 As you may know, in mid-December I posted a proposal to extend C++ by adding the ~const type-specifier. I have received a lot of good advice--thank you. Here is the final draft, which will be mailed to X3J16 in a few days. I would appreciate hearing any criticisms. If you support this proposal, your going-over with a fine toothed comb will make it more likely to be accepted. If you do not support it, I would be very interested to know why! I would like to save the committee as much time as possible by working out ramifications and addressing potential objections. This version is quite different from any previous version, as I have rewritten it from scratch, changing a few basic positions and working in a number of the generalizations that were proposed on the net. It is in LaTeX, and for me it comes out to 10 pages. If you feel inclined to comment, I would appreciate hearing from you in the next couple of days, as I hope to catch the next X3J16 extensions group mailing. Thanks in advance, --Tom Ngo ngo@tammy.harvard.edu =========================================================================== % Proposal for ~const extension to C++ % $Revision: 2.2 $ $Date: 91/03/25 23:08:48 $ \documentstyle[fullpage,11pt]{article} \newcommand{\CC}{C$++$} \newcommand{\this}{\protect\verb|this|} \newcommand{\const}{\protect\verb|const|} \newcommand{\volatile}{\protect\verb|volatile|} \newcommand{\virtual}{\protect\verb|virtual|} \newcommand{\inline}{\protect\verb|inline|} \newcommand{\static}{\protect\verb|static|} \newcommand{\nconst}{\protect\verb|\~\/const|} \newcommand{\nvolatile}{\protect\verb|\~\/volatile|} \newcommand{\nvirtual}{\protect\verb|\~\/virtual|} \newcommand{\ninline}{\protect\verb|\~\/inline|} \newcommand{\nregister}{\protect\verb|\~\/register|} \newcommand{\typespec}{{\it type-specifier}} \newcommand{\cvqual}{{\it cv-qualifier}} \newcommand{\cvquals}{{\it cv-qualifiers}} \newcommand{\pro}{{\tt [+]}} \newcommand{\con}{{\tt [-]}} \newcommand{\public}{\protect\verb|public|} \newcommand{\private}{\protect\verb|private|} \newcommand{\ie}{{\it i.e.\/}} \newcommand{\eg}{{\it e.g.\/}} \begin{document} \title{Proposed extension to \CC: \nconst} \author{J. Thomas Ngo\\$<$ngo@tammy.harvard.edu$>$} \date{March 23, 1991 (DRAFT)} \maketitle \begin{abstract} This document is presented to the ANSI committee for the standardization of \CC\ (X3J16), and is intended for consideration by the extensions group. I propose the addition of a new \typespec, \nconst. Only members of a class may be specified as \nconst. A data member that is specified as \nconst\ can be modified even if the object of which it is a part is specified as \const. A member function that is specified as \nconst\ can modify any data member in the object for which it is called, even if that object is specified as \const. The primary reason for introducing the \nconst\ specifier would be to provide a way to specify how the bitwise representation of an abstractly \const\ object might change, entirely via specifiers in the class declaration. In the first section I define the \nconst\ specifier in the context of the {\it Annotated \CC\ Reference Manual} (ARM) and explore its ramifications. In the second and third sections I describe additional suggestions that should be considered ancillary to the \nconst\ proposal. These potential enhancements, which emerged from discussions of \nconst\ on the Usenet group {\it comp.std.c++}, include generalizations of \nconst-like syntax, namely \nvolatile\ and \nvirtual. \end{abstract} \section{Core proposal} \subsection{Motivation} One often needs to modify the bit representation of an object without changing what the object represents. This is usually done for efficiency, practicality, or improved encapsulation. The purpose of the language construct proposed here is to improve the ability of \CC\ to accommodate such situations in an elegant and expressive manner. The distinction between {\em bitwise} and {\em abstract} constness is essential. I draw from a valuable personal communication from Brian Kennedy $<$bmk@csc.ti.com$>$: A \const\ object is {\em bitwise const} if the program will not modify its bit representation. It is {\em abstractly const} if the program will not modify what the object represents, but might modify its bit representation. The compiler will, for the benefit of the programmer, enforce the abstract constness of such an object, as defined by the programmer through the constness of its member functions. (Brian informs me that he is drafting an X3J16 proposal to clarify the effects of casts from \const. An important goal in his proposal will be to define the circumstances under which an object can be bitwise const. In this document I have avoided that question, except where it is affected by the proposed construct.) The currently accepted way to alter the bit representation of an object that is abstractly \const\ is to cast away the \const\ attribute as required. It has been pointed out that such casts can be useful in that they do call attention to code that can change the bitwise representation of a \const\ object. Furthermore, I am told that efforts are underway to define clearly the effects of casting away \const. However, the value of permitting casts from \const\ has been debated, and it is clear that the mechanism leaves much to be desired in the way of expressive capability. This proposal outlines an alternative to casting away \const, using a new \typespec\ called \nconst. Since this new specifier can coexist with casts from \const, it is not necessarily suggested that such casts be disallowed. Rather, it is hoped that the \nconst\ mechanism will provide an alternative, more elegant way to specify in what ways the bitwise representation of a \const\ object can be altered. \subsection{Definition in terms of the ARM} \begin{enumerate} \item Following ARM 7.1.6 [Type Specifiers], a \typespec\ can be: \begin{verbatim} simple-type-name ... const volatile ~const \end{verbatim} The \nconst\ specifier may appear in the \typespec\ of a nonstatic, nonconst, nonpointer member, whether it is data or a function. \item Following the same section, each element of a \const\ array is \const. Each nonfunction, nonstatic, nonpointer member of a \const\ class object is \const\ unless it is specified as \nconst. \item Following ARM 8 [Declarators], a \cvqual\ can be: \begin{verbatim} const volatile ~const \end{verbatim} The \nconst\ specifier may appear in the \cvqual\ of a pointer that is a nonstatic member. And as usual, ``The \cvquals\ apply to the pointer and not to the object pointed to.'' (ARM 8.2.1) \item Following ARM 9.3.1 [The \this\ Pointer]: A \const\ member function may be called for \const\ and non-\const\ objects. The type of \this\ in such a member function of a class \verb|X| is \verb|const X *const|. A non-\const\ member function that is not specified as \nconst\ may be called only for a non-\const\ object. The type of \this\ in such a member function is \verb|X *const|. A member function that is specified as \nconst\ may be called for both \const\ and non-\const\ objects. The type of \this\ in either case is \verb|X *const|. \item The \nconst\ specifier does not create any new distinct types. It merely removes the \const\ attribute from \this\ under specific circumstances. Therefore the introduction of this specifier should require no new type matching rules, aside from the rules concerning \this\ that have been outlined above. \end{enumerate} \subsection{Examples} In my own code I have found a few examples in which it was necessary to cast away \const. In each case, the \nconst\ specifier would have been an appropriate alternative. \begin{description} \item[Self-resizing array.] The abstract data type that this class represents is an array of infinite extent. The elements of the array are held in contiguous storage. Enough memory is allocated to store only the initialized elements. When an attempt is made to access an element that would reside beyond the memory already allocated, the class calls a private method \verb|grow()|, which allocates new memory and copies array contents as necessary. This application calls for \verb|grow()| to be \nconst. That is, the old code \begin{verbatim} void DynArray::grow() const { DynArray *const t = (DynArray *const) this; // refer to buffer explicitly through t } \end{verbatim} would be replaced by \begin{verbatim} void DynArray::grow() ~const { // refer to buffer implicitly through this } \end{verbatim} \item[Internal buffer.] I have a Trie class which represents a collection of strings. When I want to retrieve any one of those strings, I need to write the result to a character buffer. I want the user to be able to retrieve a string from a Trie that is declared \const, but do not want him or her to have to worry about allocating the memory for the character buffer. Hence it is necessary for the buffer to be allocated and maintained by the Trie itself. Here a \nconst\ data member is appropriate to represent the character buffer. Thus: \begin{verbatim} class Trie { private: // [Data members for implementing the Trie itself] // ... ~const char *~const buf; // Could have made buf, bufsize ~const bufsize; // static, defeating example public: const char* operator () () const; // retrieves a string }; const char* Trie::operator () () const { // If necessary, reallocate buf and change bufsize. // Write the new string to buf. return buf; } \end{verbatim} \item[Secondary representation.] This is a trumped-up example, because my real example is too application-specific. Say you have an container class A. Based on your intended usage, you have decided that it is best to store the elements in unsorted form. But occasionally (very rarely) you will want to know the iUth, jUth and kUth elements of some A that has been declared \const. So you have decided to do some kind of sort, but only when it is needed. It is appropriate to store the order information in \nconst\ data members. \end{description} \subsection{Debate} Is it really worth adding this new feature to the language? Below, I have labeled each idea with \pro\ or \con, depending on whether in think it supports or detracts from the proposal. \begin{itemize} \item[\pro] Adding a \nconst\ specifier would introduce no new keywords, and would not break existing code. \item[\pro] The semantics of \nconst\ are simple: \nconst\ breaks the propagation of the \const\ attribute from an aggregate or object to its components. \item[\con] No program can be written using the \nconst\ specifier that cannot already be written using casts from \const\ instead. \item[\con] In the hands of the wrong programmer, the \nconst\ specifier could lead to sloppiness. \pro\ On the other hand, all of \CC\ relies on the neatness of the programmer. For instance, a careless programmer could easily declare all members of a class public and never use \const, except where constrained to do so by existing libraries. This is no reason not to implement \const. \item[\pro] The \nconst\ specifier decorates the class declaration with precise information about which data and function members might not preserve the bitwise representation of a \const\ object. To the compiler this might suggest opportunities for optimization. To the human reader it can express concisely what is happening in the mind of the class programmer with regard to constructs such as internal caches and buffers. \end{itemize} \subsection{Impact on casts from \const} Casts which remove the \const\ attribute have enjoyed considerable debate between purists and pragmatists; their existence impacts issues ranging from code readability to storage implementation. In the end it is generally agreed that they are here to stay in one form or another, if only because of the existing body of code that would break were they to be disfavored. I choose not to make a specific recommendation about the fate of casts from \const\ because decisions about their continued existence and the introduction of \nconst\ need not be tightly linked. Although the two constructs perform overlapping functions, they can easily coexist with each other. Nevertheless, the two issues are related, and during discussions on {\it comp.lang.c++} a number of observations were made. \begin{itemize} \item As presently worded, the ARM (5.4) states that the effects of casting away the \const\ attribute are implementation dependent. However, according to Brian Kennedy $<$bmk@csc.ti.com$>$, efforts are underway for the effects of such casts to be well-defined and implementation independent. Apparently Brian is drafting an X3J16 proposal to clarify the wording in the ARM. Therefore casts from \const\ should not be replaced on the grounds of being implementation dependent. \item Having said that, the knowledge that a program is free of casts from \const\ could expand a compiler's latitude to perform optimizations in an implementation independent manner (see next subsection). In particular, the compiler could propagate constants and be more free to use flavors of storage such as read-only or write-once memory. (Both of these benefits accrue from the fact that the \nconst\ specifier is available in the class declaration, not just in the implementation module.) Thus, compilers could provide optional compiler flags that permit meaningful optimizations, and under these options the effects of casting away \const\ would be implementation dependent. \item It is not clear whether there exist applications in which the cast from \const\ mechanism cannot easily be replaced by the \nconst\ mechanism. Whether or not such applications exist obviously affects the relationship between \nconst\ and casts from \const. \item Jim Adcock $<$jimad@microsoft.UUCP$>$ made a distinction between ``enabling'' and ``supporting'' a feature. In my mind, the distinction is mostly aesthetic but it also has some practical aspects. A language construct {\em supports} a feature if one can use that feature in a way that is elegant, easily maintained, and simple to compile and optimize. Otherwise, it merely {\em enables} the feature. (Here is an extreme example: a colleague claims that C enables object-oriented programming; after all, with an appropriate set of coding conventions and good discipline one can manually implement virtual function tables and other object-oriented constructs.) I agree with Jim in his statement that cast-from-\const\ merely enables the ability to alter the bitwise representation of a \const\ object, whereas the \nconst\ mechanism would support it. \end{itemize} \subsection{Impact on compiler implementation} In this subsection I explore in greater detail compiler implementation issues related to \nconst, namely optimization and choice of storage. As I am unfamiliar with compiler design, I have kept my remarks general. An object that is \const\ might be fully, partially, or not at all bitwise const. The bitwise const parts of an object may be subject to constant propagation, even across calls to \const\ methods. If their bit representations can be determined at compile time, it may be possible to place them in read-only memory. Otherwise, it may still be possible to place them in write-once or {\em discardable} memory---\ie\ virtual memory that is paged to disk only the first time after initialization. To my knowledge, calls to functions--- even ones whose arguments are all bitwise const---cannot be subject to common subexpression elimination because \CC\ has no facility for declaring that a function has no side effects.\footnote{It is noted that GNU \CC\ contains or has contained such a facility, but that it was incompatible with the language defined by the ARM.} In a program that is known to be free of casts from \const, the extent to which a \const\ object is bitwise const can be determined by inspection of the class declaration. A \const\ object of a class that contains one or more \nconst\ member functions is not at all bitwise const, since any of its \const\ member functions could contain a call to a \nconst\ method. If a class contains no \nconst\ member function, each data member that is not \nconst\ is bitwise const, unless that data member itself cannot be bitwise const. \subsection{Side note: constness can be subverted during a constructor} Reid Ellis $<$rae@utcs.toronto.edu$>$ brought up an issue that might need to be addressed more explicitly in the ARM. Consider: \begin{verbatim} class vulnerable { public: vulnerable(); int nonconst(); int subverter() const; private: vulnerable& vref; }; vulnerable::vulnerable() : vref(*this) {} const vulnerable v; v.subverter(); \end{verbatim} If \verb|vulnerable::subverter()| needs to alter a data member of \verb|v|, all it needs to do is refer to \verb|v| through vref, \eg\ \verb|vref.nonconst()|. This must be a violation of ARM 8.4.3 [References], which states: ``A reference to a plain \verb|T| can be initialized only with a plain \verb|T|.'' The problem is that while a \const\ object is being constructed, it is considered non-\const\ so that its data members can be initialized and manipulated. This provides a window of opportunity to initialize a non-\const\ pointer or reference that is durable, \ie\ will still be available after the constructor is finished. Perhaps the ARM ought to specify explicit rules regarding the initialization of non-\const\ pointers and references within a constructor. It is not enough to say that \verb|this| cannot be used to initialize such entities; at least one must also prohibit such initializations from components of \this. \section{Other possible uses of \nconst\ syntax (ancillary)} \subsection{Partial cast} The following item was suggested to me, but it causes problems. Unless these problems can be resolved, I would not suggest that it be included in the language. I am told that a possible extension to \CC\ is to have run-time type information, through a facility similar to \verb|typeof| in gcc/g++. The next few paragraphs use the syntax accepted by those compilers. Briefly, the \verb|typeof| facility permits referring to the type of an expression, and can be used any place that a type name could normally be used. Here is a simple example: \begin{verbatim} int x; const typeof(x) y; // y is a const int \end{verbatim} The suggested extension is to permit the use of \nconst\ to remove the \const\ attribute from an unknown type, as when \verb|x| happens to be a macro parameter: \begin{verbatim} #define nonconst(a) ~const typeof(a) nonconst(y) z; // z is a non-const int \end{verbatim} The semantics of \nconst\ in this context are very different from the semantics in the core part of this proposal. Here, \nconst\ removes the \const\ attribute from the type that it modifies. In the main part of this proposal, \nconst\ breaks the propagation of the \const\ attribute from an enclosing structure. This difference in semantics leads to ambiguity if (referring to the example above) \verb|z| happens to be a member of some class, {\it e.g.} \begin{verbatim} class Bar { ~const typeof(y) z; }; \end{verbatim} Do we mean for \verb|z| to have the type of \verb|x|, except with the \const\ attribute removed? Or do we mean for \verb|z| to be modifiable even if the Bar object of which it is a part happens to be \const? \section{Similar specifiers: \nvolatile, \nvirtual\ (ancillary)} During Usenet discussions of \nconst, four additional specifiers were suggested: \begin{description} \item[\nvolatile] Do allow optimization on this object, even if the enclosing object is declared \volatile. \item[\nvirtual] This method should not override any base class method, even if a virtual method with the same name and type signature exists. \item[\nregister] Do not enregister this variable, even when optimizing. \item[\ninline] Do make this method a real function call. \end{description} Of these four specifiers, all except \nregister\ were deemed reasonable to propose. Good arguments in favor of \nregister\ were not found. \subsection{\nvolatile} It was agreed that \nvolatile\ would provide little in the way of optimizability and nothing in expressiveness. However, it was pointed out that since the \const\ and \volatile\ specifiers are generally handled together in a compiler, it would be easier than not to define \nvolatile\ in a manner symmetrical to \nconst. Hence, following ARM 7.1.6 [Type Specifiers], a \typespec\ can be: \begin{verbatim} simple-type-name ... const volatile ~const ~volatile \end{verbatim} \ldots. Each element of a \volatile\ array is \volatile. Each nonfunction, nonstatic member of a \volatile\ class object is \volatile\ unless it is declared \nvolatile. And following ARM 8 [Declarators], a \cvqual\ can be: \begin{verbatim} const volatile ~const ~volatile \end{verbatim} \subsection{\ninline} Inline functions are inconvenient to debug using a source-level debugger. An \inline\ function must be defined in the header file of a class so that its body will be available to the compiler in every module in which it is invoked. To use a debugger with the function, one must remove the \inline\ specifier. Since this causes the function to have external linkage, one must move the function definition into the implementation module of the class so that it will be linked only once. This procedure would be greatly simplified were it possible to leave the function definition in the header file. For nonmember functions it is sufficient to change the \inline\ specifier to \static, thus giving the function internal linkage. (According to ARM 7.1.1, ``For a nonmember function an \inline\ specifier is equivalent to a \static\ specifier for linkage purposes.) However, for member functions, the \static\ keyword additionally removes the \verb|this| pointer from the parameter list, radically changing its meaning. Thus, it is proposed that ARM 7.1.2 [Function Specifiers] be modified so that a {\it fct-specifier} can be: \begin{verbatim} inline ~inline virtual ~virtual (see below) \end{verbatim} A member or nonmember function with the \ninline\ specifier has default internal linkage (\S3.3). It is guaranteed that the compiler will not inline calls to such a function. \subsection{\nvirtual} The purpose of \nvirtual\ would be to permit the compiler to diagnose a common {\it faux pas}. The programmer of a base class and the programmer of a derived class can unwittingly write methods with identical names and types. If the base class method is virtual, then the derived class method is placed, without warning, in the virtual function table. To make matters worse, if the base and derived class methods are declared \public\ and \private\ respectively, then the derived class method is unexpectedly \public\ when invoked through a call to the base class method. This follows from ARM 11.6 [Access to Virtual Functions]. The \nvirtual\ specifier would have the following definition. Adding to ARM 7.1.6 [Function Specifiers]: \begin{quote} Some specifiers can be used only in a function declarations. A {\it fct-specifier} can be: \begin{verbatim} inline ~inline virtual ~virtual \end{verbatim} The \virtual\ and \nvirtual\ specifiers may be used only in declarations of nonstatic member functions within a class declaration; see \S10.2. \end{quote} And rewording the first paragraph of ARM 10.2 [Virtual Functions], \begin{quote} A function \verb|vf| that is a member of a class \verb|base| is said to be {\em overridden} by a function \verb|vf| that is a member of a class \verb|derived| that is derived from \verb|base|, if the following conditions are met: \begin{enumerate} \item The types (\S8.2.5) of \verb|base::vf| and \verb|derived::vf| are identical. \item The function \verb|base::vf| is declared \virtual. \item The function \verb|derived::vf| is not declared \nvirtual. \end{enumerate} If all of these conditions are met, then a call of \verb|vf| for an object of class \verb|derived| invokes \verb|derived::vf| (even if the access is through a pointer or reference to \verb|base|). It is conceivable that a method could be declared \nvirtual\ \virtual. The two declarations do not clash; the \nvirtual\ specifier states that the method is not to be placed in the vtable of any base class, while the \virtual\ specifier causes a new vtable entry to be created. In cases of multiple inheritance, ambiguities can arise. For example: \begin{verbatim} class A { public: virtual void foo(); }; class B : public A { public: ~virtual void foo(); }; class C : public A, public B { public: void foo(); }; \end{verbatim} Does \verb|C::foo| override \verb|A::foo| or not? Applying the ambiguity rules described in ARM 10.1.1, \verb|C::foo| is not virtual since the nonvirtual function \verb|B::foo| dominates \verb|A::foo|. In general, for a given function \verb|derived::foo| the directed acyclic graph of base classes is searched for methods named \verb|foo| that have the same type as \verb|derived::foo|. If none is found, then \verb|derived::foo| is not virtual. If many are found and none dominates (\S10.1.1) all of the others, there is an ambiguity and the compiler should report an error. Otherwise, it must be that exactly one is found, or many are found but one dominates all of the others. If this function is \virtual\ then \verb|dervied::foo| overrides it; otherwise \verb|derived::foo| is not virtual. \end{quote} It might be useful also to provide a way to make an entire class \nvirtual. This would be tantamount to specifying that all of its member functions are \nvirtual, \ie\ that no base class virtual function should be overridden. By contrast with ARM 10.5c [Virtual Base Classes], this would be a property of the class, not of the derivation. I am not in favor of permitting base classes to be \nvirtual\ because of the potential for confusion with the semantics of virtual base classes. Finally, it was pointed out that a declaration with complementary semantics would permit compilers to catch the opposite error, in which a derived class method is declared \virtual\ with the intention of having it override a base class method with the same name and type signature, but a small type mismatch causes the derived class method to be given its own vtable entry---silently. For example: \begin{verbatim} class B { public: virtual void foo(); }; class D : public B { public: virtual void foo() const; }; \end{verbatim} In this case, calls to foo() through a pointer of type B* that points to an object of type D will invoke B::foo(), not D::foo(), as the programmer may have intended. Errors of this kind are generally caught only after laborious debugging. In an independent discussion thread, I have seen John Chapin $<$jchapin@neon.stanford.edu$>$ suggest that the \verb|catch| keyword be given the following usage: \begin{verbatim} class B { public: virtual void foo(); }; class D : public B { public: catch virtual void foo() const; // compile-time error }; \end{verbatim} It would be a compiler error for a method specified with the \verb|catch| keyword not to have a base class method which it properly overrides. \section{Flexibility of name} As an alternative to \nconst, the name \verb|!const| was suggested. Here are two reasons why I originally chose \nconst\ instead of \verb|!const|: \begin{itemize} \item \verb|!const| seems to connote that the specified member is merely Rnot constS. The semantics of my core proposal call for a more active {\em overriding} of constness that would otherwise be taken on by virtue of membership in an enclosing structure. I feel that this meaning is more closely suggested by \nconst\ (``destroy constness''?). \item People are used to seeing \verb|~| in declarations (\ie\ in destructors), whereas the idea of seeing \verb|!| is more foreign. \end{itemize} \section{Acknowledgments} Many thanks to all those who have criticized this proposal so thoughtfully and constructively. In particular, let me thank some major contributors: \begin{verse} Jim Adcock $<$jimad@microsoft.UUCP$>$ gave detailed opinions on a number of important issues, including the relationship with casts from \const, and the idea of discardable memory. Also, he is to blame for \ninline\ and \nvirtual.\\ Dag Bruck $<$dag@control.lth.se$>$ looked at several early versions, convinced me not to leave out \nconst\ member functions, and brainstormed on some possible alternative uses of the \nconst\ syntax.\\ Brian Kennedy $<$bmk@csc.ti.com$>$ explained to me the all-important distinction between bitwise and abstract constness, and shared parts of his X3J16 proposal to define more sharply the effects of casting away the \const\ attribute in terms of bitwise and abstract constness.\\ \end{verse} \end{document} -- Tom Ngo ngo@harvard.harvard.edu 617/495-1768 lab number, leave message