Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!rpi!nyser!cmx!dl
From: dl@cmx.npac.syr.edu (Doug Lea)
Newsgroups: comp.lang.c++
Subject: Re: named return values
Message-ID: <1867@cmx.npac.syr.edu>
Date: 20 Aug 89 13:03:50 GMT
References: <6453@columbia.edu> <7874@ardent.UUCP>
Reply-To: dl@cmx.npac.syr.edu (Doug Lea)
Organization: Northeast Parallel Architectures Center, Syracuse NY
Lines: 127


Jerry Schwarz contemplates whether an optimizing C++ compiler should be
allowed to optimize a function like u(),

    class X { ... X(X&) { ... } ... }; 

    X u() {		// unnamed return value
	X y ;
	...
	return y ;
	} 

into that which would be produced via a function like n()

    X n() return x	{ // named return value
	... x ... 		    
	}

The question is whether a compiler may *legally* skip the X(X&)
constructor to get the value of y out of function u by constructing
and dealing with y as if y itself were the return value, thus, in this
case at least, automating some of the efficiency benefits of
named return values. 

Jerry and I have indeed been through a few exchanges on such points.
To summarize some of the discussions:

The C++ reference manual does not enumerate exactly those conditions
under which X(X&) will or will not be invoked, and further, does not
impose any restrictions upon user definitions of X(X&).  Therefore, as
I demonstrated in a posting last spring, one can write an X(X&)
constructor that possesses arbitrary side effects, and the results of
the corresponding program will differ depending on whether you use a
compiler that does invoke X(X&) in situations like getting the return
value out of function u() versus one that does not.  This has some
pragmatic interest since various C++ compilers `elide' out X(X&)
constructors in some circumstances but not others.

The use of named return values happily and successfully evades this
issue, at least with respect to return values, so I didn't bother to
get into it in my previous NRV postings.  But some kind of resolution
of this part of the story could be had by tightening up the
description of X(X&) semantics just enough to better reflect the
general assumptions that are or could be made inside of cfront, g++,
and probably all other C++ compilers. Jerry and I once came up with
something along the lines of

    Given any class X, and any X object `obj', when another object
    with the value of obj is required, C++ compilers may optionally
    use obj instead of a new object, `newobj', constructed via X(obj)
    if obj is not subsequently modified differently than newobj during
    newobj's potential lifetime.  Similarly, compilers may in such
    cases use any other X object constructed via X(obj), but never
    since or subsequently modified differently than obj during obj's
    lifetime.  X(X&) constructors should be defined in such a way that
    these optional compiler-generated actions do not change the
    meanings of programs.

To be more complete, this description would have to be accompanied by
an emumeration of those cases where any such value/object is
`required'.  The current reference manual appears to list these,
although not all in one place, or in these terms. 

The idea is to spell out those situations in which a compiler may safely
(1) alias and/or reuse objects (as in the case of a local and a
return value) rather than forcing copies via X(X&), and 
(2) invoke X(X&) (say, into a register) even in cases where it is not
explicitly needed if it somehow discovered that this would improve 
overall efficiency.  

The reference manual currently touches on some of these issues in
terms of `generating temporaries', leading one to believe that it is
describing case (2) here, in situations where it is really referring
to unexploited instances of case (1), i.e., the fact that compilers
don't have to guarantee that they will optimize out logically
required, but obviously unnecessary X(X&)'s.

Part of the reason that all this is controversial is that, both from a
safety perspective and in order to enable more aggressive optimization
(safety, correctness, and optimizability are almost always just
different ways of looking at the same problem), it would be nice to
banish *all* side effects from X(X&) constructors. However, it does
not seem at all desirable to disallow `innocuous' side effects, such as
reference counting manipulations.  Additionally, a case can be made
that programmers should be allowed to write non-innocuous-side-effect
laden X(X&)'s and to have a way to disable optimizations in order to
force X(X&) constructors whenever they are logically or explicitly
required, in a way analogous to how ANSI C `volatile' variables are
handled, although I am no longer convinced that this is especially
worthwhile.

The important practical implication is that programmers should write
X(X&) constructors that build faithful copies of their arguments
without any kinds of side effects that would cause programs to behave
differently depending on whether the constructors were actually called
or not when they are logically required.  Of course, it is impossible
for a compiler alone to prove to itself whether X(X&) side effects are
considered by the programmer to be innocuous or not, or even whether a
compiler-generated or programmer-defined X(X&) really does make a
`correct' copy, so any kind of restriction on X(X&) is mostly
unenforceable by compilers.

Again, these semantic issues do not impact one way or another my other
arguments for the utility of named return values. In particular, they
do not alter the fact that named return values provide a deterministic
guarantee that X(X&) will NOT be invoked, whereas, even with further
refinement of X(X&) semantics, the question of whether any compiler
actually can and does optimize out all return-based X(X&) constructors
remains a non-trivial compiler implementation matter.

Why is return-value X(X&) optimization hard for a compiler to guarantee?
Consider a function with lots of local X's, and lots of conditionals,
loops, inline and non-inline function calls, returns, etc. It would
take analyses at the edge of current state of the art compiler
technology to discover which variable(s) could safely be aliased to
the return value slot, and how to manage the slot. I am all in favor
of compiler implementors adding such features to C++ compilers.  But
even with these kinds of analyses, it seems naive to believe that a
compiler could always arrive at code that would always be as good as
that specified in a simple and natural manner by a programmer using
named return values. Permitting such control over how ones code gets
compiled is surely within the spirit of C and C++.


Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu              or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl