Document number: N1690 (WG21) and 04-0130 (J16)
 
Howard E. Hinnant, hinnant@twcny.rr.com
Dave Abrahams, dave@boost-consulting.com
Peter Dimov, pdimov@mmltd.net
 
September 7, 2004

A Proposal to Add an Rvalue Reference to the C++ Language

Introduction

We propose a new kind of reference that binds to rvalues as well as lvalues, even if not const qualified. This addition to the language enables programmers to:

Summary of Rvalue-Reference Proposal

N1377 introduces a new reference termed the rvalue reference. An rvalue reference to A is created with the syntax A&&. To distinguish it from the existing reference (A&), the existing reference is now termed an lvalue reference. The new rvalue reference behaves just like the existing lvalue reference except where noted.

There are eight modifications/additions to the core language proposed in N1377, and summarized below:

  1. A non-const rvalue can bind to a non-const rvalue reference.
  2. Overload resolution rules prefer binding rvalues to rvalue references and lvalues to lvalue references.
  3. Named rvalue references are treated as lvalues. See N1377 for rationale.
  4. Unnamed rvalue references are treated as rvalues.
  5. The result of casting an lvalue to an rvalue reference is treated as an rvalue.
  6. Where elision of copy constructor is currently allowed for function return values, the local object is implicitly cast to an rvalue reference.
  7. Reference collapsing rules are extended from cwg issue 106.
  8. Template deduction rules allow detection of rvalue/lvalue status of bound argument.

Major Applications

Move Semantics

The rvalue reference can be used to easily add move semantics to an existing class. By this we mean that the copy constructor and assignment operator can be overloaded based on whether the argument is an lvalue or an rvalue. When the argument is an rvalue, the author of the class knows that he has a unique reference to the argument.

The overload taking an rvalue can modify the argument in whatever invariant preserving way it pleases and be guaranteed that the rest of the program will not notice!

For types that own resources, the ability to pilfer those resources from rvalues can result in tremendous (order of magnitude) performance increases. Additionally clients can cast an lvalue object to rvalue. This enables clients to tap into move optimizations when it is known that the lvalue source is no longer relevant in the program logic. See N1377 for several examples of move-optimized types, and even an auto_ptr-like smart pointer which is safe to put into containers.

Containers such as vector<T> can use T's move semantics internally as it inserts, erases and reallocates its elements. Member functions such as push_back can be overloaded on the rvalue reference so that heavy weight elements can be moved into instead of copied into the container.

Sequence modifying algorithms such as remove_if, random_shuffle, and sort can move elements around instead of copy them around, greatly reducing the expense of manipulating sequences of heavy weight objects.

Forwarding

Generic function adaptors such as bind1st, bind, and function transform one function signature into another by binding some of the parameters. Ideally those parameters left unbound in the transformed function will behave identically to the corresponding parameter in the original function. For example, a const lvalue reference should bind to both lvalue and rvalue arguments, and a non-const lvalue reference should bind to a non-const lvalue, but refuse to bind to rvalues and const lvalues.

One way to accomplish this is by overloading on the free parameter with both const and non-const lvalue references. This is the approach taken by binder2nd (for example) after LWG issue 109 is applied:

template <class Operation> 
class binder2nd 
    : public unary_function<...>
{ 
    ...
public:
    ...
    result_type operator()(const first_argument_type& x);
    result_type operator()(      first_argument_type& x);
};

However, as the number of free parameters grows (as in bind and function), this solution quickly grows impractical. The number of overloads required increases exponentially with the number of parameters (2N where N is the number of parameters).

This proposal provides perfect forwarding using only one overload, no matter how many free parameters exist. For example:

template <class Operation> 
class binder2nd 
    : public unary_function<...>
{ 
    ...
public:
    ...
    result_type operator()(first_argument_type&& x);
};

If first_argument_type is a non-const lvalue reference, then due to the reference collapsing rules the expression first_argument_type&& is also a non-const lvalue reference, and thus will not bind to rvalues. If first_argument_type is not a reference (i.e. pass-by-value), then the parameter first_argument_type&& will bind to rvalues.

Given some helper functions, forwarding and moving can be made more self documenting. Consider:

template <class T>
struct identity
{
    typedef T type;
};
 
template <class T>
inline
T&&
forward(typename identity<T>::type&& t)
{
    return t;
}
 
template <class T>
inline
typename remove_reference<T>::type&&
move(T&& t)
{
    return t;
}

Now clients can specify forward<T>(t) or move(t) depending on the intent of the function. The difference is somewhat subtle, thus the library helper functions help communicate the intent. The forward function will result in an lvalue if T is an lvalue reference, whereas the move function will always result in an rvalue reference. Template argument deduction is disabled for the forward function in order to force the client to specify the template argument. This is necessary to achieve the correct forwarding semantics.

Below is an example two-parameter factory function, demonstrating perfect forwarding:

template <class T, class A1, class A2>
shared_ptr<T>
factory(A1&& a1, A2&& a2)
{
    return shared_ptr<T>(new T(forward<A1>(a1),
                               forward<A2>(a2)));
}
 
struct A
{
    A(int&, const double&);
};
 
int main()
{
    shared_ptr<A> sp1 = factory<A>(2, 1.414);  // does not compile
    int i = 2;
    shared_ptr<A> sp2 = factory<A>(i, 1.414);  // ok
}

In the above example the first call to factory fails because the rvalue "2" can not bind to the int& in A's constructor. An lvalue must be passed instead. However, rvalues can be forwarded to A's second parameter. See N1385 for details on why this works.

Moving and Forwarding

Move semantics and perfect forwarding work together synergistically. A generic forwarding utility can forward to functors overloaded on rvalue/lvalue. And since the rvalue/lvalue characteristic of the client's argument is preserved, move semantics will be respected even through such forwarding. Using the forward and move helper functions, the following example shows how move and forwarding can work together in a self-documenting way:

template <class T, class A1>
inline
shared_ptr<T>
factory(A1&& a1)
{
    // If a1 is bound to an lvalue, it is forwarded as an lvalue
    // If a1 is bound to an rvalue, it is forwarded as an rvalue
    return shared_ptr<T>(new T(forward<A1>(a1)));
}
 
struct A
{
    ...
    A(const A&);  // lvalues are copied from
    A(A&&);       // rvalues are moved from
};
 
int main()
{
    A a;
    shared_ptr<A> sp1 = factory<A>(a);        // "a" copied from
    shared_ptr<A> sp2 = factory<A>(move(a));  // "a" moved from
}

Minor Problems Addressed

In addition to the major applications of move semantics and perfect forwarding, the rvalue reference also addresses several minor usability problems in current code. Below are a few examples where the rvalue reference can be used to improve the interface of current objects, yet these solutions are considered to be neither move semantics nor forwarding.

Use of Rvalue Streams

It is not uncommon for clients to want an easy way to open a log file, append a message to it, and then close it again, as conveniently as possible. Here is one possible approach:

void foo()
{
    {
        std::ofstream("Log file", std::ios::app) log_file;
        log_file << "Log message\n";
    }
    ...
}

However users strive for something more concise. A common first try is:

void foo()
{
    std::ofstream("Log file", std::ios::app) << "Log message\n";
    ...
}

This compiles, but does not do what the programmer intended. The inserter binds to the member const void* inserter, instead of to the non-member const char* inserter. And so the address of the literal is written to the file.

template <class charT, class traits = char_traits<charT> >
class basic_ostream
    : virtual public basic_ios<charT,traits>
{
public:
    ...
    basic_ostream& operator<<(const void* p);
    ...
};
...
template<class traits>
basic_ostream<char,traits>&
operator<<(basic_ostream<char,traits>&, char); // requires lvalue ostream

As a workaround, many people instead code:

void foo()
{
    std::ofstream("Log file", std::ios::app).flush() << "Log message\n";
    ...
}

The ".flush()" is a no-op that simply changes the rvalue ofstream into an lvalue.

With an rvalue reference we could modify the library's non-member const char* inserter to accept the stream by rvalue reference instead of by lvalue reference:

template<class traits>
basic_ostream<char,traits>&
operator<<(basic_ostream<char,traits>&&, char);

And now the common first try works as expected. There is no danger in binding an rvalue stream to the first parameter. It is convenient for memory based streams as well:

class A {...};
std::istream& operator>>(std::istream&& is, A& a);
std::ostream& operator<<(std::ostream&& os, const A& a);

Clients of A can now convert an A to and from std::string with simple one-liners:

A a;
std::istringstream("data for an A") >> a;
std::string s = (std::ostringstream() << a).str();

Improved shared_ptr Interface

Consider the shared_ptr constructor:

template<class T>
template<class Y, class D>
shared_ptr<T>::shared_ptr(Y* p, D d);

Currently there is a precondition that the copy constructor of D must not throw an exception. The reason is that if it does, p is leaked. The shared_ptr constructor is designed to agressively accept ownership of p, and delete it (call d(p)), even if the construction of the shared_ptr throws an exception. And requiring the copy constructor of D not to throw is the only weak link in that design. A superior design would not allow d to be copied before ownership of p was fully established.

Here is one possible redesign:

template<class T>
template<class Y, class D>
shared_ptr<T>::shared_ptr(Y* p, const D& d);

One problem with this is that when binding an rvalue D to the const&D parameter, the compiler may make a copy anyway, and thus there is a chance for an exception before p can be owned. Assuming that was fixed in the language, there is still another problem: The expression d(p) is allowed to require a non-const d. Thus internal to the shared_ptr constructor, one must make a copy of d before calling d_copy(p). And so you have the same problem as the pass-by-value solution: The copying of d may throw an exception before you can delete the pointer.

Here is another possible redesign:

template<class T>
template<class Y, class D>
shared_ptr<T>::shared_ptr(Y* p, D& d);

Now the shared_ptr can set up code to execute d(p) before acquiring any internal resources, and before copying d. The only problem with this solution is that passing an rvalue D to the shared_ptr constructor is a very common coding pattern. But that no longer works because that rvalue D will not bind to D&.

A perfect solution:

template<class T>
template<class Y, class D>
shared_ptr<T>::shared_ptr(Y* p, D&& d);

Using the rvalue reference, clients can pass (non-const) lvalues or rvalues of D into the constructor. The compiler does not make a copy of the lvalue or rvalue when binding to the parameter. Internal to the constructor, code is set up to execute d(p), without making a copy of d, and prior to acquiring any resources.

Improved random_shuffle

The current signature of std::random_shuffle is:

template<class RandomAccessIterator, class RandomNumberGenerator>
void
random_shuffle(RandomAccessIterator first, RandomAccessIterator last,
               RandomNumberGenerator& rand);

The reference parameter for the generator is to allow the internal state of the generator to propagate back to the calling program. This is fine except for when the client builds a temporary random number generator and does not care about its final state:

random_shuffle(c.begin(), c.end(), uniform_int<>(0, c.size()-1));

Declaring the third argument with an rvalue reference would allow clients to decide for themselves if the state of the random number generator after the shuffle is important to their program logic.

template<class RandomAccessIterator, class RandomNumberGenerator>
void
random_shuffle(RandomAccessIterator first, RandomAccessIterator last,
               RandomNumberGenerator&& rand);

Improved swap Interface

A well known idiom for reducing a vector's capacity to 0 is:

vector<A>().swap(v);

But this syntax is somewhat counterintuitive. Currently the member swap of vector is declared as:

void swap(vector&);

However, declaring it instead as:

void swap(vector&&);

would allow clients to instead write:

v.swap(vector<A>());

Overloading the namespace scope swap with rvalue references could also allow:

swap(v, vector<A>());

Alternative Forwarding Solutions

N1385 explores the problems associated with generic forwarding functions such as class factories and function binders/adaptors. Seven approaches to forwarding, both within the current language, and with language extensions, are analyzed in detail. Out of those seven approaches, four do not require language changes, but one of those four is not practical as the number of free parameters increases beyond 1 or 2. The remaining three using the current language either disallow forwarding rvalues, or forward them as lvalues. Two of the three approaches requiring language modifications do allow the forwarding of rvalues, without the danger of binding them to a non-const reference. But one of those approaches would break existing code. The remaining solution is that proposed herein.

Alternative Language Syntax for Move

Since N1377 was written, various alternative syntaxes for language support of move semantics have been explored. One of the most promising alternatives involved modifying the existing overload rules between pass by value, and pass by reference. For example consider:

void foo(const A&);  // 1
void foo(A);         // 2

Today these overloads are ambiguous. But it was proposed that they be made unambiguous with lvalues preferring 1, and rvalues preferring 2. One of the most disturbing aesthetically was that a move constructor would look like:

struct A
{
    A(A);  // move constructor
};

Some found the idea of a copy constructor taking its argument by value a little counter-intuitive.

A serious backwards compatibility issue was discovered which could silently break existing code:

struct A {};
 
template <class T> void f(T&);  // 1
void f(A);                      // 2
 
int main()
{
    A a;
    f(a);  2 now, 1 with the altered overload rules
}

And this alternative does not address the forwarding problem at all.

Library-Only Move Solution

N1377 gave a simple string example which is repeated here for convenience.

class string
{
public:
    // copy semantics
    string(const string& s);
    string& operator=(const string& s);
 
    // move semantics
    string(string&& s);
    string& operator=(string&& s);
    // ...
};

For clarity, only the interface is shown above. A library-only move solution is significantly more complicated. We will focus on the best library-only solution that has emerged since the publication of N1377. Shown below is the interface only:

class string
{
public:
    // copy semantics
    string(string& a);
    template <class T>
    string(T&, typename std::enable_if_same<T,
                        const string>::type* = 0);
 
    string& operator=(string& a);
    template <class T>
    typename std::enable_if_same
    <
        T, const string,
        string&
    >::type
    operator=(T& a);
    
    // move semantics
    string(std::rvalue_ref<string> a);
    string& operator=(std::rvalue_ref<string> a);
 
    operator std::rvalue_ref<string>()
        {return std::rvalue_ref<string>(*this);}
    // ...
};

The library solution relies on a couple of helper classes that would presumably be standardized into one of the library headers, else they would need to be reinvented on each use. Ironically the copy semantics portion of the string class is the most effected by the introduction of move semantics. Separate signatures are required for non-const lvalues and const lvalues. Each of these separate signatures would presumably forward to common copy logic. The move signatures use std::rvalue_ref<string> instead of string&&, and the underlying logic of these move operations would need to dereference a string* buried in the rvalue_ref object. Finally an implicit conversion to std::rvalue_ref is required.

In contrast, the language supported move semantics example is essentially an add on operation when adding move semantics to an existing class. The copy semantics of an existing class are not effected at all. The author of string can simply add the move signatures. And there is no extraneous conversion operator required. In a nutshell, the syntax required for a library only solution is daunting compared to a language supported solution.

In addition to the added complexity in the string interface for construction and assignment, there is additional complexity to be overcome in other public interface functions where you want to take advantage of the knowledge that you're dealing with an rvalue (and so you know you can pilfer it). For example N1377 shows how string+string can be optimized if it is known that one of the arguments to operator+ is an rvalue. This requires four overloads:

string operator+(const string&, const string&);
string&& operator+(string&&, const string&);
string&& operator+(const string&, string&&);
string&& operator+(string&&, string&&);

However, with the library solution shown above, you have not two possible arguments, but three: non-const lvalue, const lvalue, rvalue. And therefore you need nine overloads to cover this optimization:

string operator+(string&, string&);
 
template <class T>
typename std::enable_if_same
<
    T, const string,
    string
>::type
operator+(T&, string&);
 
template <class T>
typename std::enable_if_same
<
    T, const string,
    string
>::type
operator+(string&, T&);
 
template <class T>
typename std::enable_if_same
<
    T, const string,
    string
>::type
operator+(T&, T&);
 
string operator+(string&, std::rvalue_ref<string>);
 
string operator+(std::rvalue_ref<string>, string&);
 
template <class T>
typename std::enable_if_same
<
    T, const string,
    string
>::type
operator+(std::rvalue_ref<string>, T&);
 
template <class T>
typename std::enable_if_same
<
    T, const string,
    string
>::type
operator+(T&, std::rvalue_ref<string>);
 
string operator+(std::rvalue_ref<string>, std::rvalue_ref<string>);

This is clearly getting out of hand.

Finally it should be noted that no library solution exists today which can accomplish perfect forwarding.

As we move into an era where forwarding functions are increasingly important (std::tr1::bind, std::tr1::function, boost::lambda), the subject of perfect forwarding will also become increasingly important.

A Minor Error in N1377

N1377 describes a templated helper function called move which casts an lvalue to an rvalue so that clients do not need to read or write the A&& syntax:

template <class T>
inline
T&&         // error
move(T&& x)
{
    return static_cast<T&&>(x);
}

The return type of this function is incorrect. The reason is because of the modified template argument deduction rules associated with A&&. These rules state that when move() is bound to an lvalue, T will be deduced as an lvalue reference type (e.g. A&). And because of the extended reference collapsing rules, the expression T&&, with T being A&, simplifies to A&.

The fix is quite easy, and relies on a type trait already voted into TR1: remove_reference (which would of course need to be extended to support rvalue references).

template <class T>
inline
typename remove_reference<T>::type&&  // correct
move(T&& x)
{
    return x;
}

Now if T is deduced as an lvalue reference, the return type of move takes this into account, stripping off the reference, and only then making the return type an rvalue reference.