Document number: N1385=02-0043
Programming Language C++
 
Peter Dimov, pdimov@mmltd.net
Howard E. Hinnant, hinnant@twcny.rr.com
Dave Abrahams, dave@boost-consulting.com
 
September 09, 2002

The Forwarding Problem: Arguments

Problem Statement

The general form of the forwarding problem is that in the current language,

For a given expression E(a1, a2, ..., an) that depends on the (generic) parameters a1, a2, ..., an, it is not possible to write a function (object) f such that f(a1, a2, ..., an) is equivalent to E(a1, a2, ..., an).

The problem has two sides: first, f must be able to accept an arbitrary argument list and forward it to E (relatively) unmodified, so that the meaning of E does not change as a result, and second, f must be able to return the result of E back to its caller.

This paper concentrates on the first aspect, argument forwarding.

Motivating Examples

Generic wrapper classes

It is sometimes necessary to wrap an instance of an arbitrary type T in a wrapper class, as shown:

template<class T> class wrapper: public wrapper_base
{
private:

    T t_;

public:

    // forwarding constructors

    wrapper(): t_() {}
    wrapper(a1): t_(a1) {}
    wrapper(a1, a2): t_(a1, a2) {}
    // ...

};

The goal might be, for example, to treat the wrapped instances polymorphically through wrapper_base.

It is clear that the constructors of wrapper need to forward an arbitrary list of arguments to T's constructor.

The Boost.Python library [5] uses this technique.

Generic factory functions

The expression new T(a1, a2, ..., an) returns a plain pointer of type T*. This is a common source of resource leaks, especially in the presence of exceptions, so the usual advice is to use factory functions returning a smart pointer. Writing class-specific factory functions quickly becomes tedious, and the solution is to employ a generic factory function that mirrors the semantics of new T(a1, a2, ..., an) but returns a smart pointer, std::auto_ptr for example:

template<class T> std::auto_ptr<T> auto_new()
{
    return std::auto_ptr<T>(new T());
}

template<class T> std::auto_ptr<T> auto_new(a1)
{
    return std::auto_ptr<T>(new T(a1));
}

template<class T> std::auto_ptr<T> auto_new(a1, a2)
{
    return std::auto_ptr<T>(new T(a1, a2));
}

// ...

Client code now uses auto_new<T>(a1, a2, ..., an) to defend against resource leaks. Again, the need to forward an arbitrary argument list to T's constructor is obvious from the example code.

Boost.Bind

The Boost.Bind library [1], a generalization of the standard binders, is able to take a function object as an argument, and create a derivative function object that calls the original. For example, the expression boost::bind(f, _2, _1) creates a function object g such that g(x, y) invokes f(y, x), and the expression boost::bind(f, 1, _1) creates a function object h such that h(x) invokes f(1, x).

The ability to forward arguments from the function object generated by boost::bind to the original function object is essential for the library to operate.

Boost.Lambda

Boost.Lambda [2], a superset of Boost.Bind, is an even more elaborate "function object factory". As such, it encounters the same problem.

As an example, the Boost.Lambda expression _1 << _2 generates a function object f such that f(x, y) is equivalent to x << y. Note that the first argument can be the integral constant 5, or it can be std::cout.

Criteria for Evaluating Solutions

Assuming a forwarding function f(a1, a2, ..., an) that calls g(a1, a2, ..., an), a general solution must have the following three properties:

Current Solutions

This section presents some of the (partial) solutions to the problem within the current language semantics. The examples used to illustrate the methods assume that we need to write a forwarding function f(a1, a2, a3) that calls g(a1, a2, a3).

#1: Non-const reference

This is the method currently employed by Boost.Bind and Boost.Lambda:

template<class A1, class A2, class A3> void f(A1 & a1, A2 & a2, A3 & a3)
{
    return g(a1, a2, a3);
}

Its main deficiency is that it cannot forward a non-const rvalue. The argument deduction creates a non-const reference, and the reference cannot bind to the argument. This makes innocent examples as

int main()
{
    f(1, 2, 3);
}

fail (violates C1).

As function objects typically take their arguments by dereferencing iterators, this approach works relatively well for Bind and Lambda; still, it is not a general solution, and some real-world iterators return an rvalue when derefenced (the Boost.Graph library [3] has such iterators, and the incompatibility between Bind and Graph has been reported as a problem in real code).

#2: Const reference

The problem with non-const rvalues can be solved by forwarding by const reference:

template<class A1, class A2, class A3> void f(A1 const & a1, A2 const & a2, A3 const & a3)
{
    return g(a1, a2, a3);
}

This method accepts and forwards arbitrary arguments, at the cost of always treating the argument as const. It is clear that this is not a general solution; when g accepts some of its arguments by a non-const reference, the forwarding will fail (violates C1 ). As an example consider the Lambda function object _1 << _2 mentioned above, when its first argument is std::cout.

This solution is typically used for constructor arguments; even then, some constructors take arguments by non-const reference.

An esoteric problem with this approach is that it is not possible to form a const reference to a function type, but this will be addressed by Core Issue 295.

#3: Const + Non-const reference

For single argument forwarders, it is possible to use a combined approach, providing both overloads:

template<class A1> void f(A1 & a1)
{
    return g(a1);
}

template<class A1> void f(A1 const & a1)
{
    return g(a1);
}

Compilers have had their disagreements about this overloading example for some time, sometimes claiming ambiguity, but the latest generation seems to have reached a consensus that the second template is more specialized than the first according to the partial ordering rules. Unfortunately, the much bigger issue with this approach is that the N-argument case would require 2N overloads, immediately discounting this as a general solution (violates C3). Our three-argument case is shown below.

template<class A1, class A2, class A3> void f(A1 const & a1, A2 const & a2, A3 const & a3)
{
    return g(a1, a2, a3);
}

template<class A1, class A2, class A3> void f(A1 & a1, A2 const & a2, A3 const & a3)
{
    return g(a1, a2, a3);
}

template<class A1, class A2, class A3> void f(A1 const & a1, A2 & a2, A3 const & a3)
{
    return g(a1, a2, a3);
}

template<class A1, class A2, class A3> void f(A1 & a1, A2 & a2, A3 const & a3)
{
    return g(a1, a2, a3);
}

template<class A1, class A2, class A3> void f(A1 const & a1, A2 const & a2, A3 & a3)
{
    return g(a1, a2, a3);
}

template<class A1, class A2, class A3> void f(A1 & a1, A2 const & a2, A3 & a3)
{
    return g(a1, a2, a3);
}

template<class A1, class A2, class A3> void f(A1 const & a1, A2 & a2, A3 & a3)
{
    return g(a1, a2, a3);
}

template<class A1, class A2, class A3> void f(A1 & a1, A2 & a2, A3 & a3)
{
    return g(a1, a2, a3);
}

#4: Const reference + const_cast

Another attempt to combine the good aspects of the two approaches is for f to take arguments by const reference, but pass them to g as non-const using const_cast:

template<class A1, class A2, class A3> void f(A1 const & a1, A2 const & a2, A3 const & a3)
{
    return g(const_cast<A1 &>(a1), const_cast<A2 &>(a2), const_cast<A3 &>(a3));
}

This method allows all valid g(a1, a2, a3) uses to work through f, too. Unfortunately, due to the way it discards the argument constness, it allows many invalid uses to work as well. For example, it allows a function g that takes a non-const reference to operate on const l- and rvalues, potentially attempting to modify them, invoking undefined behavior - a serious violation of C2.

Boost.Lambda allows this as an option.

In our opinion, the fact that library writers have to resort to such measures is an indication that the problem needs to be taken seriously.

Future Solutions

The main problem of the current "non-const reference" forwarding method is that non-const rvalues cannot bind to the deduced non-const reference. There are two possible language changes that can fix that: either make the reference bind to the argument, or deduce a const reference. This section enumerates the solutions that take advantage of these changes.

#5: Non-const reference + modified argument deduction

With a relatively small change to 14.8.2.1, it is possible to make A1 in the following snippet:

template<class A1> void f(A1 & a1)
{
}

int main()
{
    f(5);
}

to be deduced as int const, and not plain int. As a side effect, solution #1 now will (const-correctly) work for all argument types.

The upside of this approach are that it requires a small, isolated language change, that is relatively independent of the rest of the language, including the move proposal [4].

One downside is that the change breaks existing code:

template<class A1> void f(A1 & a1)
{
    std::cout << 1 << std::endl;
}

void f(long const &)
{
    std::cout << 2 << std::endl;
}

int main()
{
    f(5);              // prints 2 under the current rules, 1 after the change
    int const n(5);
    f(n);              // 1 in both cases
}

It is difficult to evaluate the impact of the proposed change. Overload sets that rely on the fact that A1 & will not bind to non-const rvalues are too fragile to be useful, as any other input is accepted. One might argue that the example code is already broken - the overload set f has different behavior for the literal 5 and for the constant n, and it is considered good programming style to avoid using unnamed literals as "magic constants". (Microsoft Visual C++ 6.0, a widely used compiler, actually prints 1 in response to the f(5) call, since literals are considered const-qualified.)

On the other hand, it is possible to accidentally create such an overload set in a program; combined with a proxy-based container, this leads to the following example:

// helper function in a header

template<class T> void something(T & t) // #1
{
    t.something();
}

// source

#include <vector>

void something(bool) // #2
{
}

int main()
{
    std::vector<bool> v(5);
    something(v[0]); // resolves to #2 under the current rules, #1 after the change
}

that we consider dangerously close to being typical. Do not be distracted by the side question of whether vector<bool> is a std::vector, or a standard container at all. vector<bool> is just a well-known example that uses proxy references for lazy evaluation, a technique that is being used in real world C++ code.

This kind of conflict can arise whenever a deduced non-const reference is used as an argument, a construct that is more widespread than most people think. As an example taken from the standard library, consider std::advance:

template<class InputIterator, class DistanceType> void advance(InputIterator & it, DistanceType dist);

Another downside is that rvalues are forwarded as lvalues (a common trait of all presented solutions except #7). In the current language this is rarely a problem, but in an extended language that features the ability to overload on "rvalueness" - required to support move semantics [4] - this forwarding solution may prove less than perfect (C1). Still, it would work in most cases, definitely a step forward from what we have now.

#6: Rvalue reference

The proposal to add move semantics to C++ [4] includes the concept of an rvalue reference to T, spelled T &&. A key property of this reference is its ability to bind to rvalues.

Using this new tool, we can invent yet another forwarding method:

template<class A1, class A2, class A3> void f(A1 && a1, A2 && a2, A3 && a3)
{
    return g(a1, a2, a3);
}

This approach is nearly identical to the previous solution, except that it forwards a non-const rvalue as a non-const lvalue to g. This enables some invalid uses of g to sneak through f, as a non-const reference in g can now bind to the rvalue passed to f. Non-const references are not allowed to bind to rvalues for a reason, as the absence of this rule leads to mistakes:

void incr(long & l)
{
    ++l;
}

void f()
{
    int i(5);
    incr(i);
}

The programmer expects i to have value 6 after the call to incr, but incr operates on a temporary rvalue of type long, and the changes are discarded.

The situation in our case is not that bad, as in the forwarding version of the problematic example shown below

void incr(long & l)
{
    ++l;
}

template<class A1> void fwd(A1 && a1)
{
    incr(a1);
}

void f()
{
    int i(5);
    fwd(i);    // correctly fails to compile
    fwd(1L);   // compiles, but shouldn't
}

the incr(a1) call fails to compile when fwd(i) is invoked, as a1 is an lvalue of type int. However, fwd(1L) works, since a1 is an lvalue of type long. Many still consider forwarding an rvalue where it otherwise would be rejected unacceptable (C2), but this is still better than the options we have under the current language semantics.

#7: Rvalue reference + modified argument deduction

A perfect forwarding function would forward rvalue arguments as rvalues; none of the methods described so far achieve this, since it is not possible to determine from within a function whether an argument was an l- or an rvalue.

Let us first consider the rvalue reference type, T &&, in more detail in the context of the C++ type system. What if T is itself a reference type? Template libraries have been known to create reference to reference types via typedefs or type manipulations. According to the proposed resolution to Core Defect Report 106 , ordinary references obey the equality T cv1 & cv2 & == T cv12 &, where cv12 is the union of cv1 and cv2. A similar reasoning can be applied to collapse two rvalue references into one: T cv1 && cv2 && == T cv12 &&.

But what about the mixed cases? What is the meaning of T cv1 & cv2 && (an rvalue of reference type)? One possibility is to consider an "rvalue of reference type" an lvalue, and this leads us to T cv1 & cv2 && == T cv12 &.

The next step is to modify the argument deduction to retain information about the "rvalueness" of the argument: when deducing against a template parameter of the form A1 &&, deduce A1 as a reference type when the argument is an lvalue, and a non-reference type otherwise. According to our T cv1 & cv2 && == T cv12 & rule, this effectively means that the type of the argument will be A1 & when the argument is an lvalue, and A1 && otherwise.

The final link in the chain is that, according to the move proposal, static_cast<A1 &&> creates an unnamed rvalue reference that is treated as an rvalue by the language.

Putting it all together, we have (one argument case shown for brevity):

template<class A1> void f(A1 && a1)
{
    return g(static_cast<A1 &&>(a1));
}

When f is invoked with an lvalue of type X, A1 is deduced as X &, and g receives static_cast<X &>(a1), i.e. an lvalue of type X. When the argument is an rvalue of type X, A1 is deduced as X, g receives static_cast<X &&>(a1), an rvalue of type X. Perfect forwarding.

Summary

The table below summarizes the various approaches. The four middle columns contain the argument type received by our forwarded-to function g, given that the forwarding function f receives an argument type listed in the header.

Forwarding method Non-const lvalue Const lvalue Non-const rvalue Const rvalue Problems Language Change Notes
#1: Non-const reference Non-const lvalue Const lvalue (fails) Const lvalue Fails for f(rvalue) to g(A), f(rvalue) to g(A const &) No Limited applicability; the best we can do currently
#2: Const reference Const lvalue Const lvalue Const lvalue Const lvalue Fails for f(lvalue) to g(A &) No Limited applicability
#3: Const + non-const reference Non-const lvalue Const lvalue Const lvalue Const lvalue Requires exponential number of overloads No Not a practical solution
#4: Const reference + const_cast Non-const lvalue Non-const lvalue Non-const lvalue Non-const lvalue Allows f(const lvalue) to g(A &), f(rvalue) to g(A &), f(const rvalue) to g(A &) No Works, but very unsafe
#5: Non-const reference + modified argument deduction Non-const lvalue Const lvalue Const lvalue Const lvalue Language change breaks existing code Yes Near-perfect forwarding in the absence of move semantics, adequate forwarding otherwise
#6: Rvalue reference Non-const lvalue Const lvalue Non-const lvalue Const lvalue Allows f(rvalue) to g(A &) Yes Slightly inferior forwarding compared to #5
#7: Rvalue reference + modified argument deduction Non-const lvalue Const lvalue Non-const rvalue Const rvalue None known Yes Perfect

Conclusion

The practical problem of forwarding arbitrary argument lists has no good solution in the current language. The main obstacle is the inability to bind a non-const reference to a non-const rvalue, an issue that is addressed by the proposal to add support for move semantics to C++. This makes the two problems closely related, and the best solution to the forwarding problem depends on whether, and to what extent, the changes required to support move are incorporated into C++.

If the move proposal is accepted, the preferred approach to address forwarding is #7, followed by #6 and #5, in that order. Otherwise, #5 remains the only possibility.

References

[1] Boost.Bind library, Peter Dimov, http://www.boost.org/libs/bind/bind.html

[2] Boost.Lambda library, Jaakko Järvi, Gary Powell, http://www.boost.org/libs/lambda/doc/

[3] Boost.Graph library, Jeremy Siek, Lie-Quan Lee, Andrew Lumsdaine, http://www.boost.org/libs/graph/doc/

[4] A Proposal to Add Move Semantics Support to the C++ Language, Howard Hinnant et al, document number N1377=02-0035

[5] Boost.Python library, Dave Abrahams et al, http://www.boost.org/libs/python/doc/

--- end ---