N4471
Revision of N3602
2015-04-12
Mike Spertus, Symantec
mike_spertus@symantec.com

Template parameter deduction for constructors (Rev. 2)

This paper proposes extending template parameter deduction for functions to constructors of template classes. The clearest way to describe the problem and solution is with some examples.

Suppose we have defined the following. vector<int> vi1 = { 0, 1, 1, 2, 3, 5, 8 }; vector<int> vi2; template<class Func> std::mutex m; unique_lock<std::mutex> ul(m, std::defer_lock); class Foo() { public: Foo(Func f) : func(f) {} void operator()(int i) { os << "Calling with " << i << endl; f(i); } private: Func func; mutex mtx; }; Currently, if we want to instantiate template classes, we need to either specify the template parameters or use an inconsistently-named "make_*" wrapper, leverage template parameter deduction for functions, or punt completely: pair<int, double> p(2, 4.5); auto t = make_tuple(4, 3, 2.5); copy_n(vi1, 3, back_inserter(vi2)); // Virtually impossible to pass a lambda to a template class' constructor for_each(vi.begin(), vi.end(), Foo<???>([&](int i) { ...})); lock_guard<std::mutex> lck(foo.mtx); lock_guard<std::mutex, std::unique_lock<std::mutex>> lck2(foo.mtx, ul); // Notation from N4470 auto hasher = [](X const & x) -> size_t { /* ... */ }; unordered_map<X, int, decltype(hasher)> ximap(10, hasher); There are several problems with the above code:

Creating “make functions” like make_tuple is confusing, artificial, extra boilerplate, and inconsistent with how non-template classes are constructed.
Since the standard library doesn't follow any consistent convention for make functions, users have to scurry through documentation to find they need to use make_tuple for tuples and back_inserter for back_insert_iterators. Of course their own template classes might not be as consistent or thoroughly-documented as the standard.
Specifying template parameters as in pair<int, double>(2, 4.5) should be unnecessary since they can be inferred from the type of the arguments, as is usual with template functions (this is the reason the standard provides make functions for many template classes in the first place!).
If we don't have a make function, we may not be able to create class objects at all as indicated by the ??? in the code above.
Make functions can't be used with classes that aren't movable like std::lock_guard.
The useful technique of replacing a large function with a class by organizing its code into methods to reduce cyclomatic complexity can't be used for template functions

If we allowed the compiler to deduce the template parameters for constructors of template classes, we could replace the above with:

pair p(2, 4.5);
tuple t(4, 3, 2.5);
copy_n(vi1, 3, back_insert_iterator(vi2));
for_each(vi.begin(), vi.end(), Foo([&](int i) { ...})); // Now easy to do
auto lck = lock_guard(foo.mtx);
lock_guard lck2(foo.mtx, ul);
unordered_map<X, int> ximap(10, [](X const & x) -> size_t { /* ... */ });

We believe this is more consistent and simpler for both users and writers of template classes, especially if we were using user-defined classes that might not have carefully designed and documented make functions like pair, tuple, and back_insert_iterator.

The Basic Idea of the Deduction Process

We propose to allow a template name referring to a class template as a simple-type-specifier in two contexts:

Functional-notation simple type conversions ([expr.type.conv], and
Simple-declarations of the form "decl-specifier-seq id-expression initializer".

In the case of a function-notation type conversion (e.g., "tuple(1, 2.0, false)") or a direct parenthesized or braced initialization, the initialization is resolved as follows. First, constructors and constructor templates declared in the named template are enumerated. Let Ci be such a constructor or constructor template; together they form an overload set. A parallel overload set F of function templates is then created as follows: For each Ci a function template is constructed with template parameters that include both those of the named class template and if Ci is a constructor template, those of that template (default arguments are included too) -- the function parameters are the constructor parameters, and the return type is void Deduction and overload resolution is then performed for a synthesized call to F with the parenthesized or braced expressions used as arguments. If that call doesn't yield a "best viable function", the program is ill-formed. Otherwise, the template name is treated as the class-name that is obtained from the named class template with the deduced arguments corresponding to that template's parameters.

Let's look at an example: template<typename T> struct S { template<typename U> struct N { N(T); N(T, U); template<typename V> N(V, U); }; }; S<int>::N x{2.0, 1}; In this example, "S<int>::N" in the declaration of x is missing template arguments, so the approach above kicks in. Template arguments can only be left out this way from the "type" of the declaration, but not from any name qualifiers used in naming the template; i.e., we couldn't replace "S<int>::N" by just "S::N" using some sort of additional level of deduction. To deduce the initialized type, the compiler now creates an overload set as follows: template<typename U> void F(S<int>::N<U> const&); template<typename U> void F(S<int>::N<U> &&); template<typename U> void F(int); template<typename U> void F(int, U); template<typename U, typename V> void F(V, U); (The first two candidates correspond to the implicitly-declared copy and move contructors. Note that template parameter T is already known to be int and is not a template parameter in the synthesized overload set.) Then the compiler performs overload resolution for a call "F(2.0, 1)" which in this case finds a unique best candidate in the last synthesized function with U = int and V = double. The initialization is therefore treated as "S<int>::N<int> x{2.0, 1};"

Note that after the deduction process described above the initialization may still end up being ill-formed. For example, a selected constructor might be inaccessible or deleted, or the selected template instance might have been specialized or partially specialized in such a way that the candidate constructors will not match the initializer.

The case of a simple-declaration with copy-initialization syntax is treated similarly to the approach described above, except that explicit constructors and constructor templates are ignored, and the initializer expression is used as the single call argument during the deduction process.

Challenges

While we do not need to cover every possible case with this feature because one can still explicitly specify parameters or use a factory function, there are some challenges that were not adequately analyzed and/or addressed in the initial proposal.

As Richard Smith has pointed out, some implementations of the STL define their value_type typedef as follows template<typename T, typename Alloc=std::allocator<T>> struct vector { struct iterator { typedef T value_type; /* ... */ }; typedef iterator::value_type value_type; /* ... */ }; The detour through vector<T>iterator makes it unclear how to deduce that T is char in a constructor call like vector(5, 'c'). We would certainly like constructors like that to work, which can be addressed in several ways.

Do nothing. The first option is to just “ignore” the issue, which doesn't actually ignore it. §23.3.6.1 [vector.overview] standard currently defines value_type to be T. If this proposal were to be accepted as above without changes, defining value_type to be iterator::value_type would no longer satisfy the “as if rule” and the typedef would need to be changed to match the standard. Fortunately, this would break neither API or ABI, so it may be a very simple way to avoid this issue. We will get feedback from library vendors before the Lenexa meeting about the impact of this change on the standard library. to iterator::value_type
Trace primary templatesSince the proposal already calculates the overload sets based on the primary template of the class it is constructing, the above algorithm could look into the primary definition of vector<T>::iterator just like it does in vector<T> and again reject the ill-formed match in the case where, say, a specialization results in no such constructor being instantiable with the candidate types.
Typed constructors See below

There are a number of natural factory functions that are not deducible in the above framework. For example, one could imaging a function make_vector defined as follows:

template<typename Iter>
vector<Iter::value_type> make_vec(Iter b, Iter e) {
  return vector<Iter::value_type>(b, e);
}

There is no constructor in vector from which we can deduce the type of the vector from two iterators. Again, it may be ok to ignore the issue as the general feature retains its value even if it does not cover this case. Our goal has never been to deduce template parameters for all constructor calls (e.g. vector{}). Still, it would be appealing and natural to deduce the template parameter when constructing a vector from two iterators. See the section on typed constructors below for a way to do this.

We also need to consider the interaction with injected class names. For example, in the following code, the differing types produced by the two calls to vector<v1.begin(), v1.end()> can be surprising:

vector<X> v1 = {/* ... */};
auto v2 = vector(v1.begin(), v1.end()); // We want v2 to be a vector<X>
template<typename T> struct vector {
  void foo() { 
    auto v3 = vector(v1.begin(), v1.end()); // v3 should be vector<T> to avoid breaking change
  }
}

Again, we should discuss whether a change is required as the template parameters could always be specified explicitly. If we do want to deduce them, we suggest allowing something like the following in analogy with Concepts Lite.

vector<X> v1 = {/* ... */};
auto v2 = vector(v1.begin(), v1.end()); // We want v2 to be a vector<X>
    
template<typename T> struct vector {
  void foo() { 
    auto v3 = vector(v1.begin(), v1.end()); // v3 should be vector<T> to avoid breaking change
    auto v4 = vector<auto>(v1.begin(), v1.end()); // v4 is vector<X>
  }
}

This example begs the question of whether we should always put in auto when deducing constructor arguments. We believe that would be a bad choice as the injected class case is rather rare, and it is hard to imagine that people would be any happier coding things like tuple<auto, auto, auto, auto, auto>(1, 2, 3, 4, 5) than they would like saying make_tuple<auto, auto, auto, auto, auto>(1, 2, 3, 4, 5). However, it could be a useful optional feature that could be applied to existing function templates as well (as illustrated by the make_tuple example above).

Typed constructors

One idea suggested by Richard Smith is to create a notation to allow constructors to specify their template parameters. This creates a fine-grained control that makes it easy to specify constructors like the ones above without needing to change the typedefs in the STL or trace through other templates. Possible notations for this are

template<typename T, typename Alloc = std::allocator<T>> struct vector {
  // Option 1: Typed constructor in primary template only
  template <typename Iter> vector<iter::value_type>(Iter b, Iter e);
};
// Option 2: Typed constructor given globally
template<typename Iter> vector<typename iterator_traits<Iter>::value_type>(Iter b, Iter e);
template<typename Iter> vector(Iter b, Iter e) -> vector<typename iterator_traits<Iter>::value_type>

Note: It is only necessary to declare typed constructors. Giving a definition is not allowed as they just construct their return type from their arguments according to normal rules.