| Document #: | N4131 | 
|---|---|
| Date: | 2014-08-09 | 
| Reply to: | Filip Roséen <filip.roseen@gmail.com> | 
| Summary: | Arguments for not allowing return {expr}to call anexplicitconstructor. | 
explicitreturn-statementIf one were to agree with the contents of N4074, the following snippet should compile without diagnostics;
struct Type1 {
    explicit Type1 (int);
};
Type1 example_f1 () {
    return { 0 };
}
    The main arguments of N4074:
return { expr } cannot mean anything besides that we,
        explicitly, want to initialize the return-value.
      explicit
        initialization is allowed when writing the return-statement
        is redundant; both the compiler and, the developer, know what is going
        on.
      This paper will try to prove why the proposed change of ISO C++ in N4074 shouldn't be allowed using several methods, among them are:
explicit
      Marking a constructor as explicit is often equivalent of
      saying: "such initialization sure is possible, but it's potentially
      not what you want, if you really want to do this; go a head, but I won't
    let it happen without your explicit consent."
    
      If a developer would like to use our explicit constructor,
      we'd like him to go the extra mile and explicitly show us that
      this is the case. We'd like him to show some effort, and more
      specifically; consider if this is really what he wants...
      explicit constructors are, by the invisible contract involved,
      potentially dangerous.
    
// meaning-of-explicit.example.1
std::unique_ptr<T> func () {
  static T  x;
  return { &x };     // error: chosen constructor is explicit in copy-initialization
}
    
      There's no way for an implementation to force a developer to actual walk
      around the block every time he tries to initialize an object using an
      explicit constructor, instead we require him to explicitly
      state his request by writing out the type he'd like to initialize at the
      point where such initialization takes place.
    
"I'll refuse to do this unless you show some effort."
      N4074 will effectively make the previously described contract disappear
      in the context of return { expr }, which further
      means that we completely disregard the original intent expressed by the
      author of said constructor.
    
// meaning-of-explicit.example.1
std::unique_ptr<T> func () {
  static T  x;
  return { &x };     // compiles, but triggers undefined-behavior
}                    //           if/when the unique_ptr is destroyed
    
      If the author didn't want the user to "walk the extra mile",
      the author wouldn't have marked the constructor as explicit.
    
A braced-init-list is often referred to as means of uniform initialization, meaning that all types can be initialized using the same syntax. It doesn't matter if we are initializing an fundamental type, or a user-defined type that is initialized with one, or several, arguments; the initialization is uniform.
      The current praxis, backed up by the Standard, does not state that
      uniform initialization is a way to bypass the rules associated
      with initialization of an object of type T, we merely have a way to
      express initialization of any type.
    
Another point of value is that you often hear developers state that one of the greatest perks of using a braced-init-list is that it's equivalent of saying: "Dear compiler, if you know what type I'm trying to initialize.. please, go-ahead."
      It is important to note the usage of
      "you know", nowhere does it imply that both the compiler and
      the developer "knows the type". When an initialization requires
      the use of an explicit constructor the compiler sure knows,
      but with the meaning of explicit in mind, an implemenation
      should be worried that the developer doesn't, which is why we get a
      diagnostic in such case.
    
There are many rules to C++, some more complicated than others, but what really makes people go "hmpf" is when seemingly equivalent constructs behaves differently.
      Allowing return { ... } to use an explicit
      constructor contradicts the previously, far more simple explanation:
      "Unless a braced-init-list has a {type, object, cast}
      explicitly stated where it is being used, a potential conversion must
      be one that can happen implicitly."
      
    
Is the proposed change by N4074 really worth it?
      There is a very close relationship between narrowing conversions,
      and the use of a constructor marked as explicit.
    
      If a fundamental type T is initialized with a compile-time
      known value which isn't suitable for that type, or if such type is
      initialized with an object of type U which potentially can
      hold a value that isn't representable in T, a diagnostic is
      required.
    
The introduction of narrowing conversions in C++ was, and is, a very good step towards increased type-safety. It prevents developers from making mistakes that can potentially result in a program that behaves in a manner which was never intended.
// narrowing-conversions.example.1
std::size_t
multiply (int x, int y) {
  return { x * y };  // error: non-constant-expression cannot be narrowed from
}                    //        type 'int' to 'std::size_t'
    
    
      It is certainly possible to initialize a std::size_t with the
      result of x * y, but since std::size_t cannot
      handle negative numbers this is potentially unsafe.
    
    If we play with the idea of writing a wrapper around
    std::size_t, we could end up with something like the below:
    
// narrowing-conversions.example.2
struct SizeType {
  explicit SizeType (  signed int);
           SizeType (unsigned int);
  …
};
SizeType
multiply (int x, int y) {
  return { x * y }; // error: chosen constructor is explicit in
}                   //        copy-initialization
    
      The reason SizeType (signed int) is marked
      explicit, is the same as to why we rely on diagnostics
      to inform us of potential narrowing conversions. We rely on the
      compiler to tell us when we are doing something that might lead
      to unforeseen consequences.
    
      Since C++11 the use of return { expr } has become almost
      synonym to "safe initialization of any return-type", if N4074 is
      approved this will no longer be true. This would be one of the scarier
      forms of a breaking change; one that cannot be caught by
      something other than a watchful eye.
    
return-statement
T func1 () {
  return expression-or-braced-init-list;
}
    
    As the name implies, a return-statement is used to return
    a value to the caller of a function. However, it is of utterly importance
    that we understand that we never directly return the value of the
    expression-or-braced-init-list associated with the
    statement; we merely say that it is to be used as the initializer for the
    returned value.
    
The return-type of a function is per definition a distant type; one cannot know the actual return-type by only interpreting the expression-or-braced-init-list used to initialize it. The opposite also applies; one cannot know the initializers for the return-value by only inspecting the return-type.
With the mentioned relation between the return-type and its initializer(s), there are side-effects that one has to properly consider:
A developer should be allowed to change the return-type of a function without having to review every return-statement in its body. The expected behavior is that such change results in a diagnostic unless every initialization of the new return-type follows the rules of strict type-safety (meaning that a potential dangerous initialization should not implicitly apply).
In the below a developer inaccurately thought "ms" was the SI unit for microseconds, long story short, it's not. The error is however caught during compilation.
// return-statement.example.1
/*!
 *  \brief  Benchmark `f()`
 *  \return The duration in ms spent evaluating `f()`
 * */
unsigned long benchmark (std::function<void()> f) {
  …
}
commit message:
  * updating codebase to C++11, `benchmark` now returns the appropriate
    duration type from <chrono>
commit diff:
  --- benchmark.cpp  2014-07-28 03:56:32.255764544 +0200
  +++ benchmark.cpp  2014-07-28 03:56:53.175682956 +0200
  @@ -5,6 +5,6 @@
    *  \return The duration in ms spent evaluating `f()`
    * */
   
 - unsigned long benchmark (std::function<void()> f) {
 + std::chrono::microseconds benchmark (std::function<void()> f) {
     …
   }
      A developer might not know the return-type of a function when he writes his return-statement, therefore he should have a mechanism to disable initializations that potentially does something which was never intended - no matter if such initialization makes use of one, or several, arguments.
// return-statement.example.2
template<class T>
struct Vector {
  explicit Vector (int size, int capacity = 0);
           Vector (std::initializer_list<T> data);
};
template<class T, class... Ts>
Vector<T> make_vector (Ts... args) {
  return { args... };
}
int main () {
  using secs = std::chrono::seconds;
  auto x = make_vector< int> (1,5,10);
  auto y = make_vector<secs> (10, 20); // error: chosen constructor is explicit in copy-initialization
}
      Even though I agree with the opinion raised by N4074, that a developer should know the return-type and the return-paths of the function he is working on, I find it of higher value that the compiler is able to stop potential brainfarts from ever making it as far as to runtime.
Neither of the two previous examples would be caught during compilation if N4074 is approved. This means that the somewhat trivial errors leaked out into the world of runtime, something which the strict type-safety of C++ has saved us from in the past.
The proposed changes by N4074 are a violation of one of the fundamental type-safety philosophies of C++; if it's not clear that a potentially unsafe conversion can happen, we - as developers - would like the compiler to diagnose the potential error. It doesn't make sense for the rules of copy-list-initialization to differ in return-statements since we are per definition initializing a distant type - and with that, a distant value.
      If N4074 is approved there are other cases where such a change need to
      propogate for it to make sense. With the philosophy expressed by N4074,
      private member-functions of a class are maintained by the
      same developer who is calling them (as they are implementation details),
      should we then allow explicit constructors to be used when
      invoking such function having copy-list-initialization of the
      arguments involved? After all, the developer should know what
      is going on.