Document #:N4131
Date:2014-08-09
Reply to:Filip Roséen <filip.roseen@gmail.com>
Summary:Arguments for not allowing return {expr} to call an explicit constructor.

Another response to N4074;

explicit should never be implicit

Contents


Introduction

If one were to agree with the contents of N4074, the following snippet should compile without diagnostics;

struct Type1 {
    explicit Type1 (int);
};

Type1 example_f1 () {
    return { 0 };
}

The main arguments of N4074:

This paper will try to prove why the proposed change of ISO C++ in N4074 shouldn't be allowed using several methods, among them are:


The Meaning of explicit

Marking a constructor as explicit is often equivalent of saying: "such initialization sure is possible, but it's potentially not what you want, if you really want to do this; go a head, but I won't let it happen without your explicit consent."

If a developer would like to use our explicit constructor, we'd like him to go the extra mile and explicitly show us that this is the case. We'd like him to show some effort, and more specifically; consider if this is really what he wants... explicit constructors are, by the invisible contract involved, potentially dangerous.

// meaning-of-explicit.example.1

std::unique_ptr<T> func () {
  static T  x;
  return { &x };     // error: chosen constructor is explicit in copy-initialization
}

There's no way for an implementation to force a developer to actual walk around the block every time he tries to initialize an object using an explicit constructor, instead we require him to explicitly state his request by writing out the type he'd like to initialize at the point where such initialization takes place.

"I'll refuse to do this unless you show some effort."

Implications if N4074 is approved:

N4074 will effectively make the previously described contract disappear in the context of return { expr }, which further means that we completely disregard the original intent expressed by the author of said constructor.

// meaning-of-explicit.example.1

std::unique_ptr<T> func () {
  static T  x;
  return { &x };     // compiles, but triggers undefined-behavior
}                    //           if/when the unique_ptr is destroyed

If the author didn't want the user to "walk the extra mile", the author wouldn't have marked the constructor as explicit.


The Current Praxis of braced-init-list

A braced-init-list is often referred to as means of uniform initialization, meaning that all types can be initialized using the same syntax. It doesn't matter if we are initializing an fundamental type, or a user-defined type that is initialized with one, or several, arguments; the initialization is uniform.

The current praxis, backed up by the Standard, does not state that uniform initialization is a way to bypass the rules associated with initialization of an object of type T, we merely have a way to express initialization of any type.

Another point of value is that you often hear developers state that one of the greatest perks of using a braced-init-list is that it's equivalent of saying: "Dear compiler, if you know what type I'm trying to initialize.. please, go-ahead."

It is important to note the usage of "you know", nowhere does it imply that both the compiler and the developer "knows the type". When an initialization requires the use of an explicit constructor the compiler sure knows, but with the meaning of explicit in mind, an implemenation should be worried that the developer doesn't, which is why we get a diagnostic in such case.

Implications if N4074 is approved:

There are many rules to C++, some more complicated than others, but what really makes people go "hmpf" is when seemingly equivalent constructs behaves differently.

Allowing return { ... } to use an explicit constructor contradicts the previously, far more simple explanation: "Unless a braced-init-list has a {type, object, cast} explicitly stated where it is being used, a potential conversion must be one that can happen implicitly."

C++ has enough rules that are cluttered with "but if this applies, that doesn't hold". We don't need another one of such rule, especially when it impediments type-safety and the only real gain is to prolong the lifetime of keyboards. Lazyness doesn't go well with writing safe initializations.

Is the proposed change by N4074 really worth it?


Narrowing Conversions

There is a very close relationship between narrowing conversions, and the use of a constructor marked as explicit.

If a fundamental type T is initialized with a compile-time known value which isn't suitable for that type, or if such type is initialized with an object of type U which potentially can hold a value that isn't representable in T, a diagnostic is required.

The introduction of narrowing conversions in C++ was, and is, a very good step towards increased type-safety. It prevents developers from making mistakes that can potentially result in a program that behaves in a manner which was never intended.

// narrowing-conversions.example.1

std::size_t
multiply (int x, int y) {
  return { x * y };  // error: non-constant-expression cannot be narrowed from
}                    //        type 'int' to 'std::size_t'
    

It is certainly possible to initialize a std::size_t with the result of x * y, but since std::size_t cannot handle negative numbers this is potentially unsafe.

If we play with the idea of writing a wrapper around std::size_t, we could end up with something like the below:

// narrowing-conversions.example.2

struct SizeType {
  explicit SizeType (  signed int);
           SizeType (unsigned int);

  …
};

SizeType
multiply (int x, int y) {
  return { x * y }; // error: chosen constructor is explicit in
}                   //        copy-initialization

The reason SizeType (signed int) is marked explicit, is the same as to why we rely on diagnostics to inform us of potential narrowing conversions. We rely on the compiler to tell us when we are doing something that might lead to unforeseen consequences.

Implications if N4074 is approved:

Since C++11 the use of return { expr } has become almost synonym to "safe initialization of any return-type", if N4074 is approved this will no longer be true. This would be one of the scarier forms of a breaking change; one that cannot be caught by something other than a watchful eye.


The return-statement

T func1 () {
  return expression-or-braced-init-list;
}

As the name implies, a return-statement is used to return a value to the caller of a function. However, it is of utterly importance that we understand that we never directly return the value of the expression-or-braced-init-list associated with the statement; we merely say that it is to be used as the initializer for the returned value.

The return-type of a function is per definition a distant type; one cannot know the actual return-type by only interpreting the expression-or-braced-init-list used to initialize it. The opposite also applies; one cannot know the initializers for the return-value by only inspecting the return-type.

With the mentioned relation between the return-type and its initializer(s), there are side-effects that one has to properly consider:

Implications if N4074 is approved:

Even though I agree with the opinion raised by N4074, that a developer should know the return-type and the return-paths of the function he is working on, I find it of higher value that the compiler is able to stop potential brainfarts from ever making it as far as to runtime.

Neither of the two previous examples would be caught during compilation if N4074 is approved. This means that the somewhat trivial errors leaked out into the world of runtime, something which the strict type-safety of C++ has saved us from in the past.


Conclusion

The proposed changes by N4074 are a violation of one of the fundamental type-safety philosophies of C++; if it's not clear that a potentially unsafe conversion can happen, we - as developers - would like the compiler to diagnose the potential error. It doesn't make sense for the rules of copy-list-initialization to differ in return-statements since we are per definition initializing a distant type - and with that, a distant value.

If N4074 is approved there are other cases where such a change need to propogate for it to make sense. With the philosophy expressed by N4074, private member-functions of a class are maintained by the same developer who is calling them (as they are implementation details), should we then allow explicit constructors to be used when invoking such function having copy-list-initialization of the arguments involved? After all, the developer should know what is going on.