Doc. No.: WG21/N3679
Date: 2013-5-05
Reply to: Hans-J. Boehm
Phone: +1-650-857-3406
Email: Hans.Boehm@hp.com

N3679: Async() future destructors must wait

We've had repeated debates about the desirability of having futures returned by async() wait in their destructor for the underlying task to complete. See for example, N3630, and it's predecessor N3451. This had turned into a particularly sensitive issue, since future destructors do not block consistently, but only when returned from async(), potentially making them difficult to use in general purpose code.

A number of the older papers, e.g. N3630, argued that future destructors should not block at all. Here we argue that such a change would be too much: It would introduce subtle program bugs, which are likely to be exploitable as security holes. A very similar argument was presented, in slide form, at the Bristol SG1 meeting. It contributed to an alternate proposal, N3637, which was almost voted into the working paper.

The only point of this paper is to document more of the discussion leading to N3637 in the interest of avoiding future repetition.

The basic issue

Futures returned by async() with async launch policy wait in their destructor for the associated shared state to become ready. This prevents a situation in which the associated thread continues to run, and there is no longer a means to wait for it to complete because the associated future has been destroyed. Without heroic efforts to otherwise wait for completion, such a "run-away" thread can continue to run past the lifetime of the objects on which it depends.

As an example, consider the following pair of functions:

void f() {
  vector<int> v;
  ...
  do_parallel_foo(v);
  ...
}

void do_parallel_foo(vector<int>& v) {
  auto fut = no_join_async([&] {...  foo(v); return ...; });
  a: ...
  fut.get();
  ...
}

If no_join_async() returns a future whose destructor does not wait for async completion, everything may work well until the code at a throws an exception. At that point nothing waits for the async to complete, and it may continue to run past the exit from both do_parallel_foo() and f(), causing the async task to access and overwite memory previously allocated to v way past it's lifetime.

The end result is likely to be a cross-thread "memory smash" similar to that described in N2802 under similar conditions.

This problem is of course avoided if get() or wait() is called on no_join_async()-generated futures before they are destroyed. The difficulty, as in N2802, is that an unexpected exception may cause that code to be bypassed. Thus some sort of scope guard is usually needed to ensure safety. If the programmer forgets to add the scope guard, it appears likely that an attacker could generate e.g. a bad_alloc exception at an opportune point to take advantage of the oversight, and cause a stack to be overwritten. It may be possible to also control the data used to overwrite the stack, and thus gain control over the process. This is a sufficiently subtle error that, in our experience, it is likely to be overlooked in real code.

Not all dangling pointers are created equal

It has repeatedly been argued that this is no worse than existing dangling pointer issues, such as those introduced by lambda expressions with reference captures. Here we argue that it is in fact worse, by contrasting the two corresponding examples in the following table. Both examples operate on a vector v passed in as a parameter. In both cases, the function foo should normally ensure that there are no references to v once foo() returns, since there is no reason to expect that v will still be around. On the left side, we assume a hypothetical no_join_async() whose returned future does not block in its destructor, as above.

Async-induced dangling reference Lambda-induced dangling reference
void foo(vector<int> &v)
{
  auto f = no_join_async([&] {...
    sort(v); return v.size(); });
  a: ...
  // drop f
}
function<int> foo(vector<int> &v) {
  function<int> f = [&] {... sort(v); return v.size(); })
  a: ...
  return f;
}

Both pieces of code are buggy, or at least very brittle. On the left, v may be accessed after the return of foo() because the asynchronous task continues run. On the right side, the returned lambda expression has captured v by reference. There is no guarantee that v still exists when the lambda expression is invoked.

But there are several reasons to consider the version of the left significantly more hazardous: