P2300R7
std::execution

Published Proposal,

Authors:
Source:
GitHub
Issue Tracking:
GitHub
Project:
ISO/IEC JTC1/SC22/WG21 14882: Programming Language — C++
Audience:
SG1, LEWG

1. Introduction

This paper proposes a self-contained design for a Standard C++ framework for managing asynchronous execution on generic execution resources. It is based on the ideas in A Unified Executors Proposal for C++ and its companion papers.

1.1. Motivation

Today, C++ software is increasingly asynchronous and parallel, a trend that is likely to only continue going forward. Asynchrony and parallelism appears everywhere, from processor hardware interfaces, to networking, to file I/O, to GUIs, to accelerators. Every C++ domain and every platform needs to deal with asynchrony and parallelism, from scientific computing to video games to financial services, from the smallest mobile devices to your laptop to GPUs in the world’s fastest supercomputer.

While the C++ Standard Library has a rich set of concurrency primitives (std::atomic, std::mutex, std::counting_semaphore, etc) and lower level building blocks (std::thread, etc), we lack a Standard vocabulary and framework for asynchrony and parallelism that C++ programmers desperately need. std::async/std::future/std::promise, C++11’s intended exposure for asynchrony, is inefficient, hard to use correctly, and severely lacking in genericity, making it unusable in many contexts. We introduced parallel algorithms to the C++ Standard Library in C++17, and while they are an excellent start, they are all inherently synchronous and not composable.

This paper proposes a Standard C++ model for asynchrony, based around three key abstractions: schedulers, senders, and receivers, and a set of customizable asynchronous algorithms.

1.2. Priorities

1.3. Examples: End User

In this section we demonstrate the end-user experience of asynchronous programming directly with the sender algorithms presented in this paper. See § 4.20 User-facing sender factories, § 4.21 User-facing sender adaptors, and § 4.22 User-facing sender consumers for short explanations of the algorithms used in these code examples.

1.3.1. Hello world

using namespace std::execution;

scheduler auto sch = thread_pool.scheduler();                                 // 1

sender auto begin = schedule(sch);                                            // 2
sender auto hi = then(begin, []{                                              // 3
    std::cout << "Hello world! Have an int.";                                 // 3
    return 13;                                                                // 3
});                                                                           // 3
sender auto add_42 = then(hi, [](int arg) { return arg + 42; });              // 4

auto [i] = this_thread::sync_wait(add_42).value();                            // 5

This example demonstrates the basics of schedulers, senders, and receivers:

  1. First we need to get a scheduler from somewhere, such as a thread pool. A scheduler is a lightweight handle to an execution resource.

  2. To start a chain of work on a scheduler, we call § 4.20.1 execution::schedule, which returns a sender that completes on the scheduler. A sender describes asynchronous work and sends a signal (value, error, or stopped) to some recipient(s) when that work completes.

  3. We use sender algorithms to produce senders and compose asynchronous work. § 4.21.2 execution::then is a sender adaptor that takes an input sender and a std::invocable, and calls the std::invocable on the signal sent by the input sender. The sender returned by then sends the result of that invocation. In this case, the input sender came from schedule, so its void, meaning it won’t send us a value, so our std::invocable takes no parameters. But we return an int, which will be sent to the next recipient.

  4. Now, we add another operation to the chain, again using § 4.21.2 execution::then. This time, we get sent a value - the int from the previous step. We add 42 to it, and then return the result.

  5. Finally, we’re ready to submit the entire asynchronous pipeline and wait for its completion. Everything up until this point has been completely asynchronous; the work may not have even started yet. To ensure the work has started and then block pending its completion, we use § 4.22.2 this_thread::sync_wait, which will either return a std::optional<std::tuple<...>> with the value sent by the last sender, or an empty std::optional if the last sender sent a stopped signal, or it throws an exception if the last sender sent an error.

1.3.2. Asynchronous inclusive scan

using namespace std::execution;

sender auto async_inclusive_scan(scheduler auto sch,                          // 2
                                 std::span<const double> input,               // 1
                                 std::span<double> output,                    // 1
                                 double init,                                 // 1
                                 std::size_t tile_count)                      // 3
{
  std::size_t const tile_size = (input.size() + tile_count - 1) / tile_count;

  std::vector<double> partials(tile_count + 1);                               // 4
  partials[0] = init;                                                         // 4

  return transfer_just(sch, std::move(partials))                              // 5
       | bulk(tile_count,                                                     // 6
           [ = ](std::size_t i, std::vector<double>& partials) {              // 7
             auto start = i * tile_size;                                      // 8
             auto end   = std::min(input.size(), (i + 1) * tile_size);        // 8
             partials[i + 1] = *--std::inclusive_scan(begin(input) + start,   // 9
                                                      begin(input) + end,     // 9
                                                      begin(output) + start); // 9
           })                                                                 // 10
       | then(                                                                // 11
           [](std::vector<double>&& partials) {
             std::inclusive_scan(begin(partials), end(partials),              // 12
                                 begin(partials));                            // 12
             return std::move(partials);                                      // 13
           })
       | bulk(tile_count,                                                     // 14
           [ = ](std::size_t i, std::vector<double>& partials) {              // 14
             auto start = i * tile_size;                                      // 14
             auto end   = std::min(input.size(), (i + 1) * tile_size);        // 14
             std::for_each(begin(output) + start, begin(output) + end,        // 14
               [&] (double& e) { e = partials[i] + e; }                       // 14
             );
           })
       | then(                                                                // 15
           [ = ](std::vector<double>&& partials) {                            // 15
             return output;                                                   // 15
           });                                                                // 15
}

This example builds an asynchronous computation of an inclusive scan:

  1. It scans a sequence of doubles (represented as the std::span<const double> input) and stores the result in another sequence of doubles (represented as std::span<double> output).

  2. It takes a scheduler, which specifies what execution resource the scan should be launched on.

  3. It also takes a tile_count parameter that controls the number of execution agents that will be spawned.

  4. First we need to allocate temporary storage needed for the algorithm, which we’ll do with a std::vector, partials. We need one double of temporary storage for each execution agent we create.

  5. Next we’ll create our initial sender with § 4.20.3 execution::transfer_just. This sender will send the temporary storage, which we’ve moved into the sender. The sender has a completion scheduler of sch, which means the next item in the chain will use sch.

  6. Senders and sender adaptors support composition via operator|, similar to C++ ranges. We’ll use operator| to attach the next piece of work, which will spawn tile_count execution agents using § 4.21.9 execution::bulk (see § 4.13 Most sender adaptors are pipeable for details).

  7. Each agent will call a std::invocable, passing it two arguments. The first is the agent’s index (i) in the § 4.21.9 execution::bulk operation, in this case a unique integer in [0, tile_count). The second argument is what the input sender sent - the temporary storage.

  8. We start by computing the start and end of the range of input and output elements that this agent is responsible for, based on our agent index.

  9. Then we do a sequential std::inclusive_scan over our elements. We store the scan result for our last element, which is the sum of all of our elements, in our temporary storage partials.

  10. After all computation in that initial § 4.21.9 execution::bulk pass has completed, every one of the spawned execution agents will have written the sum of its elements into its slot in partials.

  11. Now we need to scan all of the values in partials. We’ll do that with a single execution agent which will execute after the § 4.21.9 execution::bulk completes. We create that execution agent with § 4.21.2 execution::then.

  12. § 4.21.2 execution::then takes an input sender and an std::invocable and calls the std::invocable with the value sent by the input sender. Inside our std::invocable, we call std::inclusive_scan on partials, which the input senders will send to us.

  13. Then we return partials, which the next phase will need.

  14. Finally we do another § 4.21.9 execution::bulk of the same shape as before. In this § 4.21.9 execution::bulk, we will use the scanned values in partials to integrate the sums from other tiles into our elements, completing the inclusive scan.

  15. async_inclusive_scan returns a sender that sends the output std::span<double>. A consumer of the algorithm can chain additional work that uses the scan result. At the point at which async_inclusive_scan returns, the computation may not have completed. In fact, it may not have even started.

1.3.3. Asynchronous dynamically-sized read

using namespace std::execution;

sender_of<set_value_t(std::size_t)> auto async_read(                          // 1
    sender_of<set_value_t(std::span<std::byte>)> auto buffer,                 // 1
    auto handle);                                                             // 1

struct dynamic_buffer {                                                       // 3
  std::unique_ptr<std::byte[]> data;                                          // 3
  std::size_t size;                                                           // 3
};                                                                            // 3

sender_of<set_value_t(dynamic_buffer)> auto async_read_array(auto handle) {   // 2
  return just(dynamic_buffer{})                                               // 4
       | let_value([handle] (dynamic_buffer& buf) {                           // 5
           return just(std::as_writeable_bytes(std::span(&buf.size, 1))       // 6
                | async_read(handle)                                          // 7
                | then(                                                       // 8
                    [&buf] (std::size_t bytes_read) {                         // 9
                      assert(bytes_read == sizeof(buf.size));                 // 10
                      buf.data = std::make_unique<std::byte[]>(buf.size);     // 11
                      return std::span(buf.data.get(), buf.size);             // 12
                    })
                | async_read(handle)                                          // 13
                | then(
                    [&buf] (std::size_t bytes_read) {
                      assert(bytes_read == buf.size);                         // 14
                      return std::move(buf);                                  // 15
                    });
       });
}

This example demonstrates a common asynchronous I/O pattern - reading a payload of a dynamic size by first reading the size, then reading the number of bytes specified by the size:

  1. async_read is a pipeable sender adaptor. It’s a customization point object, but this is what it’s call signature looks like. It takes a sender parameter which must send an input buffer in the form of a std::span<std::byte>, and a handle to an I/O context. It will asynchronously read into the input buffer, up to the size of the std::span. It returns a sender which will send the number of bytes read once the read completes.

  2. async_read_array takes an I/O handle and reads a size from it, and then a buffer of that many bytes. It returns a sender that sends a dynamic_buffer object that owns the data that was sent.

  3. dynamic_buffer is an aggregate struct that contains a std::unique_ptr<std::byte[]> and a size.

  4. The first thing we do inside of async_read_array is create a sender that will send a new, empty dynamic_array object using § 4.20.2 execution::just. We can attach more work to the pipeline using operator| composition (see § 4.13 Most sender adaptors are pipeable for details).

  5. We need the lifetime of this dynamic_array object to last for the entire pipeline. So, we use let_value, which takes an input sender and a std::invocable that must return a sender itself (see § 4.21.4 execution::let_* for details). let_value sends the value from the input sender to the std::invocable. Critically, the lifetime of the sent object will last until the sender returned by the std::invocable completes.

  6. Inside of the let_value std::invocable, we have the rest of our logic. First, we want to initiate an async_read of the buffer size. To do that, we need to send a std::span pointing to buf.size. We can do that with § 4.20.2 execution::just.

  7. We chain the async_read onto the § 4.20.2 execution::just sender with operator|.

  8. Next, we pipe a std::invocable that will be invoked after the async_read completes using § 4.21.2 execution::then.

  9. That std::invocable gets sent the number of bytes read.

  10. We need to check that the number of bytes read is what we expected.

  11. Now that we have read the size of the data, we can allocate storage for it.

  12. We return a std::span<std::byte> to the storage for the data from the std::invocable. This will be sent to the next recipient in the pipeline.

  13. And that recipient will be another async_read, which will read the data.

  14. Once the data has been read, in another § 4.21.2 execution::then, we confirm that we read the right number of bytes.

  15. Finally, we move out of and return our dynamic_buffer object. It will get sent by the sender returned by async_read_array. We can attach more things to that sender to use the data in the buffer.

1.4. Asynchronous Windows socket recv

To get a better feel for how this interface might be used by low-level operations see this example implementation of a cancellable async_recv() operation for a Windows Socket.

struct operation_base : WSAOVERALAPPED {
    using completion_fn = void(operation_base* op, DWORD bytesTransferred, int errorCode) noexcept;

    // Assume IOCP event loop will call this when this OVERLAPPED structure is dequeued.
    completion_fn* completed;
};

template<typename Receiver>
struct recv_op : operation_base {
    recv_op(SOCKET s, void* data, size_t len, Receiver r)
    : receiver(std::move(r))
    , sock(s) {
        this->Internal = 0;
        this->InternalHigh = 0;
        this->Offset = 0;
        this->OffsetHigh = 0;
        this->hEvent = NULL;
        this->completed = &recv_op::on_complete;
        buffer.len = len;
        buffer.buf = static_cast<CHAR*>(data);
    }

    friend void tag_invoke(std::execution::start_t, recv_op& self) noexcept {
        // Avoid even calling WSARecv() if operation already cancelled
        auto st = std::execution::get_stop_token(
          std::get_env(self.receiver));
        if (st.stop_requested()) {
            std::execution::set_stopped(std::move(self.receiver));
            return;
        }

        // Store and cache result here in case it changes during execution
        const bool stopPossible = st.stop_possible();
        if (!stopPossible) {
            self.ready.store(true, std::memory_order_relaxed);
        }

        // Launch the operation
        DWORD bytesTransferred = 0;
        DWORD flags = 0;
        int result = WSARecv(self.sock, &self.buffer, 1, &bytesTransferred, &flags,
                             static_cast<WSAOVERLAPPED*>(&self), NULL);
        if (result == SOCKET_ERROR) {
            int errorCode = WSAGetLastError();
            if (errorCode != WSA_IO_PENDING)) {
                if (errorCode == WSA_OPERATION_ABORTED) {
                    std::execution::set_stopped(std::move(self.receiver));
                } else {
                    std::execution::set_error(std::move(self.receiver),
                                              std::error_code(errorCode, std::system_category()));
                }
                return;
            }
        } else {
            // Completed synchronously (assuming FILE_SKIP_COMPLETION_PORT_ON_SUCCESS has been set)
            execution::set_value(std::move(self.receiver), bytesTransferred);
            return;
        }

        // If we get here then operation has launched successfully and will complete asynchronously.
        // May be completing concurrently on another thread already.
        if (stopPossible) {
            // Register the stop callback
            self.stopCallback.emplace(std::move(st), cancel_cb{self});

            // Mark as 'completed'
            if (self.ready.load(std::memory_order_acquire) ||
                self.ready.exchange(true, std::memory_order_acq_rel)) {
                // Already completed on another thread
                self.stopCallback.reset();

                BOOL ok = WSAGetOverlappedResult(self.sock, (WSAOVERLAPPED*)&self, &bytesTransferred, FALSE, &flags);
                if (ok) {
                    std::execution::set_value(std::move(self.receiver), bytesTransferred);
                } else {
                    int errorCode = WSAGetLastError();
                    std::execution::set_error(std::move(self.receiver),
                                              std::error_code(errorCode, std::system_category()));
                }
            }
        }
    }

    struct cancel_cb {
        recv_op& op;

        void operator()() noexcept {
            CancelIoEx((HANDLE)op.sock, (OVERLAPPED*)(WSAOVERLAPPED*)&op);
        }
    };

    static void on_complete(operation_base* op, DWORD bytesTransferred, int errorCode) noexcept {
        recv_op& self = *static_cast<recv_op*>(op);

        if (ready.load(std::memory_order_acquire) ||
            ready.exchange(true, std::memory_order_acq_rel)) {
            // Unsubscribe any stop-callback so we know that CancelIoEx() is not accessing 'op'
            // any more
            stopCallback.reset();

            if (errorCode == 0) {
                std::execution::set_value(std::move(receiver), bytesTransferred);
            } else {
                std::execution::set_error(std::move(receiver),
                                          std::error_code(errorCode, std::system_category()));
            }
        }
    }

    Receiver receiver;
    SOCKET sock;
    WSABUF buffer;
    std::optional<typename stop_callback_type_t<Receiver>
        ::template callback_type<cancel_cb>> stopCallback;
    std::atomic<bool> ready{false};
};

struct recv_sender {
    using is_sender = void;
    SOCKET sock;
    void* data;
    size_t len;

    template<typename Receiver>
    friend recv_op<Receiver> tag_invoke(std::execution::connect_t,
                                        const recv_sender& s,
                                        Receiver r) {
        return recv_op<Receiver>{s.sock, s.data, s.len, std::move(r)};
    }
};

recv_sender async_recv(SOCKET s, void* data, size_t len) {
    return recv_sender{s, data, len};
}

1.4.1. More end-user examples

1.4.1.1. Sudoku solver

This example comes from Kirk Shoop, who ported an example from TBB’s documentation to sender/receiver in his fork of the libunifex repo. It is a Sudoku solver that uses a configurable number of threads to explore the search space for solutions.

The sender/receiver-based Sudoku solver can be found here. Some things that are worth noting about Kirk’s solution:

  1. Although it schedules asychronous work onto a thread pool, and each unit of work will schedule more work, its use of structured concurrency patterns make reference counting unnecessary. The solution does not make use of shared_ptr.

  2. In addition to eliminating the need for reference counting, the use of structured concurrency makes it easy to ensure that resources are cleaned up on all code paths. In contrast, the TBB example that inspired this one leaks memory.

For comparison, the TBB-based Sudoku solver can be found here.

1.4.1.2. File copy

This example also comes from Kirk Shoop which uses sender/receiver to recursively copy the files a directory tree. It demonstrates how sender/receiver can be used to do IO, using a scheduler that schedules work on Linux’s io_uring.

As with the Sudoku example, this example obviates the need for reference counting by employing structured concurrency. It uses iteration with an upper limit to avoid having too many open file handles.

You can find the example here.

1.4.1.3. Echo server

Dietmar Kuehl has a hobby project that implements networking APIs on top of sender/receiver. He recently implemented an echo server as a demo. His echo server code can be found here.

Below, I show the part of the echo server code. This code is executed for each client that connects to the echo server. In a loop, it reads input from a socket and echos the input back to the same socket. All of this, including the loop, is implemented with generic async algorithms.

outstanding.start(
    EX::repeat_effect_until(
          EX::let_value(
              NN::async_read_some(ptr->d_socket,
                                  context.scheduler(),
                                  NN::buffer(ptr->d_buffer))
        | EX::then([ptr](::std::size_t n){
            ::std::cout << "read='" << ::std::string_view(ptr->d_buffer, n) << "'\n";
            ptr->d_done = n == 0;
            return n;
        }),
          [&context, ptr](::std::size_t n){
            return NN::async_write_some(ptr->d_socket,
                                        context.scheduler(),
                                        NN::buffer(ptr->d_buffer, n));
          })
        | EX::then([](auto&&...){})
        , [owner = ::std::move(owner)]{ return owner->d_done; }
    )
);

In this code, NN::async_read_some and NN::async_write_some are asynchronous socket-based networking APIs that return senders. EX::repeat_effect_until, EX::let_value, and EX::then are fully generic sender adaptor algorithms that accept and return senders.

This is a good example of seamless composition of async IO functions with non-IO operations. And by composing the senders in this structured way, all the state for the composite operation -- the repeat_effect_until expression and all its child operations -- is stored altogether in a single object.

1.5. Examples: Algorithms

In this section we show a few simple sender/receiver-based algorithm implementations.

1.5.1. then

namespace exec = std::execution;

template<class R, class F>
class _then_receiver
    : exec::receiver_adaptor<_then_receiver<R, F>, R> {
  friend exec::receiver_adaptor<_then_receiver, R>;
  F f_;

  // Customize set_value by invoking the callable and passing the result to the inner receiver
  template<class... As>
  void set_value(As&&... as) && noexcept try {
    exec::set_value(std::move(*this).base(), std::invoke((F&&) f_, (As&&) as...));
  } catch(...) {
    exec::set_error(std::move(*this).base(), std::current_exception());
  }

 public:
  _then_receiver(R r, F f)
   : exec::receiver_adaptor<_then_receiver, R>{std::move(r)}
   , f_(std::move(f)) {}
};

template<exec::sender S, class F>
struct _then_sender {
  using is_sender = void;
  S s_;
  F f_;

  template <class... Args>
    using _set_value_t = exec::completion_signatures<
      exec::set_value_t(std::invoke_result_t<F, Args...>)>;

  // Compute the completion signatures
  template<class Env>
  friend auto tag_invoke(exec::get_completion_signatures_t, _then_sender&&, Env)
    -> exec::make_completion_signatures<S, Env,
        exec::completion_signatures<exec::set_error_t(std::exception_ptr)>,
        _set_value_t>;

  // Connect:
  template<exec::receiver R>
  friend auto tag_invoke(exec::connect_t, _then_sender&& self, R r)
    -> exec::connect_result_t<S, _then_receiver<R, F>> {
      return exec::connect(
        (S&&) self.s_, _then_receiver<R, F>{(R&&) r, (F&&) self.f_});
  }

  friend decltype(auto) tag_invoke(get_env_t, const _then_sender& self) noexcept {
    return get_env(self.s_);
  }
};

template<exec::sender S, class F>
exec::sender auto then(S s, F f) {
  return _then_sender<S, F>{(S&&) s, (F&&) f};
}

This code builds a then algorithm that transforms the value(s) from the input sender with a transformation function. The result of the transformation becomes the new value. The other receiver functions (set_error and set_stopped), as well as all receiver queries, are passed through unchanged.

In detail, it does the following:

  1. Defines a receiver in terms of execution::receiver_adaptor that aggregates another receiver and an invocable that:

    • Defines a constrained tag_invoke overload for transforming the value channel.

    • Defines another constrained overload of tag_invoke that passes all other customizations through unchanged.

    The tag_invoke overloads are actually implemented by execution::receiver_adaptor; they dispatch either to named members, as shown above with _then_receiver::set_value, or to the adapted receiver.

  2. Defines a sender that aggregates another sender and the invocable, which defines a tag_invoke customization for std::execution::connect that wraps the incoming receiver in the receiver from (1) and passes it and the incoming sender to std::execution::connect, returning the result. It also defines a tag_invoke customization of get_completion_signatures that declares the sender’s completion signatures when executed within a particular environment.

1.5.2. retry

using namespace std;
namespace exec = execution;

template <class From, class To>
concept _decays_to = same_as<decay_t<From>, To>;

// _conv needed so we can emplace construct non-movable types into
// a std::optional.
template<invocable F>
  requires is_nothrow_move_constructible_v<F>
struct _conv {
  F f_;
  explicit _conv(F f) noexcept : f_((F&&) f) {}
  operator invoke_result_t<F>() && {
    return ((F&&) f_)();
  }
};

template<class S, class R>
struct _op;

// pass through all customizations except set_error, which retries the operation.
template<class S, class R>
struct _retry_receiver
  : exec::receiver_adaptor<_retry_receiver<S, R>> {
  _op<S, R>* o_;

  R&& base() && noexcept { return (R&&) o_->r_; }
  const R& base() const & noexcept { return o_->r_; }

  explicit _retry_receiver(_op<S, R>* o) : o_(o) {}

  void set_error(auto&&) && noexcept {
    o_->_retry(); // This causes the op to be retried
  }
};

// Hold the nested operation state in an optional so we can
// re-construct and re-start it if the operation fails.
template<class S, class R>
struct _op {
  S s_;
  R r_;
  optional<
      exec::connect_result_t<S&, _retry_receiver<S, R>>> o_;

  _op(S s, R r): s_((S&&)s), r_((R&&)r), o_{_connect()} {}
  _op(_op&&) = delete;

  auto _connect() noexcept {
    return _conv{[this] {
      return exec::connect(s_, _retry_receiver<S, R>{this});
    }};
  }
  void _retry() noexcept try {
    o_.emplace(_connect()); // potentially-throwing
    exec::start(*o_);
  } catch(...) {
    exec::set_error((R&&) r_, std::current_exception());
  }
  friend void tag_invoke(exec::start_t, _op& o) noexcept {
    exec::start(*o.o_);
  }
};

template<class S>
struct _retry_sender {
  using is_sender = void;
  S s_;
  explicit _retry_sender(S s) : s_((S&&) s) {}

  template <class... Ts>
    using _value_t =
      exec::completion_signatures<exec::set_value_t(Ts...)>;
  template <class>
    using _error_t = exec::completion_signatures<>;

  // Declare the signatures with which this sender can complete
  template <class Env>
  friend auto tag_invoke(exec::get_completion_signatures_t, const _retry_sender&, Env)
    -> exec::make_completion_signatures<S&, Env,
        exec::completion_signatures<exec::set_error_t(std::exception_ptr)>,
        _value_t, _error_t>;

  template<exec::receiver R>
  friend _op<S, R> tag_invoke(exec::connect_t, _retry_sender&& self, R r) {
    return {(S&&) self.s_, (R&&) r};
  }

  friend decltype(auto) tag_invoke(exec::get_env_t, const _retry_sender& self) noexcept {
    return get_env(self.s_);
  }
};

template<exec::sender S>
exec::sender auto retry(S s) {
  return _retry_sender{(S&&) s};
}

The retry algorithm takes a multi-shot sender and causes it to repeat on error, passing through values and stopped signals. Each time the input sender is restarted, a new receiver is connected and the resulting operation state is stored in an optional, which allows us to reinitialize it multiple times.

This example does the following:

  1. Defines a _conv utility that takes advantage of C++17’s guaranteed copy elision to emplace a non-movable type in a std::optional.

  2. Defines a _retry_receiver that holds a pointer back to the operation state. It passes all customizations through unmodified to the inner receiver owned by the operation state except for set_error, which causes a _retry() function to be called instead.

  3. Defines an operation state that aggregates the input sender and receiver, and declares storage for the nested operation state in an optional. Constructing the operation state constructs a _retry_receiver with a pointer to the (under construction) operation state and uses it to connect to the aggregated sender.

  4. Starting the operation state dispatches to start on the inner operation state.

  5. The _retry() function reinitializes the inner operation state by connecting the sender to a new receiver, holding a pointer back to the outer operation state as before.

  6. After reinitializing the inner operation state, _retry() calls start on it, causing the failed operation to be rescheduled.

  7. Defines a _retry_sender that implements the connect customization point to return an operation state constructed from the passed-in sender and receiver.

  8. _retry_sender also implements the get_completion_signatures customization point to describe the ways this sender may complete when executed in a particular execution resource.

1.6. Examples: Schedulers

In this section we look at some schedulers of varying complexity.

1.6.1. Inline scheduler

class inline_scheduler {
  template <class R>
    struct _op {
      [[no_unique_address]] R rec_;
      friend void tag_invoke(std::execution::start_t, _op& op) noexcept {
        std::execution::set_value((R&&) op.rec_);
      }
    };

  struct _env {
    template <class Tag>
      friend inline_scheduler tag_invoke(
          std::execution::get_completion_scheduler_t<Tag>, _env) noexcept {
        return {};
      }
  };

  struct _sender {
    using is_sender = void;
    using completion_signatures =
      std::execution::completion_signatures<std::execution::set_value_t()>;

    template <class R>
      friend auto tag_invoke(std::execution::connect_t, _sender, R&& rec)
        noexcept(std::is_nothrow_constructible_v<std::remove_cvref_t<R>, R>)
        -> _op<std::remove_cvref_t<R>> {
        return {(R&&) rec};
      }

    friend _env tag_invoke(exec::get_env_t, _sender) noexcept {
      return {};
    }
  };

  friend _sender tag_invoke(std::execution::schedule_t, const inline_scheduler&) noexcept {
    return {};
  }

 public:
  inline_scheduler() = default;
  bool operator==(const inline_scheduler&) const noexcept = default;
};

The inline scheduler is a trivial scheduler that completes immediately and synchronously on the thread that calls std::execution::start on the operation state produced by its sender. In other words, start(connect(schedule(inline-scheduler), receiver)) is just a fancy way of saying set_value(receiver), with the exception of the fact that start wants to be passed an lvalue.

Although not a particularly useful scheduler, it serves to illustrate the basics of implementing one. The inline_scheduler:

  1. Customizes execution::schedule to return an instance of the sender type _sender.

  2. The _sender type models the sender concept and provides the metadata needed to describe it as a sender of no values and that never calls set_error or set_stopped. This metadata is provided with the help of the execution::completion_signatures utility.

  3. The _sender type customizes execution::connect to accept a receiver of no values. It returns an instance of type _op that holds the receiver by value.

  4. The operation state customizes std::execution::start to call std::execution::set_value on the receiver.

1.6.2. Single thread scheduler

This example shows how to create a scheduler for an execution resource that consists of a single thread. It is implemented in terms of a lower-level execution resource called std::execution::run_loop.

class single_thread_context {
  std::execution::run_loop loop_;
  std::thread thread_;

public:
  single_thread_context()
    : loop_()
    , thread_([this] { loop_.run(); })
  {}

  ~single_thread_context() {
    loop_.finish();
    thread_.join();
  }

  auto get_scheduler() noexcept {
    return loop_.get_scheduler();
  }

  std::thread::id get_thread_id() const noexcept {
    return thread_.get_id();
  }
};

The single_thread_context owns an event loop and a thread to drive it. In the destructor, it tells the event loop to finish up what it’s doing and then joins the thread, blocking for the event loop to drain.

The interesting bits are in the execution::run_loop context implementation. It is slightly too long to include here, so we only provide a reference to it, but there is one noteworthy detail about its implementation: It uses space in its operation states to build an intrusive linked list of work items. In structured concurrency patterns, the operation states of nested operations compose statically, and in an algorithm like this_thread::sync_wait, the composite operation state lives on the stack for the duration of the operation. The end result is that work can be scheduled onto this thread with zero allocations.

1.7. Examples: Server theme

In this section we look at some examples of how one would use senders to implement an HTTP server. The examples ignore the low-level details of the HTTP server and looks at how senders can be combined to achieve the goals of the project.

General application context:

1.7.1. Composability with execution::let_*

Example context:

Goals:

namespace ex = std::execution;

// Returns a sender that yields an http_request object for an incoming request
ex::sender auto schedule_request_start(read_requests_ctx ctx) {...}
// Sends a response back to the client; yields a void signal on success
ex::sender auto send_response(const http_response& resp) {...}
// Validate that the HTTP request is well-formed; forwards the request on success
ex::sender auto validate_request(const http_request& req) {...}

// Handle the request; main application logic
ex::sender auto handle_request(const http_request& req) {
  //...
  return ex::just(http_response{200, result_body});
}

// Transforms server errors into responses to be sent to the client
ex::sender auto error_to_response(std::exception_ptr err) {
  try {
    std::rethrow_exception(err);
  } catch (const std::invalid_argument& e) {
    return ex::just(http_response{404, e.what()});
  } catch (const std::exception& e) {
    return ex::just(http_response{500, e.what()});
  } catch (...) {
    return ex::just(http_response{500, "Unknown server error"});
  }
}
// Transforms cancellation of the server into responses to be sent to the client
ex::sender auto stopped_to_response() {
  return ex::just(http_response{503, "Service temporarily unavailable"});
}
//...
// The whole flow for transforming incoming requests into responses
ex::sender auto snd =
    // get a sender when a new request comes
    schedule_request_start(the_read_requests_ctx)
    // make sure the request is valid; throw if not
    | ex::let_value(validate_request)
    // process the request in a function that may be using a different execution resource
    | ex::let_value(handle_request)
    // If there are errors transform them into proper responses
    | ex::let_error(error_to_response)
    // If the flow is cancelled, send back a proper response
    | ex::let_stopped(stopped_to_response)
    // write the result back to the client
    | ex::let_value(send_response)
    // done
    ;
// execute the whole flow asynchronously
ex::start_detached(std::move(snd));

The example shows how one can separate out the concerns for interpreting requests, validating requests, running the main logic for handling the request, generating error responses, handling cancellation and sending the response back to the client. They are all different phases in the application, and can be joined together with the let_* functions.

All our functions return execution::sender objects, so that they can all generate success, failure and cancellation paths. For example, regardless where an error is generated (reading request, validating request or handling the response), we would have one common block to handle the error, and following error flows is easy.

Also, because of using execution::sender objects at any step, we might expect any of these steps to be completely asynchronous; the overall flow doesn’t care. Regardless of the execution resource in which the steps, or part of the steps are executed in, the flow is still the same.

1.7.2. Moving between execution resources with execution::on and execution::transfer

Example context:

Goals:

namespace ex = std::execution;

size_t legacy_read_from_socket(int sock, char* buffer, size_t buffer_len) {}
void process_read_data(const char* read_data, size_t read_len) {}
//...

// A sender that just calls the legacy read function
auto snd_read = ex::just(sock, buf, buf_len) | ex::then(legacy_read_from_socket);
// The entire flow
auto snd =
    // start by reading data on the I/O thread
    ex::on(io_sched, std::move(snd_read))
    // do the processing on the worker threads pool
    | ex::transfer(work_sched)
    // process the incoming data (on worker threads)
    | ex::then([buf](int read_len) { process_read_data(buf, read_len); })
    // done
    ;
// execute the whole flow asynchronously
ex::start_detached(std::move(snd));

The example assume that we need to wrap some legacy code of reading sockets, and handle execution resource switching. (This style of reading from socket may not be the most efficient one, but it’s working for our purposes.) For performance reasons, the reading from the socket needs to be done on the I/O thread, and all the processing needs to happen on a work-specific execution resource (i.e., thread pool).

Calling execution::on will ensure that the given sender will be started on the given scheduler. In our example, snd_read is going to be started on the I/O scheduler. This sender will just call the legacy code.

The completion-signal will be issued in the I/O execution resource, so we have to move it to the work thread pool. This is achieved with the help of the execution::transfer algorithm. The rest of the processing (in our case, the last call to then) will happen in the work thread pool.

The reader should notice the difference between execution::on and execution::transfer. The execution::on algorithm will ensure that the given sender will start in the specified context, and doesn’t care where the completion-signal for that sender is sent. The execution::transfer algorithm will not care where the given sender is going to be started, but will ensure that the completion-signal of will be transferred to the given context.

1.8. What this proposal is not

This paper is not a patch on top of A Unified Executors Proposal for C++; we are not asking to update the existing paper, we are asking to retire it in favor of this paper, which is already self-contained; any example code within this paper can be written in Standard C++, without the need to standardize any further facilities.

This paper is not an alternative design to A Unified Executors Proposal for C++; rather, we have taken the design in the current executors paper, and applied targeted fixes to allow it to fulfill the promises of the sender/receiver model, as well as provide all the facilities we consider essential when writing user code using standard execution concepts; we have also applied the guidance of removing one-way executors from the paper entirely, and instead provided an algorithm based around senders that serves the same purpose.

1.9. Design changes from P0443

  1. The executor concept has been removed and all of its proposed functionality is now based on schedulers and senders, as per SG1 direction.

  2. Properties are not included in this paper. We see them as a possible future extension, if the committee gets more comfortable with them.

  3. Senders now advertise what scheduler, if any, their evaluation will complete on.

  4. The places of execution of user code in P0443 weren’t precisely defined, whereas they are in this paper. See § 4.5 Senders can propagate completion schedulers.

  5. P0443 did not propose a suite of sender algorithms necessary for writing sender code; this paper does. See § 4.20 User-facing sender factories, § 4.21 User-facing sender adaptors, and § 4.22 User-facing sender consumers.

  6. P0443 did not specify the semantics of variously qualified connect overloads; this paper does. See § 4.7 Senders can be either multi-shot or single-shot.

  7. This paper extends the sender traits/typed sender design to support typed senders whose value/error types depend on type information provided late via the receiver.

  8. Support for untyped senders is dropped; the typed_sender concept is renamed sender; sender_traits is replaced with completion_signatures_of_t.

  9. Specific type erasure facilities are omitted, as per LEWG direction. Type erasure facilities can be built on top of this proposal, as discussed in § 5.9 Ranges-style CPOs vs tag_invoke.

  10. A specific thread pool implementation is omitted, as per LEWG direction.

  11. Some additional utilities are added:

    • run_loop: An execution resource that provides a multi-producer, single-consumer, first-in-first-out work queue.

    • receiver_adaptor: A utility for algorithm authors for defining one receiver type in terms of another.

    • completion_signatures and make_completion_signatures: Utilities for describing the ways in which a sender can complete in a declarative syntax.

1.10. Prior art

This proposal builds upon and learns from years of prior art with asynchronous and parallel programming frameworks in C++. In this section, we discuss async abstractions that have previously been suggested as a possible basis for asynchronous algorithms and why they fall short.

1.10.1. Futures

A future is a handle to work that has already been scheduled for execution. It is one end of a communication channel; the other end is a promise, used to receive the result from the concurrent operation and to communicate it to the future.

Futures, as traditionally realized, require the dynamic allocation and management of a shared state, synchronization, and typically type-erasure of work and continuation. Many of these costs are inherent in the nature of "future" as a handle to work that is already scheduled for execution. These expenses rule out the future abstraction for many uses and makes it a poor choice for a basis of a generic mechanism.

1.10.2. Coroutines

C++20 coroutines are frequently suggested as a basis for asynchronous algorithms. It’s fair to ask why, if we added coroutines to C++, are we suggesting the addition of a library-based abstraction for asynchrony. Certainly, coroutines come with huge syntactic and semantic advantages over the alternatives.

Although coroutines are lighter weight than futures, coroutines suffer many of the same problems. Since they typically start suspended, they can avoid synchronizing the chaining of dependent work. However in many cases, coroutine frames require an unavoidable dynamic allocation and indirect function calls. This is done to hide the layout of the coroutine frame from the C++ type system, which in turn makes possible the separate compilation of coroutines and certain compiler optimizations, such as optimization of the coroutine frame size.

Those advantages come at a cost, though. Because of the dynamic allocation of coroutine frames, coroutines in embedded or heterogeneous environments, which often lack support for dynamic allocation, require great attention to detail. And the allocations and indirections tend to complicate the job of the inliner, often resulting in sub-optimal codegen.

The coroutine language feature mitigates these shortcomings somewhat with the HALO optimization Halo: coroutine Heap Allocation eLision Optimization: the joint response, which leverages existing compiler optimizations such as allocation elision and devirtualization to inline the coroutine, completely eliminating the runtime overhead. However, HALO requires a sophisiticated compiler, and a fair number of stars need to align for the optimization to kick in. In our experience, more often than not in real-world code today’s compilers are not able to inline the coroutine, resulting in allocations and indirections in the generated code.

In a suite of generic async algorithms that are expected to be callable from hot code paths, the extra allocations and indirections are a deal-breaker. It is for these reasons that we consider coroutines a poor choise for a basis of all standard async.

1.10.3. Callbacks

Callbacks are the oldest, simplest, most powerful, and most efficient mechanism for creating chains of work, but suffer problems of their own. Callbacks must propagate either errors or values. This simple requirement yields many different interface possibilities. The lack of a standard callback shape obstructs generic design.

Additionally, few of these possibilities accommodate cancellation signals when the user requests upstream work to stop and clean up.

1.11. Field experience

1.11.1. libunifex

This proposal draws heavily from our field experience with libunifex. Libunifex implements all of the concepts and customization points defined in this paper (with slight variations -- the design of P2300 has evolved due to LEWG feedback), many of this paper’s algorithms (some under different names), and much more besides.

Libunifex has several concrete schedulers in addition to the run_loop suggested here (where it is called manual_event_loop). It has schedulers that dispatch efficiently to epoll and io_uring on Linux and the Windows Thread Pool on Windows.

In addition to the proposed interfaces and the additional schedulers, it has several important extensions to the facilities described in this paper, which demonstrate directions in which these abstractions may be evolved over time, including:

Libunifex has seen heavy production use at Facebook. As of October 2021, it is currently used in production within the following applications and platforms:

All of these applications are making direct use of the sender/receiver abstraction as presented in this paper. One product (Instagram on iOS) is making use of the sender/coroutine integration as presented. The monthly active users of these products number in the billions.

1.11.2. Other implementations

The authors are aware of a number of other implementations of sender/receiver from this paper. These are presented here in perceived order of maturity and field experience.

1.11.3. Inspirations

This proposal also draws heavily from our experience with Thrust and Agency. It is also inspired by the needs of countless other C++ frameworks for asynchrony, parallelism, and concurrency, including:

2. Revision history

2.1. R7

The changes since R6 are as follows:

Fixes:

Enhancements:

2.2. R6

The changes since R5 are as follows:

Fixes:

Enhancements:

2.2.1. Environments and attributes

In earlier revisions, receivers, senders, and schedulers all were directly queryable. In R4, receiver queries were moved into a separate "environment" object, obtainable from a receiver with a get_env accessor. In R6, the sender queries are given similar treatment, relocating to a "attributes" object obtainable from a sender with a get_attrs accessor. This was done to solve a number of design problems with the split and ensure_started algorithms; _e.g._, see NVIDIA/stdexec#466.

Schedulers, however, remain directly queryable. As lightweight handles that are required to be movable and copyable, there is little reason to want to dispose of a scheduler and yet persist the scheduler’s queries.

This revision also makes operation states directly queryable, even though there isn’t yet a use for such. Some early prototypes of cooperative bulk parallel sender algorithms done at NVIDIA suggest the utility of forwardable operation state queries. The authors chose to make opstates directly queryable since the opstate object is itself required to be kept alive for the duration of asynchronous operation.

2.3. R5

The changes since R4 are as follows:

Fixes:

Enhancements:

2.4. R4

The changes since R3 are as follows:

Fixes:

Enhancements:

2.4.1. Dependently-typed senders

Background:

In the sender/receiver model, as with coroutines, contextual information about the current execution is most naturally propagated from the consumer to the producer. In coroutines, that means information like stop tokens, allocators and schedulers are propagated from the calling coroutine to the callee. In sender/receiver, that means that that contextual information is associated with the receiver and is queried by the sender and/or operation state after the sender and the receiver are connect-ed.

Problem:

The implication of the above is that the sender alone does not have all the information about the async computation it will ultimately initiate; some of that information is provided late via the receiver. However, the sender_traits mechanism, by which an algorithm can introspect the value and error types the sender will propagate, only accepts a sender parameter. It does not take into consideration the type information that will come in late via the receiver. The effect of this is that some senders cannot be typed senders when they otherwise could be.

Example:

To get concrete, consider the case of the "get_scheduler()" sender: when connect-ed and start-ed, it queries the receiver for its associated scheduler and passes it back to the receiver through the value channel. That sender’s "value type" is the type of the receiver’s scheduler. What then should sender_traits<get_scheduler_sender>::value_types report for the get_scheduler()'s value type? It can’t answer because it doesn’t know.

This causes knock-on problems since some important algorithms require a typed sender, such as sync_wait. To illustrate the problem, consider the following code:

namespace ex = std::execution;

ex::sender auto task =
  ex::let_value(
    ex::get_scheduler(), // Fetches scheduler from receiver.
    [](auto current_sched) {
      // Lauch some nested work on the current scheduler:
      return ex::on(current_sched, nested work...);
    });

std::this_thread::sync_wait(std::move(task));

The code above is attempting to schedule some work onto the sync_wait's run_loop execution resource. But let_value only returns a typed sender when the input sender is typed. As we explained above, get_scheduler() is not typed, so task is likewise not typed. Since task isn’t typed, it cannot be passed to sync_wait which is expecting a typed sender. The above code would fail to compile.

Solution:

The solution is conceptually quite simple: extend the sender_traits mechanism to optionally accept a receiver in addition to the sender. The algorithms can use sender_traits<Sender, Receiver> to inspect the async operation’s completion-signals. The typed_sender concept would also need to take an optional receiver parameter. This is the simplest change, and it would solve the immediate problem.

Design:

Using the receiver type to compute the sender traits turns out to have pitfalls in practice. Many receivers make use of that type information in their implementation. It is very easy to create cycles in the type system, leading to inscrutible errors. The design pursued in R4 is to give receivers an associated environment object -- a bag of key/value pairs -- and to move the contextual information (schedulers, etc) out of the receiver and into the environment. The sender_traits template and the typed_sender concept, rather than taking a receiver, take an environment. This is a much more robust design.

A further refinement of this design would be to separate the receiver and the environment entirely, passing then as separate arguments along with the sender to connect. This paper does not propose that change.

Impact:

This change, apart from increasing the expressive power of the sender/receiver abstraction, has the following impact:

"Has it been implemented?"

Yes, the reference implementation, which can be found at https://github.com/NVIDIA/stdexec, has implemented this design as well as some dependently-typed senders to confirm that it works.

Implementation experience

Although this change has not yet been made in libunifex, the most widely adopted sender/receiver implementation, a similar design can be found in Folly’s coroutine support library. In Folly.Coro, it is possible to await a special awaitable to obtain the current coroutine’s associated scheduler (called an executor in Folly).

For instance, the following Folly code grabs the current executor, schedules a task for execution on that executor, and starts the resulting (scheduled) task by enqueueing it for execution.

// From Facebook’s Folly open source library:
template <class T>
folly::coro::Task<void> CancellableAsyncScope::co_schedule(folly::coro::Task<T>&& task) {
  this->add(std::move(task).scheduleOn(co_await co_current_executor));
  co_return;
}

Facebook relies heavily on this pattern in its coroutine code. But as described above, this pattern doesn’t work with R3 of std::execution because of the lack of dependently-typed schedulers. The change to sender_traits in R4 rectifies that.

Why now?

The authors are loathe to make any changes to the design, however small, at this stage of the C++23 release cycle. But we feel that, for a relatively minor design change -- adding an extra template parameter to sender_traits and typed_sender -- the returns are large enough to justify the change. And there is no better time to make this change than as early as possible.

One might wonder why this missing feature not been added to sender/receiver before now. The designers of sender/receiver have long been aware of the need. What was missing was a clean, robust, and simple design for the change, which we now have.

Drive-by:

We took the opportunity to make an additional drive-by change: Rather than providing the sender traits via a class template for users to specialize, we changed it into a sender query: get_completion_signatures(sender, env). That function’s return type is used as the sender’s traits. The authors feel this leads to a more uniform design and gives sender authors a straightforward way to make the value/error types dependent on the cv- and ref-qualification of the sender if need be.

Details:

Below are the salient parts of the new support for dependently-typed senders in R4:

2.5. R3

The changes since R2 are as follows:

Fixes:

Enhancements:

2.6. R2

The changes since R1 are as follows:

2.7. R1

The changes since R0 are as follows:

2.8. R0

Initial revision.

3. Design - introduction

The following three sections describe the entirety of the proposed design.

3.1. Conventions

The following conventions are used throughout the design section:

  1. The namespace proposed in this paper is the same as in A Unified Executors Proposal for C++: std::execution; however, for brevity, the std:: part of this name is omitted. When you see execution::foo, treat that as std::execution::foo.

  2. Universal references and explicit calls to std::move/std::forward are omitted in code samples and signatures for simplicity; assume universal references and perfect forwarding unless stated otherwise.

  3. None of the names proposed here are names that we are particularly attached to; consider the names to be reasonable placeholders that can freely be changed, should the committee want to do so.

3.2. Queries and algorithms

A query is a callable that takes some set of objects (usually one) as parameters and returns facts about those objects without modifying them. Queries are usually customization point objects, but in some cases may be functions.

An algorithm is a callable that takes some set of objects as parameters and causes those objects to do something. Algorithms are usually customization point objects, but in some cases may be functions.

4. Design - user side

4.1. Execution resources describe the place of execution

An execution resource is a resource that represents the place where execution will happen. This could be a concrete resource - like a specific thread pool object, or a GPU - or a more abstract one, like the current thread of execution. Execution contexts don’t need to have a representation in code; they are simply a term describing certain properties of execution of a function.

4.2. Schedulers represent execution resources

A scheduler is a lightweight handle that represents a strategy for scheduling work onto an execution resource. Since execution resources don’t necessarily manifest in C++ code, it’s not possible to program directly against their API. A scheduler is a solution to that problem: the scheduler concept is defined by a single sender algorithm, schedule, which returns a sender that will complete on an execution resource determined by the scheduler. Logic that you want to run on that context can be placed in the receiver’s completion-signalling method.

execution::scheduler auto sch = thread_pool.scheduler();
execution::sender auto snd = execution::schedule(sch);
// snd is a sender (see below) describing the creation of a new execution resource
// on the execution resource associated with sch

Note that a particular scheduler type may provide other kinds of scheduling operations which are supported by its associated execution resource. It is not limited to scheduling purely using the execution::schedule API.

Future papers will propose additional scheduler concepts that extend scheduler to add other capabilities. For example:

4.3. Senders describe work

A sender is an object that describes work. Senders are similar to futures in existing asynchrony designs, but unlike futures, the work that is being done to arrive at the values they will send is also directly described by the sender object itself. A sender is said to send some values if a receiver connected (see § 5.3 execution::connect) to that sender will eventually receive said values.

The primary defining sender algorithm is § 5.3 execution::connect; this function, however, is not a user-facing API; it is used to facilitate communication between senders and various sender algorithms, but end user code is not expected to invoke it directly.

The way user code is expected to interact with senders is by using sender algorithms. This paper proposes an initial set of such sender algorithms, which are described in § 4.4 Senders are composable through sender algorithms, § 4.20 User-facing sender factories, § 4.21 User-facing sender adaptors, and § 4.22 User-facing sender consumers. For example, here is how a user can create a new sender on a scheduler, attach a continuation to it, and then wait for execution of the continuation to complete:

execution::scheduler auto sch = thread_pool.scheduler();
execution::sender auto snd = execution::schedule(sch);
execution::sender auto cont = execution::then(snd, []{
    std::fstream file{ "result.txt" };
    file << compute_result;
});

this_thread::sync_wait(cont);
// at this point, cont has completed execution

4.4. Senders are composable through sender algorithms

Asynchronous programming often departs from traditional code structure and control flow that we are familiar with. A successful asynchronous framework must provide an intuitive story for composition of asynchronous work: expressing dependencies, passing objects, managing object lifetimes, etc.

The true power and utility of senders is in their composability. With senders, users can describe generic execution pipelines and graphs, and then run them on and across a variety of different schedulers. Senders are composed using sender algorithms:

4.5. Senders can propagate completion schedulers

One of the goals of executors is to support a diverse set of execution resources, including traditional thread pools, task and fiber frameworks (like HPX and Legion), and GPUs and other accelerators (managed by runtimes such as CUDA or SYCL). On many of these systems, not all execution agents are created equal and not all functions can be run on all execution agents. Having precise control over the execution resource used for any given function call being submitted is important on such systems, and the users of standard execution facilities will expect to be able to express such requirements.

A Unified Executors Proposal for C++ was not always clear about the place of execution of any given piece of code. Precise control was present in the two-way execution API present in earlier executor designs, but it has so far been missing from the senders design. There has been a proposal (Towards C++23 executors: A proposal for an initial set of algorithms) to provide a number of sender algorithms that would enforce certain rules on the places of execution of the work described by a sender, but we have found those sender algorithms to be insufficient for achieving the best performance on all platforms that are of interest to us. The implementation strategies that we are aware of result in one of the following situations:

  1. trying to submit work to one execution resource (such as a CPU thread pool) from another execution resource (such as a GPU or a task framework), which assumes that all execution agents are as capable as a std::thread (which they aren’t).

  2. forcibly interleaving two adjacent execution graph nodes that are both executing on one execution resource (such as a GPU) with glue code that runs on another execution resource (such as a CPU), which is prohibitively expensive for some execution resources (such as CUDA or SYCL).

  3. having to customise most or all sender algorithms to support an execution resource, so that you can avoid problems described in 1. and 2, which we believe is impractical and brittle based on months of field experience attempting this in Agency.

None of these implementation strategies are acceptable for many classes of parallel runtimes, such as task frameworks (like HPX) or accelerator runtimes (like CUDA or SYCL).

Therefore, in addition to the on sender algorithm from Towards C++23 executors: A proposal for an initial set of algorithms, we are proposing a way for senders to advertise what scheduler (and by extension what execution resource) they will complete on. Any given sender may have completion schedulers for some or all of the signals (value, error, or stopped) it completes with (for more detail on the completion-signals, see § 5.1 Receivers serve as glue between senders). When further work is attached to that sender by invoking sender algorithms, that work will also complete on an appropriate completion scheduler.

4.5.1. execution::get_completion_scheduler

get_completion_scheduler is a query that retrieves the completion scheduler for a specific completion-signal from a sender’s environment. For a sender that lacks a completion scheduler query for a given signal, calling get_completion_scheduler is ill-formed. If a sender advertises a completion scheduler for a signal in this way, that sender must ensure that it sends that signal on an execution agent belonging to an execution resource represented by a scheduler returned from this function. See § 4.5 Senders can propagate completion schedulers for more details.

execution::scheduler auto cpu_sched = new_thread_scheduler{};
execution::scheduler auto gpu_sched = cuda::scheduler();

execution::sender auto snd0 = execution::schedule(cpu_sched);
execution::scheduler auto completion_sch0 =
  execution::get_completion_scheduler<execution::set_value_t>(get_env(snd0));
// completion_sch0 is equivalent to cpu_sched

execution::sender auto snd1 = execution::then(snd0, []{
    std::cout << "I am running on cpu_sched!\n";
});
execution::scheduler auto completion_sch1 =
  execution::get_completion_scheduler<execution::set_value_t>(get_env(snd1));
// completion_sch1 is equivalent to cpu_sched

execution::sender auto snd2 = execution::transfer(snd1, gpu_sched);
execution::sender auto snd3 = execution::then(snd2, []{
    std::cout << "I am running on gpu_sched!\n";
});
execution::scheduler auto completion_sch3 =
  execution::get_completion_scheduler<execution::set_value_t>(get_env(snd3));
// completion_sch3 is equivalent to gpu_sched

4.6. Execution resource transitions are explicit

A Unified Executors Proposal for C++ does not contain any mechanisms for performing an execution resource transition. The only sender algorithm that can create a sender that will move execution to a specific execution resource is execution::schedule, which does not take an input sender. That means that there’s no way to construct sender chains that traverse different execution resources. This is necessary to fulfill the promise of senders being able to replace two-way executors, which had this capability.

We propose that, for senders advertising their completion scheduler, all execution resource transitions must be explicit; running user code anywhere but where they defined it to run must be considered a bug.

The execution::transfer sender adaptor performs a transition from one execution resource to another:

execution::scheduler auto sch1 = ...;
execution::scheduler auto sch2 = ...;

execution::sender auto snd1 = execution::schedule(sch1);
execution::sender auto then1 = execution::then(snd1, []{
    std::cout << "I am running on sch1!\n";
});

execution::sender auto snd2 = execution::transfer(then1, sch2);
execution::sender auto then2 = execution::then(snd2, []{
    std::cout << "I am running on sch2!\n";
});

this_thread::sync_wait(then2);

4.7. Senders can be either multi-shot or single-shot

Some senders may only support launching their operation a single time, while others may be repeatable and support being launched multiple times. Executing the operation may consume resources owned by the sender.

For example, a sender may contain a std::unique_ptr that it will be transferring ownership of to the operation-state returned by a call to execution::connect so that the operation has access to this resource. In such a sender, calling execution::connect consumes the sender such that after the call the input sender is no longer valid. Such a sender will also typically be move-only so that it can maintain unique ownership of that resource.

A single-shot sender can only be connected to a receiver at most once. Its implementation of execution::connect only has overloads for an rvalue-qualified sender. Callers must pass the sender as an rvalue to the call to execution::connect, indicating that the call consumes the sender.

A multi-shot sender can be connected to multiple receivers and can be launched multiple times. Multi-shot senders customise execution::connect to accept an lvalue reference to the sender. Callers can indicate that they want the sender to remain valid after the call to execution::connect by passing an lvalue reference to the sender to call these overloads. Multi-shot senders should also define overloads of execution::connect that accept rvalue-qualified senders to allow the sender to be also used in places where only a single-shot sender is required.

If the user of a sender does not require the sender to remain valid after connecting it to a receiver then it can pass an rvalue-reference to the sender to the call to execution::connect. Such usages should be able to accept either single-shot or multi-shot senders.

If the caller does wish for the sender to remain valid after the call then it can pass an lvalue-qualified sender to the call to execution::connect. Such usages will only accept multi-shot senders.

Algorithms that accept senders will typically either decay-copy an input sender and store it somewhere for later usage (for example as a data-member of the returned sender) or will immediately call execution::connect on the input sender, such as in this_thread::sync_wait or execution::start_detached.

Some multi-use sender algorithms may require that an input sender be copy-constructible but will only call execution::connect on an rvalue of each copy, which still results in effectively executing the operation multiple times. Other multi-use sender algorithms may require that the sender is move-constructible but will invoke execution::connect on an lvalue reference to the sender.

For a sender to be usable in both multi-use scenarios, it will generally be required to be both copy-constructible and lvalue-connectable.

4.8. Senders are forkable

Any non-trivial program will eventually want to fork a chain of senders into independent streams of work, regardless of whether they are single-shot or multi-shot. For instance, an incoming event to a middleware system may be required to trigger events on more than one downstream system. This requires that we provide well defined mechanisms for making sure that connecting a sender multiple times is possible and correct.

The split sender adaptor facilitates connecting to a sender multiple times, regardless of whether it is single-shot or multi-shot:

auto some_algorithm(execution::sender auto&& input) {
    execution::sender auto multi_shot = split(input);
    // "multi_shot" is guaranteed to be multi-shot,
    // regardless of whether "input" was multi-shot or not

    return when_all(
      then(multi_shot, [] { std::cout << "First continuation\n"; }),
      then(multi_shot, [] { std::cout << "Second continuation\n"; })
    );
}

4.9. Senders are joinable

Similarly to how it’s hard to write a complex program that will eventually want to fork sender chains into independent streams, it’s also hard to write a program that does not want to eventually create join nodes, where multiple independent streams of execution are merged into a single one in an asynchronous fashion.

when_all is a sender adaptor that returns a sender that completes when the last of the input senders completes. It sends a pack of values, where the elements of said pack are the values sent by the input senders, in order. when_all returns a sender that also does not have an associated scheduler.

transfer_when_all accepts an additional scheduler argument. It returns a sender whose value completion scheduler is the scheduler provided as an argument, but otherwise behaves the same as when_all. You can think of it as a composition of transfer(when_all(inputs...), scheduler), but one that allows for better efficiency through customization.

4.10. Senders support cancellation

Senders are often used in scenarios where the application may be concurrently executing multiple strategies for achieving some program goal. When one of these strategies succeeds (or fails) it may not make sense to continue pursuing the other strategies as their results are no longer useful.

For example, we may want to try to simultaneously connect to multiple network servers and use whichever server responds first. Once the first server responds we no longer need to continue trying to connect to the other servers.

Ideally, in these scenarios, we would somehow be able to request that those other strategies stop executing promptly so that their resources (e.g. cpu, memory, I/O bandwidth) can be released and used for other work.

While the design of senders has support for cancelling an operation before it starts by simply destroying the sender or the operation-state returned from execution::connect() before calling execution::start(), there also needs to be a standard, generic mechanism to ask for an already-started operation to complete early.

The ability to be able to cancel in-flight operations is fundamental to supporting some kinds of generic concurrency algorithms.

For example:

The mechanism used for communcating cancellation-requests, or stop-requests, needs to have a uniform interface so that generic algorithms that compose sender-based operations, such as the ones listed above, are able to communicate these cancellation requests to senders that they don’t know anything about.

The design is intended to be composable so that cancellation of higher-level operations can propagate those cancellation requests through intermediate layers to lower-level operations that need to actually respond to the cancellation requests.

For example, we can compose the algorithms mentioned above so that child operations are cancelled when any one of the multiple cancellation conditions occurs:

sender auto composed_cancellation_example(auto query) {
  return stop_when(
    timeout(
      when_all(
        first_successful(
          query_server_a(query),
          query_server_b(query)),
        load_file("some_file.jpg")),
      5s),
    cancelButton.on_click());
}

In this example, if we take the operation returned by query_server_b(query), this operation will receive a stop-request when any of the following happens:

Note that within this code there is no explicit mention of cancellation, stop-tokens, callbacks, etc. yet the example fully supports and responds to the various cancellation sources.

The intent of the design is that the common usage of cancellation in sender/receiver-based code is primarily through use of concurrency algorithms that manage the detailed plumbing of cancellation for you. Much like algorithms that compose senders relieve the user from having to write their own receiver types, algorithms that introduce concurrency and provide higher-level cancellation semantics relieve the user from having to deal with low-level details of cancellation.

4.10.1. Cancellation design summary

The design of cancellation described in this paper is built on top of and extends the std::stop_token-based cancellation facilities added in C++20, first proposed in Composable cancellation for sender-based async operations.

At a high-level, the facilities proposed by this paper for supporting cancellation include:

In addition, there are requirements added to some of the algorithms to specify what their cancellation behaviour is and what the requirements of customisations of those algorithms are with respect to cancellation.

The key component that enables generic cancellation within sender-based operations is the execution::get_stop_token() CPO. This CPO takes a single parameter, which is the execution environment of the receiver passed to execution::connect, and returns a std::stoppable_token that the operation can use to check for stop-requests for that operation.

As the caller of execution::connect typically has control over the receiver type it passes, it is able to customise the std::get_env() CPO for that receiver to return an execution environment that hooks the execution::get_stop_token() CPO to return a stop-token that the receiver has control over and that it can use to communicate a stop-request to the operation once it has started.

4.10.2. Support for cancellation is optional

Support for cancellation is optional, both on part of the author of the receiver and on part of the author of the sender.

If the receiver’s execution environment does not customise the execution::get_stop_token() CPO then invoking the CPO on that receiver’s environment will invoke the default implementation which returns std::never_stop_token. This is a special stoppable_token type that is statically known to always return false from the stop_possible() method.

Sender code that tries to use this stop-token will in general result in code that handles stop-requests being compiled out and having little to no run-time overhead.

If the sender doesn’t call execution::get_stop_token(), for example because the operation does not support cancellation, then it will simply not respond to stop-requests from the caller.

Note that stop-requests are generally racy in nature as there is often a race betwen an operation completing naturally and the stop-request being made. If the operation has already completed or past the point at which it can be cancelled when the stop-request is sent then the stop-request may just be ignored. An application will typically need to be able to cope with senders that might ignore a stop-request anyway.

4.10.3. Cancellation is inherently racy

Usually, an operation will attach a stop-callback at some point inside the call to execution::start() so that a subsequent stop-request will interrupt the logic.

A stop-request can be issued concurrently from another thread. This means the implementation of execution::start() needs to be careful to ensure that, once a stop-callback has been registered, that there are no data-races between a potentially concurrently-executing stop-callback and the rest of the execution::start() implementation.

An implementation of execution::start() that supports cancellation will generally need to perform (at least) two separate steps: launch the operation, subscribe a stop-callback to the receiver’s stop-token. Care needs to be taken depending on the order in which these two steps are performed.

If the stop-callback is subscribed first and then the operation is launched, care needs to be taken to ensure that a stop-request that invokes the stop-callback on another thread after the stop-callback is registered but before the operation finishes launching does not either result in a missed cancellation request or a data-race. e.g. by performing an atomic write after the launch has finished executing

If the operation is launched first and then the stop-callback is subscribed, care needs to be taken to ensure that if the launched operation completes concurrently on another thread that it does not destroy the operation-state until after the stop-callback has been registered. e.g. by having the execution::start implementation write to an atomic variable once it has finished registering the stop-callback and having the concurrent completion handler check that variable and either call the completion-signalling operation or store the result and defer calling the receiver’s completion-signalling operation to the execution::start() call (which is still executing).

For an example of an implementation strategy for solving these data-races see § 1.4 Asynchronous Windows socket recv.

4.10.4. Cancellation design status

This paper currently includes the design for cancellation as proposed in Composable cancellation for sender-based async operations - "Composable cancellation for sender-based async operations". P2175R0 contains more details on the background motivation and prior-art and design rationale of this design.

It is important to note, however, that initial review of this design in the SG1 concurrency subgroup raised some concerns related to runtime overhead of the design in single-threaded scenarios and these concerns are still being investigated.

The design of P2175R0 has been included in this paper for now, despite its potential to change, as we believe that support for cancellation is a fundamental requirement for an async model and is required in some form to be able to talk about the semantics of some of the algorithms proposed in this paper.

This paper will be updated in the future with any changes that arise from the investigations into P2175R0.

4.11. Sender factories and adaptors are lazy

In an earlier revision of this paper, some of the proposed algorithms supported executing their logic eagerly; i.e., before the returned sender has been connected to a receiver and started. These algorithms were removed because eager execution has a number of negative semantic and performance implications.

We have originally included this functionality in the paper because of a long-standing belief that eager execution is a mandatory feature to be included in the standard Executors facility for that facility to be acceptable for accelerator vendors. A particular concern was that we must be able to write generic algorithms that can run either eagerly or lazily, depending on the kind of an input sender or scheduler that have been passed into them as arguments. We considered this a requirement, because the _latency_ of launching work on an accelerator can sometimes be considerable.

However, in the process of working on this paper and implementations of the features proposed within, our set of requirements has shifted, as we understood the different implementation strategies that are available for the feature set of this paper better, and, after weighting the earlier concerns against the points presented below, we have arrived at the conclusion that a purely lazy model is enough for most algorithms, and users who intend to launch work earlier may use an algorithm such as ensure_started to achieve that goal. We have also come to deeply appreciate the fact that a purely lazy model allows both the implementation and the compiler to have a much better understanding of what the complete graph of tasks looks like, allowing them to better optimize the code - also when targetting accelerators.

4.11.1. Eager execution leads to detached work or worse

One of the questions that arises with APIs that can potentially return eagerly-executing senders is "What happens when those senders are destructed without a call to execution::connect?" or similarly, "What happens if a call to execution::connect is made, but the returned operation state is destroyed before execution::start is called on that operation state"?

In these cases, the operation represented by the sender is potentially executing concurrently in another thread at the time that the destructor of the sender and/or operation-state is running. In the case that the operation has not completed executing by the time that the destructor is run we need to decide what the semantics of the destructor is.

There are three main strategies that can be adopted here, none of which is particularly satisfactory:

  1. Make this undefined-behaviour - the caller must ensure that any eagerly-executing sender is always joined by connecting and starting that sender. This approach is generally pretty hostile to programmers, particularly in the presence of exceptions, since it complicates the ability to compose these operations.

    Eager operations typically need to acquire resources when they are first called in order to start the operation early. This makes eager algorithms prone to failure. Consider, then, what might happen in an expression such as when_all(eager_op_1(), eager_op_2()). Imagine eager_op_1() starts an asynchronous operation successfully, but then eager_op_2() throws. For lazy senders, that failure happens in the context of the when_all algorithm, which handles the failure and ensures that async work joins on all code paths. In this case though -- the eager case -- the child operation has failed even before when_all has been called.

    It then becomes the responsibility, not of the algorithm, but of the end user to handle the exception and ensure that eager_op_1() is joined before allowing the exception to propagate. If they fail to do that, they incur undefined behavior.

  2. Detach from the computation - let the operation continue in the background - like an implicit call to std::thread::detach(). While this approach can work in some circumstances for some kinds of applications, in general it is also pretty user-hostile; it makes it difficult to reason about the safe destruction of resources used by these eager operations. In general, detached work necessitates some kind of garbage collection; e.g., std::shared_ptr, to ensure resources are kept alive until the operations complete, and can make clean shutdown nigh impossible.

  3. Block in the destructor until the operation completes. This approach is probably the safest to use as it preserves the structured nature of the concurrent operations, but also introduces the potential for deadlocking the application if the completion of the operation depends on the current thread making forward progress.

    The risk of deadlock might occur, for example, if a thread-pool with a small number of threads is executing code that creates a sender representing an eagerly-executing operation and then calls the destructor of that sender without joining it (e.g. because an exception was thrown). If the current thread blocks waiting for that eager operation to complete and that eager operation cannot complete until some entry enqueued to the thread-pool’s queue of work is run then the thread may wait for an indefinite amount of time. If all threads of the thread-pool are simultaneously performing such blocking operations then deadlock can result.

There are also minor variations on each of these choices. For example:

  1. A variation of (1): Call std::terminate if an eager sender is destructed without joining it. This is the approach that std::thread destructor takes.

  2. A variation of (2): Request cancellation of the operation before detaching. This reduces the chances of operations continuing to run indefinitely in the background once they have been detached but does not solve the lifetime- or shutdown-related challenges.

  3. A variation of (3): Request cancellation of the operation before blocking on its completion. This is the strategy that std::jthread uses for its destructor. It reduces the risk of deadlock but does not eliminate it.

4.11.2. Eager senders complicate algorithm implementations

Algorithms that can assume they are operating on senders with strictly lazy semantics are able to make certain optimizations that are not available if senders can be potentially eager. With lazy senders, an algorithm can safely assume that a call to execution::start on an operation state strictly happens before the execution of that async operation. This frees the algorithm from needing to resolve potential race conditions. For example, consider an algorithm sequence that puts async operations in sequence by starting an operation only after the preceding one has completed. In an expression like sequence(a(), then(src, [] { b(); }), c()), one my reasonably assume that a(), b() and c() are sequenced and therefore do not need synchronisation. Eager algorithms break that assumption.

When an algorithm needs to deal with potentially eager senders, the potential race conditions can be resolved one of two ways, neither of which is desirable:

  1. Assume the worst and implement the algorithm defensively, assuming all senders are eager. This obviously has overheads both at runtime and in algorithm complexity. Resolving race conditions is hard.

  2. Require senders to declare whether they are eager or not with a query. Algorithms can then implement two different implementation strategies, one for strictly lazy senders and one for potentially eager senders. This addresses the performance problem of (1) while compounding the complexity problem.

4.11.3. Eager senders incur cancellation-related overhead

Another implication of the use of eager operations is with regards to cancellation. The eagerly executing operation will not have access to the caller’s stop token until the sender is connected to a receiver. If we still want to be able to cancel the eager operation then it will need to create a new stop source and pass its associated stop token down to child operations. Then when the returned sender is eventually connected it will register a stop callback with the receiver’s stop token that will request stop on the eager sender’s stop source.

As the eager operation does not know at the time that it is launched what the type of the receiver is going to be, and thus whether or not the stop token returned from execution::get_stop_token is an std::unstoppable_token or not, the eager operation is going to need to assume it might be later connected to a receiver with a stop token that might actually issue a stop request. Thus it needs to declare space in the operation state for a type-erased stop callback and incur the runtime overhead of supporting cancellation, even if cancellation will never be requested by the caller.

The eager operation will also need to do this to support sending a stop request to the eager operation in the case that the sender representing the eager work is destroyed before it has been joined (assuming strategy (5) or (6) listed above is chosen).

4.11.4. Eager senders cannot access execution resource from the receiver

In sender/receiver, contextual information is passed from parent operations to their children by way of receivers. Information like stop tokens, allocators, current scheduler, priority, and deadline are propagated to child operations with custom receivers at the time the operation is connected. That way, each operation has the contextual information it needs before it is started.

But if the operation is started before it is connected to a receiver, then there isn’t a way for a parent operation to communicate contextual information to its child operations, which may complete before a receiver is ever attached.

4.12. Schedulers advertise their forward progress guarantees

To decide whether a scheduler (and its associated execution resource) is sufficient for a specific task, it may be necessary to know what kind of forward progress guarantees it provides for the execution agents it creates. The C++ Standard defines the following forward progress guarantees:

This paper introduces a scheduler query function, get_forward_progress_guarantee, which returns one of the enumerators of a new enum type, forward_progress_guarantee. Each enumerator of forward_progress_guarantee corresponds to one of the aforementioned guarantees.

4.13. Most sender adaptors are pipeable

To facilitate an intuitive syntax for composition, most sender adaptors are pipeable; they can be composed (piped) together with operator|. This mechanism is similar to the operator| composition that C++ range adaptors support and draws inspiration from piping in *nix shells. Pipeable sender adaptors take a sender as their first parameter and have no other sender parameters.

a | b will pass the sender a as the first argument to the pipeable sender adaptor b. Pipeable sender adaptors support partial application of the parameters after the first. For example, all of the following are equivalent:

execution::bulk(snd, N, [] (std::size_t i, auto d) {});
execution::bulk(N, [] (std::size_t i, auto d) {})(snd);
snd | execution::bulk(N, [] (std::size_t i, auto d) {});

Piping enables you to compose together senders with a linear syntax. Without it, you’d have to use either nested function call syntax, which would cause a syntactic inversion of the direction of control flow, or you’d have to introduce a temporary variable for each stage of the pipeline. Consider the following example where we want to execute first on a CPU thread pool, then on a CUDA GPU, then back on the CPU thread pool:

Syntax Style Example
Function call
(nested)
auto snd = execution::then(
             execution::transfer(
               execution::then(
                 execution::transfer(
                   execution::then(
                     execution::schedule(thread_pool.scheduler())
                     []{ return 123; }),
                   cuda::new_stream_scheduler()),
                 [](int i){ return 123 * 5; }),
               thread_pool.scheduler()),
             [](int i){ return i - 5; });
auto [result] = this_thread::sync_wait(snd).value();
// result == 610
Function call
(named temporaries)
auto snd0 = execution::schedule(thread_pool.scheduler());
auto snd1 = execution::then(snd0, []{ return 123; });
auto snd2 = execution::transfer(snd1, cuda::new_stream_scheduler());
auto snd3 = execution::then(snd2, [](int i){ return 123 * 5; })
auto snd4 = execution::transfer(snd3, thread_pool.scheduler())
auto snd5 = execution::then(snd4, [](int i){ return i - 5; });
auto [result] = *this_thread::sync_wait(snd4);
// result == 610
Pipe
auto snd = execution::schedule(thread_pool.scheduler())
         | execution::then([]{ return 123; })
         | execution::transfer(cuda::new_stream_scheduler())
         | execution::then([](int i){ return 123 * 5; })
         | execution::transfer(thread_pool.scheduler())
         | execution::then([](int i){ return i - 5; });
auto [result] = this_thread::sync_wait(snd).value();
// result == 610

Certain sender adaptors are not pipeable, because using the pipeline syntax can result in confusion of the semantics of the adaptors involved. Specifically, the following sender adaptors are not pipeable.

Sender consumers could be made pipeable, but we have chosen to not do so. However, since these are terminal nodes in a pipeline and nothing can be piped after them, we believe a pipe syntax may be confusing as well as unnecessary, as consumers cannot be chained. We believe sender consumers read better with function call syntax.

4.14. A range of senders represents an async sequence of data

Senders represent a single unit of asynchronous work. In many cases though, what is being modelled is a sequence of data arriving asynchronously, and you want computation to happen on demand, when each element arrives. This requires nothing more than what is in this paper and the range support in C++20. A range of senders would allow you to model such input as keystrikes, mouse movements, sensor readings, or network requests.

Given some expression R that is a range of senders, consider the following in a coroutine that returns an async generator type:

for (auto snd : R) {
  if (auto opt = co_await execution::stopped_as_optional(std::move(snd)))
    co_yield fn(*std::move(opt));
  else
    break;
}

This transforms each element of the asynchronous sequence R with the function fn on demand, as the data arrives. The result is a new asynchronous sequence of the transformed values.

Now imagine that R is the simple expression views::iota(0) | views::transform(execution::just). This creates a lazy range of senders, each of which completes immediately with monotonically increasing integers. The above code churns through the range, generating a new infine asynchronous range of values [fn(0), fn(1), fn(2), ...].

Far more interesting would be if R were a range of senders representing, say, user actions in a UI. The above code gives a simple way to respond to user actions on demand.

4.15. Senders can represent partial success

Receivers have three ways they can complete: with success, failure, or cancellation. This begs the question of how they can be used to represent async operations that partially succeed. For example, consider an API that reads from a socket. The connection could drop after the API has filled in some of the buffer. In cases like that, it makes sense to want to report both that the connection dropped and that some data has been successfully read.

Often in the case of partial success, the error condition is not fatal nor does it mean the API has failed to satisfy its post-conditions. It is merely an extra piece of information about the nature of the completion. In those cases, "partial success" is another way of saying "success". As a result, it is sensible to pass both the error code and the result (if any) through the value channel, as shown below:

// Capture a buffer for read_socket_async to fill in
execution::just(array<byte, 1024>{})
  | execution::let_value([socket](array<byte, 1024>& buff) {
      // read_socket_async completes with two values: an error_code and
      // a count of bytes:
      return read_socket_async(socket, span{buff})
          // For success (partial and full), specify the next action:
        | execution::let_value([](error_code err, size_t bytes_read) {
            if (err != 0) {
              // OK, partial success. Decide how to deal with the partial results
            } else {
              // OK, full success here.
            }
          });
    })

In other cases, the partial success is more of a partial failure. That happens when the error condition indicates that in some way the function failed to satisfy its post-conditions. In those cases, sending the error through the value channel loses valuable contextual information. It’s possible that bundling the error and the incomplete results into an object and passing it through the error channel makes more sense. In that way, generic algorithms will not miss the fact that a post-condition has not been met and react inappropriately.

Another possibility is for an async API to return a range of senders: if the API completes with full success, full error, or cancellation, the returned range contains just one sender with the result. Otherwise, if the API partially fails (doesn’t satisfy its post-conditions, but some incomplete result is available), the returned range would have two senders: the first containing the partial result, and the second containing the error. Such an API might be used in a coroutine as follows:

// Declare a buffer for read_socket_async to fill in
array<byte, 1024> buff;

for (auto snd : read_socket_async(socket, span{buff})) {
  try {
    if (optional<size_t> bytes_read =
          co_await execution::stopped_as_optional(std::move(snd)))
      // OK, we read some bytes into buff. Process them here....
    } else {
      // The socket read was cancelled and returned no data. React
      // appropriately.
    }
  } catch (...) {
    // read_socket_async failed to meet its post-conditions.
    // Do some cleanup and propagate the error...
  }
}

Finally, it’s possible to combine these two approaches when the API can both partially succeed (meeting its post-conditions) and partially fail (not meeting its post-conditions).

4.16. All awaitables are senders

Since C++20 added coroutines to the standard, we expect that coroutines and awaitables will be how a great many will choose to express their asynchronous code. However, in this paper, we are proposing to add a suite of asynchronous algorithms that accept senders, not awaitables. One might wonder whether and how these algorithms will be accessible to those who choose coroutines instead of senders.

In truth there will be no problem because all generally awaitable types automatically model the sender concept. The adaptation is transparent and happens in the sender customization points, which are aware of awaitables. (By "generally awaitable" we mean types that don’t require custom await_transform trickery from a promise type to make them awaitable.)

For an example, imagine a coroutine type called task<T> that knows nothing about senders. It doesn’t implement any of the sender customization points. Despite that fact, and despite the fact that the this_thread::sync_wait algorithm is constrained with the sender concept, the following would compile and do what the user wants:

task<int> doSomeAsyncWork();

int main() {
  // OK, awaitable types satisfy the requirements for senders:
  auto o = this_thread::sync_wait(doSomeAsyncWork());
}

Since awaitables are senders, writing a sender-based asynchronous algorithm is trivial if you have a coroutine task type: implement the algorithm as a coroutine. If you are not bothered by the possibility of allocations and indirections as a result of using coroutines, then there is no need to ever write a sender, a receiver, or an operation state.

4.17. Many senders can be trivially made awaitable

If you choose to implement your sender-based algorithms as coroutines, you’ll run into the issue of how to retrieve results from a passed-in sender. This is not a problem. If the coroutine type opts in to sender support -- trivial with the execution::with_awaitable_senders utility -- then a large class of senders are transparently awaitable from within the coroutine.

For example, consider the following trivial implementation of the sender-based retry algorithm:

template<class S>
  requires single-sender<S&> // see [exec.as.awaitable]
task<single-sender-value-type<S>> retry(S s) {
  for (;;) {
    try {
      co_return co_await s;
    } catch(...) {
    }
  }
}

Only some senders can be made awaitable directly because of the fact that callbacks are more expressive than coroutines. An awaitable expression has a single type: the result value of the async operation. In contrast, a callback can accept multiple arguments as the result of an operation. What’s more, the callback can have overloaded function call signatures that take different sets of arguments. There is no way to automatically map such senders into awaitables. The with_awaitable_senders utility recognizes as awaitables those senders that send a single value of a single type. To await another kind of sender, a user would have to first map its value channel into a single value of a single type -- say, with the into_variant sender algorithm -- before co_await-ing that sender.

4.18. Cancellation of a sender can unwind a stack of coroutines

When looking at the sender-based retry algorithm in the previous section, we can see that the value and error cases are correctly handled. But what about cancellation? What happens to a coroutine that is suspended awaiting a sender that completes by calling execution::set_stopped?

When your task type’s promise inherits from with_awaitable_senders, what happens is this: the coroutine behaves as if an uncatchable exception had been thrown from the co_await expression. (It is not really an exception, but it’s helpful to think of it that way.) Provided that the promise types of the calling coroutines also inherit from with_awaitable_senders, or more generally implement a member function called unhandled_stopped, the exception unwinds the chain of coroutines as if an exception were thrown except that it bypasses catch(...) clauses.

In order to "catch" this uncatchable stopped exception, one of the calling coroutines in the stack would have to await a sender that maps the stopped channel into either a value or an error. That is achievable with the execution::let_stopped, execution::upon_stopped, execution::stopped_as_optional, or execution::stopped_as_error sender adaptors. For instance, we can use execution::stopped_as_optional to "catch" the stopped signal and map it into an empty optional as shown below:

if (auto opt = co_await execution::stopped_as_optional(some_sender)) {
  // OK, some_sender completed successfully, and opt contains the result.
} else {
  // some_sender completed with a cancellation signal.
}

As described in the section "All awaitables are senders", the sender customization points recognize awaitables and adapt them transparently to model the sender concept. When connect-ing an awaitable and a receiver, the adaptation layer awaits the awaitable within a coroutine that implements unhandled_stopped in its promise type. The effect of this is that an "uncatchable" stopped exception propagates seamlessly out of awaitables, causing execution::set_stopped to be called on the receiver.

Obviously, unhandled_stopped is a library extension of the coroutine promise interface. Many promise types will not implement unhandled_stopped. When an uncatchable stopped exception tries to propagate through such a coroutine, it is treated as an unhandled exception and terminate is called. The solution, as described above, is to use a sender adaptor to handle the stopped exception before awaiting it. It goes without saying that any future Standard Library coroutine types ought to implement unhandled_stopped. The author of Add lazy coroutine (coroutine task) type, which proposes a standard coroutine task type, is in agreement.

4.19. Composition with parallel algorithms

The C++ Standard Library provides a large number of algorithms that offer the potential for non-sequential execution via the use of execution policies. The set of algorithms with execution policy overloads are often referred to as "parallel algorithms", although additional policies are available.

Existing policies, such as execution::par, give the implementation permission to execute the algorithm in parallel. However, the choice of execution resources used to perform the work is left to the implementation.

We will propose a customization point for combining schedulers with policies in order to provide control over where work will execute.

template<class ExecutionPolicy>
unspecified executing_on(
    execution::scheduler auto scheduler,
    ExecutionPolicy && policy
);

This function would return an object of an unspecified type which can be used in place of an execution policy as the first argument to one of the parallel algorithms. The overload selected by that object should execute its computation as requested by policy while using scheduler to create any work to be run. The expression may be ill-formed if scheduler is not able to support the given policy.

The existing parallel algorithms are synchronous; all of the effects performed by the computation are complete before the algorithm returns to its caller. This remains unchanged with the executing_on customization point.

In the future, we expect additional papers will propose asynchronous forms of the parallel algorithms which (1) return senders rather than values or void and (2) where a customization point pairing a sender with an execution policy would similarly be used to obtain an object of unspecified type to be provided as the first argument to the algorithm.

4.20. User-facing sender factories

A sender factory is an algorithm that takes no senders as parameters and returns a sender.

4.20.1. execution::schedule

execution::sender auto schedule(
    execution::scheduler auto scheduler
);

Returns a sender describing the start of a task graph on the provided scheduler. See § 4.2 Schedulers represent execution resources.

execution::scheduler auto sch1 = get_system_thread_pool().scheduler();

execution::sender auto snd1 = execution::schedule(sch1);
// snd1 describes the creation of a new task on the system thread pool

4.20.2. execution::just

execution::sender auto just(
    auto ...&& values
);

Returns a sender with no completion schedulers, which sends the provided values. The input values are decay-copied into the returned sender. When the returned sender is connected to a receiver, the values are moved into the operation state if the sender is an rvalue; otherwise, they are copied. Then xvalues referencing the values in the operation state are passed to the receiver’s set_value.

execution::sender auto snd1 = execution::just(3.14);
execution::sender auto then1 = execution::then(snd1, [] (double d) {
  std::cout << d << "\n";
});

execution::sender auto snd2 = execution::just(3.14, 42);
execution::sender auto then2 = execution::then(snd2, [] (double d, int i) {
  std::cout << d << ", " << i << "\n";
});

std::vector v3{1, 2, 3, 4, 5};
execution::sender auto snd3 = execution::just(v3);
execution::sender auto then3 = execution::then(snd3, [] (std::vector<int>&& v3copy) {
  for (auto&& e : v3copy) { e *= 2; }
  return std::move(v3copy);
}
auto&& [v3copy] = this_thread::sync_wait(then3).value();
// v3 contains {1, 2, 3, 4, 5}; v3copy will contain {2, 4, 6, 8, 10}.

execution::sender auto snd4 = execution::just(std::vector{1, 2, 3, 4, 5});
execution::sender auto then4 = execution::then(std::move(snd4), [] (std::vector<int>&& v4) {
  for (auto&& e : v4) { e *= 2; }
  return std::move(v4);
});
auto&& [v4] = this_thread::sync_wait(std::move(then4)).value();
// v4 contains {2, 4, 6, 8, 10}. No vectors were copied in this example.

4.20.3. execution::transfer_just

execution::sender auto transfer_just(
    execution::scheduler auto scheduler,
    auto ...&& values
);

Returns a sender whose value completion scheduler is the provided scheduler, which sends the provided values in the same manner as just.

execution::sender auto vals = execution::transfer_just(
    get_system_thread_pool().scheduler(),
    1, 2, 3
);
execution::sender auto snd = execution::then(vals, [](auto... args) {
    std::print(args...);
});
// when snd is executed, it will print "123"

This adaptor is included as it greatly simplifies lifting values into senders.

4.20.4. execution::just_error

execution::sender auto just_error(
    auto && error
);

Returns a sender with no completion schedulers, which completes with the specified error. If the provided error is an lvalue reference, a copy is made inside the returned sender and a non-const lvalue reference to the copy is sent to the receiver’s set_error. If the provided value is an rvalue reference, it is moved into the returned sender and an rvalue reference to it is sent to the receiver’s set_error.

4.20.5. execution::just_stopped

execution::sender auto just_stopped();

Returns a sender with no completion schedulers, which completes immediately by calling the receiver’s set_stopped.

4.20.6. execution::read

execution::sender auto read(auto tag);

execution::sender auto get_scheduler() {
  return read(execution::get_scheduler);
}
execution::sender auto get_delegatee_scheduler() {
  return read(execution::get_delegatee_scheduler);
}
execution::sender auto get_allocator() {
  return read(execution::get_allocator);
}
execution::sender auto get_stop_token() {
  return read(execution::get_stop_token);
}

Returns a sender that reaches into a receiver’s environment and pulls out the current value associated with the customization point denoted by Tag. It then sends the value read back to the receiver through the value channel. For instance, get_scheduler() (with no arguments) is a sender that asks the receiver for the currently suggested scheduler and passes it to the receiver’s set_value completion-signal.

This can be useful when scheduling nested dependent work. The following sender pulls the current schduler into the value channel and then schedules more work onto it.

execution::sender auto task =
  execution::get_scheduler()
    | execution::let_value([](auto sched) {
        return execution::on(sched, some nested work here);
    });

this_thread::sync_wait( std::move(task) ); // wait for it to finish

This code uses the fact that sync_wait associates a scheduler with the receiver that it connects with task. get_scheduler() reads that scheduler out of the receiver, and passes it to let_value's receiver’s set_value function, which in turn passes it to the lambda. That lambda returns a new sender that uses the scheduler to schedule some nested work onto sync_wait's scheduler.

4.21. User-facing sender adaptors

A sender adaptor is an algorithm that takes one or more senders, which it may execution::connect, as parameters, and returns a sender, whose completion is related to the sender arguments it has received.

Sender adaptors are lazy, that is, they are never allowed to submit any work for execution prior to the returned sender being started later on, and are also guaranteed to not start any input senders passed into them. Sender consumers such as § 4.22.1 execution::start_detached and § 4.22.2 this_thread::sync_wait start senders.

For more implementer-centric description of starting senders, see § 5.5 Sender adaptors are lazy.

4.21.1. execution::transfer

execution::sender auto transfer(
    execution::sender auto input,
    execution::scheduler auto scheduler
);

Returns a sender describing the transition from the execution agent of the input sender to the execution agent of the target scheduler. See § 4.6 Execution resource transitions are explicit.

execution::scheduler auto cpu_sched = get_system_thread_pool().scheduler();
execution::scheduler auto gpu_sched = cuda::scheduler();

execution::sender auto cpu_task = execution::schedule(cpu_sched);
// cpu_task describes the creation of a new task on the system thread pool

execution::sender auto gpu_task = execution::transfer(cpu_task, gpu_sched);
// gpu_task describes the transition of the task graph described by cpu_task to the gpu

4.21.2. execution::then

execution::sender auto then(
    execution::sender auto input,
    std::invocable<values-sent-by(input)...> function
);

then returns a sender describing the task graph described by the input sender, with an added node of invoking the provided function with the values sent by the input sender as arguments.

then is guaranteed to not begin executing function until the returned sender is started.

execution::sender auto input = get_input();
execution::sender auto snd = execution::then(input, [](auto... args) {
    std::print(args...);
});
// snd describes the work described by pred
// followed by printing all of the values sent by pred

This adaptor is included as it is necessary for writing any sender code that actually performs a useful function.

4.21.3. execution::upon_*

execution::sender auto upon_error(
    execution::sender auto input,
    std::invocable<errors-sent-by(input)...> function
);

execution::sender auto upon_stopped(
    execution::sender auto input,
    std::invocable auto function
);

upon_error and upon_stopped are similar to then, but where then works with values sent by the input sender, upon_error works with errors, and upon_stopped is invoked when the "stopped" signal is sent.

4.21.4. execution::let_*

execution::sender auto let_value(
    execution::sender auto input,
    std::invocable<values-sent-by(input)...> function
);

execution::sender auto let_error(
    execution::sender auto input,
    std::invocable<errors-sent-by(input)...> function
);

execution::sender auto let_stopped(
    execution::sender auto input,
    std::invocable auto function
);

let_value is very similar to then: when it is started, it invokes the provided function with the values sent by the input sender as arguments. However, where the sender returned from then sends exactly what that function ends up returning - let_value requires that the function return a sender, and the sender returned by let_value sends the values sent by the sender returned from the callback. This is similar to the notion of "future unwrapping" in future/promise-based frameworks.

let_value is guaranteed to not begin executing function until the returned sender is started.

let_error and let_stopped are similar to let_value, but where let_value works with values sent by the input sender, let_error works with errors, and let_stopped is invoked when the "stopped" signal is sent.

4.21.5. execution::on

execution::sender auto on(
    execution::scheduler auto sched,
    execution::sender auto snd
);

Returns a sender which, when started, will start the provided sender on an execution agent belonging to the execution resource associated with the provided scheduler. This returned sender has no completion schedulers.

4.21.6. execution::into_variant

execution::sender auto into_variant(
    execution::sender auto snd
);

Returns a sender which sends a variant of tuples of all the possible sets of types sent by the input sender. Senders can send multiple sets of values depending on runtime conditions; this is a helper function that turns them into a single variant value.

4.21.7. execution::stopped_as_optional

execution::sender auto stopped_as_optional(
    single-sender auto snd
);

Returns a sender that maps the value channel from a T to an optional<decay_t<T>>, and maps the stopped channel to a value of an empty optional<decay_t<T>>.

4.21.8. execution::stopped_as_error

template<move_constructible Error>
execution::sender auto stopped_as_error(
    execution::sender auto snd,
    Error err
);

Returns a sender that maps the stopped channel to an error of err.

4.21.9. execution::bulk

execution::sender auto bulk(
    execution::sender auto input,
    std::integral auto shape,
    invocable<decltype(size), values-sent-by(input)...> function
);

Returns a sender describing the task of invoking the provided function with every index in the provided shape along with the values sent by the input sender. The returned sender completes once all invocations have completed, or an error has occurred. If it completes by sending values, they are equivalent to those sent by the input sender.

No instance of function will begin executing until the returned sender is started. Each invocation of function runs in an execution agent whose forward progress guarantees are determined by the scheduler on which they are run. All agents created by a single use of bulk execute with the same guarantee. The number of execution agents used by bulk is not specified. This allows a scheduler to execute some invocations of the function in parallel.

In this proposal, only integral types are used to specify the shape of the bulk section. We expect that future papers may wish to explore extensions of the interface to explore additional kinds of shapes, such as multi-dimensional grids, that are commonly used for parallel computing tasks.

4.21.10. execution::split

execution::sender auto split(execution::sender auto sender);

If the provided sender is a multi-shot sender, returns that sender. Otherwise, returns a multi-shot sender which sends values equivalent to the values sent by the provided sender. See § 4.7 Senders can be either multi-shot or single-shot.

4.21.11. execution::when_all

execution::sender auto when_all(
    execution::sender auto ...inputs
);

execution::sender auto when_all_with_variant(
    execution::sender auto ...inputs
);

when_all returns a sender that completes once all of the input senders have completed. It is constrained to only accept senders that can complete with a single set of values (_i.e._, it only calls one overload of set_value on its receiver). The values sent by this sender are the values sent by each of the input senders, in order of the arguments passed to when_all. It completes inline on the execution resource on which the last input sender completes, unless stop is requested before when_all is started, in which case it completes inline within the call to start.

when_all_with_variant does the same, but it adapts all the input senders using into_variant, and so it does not constrain the input arguments as when_all does.

The returned sender has no completion schedulers.

See § 4.9 Senders are joinable.

execution::scheduler auto sched = thread_pool.scheduler();

execution::sender auto sends_1 = ...;
execution::sender auto sends_abc = ...;

execution::sender auto both = execution::when_all(sched,
    sends_1,
    sends_abc
);

execution::sender auto final = execution::then(both, [](auto... args){
    std::cout << std::format("the two args: {}, {}", args...);
});
// when final executes, it will print "the two args: 1, abc"

4.21.12. execution::transfer_when_all

execution::sender auto transfer_when_all(
    execution::scheduler auto sched,
    execution::sender auto ...inputs
);

execution::sender auto transfer_when_all_with_variant(
    execution::scheduler auto sched,
    execution::sender auto ...inputs
);

Similar to § 4.21.11 execution::when_all, but returns a sender whose value completion scheduler is the provided scheduler.

See § 4.9 Senders are joinable.

4.21.13. execution::ensure_started

execution::sender auto ensure_started(
    execution::sender auto sender
);

Once ensure_started returns, it is known that the provided sender has been connected and start has been called on the resulting operation state (see § 5.2 Operation states represent work); in other words, the work described by the provided sender has been submitted for execution on the appropriate execution resources. Returns a sender which completes when the provided sender completes and sends values equivalent to those of the provided sender.

If the returned sender is destroyed before execution::connect() is called, or if execution::connect() is called but the returned operation-state is destroyed before execution::start() is called, then a stop-request is sent to the eagerly launched operation and the operation is detached and will run to completion in the background. Its result will be discarded when it eventually completes.

Note that the application will need to make sure that resources are kept alive in the case that the operation detaches. e.g. by holding a std::shared_ptr to those resources or otherwise having some out-of-band way to signal completion of the operation so that resource release can be sequenced after the completion.

4.22. User-facing sender consumers

A sender consumer is an algorithm that takes one or more senders, which it may execution::connect, as parameters, and does not return a sender.

4.22.1. execution::start_detached

void start_detached(
    execution::sender auto sender
);

Like ensure_started, but does not return a value; if the provided sender sends an error instead of a value, std::terminate is called.

4.22.2. this_thread::sync_wait

auto sync_wait(
    execution::sender auto sender
) requires (always-sends-same-values(sender))
    -> std::optional<std::tuple<values-sent-by(sender)>>;

this_thread::sync_wait is a sender consumer that submits the work described by the provided sender for execution, similarly to ensure_started, except that it blocks the current std::thread or thread of main until the work is completed, and returns an optional tuple of values that were sent by the provided sender on its completion of work. Where § 4.20.1 execution::schedule and § 4.20.3 execution::transfer_just are meant to enter the domain of senders, sync_wait is meant to exit the domain of senders, retrieving the result of the task graph.

If the provided sender sends an error instead of values, sync_wait throws that error as an exception, or rethrows the original exception if the error is of type std::exception_ptr.

If the provided sender sends the "stopped" signal instead of values, sync_wait returns an empty optional.

For an explanation of the requires clause, see § 5.8 All senders are typed. That clause also explains another sender consumer, built on top of sync_wait: sync_wait_with_variant.

Note: This function is specified inside std::this_thread, and not inside execution. This is because sync_wait has to block the current execution agent, but determining what the current execution agent is is not reliable. Since the standard does not specify any functions on the current execution agent other than those in std::this_thread, this is the flavor of this function that is being proposed. If C++ ever obtains fibers, for instance, we expect that a variant of this function called std::this_fiber::sync_wait would be provided. We also expect that runtimes with execution agents that use different synchronization mechanisms than std::thread's will provide their own flavors of sync_wait as well (assuming their execution agents have the means to block in a non-deadlock manner).

4.23. execution::execute

In addition to the three categories of functions presented above, we also propose to include a convenience function for fire-and-forget eager one-way submission of an invocable to a scheduler, to fulfil the role of one-way executors from P0443.

void execution::execute(
    execution::schedule auto sched,
    std::invocable auto fn
);

Submits the provided function for execution on the provided scheduler, as-if by:

auto snd = execution::schedule(sched);
auto work = execution::then(snd, fn);
execution::start_detached(work);

5. Design - implementer side

5.1. Receivers serve as glue between senders

A receiver is a callback that supports more than one channel. In fact, it supports three of them:

Once an async operation has been started exactly one of these functions must be invoked on a receiver before it is destroyed.

While the receiver interface may look novel, it is in fact very similar to the interface of std::promise, which provides the first two signals as set_value and set_exception, and it’s possible to emulate the third channel with lifetime management of the promise.

Receivers are not a part of the end-user-facing API of this proposal; they are necessary to allow unrelated senders communicate with each other, but the only users who will interact with receivers directly are authors of senders.

Receivers are what is passed as the second argument to § 5.3 execution::connect.

5.2. Operation states represent work

An operation state is an object that represents work. Unlike senders, it is not a chaining mechanism; instead, it is a concrete object that packages the work described by a full sender chain, ready to be executed. An operation state is neither movable nor copyable, and its interface consists of a single algorithm: start, which serves as the submission point of the work represented by a given operation state.

Operation states are not a part of the user-facing API of this proposal; they are necessary for implementing sender consumers like execution::ensure_started and this_thread::sync_wait, and the knowledge of them is necessary to implement senders, so the only users who will interact with operation states directly are authors of senders and authors of sender algorithms.

The return value of § 5.3 execution::connect must satisfy the operation state concept.

5.3. execution::connect

execution::connect is a customization point which connects senders with receivers, resulting in an operation state that will ensure that if start is called that one of the completion operations will be called on the receiver passed to connect.

execution::sender auto snd = some input sender;
execution::receiver auto rcv = some receiver;
execution::operation_state auto state = execution::connect(snd, rcv);

execution::start(state);
// at this point, it is guaranteed that the work represented by state has been submitted
// to an execution resource, and that execution resource will eventually call one of the
// completion operations on rcv

// operation states are not movable, and therefore this operation state object must be
// kept alive until the operation finishes

5.4. Sender algorithms are customizable

Senders being able to advertise what their completion schedulers are fulfills one of the promises of senders: that of being able to customize an implementation of a sender algorithm based on what scheduler any work it depends on will complete on.

The simple way to provide customizations for functions like then, that is for sender adaptors and sender consumers, is to follow the customization scheme that has been adopted for C++20 ranges library; to do that, we would define the expression execution::then(sender, invocable) to be equivalent to:

  1. sender.then(invocable), if that expression is well-formed; otherwise

  2. then(sender, invocable), performed in a context where this call always performs ADL, if that expression is well-formed; otherwise

  3. a default implementation of then, which returns a sender adaptor, and then define the exact semantics of said adaptor.

However, this definition is problematic. Imagine another sender adaptor, bulk, which is a structured abstraction for a loop over an index space. Its default implementation is just a for loop. However, for accelerator runtimes like CUDA, we would like sender algorithms like bulk to have specialized behavior, which invokes a kernel of more than one thread (with its size defined by the call to bulk); therefore, we would like to customize bulk for CUDA senders to achieve this. However, there’s no reason for CUDA kernels to necessarily customize the then sender adaptor, as the generic implementation is perfectly sufficient. This creates a problem, though; consider the following snippet:

execution::scheduler auto cuda_sch = cuda_scheduler{};

execution::sender auto initial = execution::schedule(cuda_sch);
// the type of initial is a type defined by the cuda_scheduler
// let’s call it cuda::schedule_sender<>

execution::sender auto next = execution::then(cuda_sch, []{ return 1; });
// the type of next is a standard-library unspecified sender adaptor
// that wraps the cuda sender
// let’s call it execution::then_sender_adaptor<cuda::schedule_sender<>>

execution::sender auto kernel_sender = execution::bulk(next, shape, [](int i){ ... });

How can we specialize the bulk sender adaptor for our wrapped schedule_sender? Well, here’s one possible approach, taking advantage of ADL (and the fact that the definition of "associated namespace" also recursively enumerates the associated namespaces of all template parameters of a type):

namespace cuda::for_adl_purposes {
template<typename... SentValues>
class schedule_sender {
    execution::operation_state auto connect(execution::receiver auto rcv);
    execution::scheduler auto get_completion_scheduler() const;
};

execution::sender auto bulk(
    execution::sender auto && input,
    execution::shape auto && shape,
    invocable<sender-values(input)> auto && fn)
{
    // return a cuda sender representing a bulk kernel launch
}
} // namespace cuda::for_adl_purposes

However, if the input sender is not just a then_sender_adaptor like in the example above, but another sender that overrides bulk by itself, as a member function, because its author believes they know an optimization for bulk - the specialization above will no longer be selected, because a member function of the first argument is a better match than the ADL-found overload.

This means that well-meant specialization of sender algorithms that are entirely scheduler-agnostic can have negative consequences. The scheduler-specific specialization - which is essential for good performance on platforms providing specialized ways to launch certain sender algorithms - would not be selected in such cases. But it’s really the scheduler that should control the behavior of sender algorithms when a non-default implementation exists, not the sender. Senders merely describe work; schedulers, however, are the handle to the runtime that will eventually execute said work, and should thus have the final say in how the work is going to be executed.

Therefore, we are proposing the following customization scheme (also modified to take § 5.9 Ranges-style CPOs vs tag_invoke into account): the expression execution::<sender-algorithm>(sender, args...), for any given sender algorithm that accepts a sender as its first argument, should be equivalent to:

  1. tag_invoke(<sender-algorithm>, get_completion_scheduler<Tag>(get_env(sender)), sender, args...), if that expression is well-formed; otherwise

  2. tag_invoke(<sender-algorithm>, sender, args...), if that expression is well-formed; otherwise

  3. a default implementation, if there exists a default implementation of the given sender algorithm.

where Tag is one of set_value, set_error, or set_stopped. For most sender algorithms, the completion scheduler for set_value would be used, but for some (like upon_error or let_stopped), one of the others would be used.

For sender algorithms which accept concepts other than sender as their first argument, we propose that the customization scheme remains as it has been in A Unified Executors Proposal for C++ so far, except it should also use tag_invoke.

5.5. Sender adaptors are lazy

Contrary to early revisions of this paper, we propose to make all sender adaptors perform strictly lazy submission, unless specified otherwise (the one notable exception in this paper is § 4.21.13 execution::ensure_started, whose sole purpose is to start an input sender).

Strictly lazy submission means that there is a guarantee that no work is submitted to an execution resource before a receiver is connected to a sender, and execution::start is called on the resulting operation state.

5.6. Lazy senders provide optimization opportunities

Because lazy senders fundamentally describe work, instead of describing or representing the submission of said work to an execution resource, and thanks to the flexibility of the customization of most sender algorithms, they provide an opportunity for fusing multiple algorithms in a sender chain together, into a single function that can later be submitted for execution by an execution resource. There are two ways this can happen.

The first (and most common) way for such optimizations to happen is thanks to the structure of the implementation: because all the work is done within callbacks invoked on the completion of an earlier sender, recursively up to the original source of computation, the compiler is able to see a chain of work described using senders as a tree of tail calls, allowing for inlining and removal of most of the sender machinery. In fact, when work is not submitted to execution resources outside of the current thread of execution, compilers are capable of removing the senders abstraction entirely, while still allowing for composition of functions across different parts of a program.

The second way for this to occur is when a sender algorithm is specialized for a specific set of arguments. For instance, we expect that, for senders which are known to have been started already, § 4.21.13 execution::ensure_started will be an identity transformation, because the sender algorithm will be specialized for such senders. Similarly, an implementation could recognize two subsequent § 4.21.9 execution::bulks of compatible shapes, and merge them together into a single submission of a GPU kernel.

5.7. Execution resource transitions are two-step

Because execution::transfer takes a sender as its first argument, it is not actually directly customizable by the target scheduler. This is by design: the target scheduler may not know how to transition from a scheduler such as a CUDA scheduler; transitioning away from a GPU in an efficient manner requires making runtime calls that are specific to the GPU in question, and the same is usually true for other kinds of accelerators too (or for scheduler running on remote systems). To avoid this problem, specialized schedulers like the ones mentioned here can still hook into the transition mechanism, and inject a sender which will perform a transition to the regular CPU execution resource, so that any sender can be attached to it.

This, however, is a problem: because customization of sender algorithms must be controlled by the scheduler they will run on (see § 5.4 Sender algorithms are customizable), the type of the sender returned from transfer must be controllable by the target scheduler. Besides, the target scheduler may itself represent a specialized execution resource, which requires additional work to be performed to transition to it. GPUs and remote node schedulers are once again good examples of such schedulers: executing code on their execution resources requires making runtime API calls for work submission, and quite possibly for the data movement of the values being sent by the input sender passed into transfer.

To allow for such customization from both ends, we propose the inclusion of a secondary transitioning sender adaptor, called schedule_from. This adaptor is a form of schedule, but takes an additional, second argument: the input sender. This adaptor is not meant to be invoked manually by the end users; they are always supposed to invoke transfer, to ensure that both schedulers have a say in how the transitions are made. Any scheduler that specializes transfer(snd, sch) shall ensure that the return value of their customization is equivalent to schedule_from(sch, snd2), where snd2 is a successor of snd that sends values equivalent to those sent by snd.

The default implementation of transfer(snd, sched) is schedule_from(sched, snd).

5.8. All senders are typed

All senders must advertise the types they will send when they complete. This is necessary for a number of features, and writing code in a way that’s agnostic of whether an input sender is typed or not in common sender adaptors such as execution::then is hard.

The mechanism for this advertisement is similar to the one in A Unified Executors Proposal for C++; the way to query the types is through completion_signatures_of_t<S, [Env]>::value_types<tuple_like, variant_like>.

completion_signatures_of_t::value_types is a template that takes two arguments: one is a tuple-like template, the other is a variant-like template. The tuple-like argument is required to represent senders sending more than one value (such as when_all). The variant-like argument is required to represent senders that choose which specific values to send at runtime.

There’s a choice made in the specification of § 4.22.2 this_thread::sync_wait: it returns a tuple of values sent by the sender passed to it, wrapped in std::optional to handle the set_stopped signal. However, this assumes that those values can be represented as a tuple, like here:

execution::sender auto sends_1 = ...;
execution::sender auto sends_2 = ...;
execution::sender auto sends_3 = ...;

auto [a, b, c] = this_thread::sync_wait(
    execution::transfer_when_all(
        execution::get_completion_scheduler<execution::set_value_t>(get_env(sends_1)),
        sends_1,
        sends_2,
        sends_3
    )).value();
// a == 1
// b == 2
// c == 3

This works well for senders that always send the same set of arguments. If we ignore the possibility of having a sender that sends different sets of arguments into a receiver, we can specify the "canonical" (i.e. required to be followed by all senders) form of value_types of a sender which sends Types... to be as follows:

template<template<typename ...> typename TupleLike>
using value_types = TupleLike;

If senders could only ever send one specific set of values, this would probably need to be the required form of value_types for all senders; defining it otherwise would cause very weird results and should be considered a bug.

This matter is somewhat complicated by the fact that (1) set_value for receivers can be overloaded and accept different sets of arguments, and (2) senders are allowed to send multiple different sets of values, depending on runtime conditions, the data they consumed, and so on. To accomodate this, A Unified Executors Proposal for C++ also includes a second template parameter to value_types, one that represents a variant-like type. If we permit such senders, we would almost certainly need to require that the canonical form of value_types for all senders (to ensure consistency in how they are handled, and to avoid accidentally interpreting a user-provided variant as a sender-provided one) sending the different sets of arguments Types1..., Types2..., ..., TypesN... to be as follows:

template<
    template<typename ...> typename TupleLike,
    template<typename ...> typename VariantLike
>
using value_types = VariantLike<
    TupleLike<Types1...>,
    TupleLike<Types2...>,
    ...,
    TupleLike<Types3...>
>;

This, however, introduces a couple of complications:

  1. A just(1) sender would also need to follow this structure, so the correct type for storing the value sent by it would be std::variant<std::tuple<int>> or some such. This introduces a lot of compile time overhead for the simplest senders, and this overhead effectively exists in all places in the code where value_types is queried, regardless of the tuple-like and variant-like templates passed to it. Such overhead does exist if only the tuple-like parameter exists, but is made much worse by adding this second wrapping layer.

  2. As a consequence of (1): because sync_wait needs to store the above type, it can no longer return just a std::tuple<int> for just(1); it has to return std::variant<std::tuple<int>>. C++ currently does not have an easy way to destructure this; it may get less awkward with pattern matching, but even then it seems extremely heavyweight to involve variants in this API, and for the purpose of generic code, the kind of the return type of sync_wait must be the same across all sender types.

One possible solution to (2) above is to place a requirement on sync_wait that it can only accept senders which send only a single set of values, therefore removing the need for std::variant to appear in its API; because of this, we propose to expose both sync_wait, which is a simple, user-friendly version of the sender consumer, but requires that value_types have only one possible variant, and sync_wait_with_variant, which accepts any sender, but returns an optional whose value type is the variant of all the possible tuples sent by the input sender:

auto sync_wait_with_variant(
    execution::sender auto sender
) -> std::optional<std::variant<
        std::tuple<values0-sent-by(sender)>,
        std::tuple<values1-sent-by(sender)>,
        ...,
        std::tuple<valuesn-sent-by(sender)>
    >>;

auto sync_wait(
    execution::sender auto sender
) requires (always-sends-same-values(sender))
    -> std::optional<std::tuple<values-sent-by(sender)>>;

5.9. Ranges-style CPOs vs tag_invoke

The contemporary technique for customization in the Standard Library is customization point objects. A customization point object, will it look for member functions and then for nonmember functions with the same name as the customization point, and calls those if they match. This is the technique used by the C++20 ranges library, and previous executors proposals (A Unified Executors Proposal for C++ and Towards C++23 executors: A proposal for an initial set of algorithms) intended to use it as well. However, it has several unfortunate consequences:

  1. It does not allow for easy propagation of customization points unknown to the adaptor to a wrapped object, which makes writing universal adapter types much harder - and this proposal uses quite a lot of those.

  2. It effectively reserves names globally. Because neither member names nor ADL-found functions can be qualified with a namespace, every customization point object that uses the ranges scheme reserves the name for all types in all namespaces. This is unfortunate due to the sheer number of customization points already in the paper, but also ones that we are envisioning in the future. It’s also a big problem for one of the operations being proposed already: sync_wait. We imagine that if, in the future, C++ was to gain fibers support, we would want to also have std::this_fiber::sync_wait, in addition to std::this_thread::sync_wait. However, because we would want the names to be the same in both cases, we would need to make the names of the customizations not match the names of the customization points. This is undesirable.

This paper proposes to instead use the mechanism described in tag_invoke: A general pattern for supporting customisable functions: tag_invoke; the wording for tag_invoke has been incorporated into the proposed specification in this paper.

In short, instead of using globally reserved names, tag_invoke uses the type of the customization point object itself as the mechanism to find customizations. It globally reserves only a single name - tag_invoke - which itself is used the same way that ranges-style customization points are used. All other customization points are defined in terms of tag_invoke. For example, the customization for std::this_thread::sync_wait(s) will call tag_invoke(std::this_thread::sync_wait, s), instead of attempting to invoke s.sync_wait(), and then sync_wait(s) if the member call is not valid.

Using tag_invoke has the following benefits:

  1. It reserves only a single global name, instead of reserving a global name for every customization point object we define.

  2. It is possible to propagate customizations to a subobject, because the information of which customization point is being resolved is in the type of an argument, and not in the name of the function:

    // forward most customizations to a subobject
    template<typename Tag, typename ...Args>
    friend auto tag_invoke(Tag && tag, wrapper & self, Args &&... args) {
        return std::forward<Tag>(tag)(self.subobject, std::forward<Args>(args)...);
    }
    
    // but override one of them with a specific value
    friend auto tag_invoke(specific_customization_point_t, wrapper & self) {
        return self.some_value;
    }
    
  3. It is possible to pass those as template arguments to types, because the information of which customization point is being resolved is in the type. Similarly to how A Unified Executors Proposal for C++ defines a polymorphic executor wrapper which accepts a list of properties it supports, we can imagine scheduler and sender wrappers that accept a list of queries and operations they support. That list can contain the types of the customization point objects, and the polymorphic wrappers can then specialize those customization points on themselves using tag_invoke, dispatching to manually constructed vtables containing pointers to specialized implementations for the wrapped objects. For an example of such a polymorphic wrapper, see unifex::any_unique (example).

6. Specification

Much of this wording follows the wording of A Unified Executors Proposal for C++.

§ 8 Library introduction [library] is meant to be a diff relative to the wording of the [library] clause of Working Draft, Standard for Programming Language C++.

§ 9 General utilities library [utilities] is meant to be a diff relative to the wording of the [utilities] clause of Working Draft, Standard for Programming Language C++. This diff applies changes from tag_invoke: A general pattern for supporting customisable functions.

§ 10 Thread support library [thread] is meant to be a diff relative to the wording of the [thread] clause of Working Draft, Standard for Programming Language C++. This diff applies changes from Composable cancellation for sender-based async operations.

§ 11 Execution control library [exec] is meant to be added as a new library clause to the working draft of C++.

7. Exception handling [except]

7.1. Special functions [except.special]

7.1.1. General [except.special.general]

7.1.1.1. The std::terminate function [except.terminate]
At the end of the bulleted list in the Note in paragraph 1, add a new bullet as follows:
  • when a callback invocation exits via an exception when requesting stop on a std::stop_source or a std::in_place_stop_source ([stopsource.mem], [stopsource.inplace.mem]), or in the constructor of std::stop_callback or std::in_place_stop_callback ([stopcallback.cons], [stopcallback.inplace.cons]) when a callback invocation exits via an exception.

8. Library introduction [library]

Add the header <execution> to Table 23: C++ library headers [tab:headers.cpp]

In subclause [conforming], after [lib.types.movedfrom], add the following new subclause with suggested stable name [lib.tmpl-heads].

16.4.6.17 Class template-heads
  1. If a class template’s template-head is marked with "arguments are not associated entities"", any template arguments do not contribute to the associated entities ([basic.lookup.argdep]) of a function call where a specialization of the class template is an associated entity. In such a case, the class template can be implemented as an alias template referring to a templated class, or as a class template where the template arguments themselves are templated classes.

  2. [Example:

    template<class T> // arguments are not associated entities
    struct S {};
    
    namespace N {
      int f(auto);
      struct A {};
    }
    
    int x = f(S<N::A>{});  // error: N::f not a candidate
    

    The template S specified above can be implemented as

    template<class T>
    struct s-impl {
      struct type { };
    };
    
    template<class T>
    using S = typename s-impl<T>::type;
    

    or as

    template<class T>
    struct hidden {
      using type = struct _ {
        using type = T;
      };
    };
    
    template<class HiddenT>
    struct s-impl {
      using T = typename HiddenT::type;
    };
    
    template<class T>
    using S = s-impl<typename hidden<T>::type>;
    

    -- end example]

9. General utilities library [utilities]

9.1. Function objects [function.objects]

9.1.1. Header <functional> synopsis [functional.syn]

At the end of this subclause, insert the following declarations into the synopsis within namespace std:

// expositon only:
template<class Fn, class... Args>
  concept callable =
    requires (Fn&& fn, Args&&... args) {
      std::forward<Fn>(fn)(std::forward<Args>(args)...);
    };
template<class Fn, class... Args>
  concept nothrow-callable =
    callable<Fn, Args...> &&
    requires (Fn&& fn, Args&&... args) {
      { std::forward<Fn>(fn)(std::forward<Args>(args)...) } noexcept;
    };
template<class Fn, class... Args>
  using call-result-t = decltype(declval<Fn>()(declval<Args>()...));

// [func.tag_invoke], tag_invoke
namespace tag-invoke { // exposition only
  void tag_invoke();

  template<class Tag, class... Args>
    concept tag_invocable =
      requires (Tag&& tag, Args&&... args) {
        tag_invoke(std::forward<Tag>(tag), std::forward<Args>(args)...);
      };

  template<class Tag, class... Args>
    concept nothrow_tag_invocable =
      tag_invocable<Tag, Args...> &&
      requires (Tag&& tag, Args&&... args) {
        { tag_invoke(std::forward<Tag>(tag), std::forward<Args>(args)...) } noexcept;
      };

  template<class Tag, class... Args>
    using tag_invoke_result_t =
      decltype(tag_invoke(declval<Tag>(), declval<Args>()...));

  template<class Tag, class... Args>
    struct tag_invoke_result<Tag, Args...> {
      using type =
        tag_invoke_result_t<Tag, Args...>; // present if and only if tag_invocable<Tag, Args...> is true
    };

  struct tag; // exposition only
}
inline constexpr tag-invoke::tag tag_invoke {};
using tag-invoke::tag_invocable;
using tag-invoke::nothrow_tag_invocable;
using tag-invoke::tag_invoke_result_t;
using tag-invoke::tag_invoke_result;

template<auto& Tag>
  using tag_t = decay_t<decltype(Tag)>;

9.1.2. tag_invoke [func.tag_invoke]

Insert this section as a new subclause, between Searchers [func.search] and Class template hash [unord.hash].

  1. Given a subexpression E, let REIFY(E) be expression-equivalent to a glvalue with the same type and value as E as if by identity()(E).

  2. The name std::tag_invoke denotes a customization point object [customization.point.object]. Given subexpressions T and A..., the expression std::tag_invoke(T, A...) is expression-equivalent [defns.expression-equivalent] to tag_invoke(REIFY(T), REIFY(A)...) with overload resolution performed in a context in which unqualified lookup for tag_invoke finds only the declaration

    void tag_invoke();
    
  3. [Note: Diagnosable ill-formed cases above result in substitution failure when std::tag_invoke(T, A...) appears in the immediate context of a template instantiation. —end note]

10. Thread support library [thread]

10.1. Stop tokens [thread.stoptoken]

10.1.1. Header <stop_token> synopsis [thread.stoptoken.syn]

At the beginning of this subclause, insert the following declarations into the synopsis within namespace std:

template<template<class> class>
  struct check-type-alias-exists; // exposition-only

template<class T>
  concept stoppable_token = see-below;

template<class T, class CB, class Initializer = CB>
  concept stoppable_token_for = see-below;

template<class T>
  concept unstoppable_token = see-below;

At the end of this subclause, insert the following declarations into the synopsis of within namespace std:

// [stoptoken.never], class never_stop_token
class never_stop_token;

// [stoptoken.inplace], class in_place_stop_token
class in_place_stop_token;

// [stopsource.inplace], class in_place_stop_source
class in_place_stop_source;

// [stopcallback.inplace], class template in_place_stop_callback
template<class CB>
  class in_place_stop_callback;

template<class T, class CB>
  using stop_callback_for_t = typename T::template callback_type<CB>;

10.1.2. Stop token concepts [thread.stoptoken.concepts]

Insert this section as a new subclause between Header <stop_token> synopsis [thread.stoptoken.syn] and Class stop_token [stoptoken].

  1. The stoppable_token concept checks for the basic interface of a stop token that is copyable and allows polling to see if stop has been requested and also whether a stop request is possible. For a stop token type T and a type CB that is callable with no arguments, the type T::callback_type<CB> is valid and denotes the stop callback type to use to register a callback to be executed if a stop request is ever made on a stoppable_token of type T. The stoppable_token_for concept checks for a stop token type compatible with a given callback type. The unstoppable_token concept checks for a stop token type that does not allow stopping.

template<class T>
  concept stoppable_token =
    copyable<T> &&
    equality_comparable<T> &&
    requires (const T t) {
      { T(t) } noexcept; // see implicit expression variations ([concepts.equality])
      { t.stop_requested() } noexcept -> same_as<bool>;
      { t.stop_possible() } noexcept -> same_as<bool>;
      typename check-type-alias-exists<T::template callback_type>;
    };

template<class T, class CB, class Initializer = CB>
  concept stoppable_token_for =
    stoppable_token<T> &&
    invocable<CB> &&
    constructible_from<CB, Initializer> &&
    requires { typename stop_callback_for_t<T, CB>; } &&
    constructible_from<stop_callback_for_t<T, CB>, const T&, Initializer>;

template<class T>
  concept unstoppable_token =
    stoppable_token<T> &&
    requires {
      { bool_constant<T::stop_possible()>{} } -> same_as<false_type>;
    };
LWG directed me to replace T::stop_possible() with t.stop_possible() because of the recent constexpr changes in P2280R2. However, even with those changes, a nested requirement like requires (!t.stop_possible()), where t is an argument in the requirement-parameter-list, is ill-formed according to [expr.prim.req.nested/p2]:

A local parameter shall only appear as an unevaluated operand within the constraint-expression.

This is the subject of core issue 2517.

  1. Let t and u be distinct, valid objects of type T. The type T models stoppable_token only if:

    1. If t.stop_possible() evaluates to false then, if t and u reference the same logical shared stop state, u.stop_possible() shall also subsequently evaluate to false and u.stop_requested() shall also subsequently evaluate to false.

    2. If t.stop_requested() evaluates to true then, if t and u reference the same logical shared stop state, u.stop_requested() shall also subsequently evaluate to true and u.stop_possible() shall also subsequently evaluate to true.

  2. Let t and u be distinct, valid objects of type T and let init be an object of type Initializer. Then for some type CB, the type T models stoppable_token_for<CB, Initializer> only if:

    1. The type T::callback_type<CB> models:

      constructible_from<T, Initializer> &&
      constructible_from<T&, Initializer> &&
      constructible_from<const T, Initializer>
      
    2. Direct non-list initializing an object cb of type T::callback_type<CB> from t, init shall, if t.stop_possible() is true, construct an instance, callback, of type CB, direct-initialized with init, and register callback with t's shared stop state such that callback will be invoked with an empty argument list if a stop request is made on the shared stop state.

      1. If t.stop_requested() evaluates to true at the time callback is registered then callback can be invoked on the thread executing cb's constructor.

      2. If callback is invoked then, if t and u reference the same shared stop state, an evaluation of u.stop_requested() will be true if the beginning of the invocation of callback strongly-happens-before the evaluation of u.stop_requested().

      3. [Note: If t.stop_possible() evaluates to false then the construction of cb is not required to construct and initialize callback. --end note]

    3. Construction of a T::callback_type<CB> instance shall only throw exceptions thrown by the initialization of the CB instance from the value of type Initializer.

    4. Destruction of the T::callback_type<CB> object, cb, removes callback from the shared stop state such that callback will not be invoked after the destructor returns.

      1. If callback is currently being invoked on another thread then the destructor of cb will block until the invocation of callback returns such that the return from the invocation of callback strongly-happens-before the destruction of callback.

      2. Destruction of a callback cb shall not block on the completion of the invocation of some other callback registered with the same shared stop state.

10.1.3. Class stop_token [stoptoken]

10.1.3.1. General [stoptoken.general]

Modify the synopsis of class stop_token in section General [stoptoken.general] as follows:

namespace std {
  class stop_token {
  public:
    template<class T>
      using callback_type = stop_callback<T>;

    // [stoptoken.cons], constructors, copy, and assignment
    stop_token() noexcept;

    // ...

10.1.4. Class never_stop_token [stoptoken.never]

Insert a new subclause, Class never_stop_token [stoptoken.never], after section Class template stop_callback [stopcallback], as a new subclause of Stop tokens [thread.stoptoken].

10.1.4.1. General [stoptoken.never.general]
  1. The class never_stop_token provides an implementation of the unstoppable_token concept. It provides a stop token interface, but also provides static information that a stop is never possible nor requested.

namespace std
{
  class never_stop_token {
    // exposition only
    struct callback {
      explicit callback(never_stop_token, auto&&) noexcept {}
    };
  public:
    template<class>
      using callback_type = callback;

    [[nodiscard]] static constexpr bool stop_requested() noexcept { return false; }
    [[nodiscard]] static constexpr bool stop_possible() noexcept { return false; }

    [[nodiscard]] friend bool operator==(const never_stop_token&, const never_stop_token&) noexcept = default;
  };
}

10.1.5. Class in_place_stop_token [stoptoken.inplace]

Insert a new subclause, Class in_place_stop_token [stoptoken.inplace], after the section added above, as a new subclause of Stop tokens [thread.stoptoken].

10.1.5.1. General [stoptoken.inplace.general]
  1. The class in_place_stop_token provides an interface for querying whether a stop request has been made (stop_requested) or can ever be made (stop_possible) using an associated in_place_stop_source object ([stopsource.inplace]). An in_place_stop_token can also be passed to an in_place_stop_callback ([stopcallback.inplace]) constructor to register a callback to be called when a stop request has been made from an associated in_place_stop_source.

namespace std {
  class in_place_stop_token {
  public:
    template<class CB>
      using callback_type = in_place_stop_callback<CB>;

    // [stoptoken.inplace.cons], constructors, copy, and assignment
    in_place_stop_token() noexcept;
    ~in_place_stop_token();
    void swap(in_place_stop_token&) noexcept;

    // [stoptoken.inplace.mem], stop handling
    [[nodiscard]] bool stop_requested() const noexcept;
    [[nodiscard]] bool stop_possible() const noexcept;

    [[nodiscard]] friend bool operator==(const in_place_stop_token&, const in_place_stop_token&) noexcept = default;
    friend void swap(in_place_stop_token& lhs, in_place_stop_token& rhs) noexcept;

  private:
    const in_place_stop_source* source_; // exposition only
  };
}
10.1.5.2. Constructors, copy, and assignment [stoptoken.inplace.cons]
in_place_stop_token() noexcept;
  1. Effects: initializes source_ with nullptr.

void swap(stop_token& rhs) noexcept;
  1. Effects: Exchanges the values of source_ and rhs.source_.

10.1.5.3. Members [stoptoken.inplace.mem]
[[nodiscard]] bool stop_requested() const noexcept;
  1. Effects: Equivalent to: return source_ != nullptr && source_->stop_requested();

  2. [Note: The behavior of stop_requested() is undefined unless the call strongly happens before the start of the destructor of the associated in_place_stop_source, if any ([basic.life]). --end note]

[[nodiscard]] bool stop_possible() const noexcept;
  1. Effects: Equivalent to: return source_ != nullptr;

  2. [Note: The behavior of stop_possible() is implementation-defined unless the call strongly happens before the end of the storage duration of the associated in_place_stop_source object, if any ([basic.stc.general]). --end note]

10.1.5.4. Non-member functions [stoptoken.inplace.nonmembers]
friend void swap(in_place_stop_token& x, in_place_stop_token& y) noexcept;
  1. Effects: Equivalent to: x.swap(y).

10.1.6. Class in_place_stop_source [stopsource.inplace]

Insert a new subclause, Class in_place_stop_source [stopsource.inplace], after the section added above, as a new subclause of Stop tokens [thread.stoptoken].

10.1.6.1. General [stopsource.inplace.general]
  1. The class in_place_stop_source implements the semantics of making a stop request, without the need for a dynamic allocation of a shared state. A stop request made on a in_place_stop_source object is visible to all associated in_place_stop_token ([stoptoken.inplace]) objects. Once a stop request has been made it cannot be withdrawn (a subsequent stop request has no effect). All uses of in_place_stop_token objects associated with a given in_place_stop_source object must happen before the start of the destructor of that in_place_stop_source object.

namespace std {
  class in_place_stop_source {
  public:
    // [stopsource.inplace.cons], constructors, copy, and assignment
    in_place_stop_source() noexcept;

    in_place_stop_source(in_place_stop_source&&) noexcept = delete;
    ~in_place_stop_source();

    //[stopsource.inplace.mem], stop handling
    [[nodiscard]] in_place_stop_token get_token() const noexcept;
    [[nodiscard]] static constexpr bool stop_possible() noexcept { return true; }
    [[nodiscard]] bool stop_requested() const noexcept;
    bool request_stop() noexcept;
  };
}
  1. An instance of in_place_stop_source maintains a list of registered callback invocations. The registration of a callback invocation either succeeds or fails. When an invocation of a callback is registered, the following happens atomically:

    • The stop state is checked. If stop has not been requested, the callback invocation is added to the list of registered callback invocations, and registration has succeeded.

    • Otherwise, registration has failed.

    When an invocation of a callback is unregistered, the invocation is atomically removed from the list of registered callback invocations. The removal is not blocked by the concurrent execution of another callback invocation in the list. If the callback invocation being unregistered is currently executing, then:

    • If the execution of the callback invocation is happening concurrently on another thread, the completion of the execution strongly happens before ([intro.races]) the end of the callback’s lifetime.

    • Otherwise, the execution is happening on the current thread. Removal of the callback invocation does not block waiting for the execution to complete.

10.1.6.2. Constructors, copy, and assignment [stopsource.inplace.cons]
in_place_stop_source() noexcept;
  1. Effects: Initializes a new stop state inside *this.

  2. Postconditions: stop_requested() is false.

10.1.6.3. Members [stopsource.inplace.mem]
[[nodiscard]] in_place_stop_token get_token() const noexcept;
  1. Returns: A new associated in_place_stop_token object.

[[nodiscard]] bool stop_requested() const noexcept;
  1. Returns: true if the stop state inside *this has received a stop request; otherwise, false.

bool request_stop() noexcept;
  1. Effects: Atomically determines whether the stop state inside *this has received a stop request, and if not, makes a stop request. The determination and making of the stop request are an atomic read-modify-write operation ([intro.races]). If the request was made, the registered invocations are executed and the evaluations of the invocations are indeterminately sequenced. If an invocation of a callback exits via an exception then terminate is invoked ([except.terminate]).

  2. Postconditions: stop_requested() is true.

  3. Returns: true if this call made a stop request; otherwise false.

10.1.7. Class template in_place_stop_callback [stopcallback.inplace]

Insert a new subclause, Class template in_place_stop_callback [stopcallback.inplace], after the section added above, as a new subclause of Stop tokens [thread.stoptoken].

10.1.7.1. General [stopcallback.inplace.general]
  1. namespace std {
      template<class Callback>
      class in_place_stop_callback {
      public:
        using callback_type = Callback;
    
        // [stopcallback.inplace.cons], constructors and destructor
        template<class C>
          explicit in_place_stop_callback(in_place_stop_token st, C&& cb)
            noexcept(is_nothrow_constructible_v<Callback, C>);
        ~in_place_stop_callback();
    
        in_place_stop_callback(in_place_stop_callback&&) = delete;
    
      private:
        Callback callback_;      // exposition only
      };
    
      template<class Callback>
        in_place_stop_callback(in_place_stop_token, Callback)
          -> in_place_stop_callback<Callback>;
    }
    
  2. Mandates: in_place_stop_callback is instantiated with an argument for the template parameter Callback that satisfies both invocable and destructible.

  3. Preconditions: in_place_stop_callback is instantiated with an argument for the template parameter Callback that models both invocable and destructible.

  4. Recommended practice: Implementations should use the storage of the in_place_stop_callback objects to store the state necessary for their association with an in_place_stop_source object.

10.1.7.2. Constructors and destructor [stopcallback.inplace.cons]
template<class C>
  explicit in_place_stop_callback(in_place_stop_token st, C&& cb)
    noexcept(is_nothrow_constructible_v<Callback, C>);
  1. Constraints: Callback and C satisfy constructible_from<Callback, C>.

  2. Preconditions: Callback and C model constructible_from<Callback, C>.

  3. Effects: Initializes callback_ with std::forward<C>(cb). Any in_place_stop_source associated with st becomes associated with *this. Registers ([stopsource.inplace.general]) the callback invocation std::forward<Callback>(callback_)() with the associated in_place_stop_source, if any. If the registration fails, evaluates the callback invocation.

  4. Throws: Any exception thrown by the initialization of callback_.

  5. Remarks: If evaluating std::forward<Callback>(callback_)() exits via an exception, then terminate is invoked ([except.terminate]).

~in_place_stop_callback();
  1. Effects: Unregisters ([stopsource.inplace.general]) the callback invocation from the associated in_place_stop_source object, if any.

  2. Remarks: A program has undefined behavior if the start of this destructor does not strongly happen before the start of the destructor of the associated in_place_stop_source object, if any.

11. Execution control library [exec]

11.1. General [exec.general]

  1. This Clause describes components supporting execution of function objects [function.objects].

  2. The following subclauses describe the requirements, concepts, and components for execution control primitives as summarized in Table 1.

Table N: Execution control library summary [tab:execution.summary]
Subclause Header
[exec.sched] Schedulers <execution>
[exec.recv] Receivers
[exec.opstate] Operation states
[exec.snd] Senders
[exec.execute] One-way execution
  1. [Note: A large number of execution control primitives are customization point objects. For an object one might define multiple types of customization point objects, for which different rules apply. Table 2 shows the types of customization point objects used in the execution control library:

Table N+1: Types of customization point objects in the execution control library [tab:execution.cpos]
Customization point object type Purpose Examples
core provide core execution functionality, and connection between core components connect, start, execute
completion functions called by senders to announce the completion of the work (success, error, or cancellation) set_value, set_error, set_stopped
senders allow the specialization of the provided sender algorithms
  • sender factories (schedule, transfer_just, read, ...)
  • sender adaptors (transfer, then, let_value, ...)
  • sender consumers (start_detached, sync_wait)
queries allow querying different properties of objects
  • general queries (get_allocator, get_stop_token, ...)
  • environment queries (get_scheduler, get_delegatee_scheduler, ...)
  • scheduler queries (get_forward_progress_guarantee, execute_may_block_caller, ...)
  • sender attribute queries (get_completion_scheduler)

-- end note]

  1. This clause makes use of the following exposition-only entities:

    1. template<class Fn, class... Args>
          requires callable<Fn, Args...>
        constexpr auto mandate-nothrow-call(Fn&& fn, Args&&... args) noexcept
          -> call-result-t<Fn, Args...> {
          return std::forward<Fn>(fn)(std::forward<Args>(args)...);
        }
      
      • Mandates: nothrow-callable<Fn, Args...> is true.

    2. template<class T>
        concept movable-value =
          move_constructible<decay_t<T>> &&
          constructible_from<decay_t<T>, T>;
      
    3. For function types F1 and F2 denoting R1(Args1...) and R2(Args2...) respectively, MATCHING-SIG(F1, F2) is true if and only if same_as<R1(Args&&...), R2(Args2&&...)> is true.

11.2. Queries and queryables [exec.queryable]

11.2.1. General [exec.queryable.general]

  1. A queryable object is a read-only collection of key/value pairs where each key is a customization point object known as a query object. A query is an invocation of a query object with a queryable object as its first argument and a (possibly empty) set of additional arguments. The result of a query expression is valid as long as the queryable object is valid. A query imposes syntactic and semantic requirements on its invocations.

  2. Given a subexpression e that refers to a queryable object q, a query object F, and a (possibly empty) pack of subexpressions args, the expression F(e, args...) is equal to ([concepts.equality]) the expression F(c, args...) where c is a const lvalue reference to q.

  3. The type of a query expression can not be void.

  4. The expression F(e, args...) is equality-preserving ([concepts.equality]) and does not modify the function object or the arguments.

  5. Unless otherwise specified, the value returned by the expression F(e, args...) is valid as long as e is valid.

11.2.2. queryable concept [exec.queryable.concept]

template<class T>
  concept queryable = destructible<T>;
  1. The queryable concept specifies the constraints on the types of queryable objects.

  2. Let e be an object of type E. The type E models queryable if for each callable object F and a pack of subexpressions args, if requires { F(e, args...) } is true then F(e, args...) meets any semantic requirements imposed by F.

11.3. Asynchronous operations [async.ops]

  1. An execution resource is a program entity that manages a (possibly dynamic) set of execution agents ([thread.req.lockable.general]), which it uses to execute parallel work on behalf of callers. [Example 1: The currently active thread, a system-provided thread pool, and uses of an API associated with an external hardware accelerator are all examples of execution resources. -- end example] Execution resources execute asynchronous operations. An execution resource is either valid or invalid.

  2. An asynchronous operation is a distinct unit of program execution that:

    • is explicitly created;

    • can be explicitly started; an asynchronous operation can be started once at most;

    • if started, eventually completes with a (possibly empty) set of result datums, and in exactly one of three modes: success, failure, or cancellation, known as the operation’s disposition; an asychronous operation can only complete once; a successful completion, also known as a value completion, can have an arbitrary number of result datums; a failure completion, also known as an error completion, has a single result datum; a cancellation completion, also known as a stopped completion, has no result datum; an asynchronous operation’s async result is its disposition and its (possibly empty) set of result datums.

    • can complete on a different execution resource than that on which it started; and

    • can create and start other asychronous operations called child operations. A child operation is an asynchronous operation that is created by the parent operation and, if started, completes before the parent operation completes. A parent operation is the asynchronous operation that created a particular child operation.

    An asynchronous operation can in fact execute synchronously; that is, it can complete during the execution of its start operation on the thread of execution that started it.

  3. An asynchronous operation has associated state known as its operation state.

  4. An asynchronous operation has an associated environment. An environment is a queryable object ([exec.queryable]) representing the execution-time properties of the operation’s caller. The caller of an asynchronous operation is its parent operation or the function that created it. An asynchronous operation’s operation state owns the operation’s environment.

  5. An asynchronous operation has an associated receiver. A receiver is an aggregation of three handlers for the three asynchronous completion dispositions: a value completion handler for a value completion, an error completion handler for an error completion, and a stopped completion handler for a stopped completion. A receiver has an associated environment. An asynchronous operation’s operation state owns the operation’s receiver. The environment of an asynchronous operation is equal to its receiver’s environment.

  6. For each completion disposition, there is a completion function. A completion function is a customization point object ([customization.point.object]) that accepts an asynchronous operation’s receiver as the first argument and the result datums of the asynchronous operation as additional arguments. The value completion function invokes the receiver’s value completion handler with the value result datums; likewise for the error completion function and the stopped completion function. A completion function has an associated type known as its completion tag that names the unqualified type of the completion function. A valid invocation of a completion function is called a completion operation.

  7. The lifetime of an asynchronous operation, also known as the operation’s async lifetime, begins when its start operation begins executing and ends when its completion operation begins executing. If the lifetime of an asynchronous operation’s associated operation state ends before the lifetime of the asynchronous operation, the behavior is undefined. After an asynchronous operation executes a completion operation, its associated operation state is invalid. Accessing any part of an invalid operation state is undefined behavior.

  8. An asynchronous operation shall not execute a completion operation before its start operation has begun executing. After its start operation has begun executing, exactly one completion operation shall execute. The lifetime of an asynchronous operation’s operation state can end during the execution of the completion operation.

  9. A sender is a factory for one or more asynchronous operations. Connecting a sender and a receiver creates an asynchronous operation. The asynchronous operation’s associated receiver is equal to the receiver used to create it, and its associated environment is equal to the environment associated with the receiver used to create it. The lifetime of an asynchronous operation’s associated operation state does not depend on the lifetimes of either the sender or the receiver from which it was created. A sender sends its results by way of the asynchronous operation(s) it produces, and a receiver receives those results. A sender is either valid or invalid; it becomes invalid when its parent sender (see below) becomes invalid.

  10. A scheduler is an abstraction of an execution resource with a uniform, generic interface for scheduling work onto that resource. It is a factory for senders whose asynchronous operations execute value completion operations on an execution agent belonging to the scheduler’s associated execution resource. A schedule-expression obtains such a sender from a scheduler. A schedule sender is the result of a schedule expression. On success, an asynchronous operation produced by a schedule sender executes a value completion operation with an empty set of result datums. Multiple schedulers can refer to the same execution resource. A scheduler can be valid or invalid. A scheduler becomes invalid when the execution resource to which it refers becomes invalid, as do any schedule senders obtained from the scheduler, and any operation states obtained from those senders.

  11. An asynchronous operation has one or more associated completion schedulers for each of its possible dispositions. A completion scheduler is a scheduler whose associated execution resource is used to execute a completion operation for an asynchronous operation. A value completion scheduler is a scheduler on which an asynchronous operation’s value completion operation can execute. Likewise for error completion schedulers and stopped completion schedulers.

  12. A sender has an associated queryable object ([exec.queryable]) known as its attributes that describes various characteristics of the sender and of the asynchronous operation(s) it produces. For each disposition, there is a query object for reading the associated completion scheduler from a sender’s attributes; i.e., a value completion scheduler query object for reading a sender’s value completion scheduler, etc. If a completion scheduler query is well-formed, the returned completion scheduler is unique for that disposition for any asynchronous operation the sender creates. A schedule sender is required to have a value completion scheduler attribute whose value is equal to the scheduler that produced the schedule sender.

  13. A completion signature is a function type that describes a completion operation. An asychronous operation has a finite set of possible completion signatures. The completion signature’s return type is the completion tag associated with the completion function that executes the completion operation. The completion signature’s argument types are the types and value categories of the asynchronous operation’s result datums. Together, a sender type and an environment type E determine the set of completion signatures of an operation state that results from connecting the sender with a receiver whose environment has type E. The type of the receiver does not affect an asychronous operation’s completion signatures, only the type of the receiver’s environment.

  14. A sender algorithm is a function that takes and/or returns a sender. There are three categories of sender algorithms:

    • A sender factory is a function that takes non-senders as arguments and that returns a sender.

    • A sender adaptor is a function that constructs and returns a parent sender from a set of one or more child senders and a (possibly empty) set of additional arguments. An asynchronous operation created by a parent sender is a parent to the child operations created by the child senders.

    • A sender consumer is a function that takes one or more senders and a (possibly empty) set of additional arguments, and whose return type is not the type of a sender.

11.4. Header <execution> synopsis [exec.syn]

namespace std {
  // [exec.general], helper concepts
  template<class T>
    concept movable-value = see-below; // exposition only

  template<class From, class To>
    concept decays-to = same_as<decay_t<From>, To>; // exposition only

  template<class T>
    concept class-type = decays-to<T, T> && is_class_v<T>;  // exposition only

  // [exec.queryable], queryable objects
  template<class T>
    concept queryable = destructible;

  // [exec.queries], queries
  namespace queries { // exposition only
    struct forwarding_query_t;
    struct get_allocator_t;
    struct get_stop_token_t;
  }
  using queries::forwarding_query_t;
  using queries::get_allocator_t;
  using queries::get_stop_token_t;
  inline constexpr forwarding_query_t forwarding_query{};
  inline constexpr get_allocator_t get_allocator{};
  inline constexpr get_stop_token_t get_stop_token{};

  template<class T>
    using stop_token_of_t =
      remove_cvref_t<decltype(get_stop_token(declval<T>()))>;

  template<class T>
    concept forwarding-query = // exposition only
      forwarding_query(T{});

  namespace exec-envs { // exposition only
    struct empty_env {};
    struct get_env_t;
  }
  using envs-envs::empty_env;
  using envs-envs::get_env_t;
  inline constexpr get_env_t get_env {};

  template<class T>
    using env_of_t = decltype(get_env(declval<T>()));
}

namespace std::execution {
  // [exec.queries], queries
  enum class forward_progress_guarantee;
  namespace queries { // exposition only
    struct get_scheduler_t;
    struct get_delegatee_scheduler_t;
    struct get_forward_progress_guarantee_t;
    template<class CPO>
      struct get_completion_scheduler_t;
  }
  using queries::get_scheduler_t;
  using queries::get_delegatee_scheduler_t;
  using queries::get_forward_progress_guarantee_t;
  using queries::get_completion_scheduler_t;
  inline constexpr get_scheduler_t get_scheduler{};
  inline constexpr get_delegatee_scheduler_t get_delegatee_scheduler{};
  inline constexpr get_forward_progress_guarantee_t get_forward_progress_guarantee{};
  template<class CPO>
    inline constexpr get_completion_scheduler_t<CPO> get_completion_scheduler{};

  // [exec.sched], schedulers
  template<class S>
    concept scheduler = see-below;

  // [exec.recv], receivers
  template<class R>
    inline constexpr bool enable_receiver = see-below;

  template<class R>
    concept receiver = see-below;

  template<class R, class Completions>
    concept receiver_of = see-below;

  namespace receivers { // exposition only
    struct set_value_t;
    struct set_error_t;
    struct set_stopped_t;
  }
  using receivers::set_value_t;
  using receivers::set_error_t;
  using receivers::set_stopped_t;
  inline constexpr set_value_t set_value{};
  inline constexpr set_error_t set_error{};
  inline constexpr set_stopped_t set_stopped{};

  // [exec.opstate], operation states
  template<class O>
    concept operation_state = see-below;

  namespace op-state { // exposition only
    struct start_t;
  }
  using op-state::start_t;
  inline constexpr start_t start{};

  // [exec.snd], senders
  template<class S>
    inline constexpr bool enable_sender = see below;

  template<class S>
    concept sender = see-below;

  template<class S, class E = empty_env>
    concept sender_in = see-below;

  template<class S, class R>
    concept sender_to = see-below;

  template <class S, class Sig, class E = empty_env>
    concept sender_of = see below;

  template<class... Ts>
    struct type-list; // exposition only

  template<class S, class E = empty_env>
    using single-sender-value-type = see below; // exposition only

  template<class S, class E = empty_env>
    concept single-sender = see below; // exposition only

  // [exec.getcomplsigs], completion signatures
  namespace completion-signatures { // exposition only
    struct get_completion_signatures_t;
  }
  using completion-signatures::get_completion_signatures_t;
  inline constexpr get_completion_signatures_t get_completion_signatures {};

  template<class S, class E = empty_env>
      requires sender_in<S, E>
    using completion_signatures_of_t = call-result-t<get_completion_signatures_t, S, E>;

  template<class... Ts>
    using decayed-tuple = tuple<decay_t<Ts>...>; // exposition only

  template<class... Ts>
    using variant-or-empty = see below; // exposition only

  template<class S,
           class E = empty_env,
           template<class...> class Tuple = decayed-tuple,
           template<class...> class Variant = variant-or-empty>
      requires sender_in<S, E>
    using value_types_of_t = see below;

  template<class S,
           class Env = empty_env,
           template<class...> class Variant = variant-or-empty>
      requires sender_in<S, E>
    using error_types_of_t = see below;

  template<class S, class E = empty_env>
      requires sender_in<S, E>
    inline constexpr bool sends_stopped = see below;

  // [exec.connect], the connect sender algorithm
  namespace senders-connect { // exposition only
    struct connect_t;
  }
  using senders-connect::connect_t;
  inline constexpr connect_t connect{};

  template<class S, class R>
    using connect_result_t = decltype(connect(declval<S>(), declval<R>()));

  // [exec.factories], sender factories
  namespace senders-factories { // exposition only
    struct schedule_t;
    struct transfer_just_t;
  }
  inline constexpr unspecified just{};
  inline constexpr unspecified just_error{};
  inline constexpr unspecified just_stopped{};
  using senders-factories::schedule_t;
  using senders-factories::transfer_just_t;
  inline constexpr schedule_t schedule{};
  inline constexpr transfer_just_t transfer_just{};
  inline constexpr unspecified read{};

  template<scheduler S>
    using schedule_result_t = decltype(schedule(declval<S>()));

  // [exec.adapt], sender adaptors
  namespace sender-adaptor-closure { // exposition only
    template<class-type D>
      struct sender_adaptor_closure { };
  }
  using sender-adaptor-closure::sender_adaptor_closure;

  namespace sender-adaptors { // exposition only
    struct on_t;
    struct transfer_t;
    struct schedule_from_t;
    struct then_t;
    struct upon_error_t;
    struct upon_stopped_t;
    struct let_value_t;
    struct let_error_t;
    struct let_stopped_t;
    struct bulk_t;
    struct split_t;
    struct when_all_t;
    struct when_all_with_variant_t;
    struct transfer_when_all_t;
    struct transfer_when_all_with_variant_t;
    struct into_variant_t;
    struct stopped_as_optional_t;
    struct stopped_as_error_t;
    struct ensure_started_t;
  }
  using sender-adaptors::on_t;
  using sender-adaptors::transfer_t;
  using sender-adaptors::schedule_from_t;
  using sender-adaptors::then_t;
  using sender-adaptors::upon_error_t;
  using sender-adaptors::upon_stopped_t;
  using sender-adaptors::let_value_t;
  using sender-adaptors::let_error_t;
  using sender-adaptors::let_stopped_t;
  using sender-adaptors::bulk_t;
  using sender-adaptors::split_t;
  using sender-adaptors::when_all_t;
  using sender-adaptors::when_all_with_variant_t;
  using sender-adaptors::transfer_when_all_t;
  using sender-adaptors::transfer_when_all_with_variant_t;
  using sender-adaptors::into_variant_t;
  using sender-adaptors::stopped_as_optional_t;
  using sender-adaptors::stopped_as_error_t;
  using sender-adaptors::ensure_started_t;

  inline constexpr on_t on{};
  inline constexpr transfer_t transfer{};
  inline constexpr schedule_from_t schedule_from{};

  inline constexpr then_t then{};
  inline constexpr upon_error_t upon_error{};
  inline constexpr upon_stopped_t upon_stopped{};

  inline constexpr let_value_t let_value{};
  inline constexpr let_error_t let_error{};
  inline constexpr let_stopped_t let_stopped{};

  inline constexpr bulk_t bulk{};

  inline constexpr split_t split{};
  inline constexpr when_all_t when_all{};
  inline constexpr when_all_with_variant_t when_all_with_variant{};
  inline constexpr transfer_when_all_t transfer_when_all{};
  inline constexpr transfer_when_all_with_variant_t
    transfer_when_all_with_variant{};

  inline constexpr into_variant_t into_variant{};

  inline constexpr stopped_as_optional_t stopped_as_optional;

  inline constexpr stopped_as_error_t stopped_as_error;

  inline constexpr ensure_started_t ensure_started{};

  // [exec.consumers], sender consumers
  namespace sender-consumers { // exposition only
    struct start_detached_t;
  }
  using sender-consumers::start_detached_t;
  inline constexpr start_detached_t start_detached{};

  // [exec.utils], sender and receiver utilities
  // [exec.utils.rcvr.adptr]
  template<
      class-type Derived,
      receiver Base = unspecified> // arguments are not associated entities ([lib.tmpl-heads])
    class receiver_adaptor;

  template<class Fn>
    concept completion-signature = // exposition only
      see below;

  // [exec.utils.cmplsigs]
  template<completion-signature... Fns>
    struct completion_signatures {};

  template<class... Args> // exposition only
    using default-set-value =
      completion_signatures<set_value_t(Args...)>;

  template<class Err> // exposition only
    using default-set-error =
      completion_signatures<set_error_t(Err)>;

  template<class Sigs> // exposition only
    concept valid-completion-signatures = see below;

  // [exec.utils.mkcmplsigs]
  template<
    sender Sndr,
    class Env = empty_env,
    valid-completion-signatures AddlSigs = completion_signatures<>,
    template<class...> class SetValue = see below,
    template<class> class SetError = see below,
    valid-completion-signatures SetStopped = completion_signatures<set_stopped_t()>>
      requires sender_in<Sndr, Env>
  using make_completion_signatures = completion_signatures<see below>;

  // [exec.ctx], execution resources
  class run_loop;
}

namespace std::this_thread {
  // [exec.queries], queries
  namespace queries { // exposition only
    struct execute_may_block_caller_t;
  }
  using queries::execute_may_block_caller_t;
  inline constexpr execute_may_block_caller_t execute_may_block_caller{};

  namespace this-thread { // exposition only
    struct sync-wait-env; // exposition only
    template<class S>
        requires sender_in<S, sync-wait-env>
      using sync-wait-type = see-below; // exposition only
    template<class S>
      using sync-wait-with-variant-type = see-below; // exposition only

    struct sync_wait_t;
    struct sync_wait_with_variant_t;
  }
  using this-thread::sync_wait_t;
  using this-thread::sync_wait_with_variant_t;
  inline constexpr sync_wait_t sync_wait{};
  inline constexpr sync_wait_with_variant_t sync_wait_with_variant{};
}

namespace std::execution {
  // [exec.execute], one-way execution
  namespace execute { // exposition only
    struct execute_t;
  }
  using execute::execute_t;
  inline constexpr execute_t execute{};

  // [exec.as.awaitable]
  namespace coro-utils { // exposition only
    struct as_awaitable_t;
  }
  using coro-utils::as_awaitable_t;
  inline constexpr as_awaitable_t as_awaitable;

  // [exec.with.awaitable.senders]
  template<class-type Promise>
    struct with_awaitable_senders;
}
  1. The exposition-only type variant-or-empty<Ts...> is defined as follows:

    1. If sizeof...(Ts) is greater than zero, variant-or-empty<Ts...> names the type variant<Us...> where Us... is the pack decay_t<Ts>... with duplicate types removed.

    2. Otherwise, variant-or-empty<Ts...> names the exposition-only class type:

      struct empty-variant {
        empty-variant() = delete;
      };
      

11.5. Queries [exec.queries]

11.5.1. std::get_env [exec.get.env]

  1. get_env is a customization point object. For some subexpression o of type O, get_env(o) is expression-equivalent to

    1. tag_invoke(std::get_env, const_cast<const O&>(o)) if that expression is well-formed.

      • Mandates: The type of the expression above satisfies queryable ([exec.queryable]).

    2. Otherwise, empty_env{}.

  2. The value of get_env(o) shall be valid while o is valid.

  3. When passed a sender object, get_env returns the sender’s attributes. When passed a receiver, get_env returns the receiver’s environment.

11.5.2. std::forwarding_query [exec.fwd.env]

  1. std::forwarding_query asks a query object whether it should be forwarded through queryable adaptors.

  2. The name std::forwarding_query denotes a query object. For some query object q of type Q, std::forwarding_query(q) is expression-equivalent to:

    1. mandate-nothrow-call(tag_invoke, std::forwarding_query, q) if that expression is well-formed.

      • Mandates: The expression above has type bool and is a core constant expressions if q is a core constant expression.

    2. Otherwise, true if derived_from<Q, std::forwarding_query_t> is true.

    3. Otherwise, false.

  3. For a queryable object o, let FWD-QUERIES(o) be a queryable object such that for a query object q and a pack of subexpressions as, the expression q(FWD-QUERIES(o), as...) is ill-formed if forwarding_query(q) is false; otherwise, it is expression-equivalent to q(o, as...).

11.5.3. std::get_allocator [exec.get.allocator]

  1. get_allocator asks an object for its associated allocator.

  2. The name get_allocator denotes a query object. For some subexpression r, get_allocator(r) is expression-equivalent to mandate-nothrow-call(tag_invoke, std::get_allocator, as_const(r)).

    • Mandates: The type of the expression above satisfies Allocator.

  3. std::forwarding_query(std::get_allocator) is true.

  4. get_allocator() (with no arguments) is expression-equivalent to execution::read(std::get_allocator) ([exec.read]).

11.5.4. std::get_stop_token [exec.get.stop.token]

  1. get_stop_token asks an object for an associated stop token.

  2. The name get_stop_token denotes a query object. For some subexpression r, get_stop_token(r) is expression-equivalent to:

    1. mandate-nothrow-call(tag_invoke, std::get_stop_token, as_const(r)), if this expression is well-formed.

      • Mandates: The type of the expression above satisfies stoppable_token.

    2. Otherwise, never_stop_token{}.

  3. std::forwarding_query(std::get_stop_token) is true.

  4. get_stop_token() (with no arguments) is expression-equivalent to execution::read(std::get_stop_token) ([exec.read]).

11.5.5. execution::get_scheduler [exec.get.scheduler]

  1. get_scheduler asks an object for its associated scheduler.

  2. The name get_scheduler denotes a query object. For some subexpression r, get_scheduler(r) is expression-equivalent to mandate-nothrow-call(tag_invoke, get_scheduler, as_const(r)).

    • Mandates: The type of the expression above satisfies scheduler.

  3. std::forwarding_query(std::get_scheduler) is true.

  4. get_scheduler() (with no arguments) is expression-equivalent to execution::read(get_scheduler) ([exec.read]).

11.5.6. execution::get_delegatee_scheduler [exec.get.delegatee.scheduler]

  1. get_delegatee_scheduler asks an object for a scheduler that can be used to delegate work to for the purpose of forward progress delegation.

  2. The name get_delegatee_scheduler denotes a query object. For some subexpression r, get_delegatee_scheduler(r) is expression-equivalent to mandate-nothrow-call(tag_invoke, get_delegatee_scheduler, as_const(r)).

    • Mandates: The type of the expression above is satisfies scheduler.

  3. std::forwarding_query(std::get_delegatee_scheduler) is true.

  4. get_delegatee_scheduler() (with no arguments) is expression-equivalent to execution::read(get_delegatee_scheduler) ([exec.read]).

11.5.7. execution::get_forward_progress_guarantee [exec.get.forward.progress.guarantee]

enum class forward_progress_guarantee {
  concurrent,
  parallel,
  weakly_parallel
};
  1. get_forward_progress_guarantee asks a scheduler about the forward progress guarantees of execution agents created by that scheduler.

  2. The name get_forward_progress_guarantee denotes a query object. For some subexpression s, let S be decltype((s)). If S does not satisfy scheduler, get_forward_progress_guarantee is ill-formed. Otherwise, get_forward_progress_guarantee(s) is expression-equivalent to:

    1. mandate-nothrow-call(tag_invoke, get_forward_progress_guarantee, as_const(s)), if this expression is well-formed.

      • Mandates: The type of the expression above is forward_progress_guarantee.

    2. Otherwise, forward_progress_guarantee::weakly_parallel.

  3. If get_forward_progress_guarantee(s) for some scheduler s returns forward_progress_guarantee::concurrent, all execution agents created by that scheduler shall provide the concurrent forward progress guarantee. If it returns forward_progress_guarantee::parallel, all execution agents created by that scheduler shall provide at least the parallel forward progress guarantee.

11.5.8. this_thread::execute_may_block_caller [exec.execute.may.block.caller]

  1. this_thread::execute_may_block_caller asks a scheduler s whether a call execute(s, f) with any invocable f may block the thread where such a call occurs.

  2. The name this_thread::execute_may_block_caller denotes a query object. For some subexpression s, let S be decltype((s)). If S does not satisfy scheduler, this_thread::execute_may_block_caller is ill-formed. Otherwise, this_thread::execute_may_block_caller(s) is expression-equivalent to:

    1. mandate-nothrow-call(tag_invoke, this_thread::execute_may_block_caller, as_const(s)), if this expression is well-formed.

      • Mandates: The type of the expression above is bool.

    2. Otherwise, true.

  3. If this_thread::execute_may_block_caller(s) for some scheduler s returns false, no execute(s, f) call with some invocable f shall block the calling thread.

11.5.9. execution::get_completion_scheduler [exec.completion.scheduler]

  1. get_completion_scheduler<completion-tag> obtains the completion scheduler associated with a completion tag from a sender’s attributes.

  2. The name get_completion_scheduler denotes a query object template. For some subexpression q, let Q be decltype((q)). If the template argument Tag in get_completion_scheduler<Tag>(q) is not one of set_value_t, set_error_t, or set_stopped_t, get_completion_scheduler<Tag>(q) is ill-formed. Otherwise, get_completion_scheduler<Tag>(q) is expression-equivalent to mandate-nothrow-call(tag_invoke, get_completion_scheduler, as_const(q)) if this expression is well-formed.

    • Mandates: The type of the expression above satisfies scheduler.

  3. If, for some sender s and completion function C that has an associated completion tag Tag, get_completion_scheduler<Tag>(get_env(s)) is well-formed and results in a scheduler sch, and the sender s invokes C(r, args...), for some receiver r that has been connected to s, with additional arguments args..., on an execution agent that does not belong to the associated execution resource of sch, the behavior is undefined.

  4. The expression forwarding_query(get_completion_scheduler<CPO>) has value true.

11.6. Schedulers [exec.sched]

  1. The scheduler concept defines the requirements of a scheduler type ([async.ops]). schedule is a customization point object that accepts a scheduler. A valid invocation of schedule is a schedule-expression.

    template<class S>
      concept scheduler =
        queryable<S> &&
        requires(S&& s, const get_completion_scheduler_t<set_value_t> tag) {
          { schedule(std::forward<S>(s)) } -> sender;
          { tag_invoke(tag, std::get_env(
              schedule(std::forward<S>(s)))) } -> same_as<remove_cvref_t<S>>;
        } &&
        equality_comparable<remove_cvref_t<S>> &&
        copy_constructible<remove_cvref_t<S>>;
    
  2. Let S be the type of a scheduler and let E be the type of an execution environment for which sender_in<schedule_result_t<S>, E> is true. Then sender_of<schedule_result_t<S>, set_value_t(), E> shall be true.

  3. None of a scheduler’s copy constructor, destructor, equality comparison, or swap member functions shall exit via an exception.

  4. None of these member functions, nor a scheduler type’s schedule function, shall introduce data races as a result of concurrent invocations of those functions from different threads.

  5. For any two (possibly const) values s1 and s2 of some scheduler type S, s1 == s2 shall return true only if both s1 and s2 share the same associated execution resource.

  6. For a given scheduler expression s, the expression get_completion_scheduler<set_value_t>(std::get_env(schedule(s))) shall compare equal to s.

  7. A scheduler type’s destructor shall not block pending completion of any receivers connected to the sender objects returned from schedule. The ability to wait for completion of submitted function objects can be provided by the associated execution resource of the scheduler.

11.7. Receivers [exec.recv]

11.7.1. Receiver concepts [exec.recv.concepts]

  1. A receiver represents the continuation of an asynchronous operation. The receiver concept defines the requirements for a receiver type ([async.ops]). The receiver_of concept defines the requirements for a receiver type that is usable as the first argument of a set of completion operations corresponding to a set of completion signatures. The get_env customization point is used to access a receiver’s associated environment.

    template<class R>
      inline constexpr bool enable_receiver =
        requires {
          typename R::is_receiver;
        };
    
    template<class R>
      concept receiver =
        enable_receiver<remove_cvref_t<R>> &&
        requires(const remove_cvref_t<R>& r) {
          { get_env(r) } -> queryable;
        } &&
        move_constructible<remove_cvref_t<R>> &&  // rvalues are movable, and
        constructible_from<remove_cvref_t<R>, R>; // lvalues are copyable
    
    template<class Signature, class R>
      concept valid-completion-for = // exposition only
        requires (Signature* sig) {
          []<class Tag, class... Args>(Tag(*)(Args...))
              requires callable<Tag, remove_cvref_t<R>, Args...>
          {}(sig);
        };
    
    template<class R, class Completions>
      concept receiver_of =
        receiver<R> &&
        requires (Completions* completions) {
          []<valid-completion-for<R>...Sigs>(completion_signatures<Sigs...>*)
          {}(completions);
        };
    
  2. Remarks: Pursuant to [namespace.std], users can specialize enable_receiver to true for cv-unqualified program-defined types that model receiver, and false for types that do not. Such specializations shall be usable in constant expressions ([expr.const]) and have type const bool.

  3. Let r be a receiver and let op_state be an operation state associated with an asynchronous operation created by connecting r with a sender. Let token be a stop token equal to get_stop_token(get_env(r)). token shall remain valid for the duration of the asynchronous operation’s lifetime ([async.ops]). This means that, unless it knows about further guarantees provided by the type of receiver r, the implementation of op_state can not use token after it executes a completion operation. This also implies that any stop callbacks registered on token must be destroyed before the invocation of the completion operation.

11.7.2. execution::set_value [exec.set.value]

  1. set_value is a value completion function ([async.ops]). Its associated completion tag is set_value_t. The expression set_value(R, Vs...) for some subexpression R and pack of subexpressions Vs is ill-formed if R is an lvalue or a const rvalue. Otherwise, it is expression-equivalent to mandate-nothrow-call(tag_invoke, set_value, R, Vs...).

11.7.3. execution::set_error [exec.set.error]

  1. set_error is an error completion function. Its associated completion tag is set_error_t. The expression set_error(R, E) for some subexpressions R and E is ill-formed if R is an lvalue or a const rvalue. Otherwise, it is expression-equivalent to mandate-nothrow-call(tag_invoke, set_error, R, E).

11.7.4. execution::set_stopped [exec.set.stopped]

  1. set_stopped is a stopped completion function. Its associated completion tag is set_stopped_t. The expression set_stopped(R) for some subexpression R is ill-formed if R is an lvalue or a const rvalue. Otherwise, it is expression-equivalent to mandate-nothrow-call(tag_invoke, set_stopped, R).

11.8. Operation states [exec.opstate]

  1. The operation_state concept defines the requirements of an operation state type ([async.ops]).

    template<class O>
      concept operation_state =
        queryable<O> &&
        is_object_v<O> &&
        requires (O& o) {
          { start(o) } noexcept;
        };
    
  2. If an operation_state object is moved during the lifetime of its asynchronous operation ([async.ops]), the behavior is undefined.

  3. Library-provided operation state types are non-movable.

11.8.1. execution::start [exec.opstate.start]

  1. The name start denotes a customization point object that starts ([async.ops]) the asynchronous operation associated with the operation state object. The expression start(O) for some subexpression O is ill-formed if O is an rvalue. Otherwise, it is expression-equivalent to:

    mandate-nothrow-call(tag_invoke, start, O)
    
  2. If the function selected by tag_invoke does not start the asynchronous operation associated with the operation state O, the behavior of calling start(O) is undefined.

11.9. Senders [exec.snd]

11.9.1. Sender concepts [exec.snd.concepts]

  1. The sender concept defines the requirements for a sender type ([async.ops]). The sender_in concept defines the requirements for a sender type that can create asynchronous operations given an associated environment type. The sender_to concept defines the requirements for a sender type that can connect with a specific receiver type. The get_env customization point object is used to access a sender’s associated attributes. The connect customization point object is used to connect ([async.ops]) a sender and a receiver to produce an operation state.

    template<class Sigs>
      concept valid-completion-signatures = see below;
    
    template<class S>
      inline constexpr bool enable_sender =
        requires { typename S::is_sender; };
    
    template<is-awaitable<env-promise<empty_env>> S> // [exec.awaitables]
      inline constexpr bool enable_sender<S> = true;
    
    template<class S>
      concept sender =
        enable_sender<remove_cvref_t<S>> &&
        requires (const remove_cvref_t<S>& s) {
          { get_env(s) } -> queryable;
        } &&
        move_constructible<remove_cvref_t<S>> &&  // rvalues are movable, and
        constructible_from<remove_cvref_t<S>, S>; // lvalues are copyable
    
    template<class S, class E = empty_env>
      concept sender_in =
        sender<S> &&
        requires (S&& s, E&& e) {
          { get_completion_signatures(std::forward<S>(s), std::forward<E>(e)) } ->
            valid-completion-signatures;
        };
    
    template<class S, class R>
      concept sender_to =
        sender_in<S, env_of_t<R>> &&
        receiver_of<R, completion_signatures_of_t<S, env_of_t<R>>> &&
        requires (S&& s, R&& r) {
          connect(std::forward<S>(s), std::forward<R>(r));
        };
    
  2. A type Sigs satisfies and models the exposition-only concept valid-completion-signatures if it names a specialization of the completion_signatures class template.

  3. Remarks: Pursuant to [namespace.std], users can specialize enable_sender to true for cv-unqualified program-defined types that model sender, and false for types that do not. Such specializations shall be usable in constant expressions ([expr.const]) and have type const bool.

  4. The sender_of concept defines the requirements for a sender type that completes with the completion signature specified for the given completion function.

    template<class> struct sender-of-helper; // exposition only
    template<class R, class... As>
      struct sender-of-helper<R(As...)> {
        using tag = R;
    
        template<class... Bs>
          using as-sig = R(Bs...);
      };
    
    template<class S, class Sig, class E = empty_env>
      concept sender_of =
        sender_in<S, E> &&
        MATCHING-SIG( // see [exec.general]
          Sig,
          gather-signatures<  // see [exec.utils.cmplsigs]
            typename sender-of-helper<Sig>::tag,
            S, E,
            sender-of-helper<Sig>::template as-sig, type_identity_t>);
    
    1. [Example:

      auto s1 = just() | then([]{});
      using S1 = decltype(s1);
      
      static_assert(sender_of<S1, set_value_t()>);
      static_assert(sender_of<S1, set_error_t(exception_ptr)>);
      static_assert(!sender_of<S1, set_stopped_t()>);
      
      auto s2 = s1 | let_error([](auto) { return just('a'); });
      using S2 = decltype(s2);
      
      static_assert(!sender_of<S2, set_value_t()>);
      static_assert(!sender_of<S2, set_value_t(char)>);
      static_assert(!sender_of<S2, set_error_t(exception_ptr)>);
      static_assert(!sender_of<S2, set_stopped_t()>);
      

      -- end example]

  5. For a type T, SET-VALUE-SIG(T) names the type set_value_t() if T is cv void; otherwise, it names the type set_value_t(T).

  6. Library-provided sender types:

    • Always expose an overload of a customization of connect that accepts an rvalue sender.

    • Only expose an overload of a customization of connect that accepts an lvalue sender if they model copy_constructible.

    • Model copy_constructible if they satisfy copy_constructible.

11.9.2. Awaitable helpers [exec.awaitables]

  1. The sender concepts recognize awaitables as senders. For this clause ([exec]), an awaitable is an expression that would be well-formed as the operand of a co_await expression within a given context.

  2. For a subexpression c, let GET-AWAITER(c, p) be expression-equivalent to the series of transformations and conversions applied to c as the operand of an await-expression in a coroutine, resulting in lvalue e as described by [expr.await]/3.2-4, where p is an lvalue refering to the coroutine’s promise type, P. This includes the invocation of the promise type’s await_transform member if any, the invocation of the operator co_await picked by overload resolution if any, and any necessary implicit conversions and materializations.

    I have opened cwg#250 to give these transformations a term-of-art so we can more easily refer to it here.
  3. Let is-awaitable be the following exposition-only concept:

    template<class T>
    concept await-suspend-result = see below;
    
    template<class A, class P>
    concept is-awaiter = // exposition only
      requires (A& a, coroutine_handle<P> h) {
        a.await_ready() ? 1 : 0;
        { a.await_suspend(h) } -> await-suspend-result;
        a.await_resume();
      };
    
    template<class C, class P>
    concept is-awaitable =
      requires (C (*fc)() noexcept, P& p) {
        { GET-AWAITER(fc(), p) } -> is-awaiter<P>;
      };
    

    await-suspend-result<T> is true if and only if one of the following is true:

    • T is void, or

    • T is bool, or

    • T is a specialization of coroutine_handle.

  4. For a subexpression c such that decltype((c)) is type C, and an lvalue p of type P, await-result-type<C, P> names the type decltype(GET-AWAITER(c, p).await_resume()).

  5. Let with-await-transform be the exposition-only class template:

    template<class Derived>
    struct with-await-transform {
      template<class T>
      T&& await_transform(T&& value) noexcept {
        return std::forward<T>(value);
      }
    
      template<class T>
        requires tag_invocable<as_awaitable_t, T, Derived&>
      auto await_transform(T&& value)
        noexcept(nothrow_tag_invocable<as_awaitable_t, T, Derived&>)
        -> tag_invoke_result_t<as_awaitable_t, T, Derived&> {
        return tag_invoke(as_awaitable, std::forward<T>(value), static_cast<Derived&>(*this));
      }
    };
    
  6. Let env-promise be the exposition-only class template:

    template<class Env>
    struct env-promise : with-await-transform<env-promise<Env>> {
      unspecified get_return_object() noexcept;
      unspecified initial_suspend() noexcept;
      unspecified final_suspend() noexcept;
      void unhandled_exception() noexcept;
      void return_void() noexcept;
      coroutine_handle<> unhandled_stopped() noexcept;
    
      friend const Env& tag_invoke(get_env_t, const env-promise&) noexcept;
    };
    

    Specializations of env-promise are only used for the purpose of type computation; its members need not be defined.

11.9.3. execution::get_completion_signatures [exec.getcomplsigs]

  1. get_completion_signatures is a customization point object. Let s be an expression such that decltype((s)) is S, and let e be an expression such that decltype((e)) is E. Then get_completion_signatures(s, e) is expression-equivalent to:

    1. tag_invoke_result_t<get_completion_signatures_t, S, E>{} if that expression is well-formed,

      • Mandates: valid-completion-signatures<Sigs>, where Sigs names the type tag_invoke_result_t<get_completion_signatures_t, S, E>.

    2. Otherwise, remove_cvref_t<S>::completion_signatures{} if that expression is well-formed,

      • Mandates: valid-completion-signatures<Sigs>, where Sigs names the type remove_cvref_t<S>::completion_signatures.

    3. Otherwise, if is-awaitable<S, env-promise<E>> is true, then:

      completion_signatures<
        SET-VALUE-SIG(await-result-type<S, env-promise<E>>), // see [exec.snd.concepts]
        set_error_t(exception_ptr),
        set_stopped_t()>{}
      
    4. Otherwise, get_completion_signatures(s, e) is ill-formed.

  2. Let r be an rvalue receiver of type R, and let S be the type of a sender such that sender_in<S, env_of_t<R>> is true. Let Sigs... be the template arguments of the completion_signatures specialization named by completion_signatures_of_t<S, env_of_t<R>>. Let CSO be a completion function. If sender S or its operation state cause the expression CSO(r, args...) to be potentially evaluated ([basic.def.odr]) then there shall be a signature Sig in Sigs... such that MATCHING-SIG(tag_t<CSO>(decltype(args)...), Sig) is true ([exec.general]).

11.9.4. execution::connect [exec.connect]

  1. connect connects ([async.op]) a sender with a receiver.

  2. The name connect denotes a customization point object. For subexpressions s and r, let S be decltype((s)) and R be decltype((r)), and let DS and DR be the decayed types of S and R, respectively.

  3. Let connect-awaitable-promise be the following class:

    struct connect-awaitable-promise : with-await-transform<connect-awaitable-promise> {
      DR& rcvr; // exposition only
    
      connect-awaitable-promise(DS&, DR& r) noexcept : rcvr(r) {}
    
      suspend_always initial_suspend() noexcept { return {}; }
      [[noreturn]] suspend_always final_suspend() noexcept { std::terminate(); }
      [[noreturn]] void unhandled_exception() noexcept { std::terminate(); }
      [[noreturn]] void return_void() noexcept { std::terminate(); }
    
      coroutine_handle<> unhandled_stopped() noexcept {
        set_stopped((DR&&) rcvr);
        return noop_coroutine();
      }
    
      operation-state-task get_return_object() noexcept {
        return operation-state-task{
          coroutine_handle<connect-awaitable-promise>::from_promise(*this)};
      }
    
      friend auto tag_invoke(get_env_t, connect-awaitable-promise& self)
        noexcept(nothrow-callable<get_env_t, const DR&>) -> env_of_t<const DR&> {
        return get_env(self.rcvr);
      }
    };
    
  4. Let operation-state-task be the following class:

    struct operation-state-task {
      using promise_type = connect-awaitable-promise;
      coroutine_handle<> coro; // exposition only
    
      explicit operation-state-task(coroutine_handle<> h) noexcept : coro(h) {}
      operation-state-task(operation-state-task&& o) noexcept
        : coro(exchange(o.coro, {})) {}
      ~operation-state-task() { if (coro) coro.destroy(); }
    
      friend void tag_invoke(start_t, operation-state-task& self) noexcept {
        self.coro.resume();
      }
    };
    
  5. Let V name the type await-result-type<DS, connect-awaitable-promise>, let Sigs name the type:

    completion_signatures<
      SET-VALUE-SIG(V), // see [exec.snd.concepts]
      set_error_t(exception_ptr),
      set_stopped_t()>
    

    and let connect-awaitable be an exposition-only coroutine defined as follows:

    template<class Fun, class... Ts>
    auto suspend-complete(Fun fun, Ts&&... as) noexcept { // exposition only
      auto fn = [&, fun]() noexcept { fun(std::forward<Ts>(as)...); };
    
      struct awaiter {
        decltype(fn) fn_;
    
        static bool await_ready() noexcept { return false; }
        void await_suspend(coroutine_handle<>) noexcept { fn_(); }
        [[noreturn]] void await_resume() noexcept { unreachable(); }
      };
      return awaiter{fn};
    };
    
    operation-state-task connect-awaitable(DS s, DR r) requires receiver_of<DR, Sigs> {
      exception_ptr ep;
      try {
        if constexpr (same_as<V, void>) {
          co_await std::move(s);
          co_await suspend-complete(set_value, std::move(r));
        } else {
          co_await suspend-complete(set_value, std::move(r), co_await std::move(s));
        }
      } catch(...) {
        ep = current_exception();
      }
      co_await suspend-complete(set_error, std::move(r), std::move(ep));
    }
    
  6. If S does not satisfy sender or if R does not satisfy receiver, connect(s, r) is ill-formed. Otherwise, the expression connect(s, r) is expression-equivalent to:

    1. tag_invoke(connect, s, r) if connectable-with-tag-invoke<S, R> is modeled.

      • Mandates: The type of the tag_invoke expression above satisfies operation_state.

    2. Otherwise, connect-awaitable(s, r) if that expression is well-formed.

    3. Otherwise, connect(s, r) is ill-formed.

11.9.5. Sender factories [exec.factories]

11.9.5.1. execution::schedule [exec.schedule]
  1. schedule obtains a schedule-sender ([async.ops]) from a scheduler.

  2. The name schedule denotes a customization point object. For some subexpression s, the expression schedule(s) is expression-equivalent to:

    1. tag_invoke(schedule, s), if that expression is valid. If the function selected by tag_invoke does not return a sender whose set_value completion scheduler is equivalent to s, the behavior of calling schedule(s) is undefined.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, schedule(s) is ill-formed.

11.9.5.2. execution::just, execution::just_error, execution::just_stopped [exec.just]
  1. just is a factory for senders whose asynchronous operations complete synchronously in their start operation with a value completion operation. just_error is a factory for senders whose asynchronous operations complete synchronously in their start operation with an error completion operation. just_stopped is a factory for senders whose asynchronous operations complete synchronously in their start operation with a stopped completion operation.

  2. Let just-sender be the class template:

    template<class Tag, movable-value... Ts>
    struct just-sender { // exposition only
      using is_sender = unspecified;
      using completion_signatures =
        execution::completion_signatures<Tag(Ts...)>;
    
      tuple<Ts...> vs_; // exposition only
    
      template<class R>
      struct operation { // exposition only
        tuple<Ts...> vs_; // exposition only
        R r_; // exposition only
    
        friend void tag_invoke(start_t, operation& s) noexcept {
          apply([&s](Ts&... values) {
            Tag()(std::move(s.r_), std::move(values)...);
          }, s.vs_);
        }
      };
    
      template<receiver_of<completion_signatures> R>
        requires (copy_constructible<Ts> &&...)
      friend operation<decay_t<R>> tag_invoke(connect_t, const just-sender& s, R && r) {
        return { s.vs_, std::forward<R>(r) };
      }
    
      template<receiver_of<completion_signatures> R>
      friend operation<decay_t<R>> tag_invoke(connect_t, just-sender&& s, R && r) {
        return { std::move(s.vs_), std::forward<R>(r) };
      }
    };
    
  3. The name just denotes a customization point object. For some pack of subexpressions vs, let Vs be the template paramter pack decltype((vs)). just(vs...) is expression-equivalent to just-sender<set_value_t, remove_cvref_t<Vs>...>({vs...}).

  4. The name just_error denotes a customization point object. For some subexpression err, let Err be decltype((err)). just_error(err) is expression-equivalent to just-sender<set_error_t, remove_cvref_t<Err>>({err}).

  5. Then name just_stopped denotes a customization point object. just_stopped() is expression-equivalent to just-sender<set_stopped_t>().

11.9.5.3. execution::transfer_just [exec.transfer.just]
  1. transfer_just is a factory for senders whose asynchronous operations execute value completion operations on an execution agent belonging to the execution resource associated with a specified scheduler.

  2. The name transfer_just denotes a customization point object. For some subexpression s and pack of subexpressions vs, let S be decltype((s)) and let Vs be the template parameter pack decltype((vs)).... If S does not satisfy scheduler, or any type V in Vs does not satisfy movable-value, transfer_just(s, vs...) is ill-formed. Otherwise, transfer_just(s, vs...) is expression-equivalent to:

    1. tag_invoke(transfer_just, s, vs...), if that expression is valid. Let as be a pack of rvalue subexpressions of types decay_t<Vs>... refering to objects direct-initilized from vs. If the function selected by tag_invoke does not return a sender whose asynchronous operations execute value completion operations on an execution agent belonging to the execution resource associated with s, with value result datums as, the behavior of calling transfer_just(s, vs...) is undefined.

      • Mandates: sender_of<R, set_value_t(decay_t<Vs>...), E>, where R is the type of the tag_invoke expression above, and E is the type of an environment.

    2. Otherwise, transfer(just(vs...), s).

11.9.5.4. execution::read [exec.read]
  1. read is a factory for a sender whose asynchronous operation completes synchronously in its start operation with a value completion result equal to a value read from the receiver’s associated environment.

  2. read is a customization point object of the unspecified class type:

    template<class Tag>
      struct read-sender; // exposition only
    
    struct read-t { // exposition only
      template<class Tag>
        constexpr read-sender<Tag> operator()(Tag) const noexcept {
          return {};
        }
    };
    
  3. read-sender is the exposition-only class template:

    template<class Tag>
      struct read-sender { // exposition only
        using is_sender = unspecified;
        template<class R>
          struct operation-state { // exposition only
            R r_; // exposition only
    
            friend void tag_invoke(start_t, operation-state& s) noexcept {
              TRY-SET-VALUE(std::move(s.r_), Tag{}(get_env(s.r_)));
            }
          };
    
        template<receiver R>
        friend operation-state<decay_t<R>> tag_invoke(connect_t, read-sender, R && r) {
          return { std::forward<R>(r) };
        }
    
        template<class Env>
            requires callable<Tag, Env>
          friend auto tag_invoke(get_completion_signatures_t, read-sender, Env)
            -> completion_signatures<
              set_value_t(call-result-t<Tag, Env>), set_error_t(exception_ptr)>; // not defined
    
        template<class Env>
            requires nothrow-callable<Tag, Env>
          friend auto tag_invoke(get_completion_signatures_t, read-sender, Env)
            -> completion_signatures<set_value_t(call-result-t<Tag, Env>)>; // not defined
    
        friend empty_env tag_invoke(get_env_t, const read-sender&) noexcept {
          return {};
        }
      };
    

    where TRY-SET-VALUE(r, e), for two subexpressions r and e, is equivalent to:

    try {
      set_value(r, e);
    } catch(...) {
      set_error(r, current_exception());
    }
    

    if e is potentially-throwing; or set_value(r, e) otherwise.

11.9.6. Sender adaptors [exec.adapt]

11.9.6.1. General [exec.adapt.general]
  1. Subclause [exec.adapt] specifies a set of sender adaptors.

  2. The bitwise OR operator is overloaded for the purpose of creating sender chains. The adaptors also support function call syntax with equivalent semantics.

  3. Unless otherwise specified, a sender adaptor is required to not begin executing any functions that would observe or modify any of the arguments of the adaptor before the returned sender is connected with a receiver using connect, and start is called on the resulting operation state. This requirement applies to any function that is selected by the implementation of the sender adaptor.

  4. Unless otherwise specified, a parent sender ([async.ops]) with a single child sender s has an associated attribute object equal to FWD-QUERIES(get_env(s)) ([exec.fwd.env]). Unless otherwise specified, a parent sender with more than one child senders has an associated attributes object equal to empty_env{}. These requirements apply to any function that is selected by the implementation of the sender adaptor.

  5. Unless otherwise specified, when a parent sender is connected to a receiver r, any receiver used to connect a child sender has an associated environment equal to FWD-QUERIES(get_env(r)). This requirements applies to any sender returned from a function that is selected by the implementation of such sender adaptor.

  6. For any sender type, receiver type, operation state type, queryable type, or coroutine promise type that is part of the implementation of any sender adaptor in this subclause and that is a class template, the template arguments do not contribute to the associated entities ([basic.lookup.argdep]) of a function call where a specialization of the class template is an associated entity.

    [Example:

    namespace sender-adaptors { // exposition only
      template<class Sch, class S> // arguments are not associated entities ([lib.tmpl-heads])
      class on-sender {
        // ...
      };
    
      struct on_t {
        template<scheduler Sch, sender S>
        on-sender<Sch, S> operator()(Sch&& sch, S&& s) const {
          // ...
        }
      };
    }
    inline constexpr sender-adaptors::on_t on{};
    

    -- end example]

  7. If a sender returned from a sender adaptor specified in this subsection is specified to include set_error_t(E) among its set of completion signatures where decay_t<E> names the type exception_ptr, but the implementation does not potentially evaluate an error completion operation with an exception_ptr argument, the implementation is allowed to omit the exception_ptr error completion signature from the set.

11.9.6.2. Sender adaptor closure objects [exec.adapt.objects]
  1. A pipeable sender adaptor closure object is a function object that accepts one or more sender arguments and returns a sender. For a sender adaptor closure object C and an expression S such that decltype((S)) models sender, the following expressions are equivalent and yield a sender:

    C(S)
    S | C
    

    Given an additional pipeable sender adaptor closure object D, the expression C | D produces another pipeable sender adaptor closure object E:

    E is a perfect forwarding call wrapper ([func.require]) with the following properties:

    • Its target object is an object d of type decay_t<decltype((D))> direct-non-list-initialized with D.

    • It has one bound argument entity, an object c of type decay_t<decltype((C))> direct-non-list-initialized with C.

    • Its call pattern is d(c(arg)), where arg is the argument used in a function call expression of E.

    The expression C | D is well-formed if and only if the initializations of the state entities of E are all well-formed.

  2. An object t of type T is a pipeable sender adaptor closure object if T models derived_from<sender_adaptor_closure<T>>, T has no other base classes of type sender_adaptor_closure<U> for any other type U, and T does not model sender.

  3. The template parameter D for sender_adaptor_closure can be an incomplete type. Before any expression of type cv D appears as an operand to the | operator, D shall be complete and model derived_from<sender_adaptor_closure<D>>. The behavior of an expression involving an object of type cv D as an operand to the | operator is undefined if overload resolution selects a program-defined operator| function.

  4. A pipeable sender adaptor object is a customization point object that accepts a sender as its first argument and returns a sender.

  5. If a pipeable sender adaptor object accepts only one argument, then it is a pipeable sender adaptor closure object.

  6. If a pipeable sender adaptor object adaptor accepts more than one argument, then let s be an expression such that decltype((s)) models sender, let args... be arguments such that adaptor(s, args...) is a well-formed expression as specified in the rest of this subclause ([exec.adapt.objects]), and let BoundArgs be a pack that denotes decay_t<decltype((args))>.... The expression adaptor(args...) produces a pipeable sender adaptor closure object f that is a perfect forwarding call wrapper with the following properties:

    • Its target object is a copy of adaptor.

    • Its bound argument entities bound_args consist of objects of types BoundArgs... direct-non-list-initialized with std::forward<decltype((args))>(args)..., respectively.

    • Its call pattern is adaptor(r, bound_args...), where r is the argument used in a function call expression of f.

    The expression adaptor(args...) is well-formed if and only if the initializations of the bound argument entities of the result, as specified above, are all well-formed.

11.9.6.3. execution::on [exec.on]
  1. on adapts an input sender into a sender that will start on an execution agent belonging to a particular scheduler’s associated execution resource.

  2. Let replace-scheduler(e, sch) be an expression denoting an object e' such that get_scheduler(e) returns a copy of sch, and tag_invoke(tag, e', args...) is expression-equivalent to tag(e, args...) for all arguments args... and for all tag whose type satisfies forwarding-query and is not get_scheduler_t.

  3. The name on denotes a customization point object. For some subexpressions sch and s, let Sch be decltype((sch)) and S be decltype((s)). If Sch does not satisfy scheduler, or S does not satisfy sender, on is ill-formed. Otherwise, the expression on(sch, s) is expression-equivalent to:

    1. tag_invoke(on, sch, s), if that expression is valid. If the function selected above does not return a sender which starts s on an execution agent of the associated execution resource of sch when started, the behavior of calling on(sch, s) is undefined.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, constructs a sender s1. When s1 is connected with some receiver out_r, it:

      1. Constructs a receiver r such that:

        1. When set_value(r) is called, it calls connect(s, r2), where r2 is as specified below, which results in op_state3. It calls start(op_state3). If any of these throws an exception, it calls set_error on out_r, passing current_exception() as the second argument.

        2. set_error(r, e) is expression-equivalent to set_error(out_r, e).

        3. set_stopped(r) is expression-equivalent to set_stopped(out_r).

        4. get_env(r) is expression-equivalent to get_env(out_r).

      2. Calls schedule(sch), which results in s2. It then calls connect(s2, r), resulting in op_state2.

      3. op_state2 is wrapped by a new operation state, op_state1, that is returned to the caller.

      4. r2 is a receiver that wraps a reference to out_r and forwards all completion operations to it. In addition, get_env(r2) returns replace-scheduler(e, sch).

      5. When start is called on op_state1, it calls start on op_state2.

      6. The lifetime of op_state2, once constructed, lasts until either op_state3 is constructed or op_state1 is destroyed, whichever comes first. The lifetime of op_state3, once constructed, lasts until op_state1 is destroyed.

    3. Given subexpressions s1 and e, where s1 is a sender returned from on or a copy of such, let S1 be decltype((s1)). Let E' be decltype((replace-scheduler(e, sch))). Then the type of tag_invoke(get_completion_signatures, s1, e) shall be:

      make_completion_signatures<
        copy_cvref_t<S1, S>,
        E',
        make_completion_signatures<
          schedule_result_t<Sch>,
          E,
          completion_signatures<set_error_t(exception_ptr)>,
          no-value-completions>>;
      

      where no-value-completions<As...> names the type completion_signatures<> for any set of types As....

11.9.6.4. execution::transfer [exec.transfer]
  1. transfer adapts a sender into a sender with a different associated set_value completion scheduler. [Note: it results in a transition between different execution resources when executed. --end note]

  2. The name transfer denotes a customization point object. For some subexpressions sch and s, let Sch be decltype((sch)) and S be decltype((s)). If Sch does not satisfy scheduler, or S does not satisfy sender, transfer is ill-formed. Otherwise, the expression transfer(s, sch) is expression-equivalent to:

    1. tag_invoke(transfer, get_completion_scheduler<set_value_t>(get_env(s)), s, sch), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(transfer, s, sch), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, schedule_from(sch, s).

    If the function selected above does not return a sender which is a result of a call to schedule_from(sch, s2), where s2 is a sender which sends values equivalent to those sent by s, the behavior of calling transfer(s, sch) is undefined.

  3. For a sender t returned from transfer(s, sch), get_env(t) shall return a queryable object q such that get_completion_scheduler<CPO>(q) returns a copy of sch, where CPO is either set_value_t or set_stopped_t. The get_completion_scheduler<set_error_t> query is not implemented, as the scheduler cannot be guaranteed in case an error is thrown while trying to schedule work on the given scheduler object. For all other query objects Q whose type satisfies forwarding-query, the expression Q(q, args...) shall be equivalent to Q(get_env(s), args...).

11.9.6.5. execution::schedule_from [exec.schedule.from]
  1. schedule_from schedules work dependent on the completion of a sender onto a scheduler’s associated execution resource. [Note: schedule_from is not meant to be used in user code; it is used in the implementation of transfer. -end note]

  2. The name schedule_from denotes a customization point object. For some subexpressions sch and s, let Sch be decltype((sch)) and S be decltype((s)). If Sch does not satisfy scheduler, or S does not satisfy sender, schedule_from is ill-formed. Otherwise, the expression schedule_from(sch, s) is expression-equivalent to:

    1. tag_invoke(schedule_from, sch, s), if that expression is valid. If the function selected by tag_invoke does not return a sender that completes on an execution agent belonging to the associated execution resource of sch and completing with the same async result ([async.ops]) as s, the behavior of calling schedule_from(sch, s) is undefined.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, constructs a sender s2. When s2 is connected with some receiver out_r, it:

      1. Constructs a receiver r such that when a receiver completion operation Tag(r, args...) is called, it decay-copies args... into op_state (see below) as args'... and constructs a receiver r2 such that:

        1. When set_value(r2) is called, it calls Tag(out_r, std::move(args')...).

        2. set_error(r2, e) is expression-equivalent to set_error(out_r, e).

        3. set_stopped(r2) is expression-equivalent to set_stopped(out_r).

        It then calls schedule(sch), resulting in a sender s3. It then calls connect(s3, r2), resulting in an operation state op_state3. It then calls start(op_state3). If any of these throws an exception, it catches it and calls set_error(out_r, current_exception()). If any of these expressions would be ill-formed, then Tag(r, args...) is ill-formed.

      2. Calls connect(s, r) resulting in an operation state op_state2. If this expression would be ill-formed, connect(s2, out_r) is ill-formed.

      3. Returns an operation state op_state that contains op_state2. When start(op_state) is called, calls start(op_state2). The lifetime of op_state3 ends when op_state is destroyed.

    3. Given subexpressions s2 and e, where s2 is a sender returned from schedule_from or a copy of such, let S2 be decltype((s2)) and let E be decltype((e)). Then the type of tag_invoke(get_completion_signatures, s2, e) shall be:

      make_completion_signatures<
        copy_cvref_t<S2, S>,
        E,
        make_completion_signatures<
          schedule_result_t<Sch>,
          E,
          potenially-throwing-completions,
          no-completions>,
        value-completions,
        error-completions>;
      

      where potentially-throwing-completions, no-completions, value-completions, and error-completions are defined as follows:

      template <class... Ts>
      using all-nothrow-decay-copyable =
        boolean_constant<(is_nothrow_constructible_v<decay_t<Ts>, Ts> && ...)>;
      
      template <class... Ts>
      using conjunction = boolean_constant<(Ts::value &&...)>;
      
      using potentially-throwing-completions =
        conditional_t<
          error_types_of_t<copy_cvref_t<S2, S>, E, all-nothrow-decay-copyable>::value &&
            value_types_of_t<copy_cvref_t<S2, S>, E, all-nothrow-decay-copyable, conjunction>::value,
          completion_signatures<>,
          completion_signatures<set_error_t(exception_ptr)>;
      
      template <class...>
      using no-completions = completion_signatures<>;
      
      template <class... Ts>
      using value-completions = completion_signatures<set_value_t(decay_t<Ts>&&...)>;
      
      template <class T>
      using error-completions = completion_signatures<set_error_t(decay_t<T>&&)>;
      
  3. For a sender t returned from schedule_from(sch, s), get_env(t) shall return a queryable object q such that get_completion_scheduler<CPO>(q) returns a copy of sch, where CPO is either set_value_t or set_stopped_t. The get_completion_scheduler<set_error_t> query is not implemented, as the scheduler cannot be guaranteed in case an error is thrown while trying to schedule work on the given scheduler object. For all other query objects Q whose type satisfies forwarding_query, the expression Q(q, args...) shall be equivalent to Q(get_env(s), args...).

11.9.6.6. execution::then [exec.then]
  1. then attaches an invocable as a continuation for an input sender’s value completion operation.

  2. The name then denotes a customization point object. For some subexpressions s and f, let S be decltype((s)), let F be the decayed type of f, and let f' be an xvalue refering to an object decay-copied from f. If S does not satisfy sender, or F does not model movable-value, then is ill-formed. Otherwise, the expression then(s, f) is expression-equivalent to:

    1. tag_invoke(then, get_completion_scheduler<set_value_t>(get_env(s)), s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(then, s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, constructs a sender s2. When s2 is connected with some receiver out_r, it:

      1. Constructs a receiver r such that:

        1. When set_value(r, args...) is called, let v be the expression invoke(f', args...). If decltype(v) is void, calls set_value(out_r); otherwise, it calls set_value(out_r, v). If any of these throw an exception, it catches it and calls set_error(out_r, current_exception()). If any of these expressions would be ill-formed, the expression set_value(r, args...) is ill-formed.

        2. set_error(r, e) is expression-equivalent to set_error(out_r, e).

        3. set_stopped(r) is expression-equivalent to set_stopped(out_r).

      2. Returns an expression-equivalent to connect(s, r).

      3. Let compl-sig-t<Tag, Args...> name the type Tag() if Args... is a template paramter pack containing the single type void; otherwise, Tag(Args...). Given subexpressions s2 and e where s2 is a sender returned from then or a copy of such, let S2 be decltype((s2)) and let E be decltype((e)). The type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to:

        make_completion_signatures<
          copy_cvref_t<S2, S>, E, set-error-signature,
            set-value-completions>;
        

        where set-value-completions is an alias for:

        template<class... As>
          set-value-completions =
            completion_signatures<compl-sig-t<set_value_t, invoke_result_t<F, As...>>>
        

        and set-error-signature is an alias for completion_signatures<set_error_t(exception_ptr)> if any of the types in the type-list named by value_types_of_t<copy_cvref_t<S2, S>, E, potentially-throwing, type-list> are true_type; otherwise, completion_signatures<>, where potentially-throwing is the template alias:

        template<class... As>
          using potentially-throwing =
            bool_constant<!is_nothrow_invocable_v<F, As...>>;
        

    If the function selected above does not return a sender that invokes f with the value result datums of s using f's return value as the sender’s value completion, and forwards the non-value completion operations unchanged, the behavior of calling then(s, f) is undefined.

11.9.6.7. execution::upon_error [exec.upon.error]
  1. upon_error maps an input sender’s error completion operation into a value completion operation using the provided invocable.

  2. The name upon_error denotes a customization point object. For some subexpressions s and f, let S be decltype((s)), let F be the decayed type of f, and let f' be an xvalue refering to an object decay-copied from f. If S does not satisfy sender, or F does not model movable-value, upon_error is ill-formed. Otherwise, the expression upon_error(s, f) is expression-equivalent to:

    1. tag_invoke(upon_error, get_completion_scheduler<set_error_t>(get_env(s)), s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(upon_error, s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, constructs a sender s2. When s2 is connected with some receiver out_r, it:

      1. Constructs a receiver r such that:

        1. set_value(r, args...) is expression-equivalent to set_value(out_r, args...).

        2. When set_error(r, e) is called, let v be the expression invoke(f', e). If decltype(v) is void, calls set_value(out_r); otherwise, it calls set_value(out_r, v). If any of these throw an exception, it catches it and calls set_error(out_r, current_exception()). If any of these expressions would be ill-formed, the expression set_error(r, e) is ill-formed.

        3. set_stopped(r) is expression-equivalent to set_stopped(out_r).

      2. Returns an expression-equivalent to connect(s, r).

      3. Let compl-sig-t<Tag, Args...> name the type Tag() if Args... is a template paramter pack containing the single type void; otherwise, Tag(Args...). Given subexpressions s2 and e where s2 is a sender returned from upon_error or a copy of such, let S2 be decltype((s2)) and let E be decltype((e)). The type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to:

        make_completion_signatures<
          copy_cvref_t<S2, S>, E, set-error-signature,
            default-set-value, set-error-completion>;
        

        where set-error-completion is the template alias:

        template<class E>
          set-error-completion =
            completion_signatures<compl-sig-t<set_value_t, invoke_result_t<F, E>>>
        

        and set-error-signature is an alias for completion_signatures<set_error_t(exception_ptr)> if any of the types in the type-list named by error_types_of_t<copy_cvref_t<S2, S>, E, potentially-throwing> are true_type; otherwise, completion_signatures<>, where potentially-throwing is the template alias:

        template<class... Es>
          using potentially-throwing =
            type-list<!bool_constant<is_nothrow_invocable_v<F, Es>>...>;
        

    If the function selected above does not return a sender which invokes f with the error result datum of s using f's return value as the sender’s value completion, and forwards the non-error completion operations unchanged, the behavior of calling upon_error(s, f) is undefined.

11.9.6.8. execution::upon_stopped [exec.upon.stopped]
  1. upon_stopped maps an input sender’s stopped completion operation into a value completion operation using the provided invocable.

  2. The name upon_stopped denotes a customization point object. For some subexpressions s and f, let S be decltype((s)), let F be the decayed type of f, and let f' be an xvalue refering to an object decay-copied from f. If S does not satisfy sender, or F does not model both movable-value and invocable, upon_stopped is ill-formed. Otherwise, the expression upon_stopped(s, f) is expression-equivalent to:

    1. tag_invoke(upon_stopped, get_completion_scheduler<set_stopped_t>(get_env(s)), s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(upon_stopped, s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, constructs a sender s2. When s2 is connected with some receiver out_r, it:

      1. Constructs a receiver r such that:

        1. set_value(r, args...) is expression-equivalent to set_value(out_r, args...).

        2. set_error(r, e) is expression-equivalent to set_error(out_r, e).

        3. When set_stopped(r) is called, let v be the expression invoke(f'). If v has type void, calls set_value(out_r); otherwise, calls set_value(out_r, v). If any of these throw an exception, it catches it and calls set_error(out_r, current_exception()). If any of these expressions would be ill-formed, the expression set_stopped(r) is ill-formed.

      2. Returns an expression-equivalent to connect(s, r).

      3. Let compl-sig-t<Tag, Args...> name the type Tag() if Args... is a template paramter pack containing the single type void; otherwise, Tag(Args...). Given subexpressions s2 and e where s2 is a sender returned from upon_stopped or a copy of such, let S2 be decltype((s2)) and let E be decltype((e)). The type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to:

        make_completion_signatures<
          copy_cvref_t<S2, S>, E, set-error-signature,
            default-set-value, default-set-error, set-stopped-completions>;
        

        where set-stopped-completions names the type completion_signatures<compl-sig-t<set_value_t, invoke_result_t<F>>, and set-error-signature names the type completion_signatures<set_error_t(exception_ptr)> if is_nothrow_invocable_v<F> is true, or completion_signatures<> otherwise.

    If the function selected above does not return a sender that invokes f when s executes a stopped completion, using f's return value as the sender’s the value completion, and propagates s's other completion operations unchanged, the behavior of calling upon_stopped(s, f) is undefined.

11.9.6.9. execution::let_value, execution::let_error, execution::let_stopped, [exec.let]
  1. let_value transforms a sender’s value completion into a new child asynchronous operation. let_error transforms a sender’s error completion into a new child asynchronous operation. let_stopped transforms a sender’s stopped completion into a new child asynchronous operation.

  2. The names let_value, let_error, and let_stopped denote customization point objects. Let the expression let-cpo be one of let_value, let_error, or let_stopped. For subexpressions s and f, let S be decltype((s)), let F be the decayed type of f, and let f' be an xvalue that refers to an object decay-copied from f. If S does not satisfy sender, the expression let-cpo(s, f) is ill-formed. If F does not satisfy invocable, the expression let_stopped(s, f) is ill-formed. Otherwise, the expression let-cpo(s, f) is expression-equivalent to:

    1. tag_invoke(let-cpo, get_completion_scheduler<set_value_t>(get_env(s)), s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(let-cpo, s, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, given a receiver out_r and an lvalue out_r' refering to an object decay-copied from out_r.

      1. For let_value, let set-cpo be set_value. For let_error, let set-cpo be set_error. For let_stopped, let set-cpo be set_stopped. Let completion-function be one of set_value, set_error, or set_stopped.

      2. Let r be an rvalue of a receiver type R such that:

        1. When set-cpo(r, args...) is called, the receiver r decay-copies args... into op_state2 as args'..., then calls invoke(f', args'...), resulting in a sender s3. It then calls connect(s3, std::move(out_r')), resulting in an operation state op_state3. op_state3 is saved as a part of op_state2. It then calls start(op_state3). If any of these throws an exception, it catches it and calls set_error(std::move(out_r'), current_exception()). If any of these expressions would be ill-formed, set-cpo(r, args...) is ill-formed.

        2. completion-function(r, args...) is expression-equivalent to completion-function(std::move(out_r'), args...), when completion-function is different from set-cpo.

      3. let-cpo(s, f) returns a sender s2 such that:

        1. If the expression connect(s, r) is ill-formed, connect(s2, out_r) is ill-formed.

        2. Otherwise, let op_state2 be the result of connect(s, r). connect(s2, out_r) returns an operation state op_state that stores op_state2. start(op_state) is expression-equivalent to start(op_state2).

      4. Given subexpressions s2 and e, where s2 is a sender returned from let-cpo(s, f) or a copy of such, let S2 be decltype((s2)), let E be decltype((e)), and let DS be copy_cvref_t<S2, S>. Then the type of tag_invoke(get_completion_signatures, s2, e) is specified as follows:

        1. If sender_in<DS, E> is false, the expression tag_invoke(get_completion_signatures, s2, e) is ill-formed.

        2. Otherwise, let Sigs... be the set of template arguments of the completion_signatures specialization named by completion_signatures_of_t<DS, E>, let Sigs2... be the set of function types in Sigs... whose return type is set-cpo, and let Rest... be the set of function types in Sigs... but not Sigs2....

        3. For each Sig2i in Sigs2..., let Vsi... be the set of function arguments in Sig2i and let S3i be invoke_result_t<F, decay_t<Vsi>&...>. If S3i is ill-formed, or if sender_in<S3i, E> is not satisfied, then the expression tag_invoke(get_completion_signatures, s2, e) is ill-formed.

        4. Otherwise, let Sigs3i... be the set of template arguments of the completion_signatures specialization named by completion_signatures_of_t<S3i, E>. Then the type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to completion_signatures<Sigs30..., Sigs31..., ... Sigs3n-1..., Rest..., set_error_t(exception_ptr)>, where n is sizeof...(Sigs2).

    If let-cpo(s, f) does not return a sender that invokes f when set-cpo is called, and makes its completion dependent on the completion of a sender returned by f, and propagates the other completion operations sent by s, the behavior of calling let-cpo(s, f) is undefined.

11.9.6.10. execution::bulk [exec.bulk]
  1. bulk runs a task repeatedly for every index in an index space.

  2. The name bulk denotes a customization point object. For some subexpressions s, shape, and f, let S be decltype((s)), Shape be decltype((shape)), and F be decltype((f)). If S does not satisfy sender or Shape does not satisfy integral, bulk is ill-formed. Otherwise, the expression bulk(s, shape, f) is expression-equivalent to:

    1. tag_invoke(bulk, get_completion_scheduler<set_value_t>(get_env(s)), s, shape, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(bulk, s, shape, f), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, constructs a sender s2. When s2 is connected with some receiver out_r, it:

      1. Constructs a receiver r:

        1. When set_value(r, args...) is called, calls f(i, args...) for each i of type Shape from 0 to shape, then calls set_value(out_r, args...). If any of these throws an exception, it catches it and calls set_error(out_r, current_exception()).

        2. When set_error(r, e) is called, calls set_error(out_r, e).

        3. When set_stopped(r) is called, calls set_stopped(out_r, e).

      2. Calls connect(s, r), which results in an operation state op_state2.

      3. Returns an operation state op_state that contains op_state2. When start(op_state) is called, calls start(op_state2).

      4. Given subexpressions s2 and e where s2 is a sender returned from bulk or a copy of such, let S2 be decltype((s2)), let E be decltype((e)), let DS be copy_cvref_t<S2, S>, let Shape be decltype((shape)) and let nothrow-callable be the alias template:

        template<class... As>
          using nothrow-callable =
            bool_constant<is_nothrow_invocable_v<decay_t<F>&, Shape, As...>>;
        
        1. If any of the types in the type-list named by value_types_of_t<DS, E, nothrow-callable, type-list> are false_type, then the type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to:

          make_completion_signatures<
            DS, E, completion_signatures<set_error_t(exception_ptr)>>
          
        2. Otherwise, the type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to completion_signatures_of_t<DS, E>.

    4. If the function selected above does not return a sender that invokes f(i, args...) for each i of type Shape from 0 to shape where args is a pack of subexpressions refering to the value completion result datums of the input sender, or does not execute a value completion operation with said datums, the behavior of calling bulk(s, shape, f) is undefined.

11.9.6.11. execution::split [exec.split]
  1. split adapts an arbitrary sender into a sender that can be connected multiple times.

  2. Let split-env be the type of an environment such that, given an instance e, the expression get_stop_token(e) is well-formed and has type stop_token.

  3. The name split denotes a customization point object. For some subexpression s, let S be decltype((s)). If sender_in<S, split-env> or constructible_from<decay_t<env_of_t<S>>, env_of_t<S>> is false, split is ill-formed. Otherwise, the expression split(s) is expression-equivalent to:

    1. tag_invoke(split, get_completion_scheduler<set_value_t>(get_env(s)), s), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(split, s), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, constructs a sender s2, which:

      1. Creates an object sh_state that contains a stop_source, a list of pointers to operation states awaiting the completion of s, and that also reserves space for storing:

        • the operation state that results from connecting s with r described below, and

        • the sets of values and errors with which s can complete, with the addition of exception_ptr.

        • the result of decay-copying get_env(s).

      2. Constructs a receiver r such that:

        1. When set_value(r, args...) is called, decay-copies the expressions args... into sh_state. It then notifies all the operation states in sh_state's list of operation states that the results are ready. If any exceptions are thrown, the exception is caught and set_error(r, current_exception()) is called instead.

        2. When set_error(r, e) is called, decay-copies e into sh_state. It then notifies the operation states in sh_state's list of operation states that the results are ready.

        3. When set_stopped(r) is called, notifies the operation states in sh_state's list of operation states that the results are ready.

        4. get_env(r) is an expression e of type split-env such that get_stop_token(e) is well-formed and returns the results of calling get_token() on sh_state's stop source.

      3. Calls get_env(s) and decay-copies the result into sh_state.

      4. Calls connect(s, r), resulting in an operation state op_state2. op_state2 is saved in sh_state.

      5. When s2 is connected with a receiver out_r of type OutR, it returns an operation state object op_state that contains:

        • An object out_r' of type OutR decay-copied from out_r,

        • A reference to sh_state,

        • A stop callback of type optional<stop_token_of_t<env_of_t<OutR>>::callback_type<stop-callback-fn>>, where stop-callback-fn is the unspecified class type:

          struct stop-callback-fn {
            stop_source& stop_src_;
            void operator()() noexcept {
              stop_src_.request_stop();
            }
          };
          
      6. When start(op_state) is called:

        • If one of r's completion functions has executed, then let Tag be the completion function that was called. Calls Tag(out_r', args2...), where args2... is a pack of const lvalues referencing the subobjects of sh_state that have been saved by the original call to Tag(r, args...) and returns.

        • Otherwise, it emplace constructs the stop callback optional with the arguments get_stop_token(get_env(out_r')) and stop-callback-fn{stop-src}, where stop-src refers to the stop source of sh_state.

        • Otherwise, it adds a pointer to op_state to the list of operation states in sh_state. If op_state is the first such state added to the list:

          • If stop-src.stop_requested() is true, all of the operation states in sh_state's list of operation states are notified as if set_stopped(r) had been called.

          • Otherwise, start(op_state2) is called.

      7. When r completes it will notify op_state that the result are ready. Let Tag be whichever completion function was called on receiver r. op_state's stop callback optional is reset. Then Tag(std::move(out_r'), args2...) is called, where args2... is a pack of const lvalues referencing the subobjects of sh_state that have been saved by the original call to Tag(r, args...).

      8. Ownership of sh_state is shared by s2 and by every op_state that results from connecting s2 to a receiver.

    4. Given subexpressions s2 where s2 is a sender returned from split or a copy of such, get_env(s2) shall return an lvalue reference to the object in sh_state that was initialized with the result of get_env(s).

    5. Given subexpressions s2 and e where s2 is a sender returned from split or a copy of such, let S2 be decltype((s2)) and let E be decltype((e)). The type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to:

      make_completion_signatures<
        copy_cvref_t<S2, S>,
        E,
        completion_signatures<set_error_t(exception_ptr),
                              set_error_t(Es)...>,
        value-signatures,
        error-signatures>;
      

      where Es is a (possibly empty) template parameter pack, value-signatures is the alias template:

      template<class... Ts>
        using value-signatures =
          completion_signatures<set_value_t(const decay_t<Ts>&...)>;
      

      and error-signatures is the alias template:

      template<class E>
        using error-signatures =
          completion_signatures<set_error_t(const decay_t<E>&)>;
      
    6. Let s be a sender expression, r be an instance of the receiver type described above, s2 be a sender returned from split(s) or a copy of such, r2 is the receiver to which s2 is connected, and args is the pack of subexpressions passed to r's completion function CSO when s completes. s2 shall invoke CSO(r2, args2...) where args2 is a pack of const lvalue references to objects decay-copied from args, or by calling set_error(r2, e2) for some subexpression e2. The objects passed to r2's completion operation shall be valid until after the completion of the invocation of r2's completion operation.

11.9.6.12. execution::when_all [exec.when.all]
  1. when_all and when_all_with_variant both adapt multiple input senders into a sender that completes when all input senders have completed. when_all only accepts senders with a single value completion signature and on success concatenates all the input senders' value result datums into its own value completion operation. when_all_with_variant(s...) is semantically equivilant to when_all(into_variant(s)...), where s is a pack of subexpressions of sender types.

  2. The name when_all denotes a customization point object. For some subexpressions si..., let Si... be decltype((si)).... The expression when_all(si...) is ill-formed if any of the following is true:

    • If the number of subexpressions si... is 0, or

    • If any type Si does not satisfy sender.

    Otherwise, the expression when_all(si...) is expression-equivalent to:

    1. tag_invoke(when_all, si...), if that expression is valid. If the function selected by tag_invoke does not return a sender that sends a concatenation of values sent by si... when they all complete with set_value, the behavior of calling when_all(si...) is undefined.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, constructs a sender w of type W. When w is connected with some receiver out_r of type OutR, it returns an operation state op_state specified as below:

      1. For each sender si, constructs a receiver ri such that:

        1. If set_value(ri, ti...) is called for every ri, op_state's associated stop callback optional is reset and set_value(out_r, t0..., t1..., ..., tn-1...) is called, where n the number of subexpressions in si....

        2. Otherwise, set_error or set_stopped was called for at least one receiver ri. If the first such to complete did so with the call set_error(ri, e), request_stop is called on op_state's associated stop source. When all child operations have completed, op_state's associated stop callback optional is reset and set_error(out_r, e) is called.

        3. Otherwise, request_stop is called on op_state's associated stop source. When all child operations have completed, op_state's associated stop callback optional is reset and set_stopped(out_r) is called.

        4. For each receiver ri, get_env(ri) is an expression e such that get_stop_token(e) is well-formed and returns the results of calling get_token() on op_state's associated stop source, and for which tag_invoke(tag, e, args...) is expression-equivalent to tag(get_env(out_r), args...) for all arguments args... and all tag whose type satisfies forwarding-query and is not get_stop_token_t.

      2. For each sender si, calls connect(si, ri), resulting in operation states child_opi.

      3. Returns an operation state op_state that contains:

        • Each operation state child_opi,

        • A stop source of type in_place_stop_source,

        • A stop callback of type optional<stop_token_of_t<env_of_t<OutR>>::callback_type<stop-callback-fn>>, where stop-callback-fn is the unspecified class type:

          struct stop-callback-fn {
            in_place_stop_source& stop_src_;
            void operator()() noexcept {
              stop_src_.request_stop();
            }
          };
          
      4. When start(op_state) is called it:

        • Emplace constructs the stop callback optional with the arguments get_stop_token(get_env(out_r)) and stop-callback-fn{stop-src}, where stop-src refers to the stop source of op_state.

        • Then, it checks to see if stop-src.stop_requested() is true. If so, it calls set_stopped(out_r).

        • Otherwise, calls start(child_opi) for each child_opi.

      5. Given subexpressions s2 and e where s2 is a sender returned from when_all or a copy of such, let S2 be decltype((s2)), let E be decltype((e)), and let Ss... be the decayed types of the arguments to the when_all expression that created s2. Let WE be a type such that stop_token_of_t<WE> is in_place_stop_token and tag_invoke_result_t<Tag, WE, As...> names the type, if any, of call-result-t<Tag, E, As...> for all types As... and all types Tag besides get_stop_token_t. The type of tag_invoke(get_completion_signatures, s2, e) shall be as follows:

        1. For each type Si in Ss..., let DSi name the type copy_cvref_t<S2, Si>. If for any type DSi, the type completion_signatures_of_t<DSi, WE> is ill-formed, the expression of tag_invoke(get_completion_signatures, s2, e) is ill-formed.

        2. Otherwise, for each type DSi, let Sigsi... be the set of template arguments in the specialization of completion_signatures named by completion_signatures_of_t<DSi, WE>, and let Ci be the count of function types in Sigsi... for which the return type is set_value_t. If any Ci is two or greater, then the expression tag_invoke(get_completion_signatures, s2, e) is ill-formed.

        3. Otherwise, let Sigs2i... be the set of function types in Sigsi... whose return types are not set_value_t, and let Ws... be the unique set of types in [Sigs20..., Sigs21..., ... Sigs2n-1..., set_stopped_t()], where n is sizeof...(Ss). If any Ci is 0, then the type of tag_invoke(get_completion_signatures, s2, e) shall be completion_signatures<Ws...>.

        4. Otherwise, let Vi... be the function argument types of the single type in Sigsi... for which the return type is set_value_t. Then the type of tag_invoke(get_completion_signatures, s2, e) shall be completion_signatures<Ws..., set_value_t(decay_t<V0>&&..., decay_t<V1>&&..., ... decay_t<Vn-1>&&...)>.

  3. The name when_all_with_variant denotes a customization point object. For some subexpressions s..., let S be decltype((s)). If any type Si in S... does not satisfy sender, when_all_with_variant is ill-formed. Otherwise, the expression when_all_with_variant(s...) is expression-equivalent to:

    1. tag_invoke(when_all_with_variant, s...), if that expression is valid. If the function selected by tag_invoke does not return a sender that, when connected with a receiver of type R, sends the types into-variant-type<S, env_of_t<R>>... when they all complete with set_value, the behavior of calling when_all(si...) is undefined.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, when_all(into_variant(s)...).

  4. For a sender s2 returned from when_all or when_all_with_variant, get_env(s2) shall return an instance of a class equivalent to empty_env.

11.9.6.13. execution::transfer_when_all [exec.transfer.when.all]
  1. transfer_when_all and transfer_when_all_with_variant both adapt multiple input senders into a sender that completes when all input senders have completed, ensuring the input senders complete on the specified scheduler. transfer_when_all only accepts senders with a single value completion signature and on success concatenates all the input senders' value result datums into its own value completion operation; transfer_when_all(scheduler, input-senders...) is semantically equivalent to transfer(when_all(input-senders...), scheduler). transfer_when_all_with_variant(scheduler, input-senders...) is semantically equivilant to transfer_when_all(scheduler, into_variant(intput-senders)...). These customizable composite algorithms can allow for more efficient customizations in some cases.

  2. The name transfer_when_all denotes a customization point object. For some subexpressions sch and s..., let Sch be decltype(sch) and S be decltype((s)). If Sch does not satisfy scheduler, or any type Si in S... does not satisfy sender, transfer_when_all is ill-formed. Otherwise, the expression transfer_when_all(sch, s...) is expression-equivalent to:

    1. tag_invoke(transfer_when_all, sch, s...), if that expression is valid. If the function selected by tag_invoke does not return a sender that sends a concatenation of values sent by s... when they all complete with set_value, or does not send its completion operation, other than ones resulting from a scheduling error, on an execution agent belonging to the associated execution resource of sch, the behavior of calling transfer_when_all(sch, s...) is undefined.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, transfer(when_all(s...), sch).

  3. The name transfer_when_all_with_variant denotes a customization point object. For some subexpressions sch and s..., let Sch be decltype((sch)) and let S be decltype((s)). If any type Si in S... does not satisfy sender, transfer_when_all_with_variant is ill-formed. Otherwise, the expression transfer_when_all_with_variant(sch, s...) is expression-equivalent to:

    1. tag_invoke(transfer_when_all_with_variant, s...), if that expression is valid. If the function selected by tag_invoke does not return a sender that, when connected with a receiver of type R, sends the types into-variant-type<S, env_of_t<R>>... when they all complete with set_value, the behavior of calling transfer_when_all_with_variant(sch, s...) is undefined.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, transfer_when_all(sch, into_variant(s)...).

  4. For a sender t returned from transfer_when_all(sch, s...), get_env(t) shall return a queryable object q such that get_completion_scheduler<CPO>(q) returns a copy of sch, where CPO is either set_value_t or set_stopped_t. The get_completion_scheduler<set_error_t> query is not implemented, as the scheduler cannot be guaranteed in case an error is thrown while trying to schedule work on the given scheduler object.

11.9.6.14. execution::into_variant [exec.into.variant]
  1. into_variant adapts a sender with multiple value completion signatures into a sender with just one consisting of a variant of tuples.

  2. The template into-variant-type computes the type sent by a sender returned from into_variant.

    template<class S, class E>
        requires sender_in<S, E>
      using into-variant-type =
        value_types_of_t<S, E>;
    
  3. into_variant is a customization point object. For some subexpression s, let S be decltype((s)). If S does not satisfy sender, into_variant(s) is ill-formed. Otherwise, into_variant(s) returns a sender s2. When s2 is connected with some receiver out_r, it:

    1. Constructs a receiver r:

      1. If set_value(r, ts...) is called, calls set_value(out_r, into-variant-type<S, env_of_t<decltype((r))>>(decayed-tuple<decltype(ts)...>(ts...))). If this expression throws an exception, calls set_error(out_r, current_exception()).

      2. set_error(r, e) is expression-equivalent to set_error(out_r, e).

      3. set_stopped(r) is expression-equivalent to set_stopped(out_r).

    2. Calls connect(s, r), resulting in an operation state op_state2.

    3. Returns an operation state op_state that contains op_state2. When start(op_state) is called, calls start(op_state2).

    4. Given subexpressions s2 and e, where s2 is a sender returned from into_variant or a copy of such, let S2 be decltype((s2)) and E be decltype((e)). Let into-variant-set-value be the class template:

      template<class S, class E>
      struct into-variant-set-value {
        template<class ...Args>
        using apply = set_value_t(into-variant-type<S, E>);
      };
      

      Let into-variant-is-nothrow be the class template:

      template<class S, class E>
      struct into-variant-is-nothrow {
        template<class... Args>
            requires constructible_from<decayed-tuple<Args...>, Args...>
          using apply = bool_constant<noexcept(
            into-variant-type<S, E>(decayed-tuple<Args...>(declval<Args>()...)))>;
      };
      

      Let INTO-VARIANT-ERROR-SIGNATURES(S, E) be completion_signatures<set_error_t(exception_ptr)> if any of the types in the type-list named by value_types_of_t<S, E, into-variant-is-nothrow<S, E>::template apply, type-list> are false_type; otherwise, completion_signatures<>.

      The type of tag_invoke(get_completion_signatures_t{}, s2, e) shall be equivalent to:

      make_completion_signatures<
          S2,
          E,
          INTO-VARIANT-ERROR-SIGNATURES(S, E),
          into-variant-set-value<S2, E>::template apply
      >
      
11.9.6.15. execution::stopped_as_optional [exec.stopped.as.optional]
  1. stopped_as_optional maps an input sender’s stopped completion operation into the value completion operation as an empty optional. The input sender’s value completion operation is also converted into an optional. The result is a sender that never completes with stopped, reporting cancellation by completing with an empty optional.

  2. The name stopped_as_optional denotes a customization point object. For some subexpression s, let S be decltype((s)). Let get-env-sender be an expression such that, when it is connected with a receiver r, start on the resulting operation state completes immediately by calling set_value(r, get_env(r)). The expression stopped_as_optional(s) is expression-equivalent to:

    let_value(
      get-env-sender,
      []<class E>(const E&) requires single-sender<S, E> {
        return let_stopped(
          then(s,
            []<class T>(T&& t) {
              return optional<decay_t<single-sender-value-type<S, E>>>{
                std::forward<T>(t)
              };
            }
          ),
          [] () noexcept {
            return just(optional<decay_t<single-sender-value-type<S, E>>>{});
          }
        );
      }
    )
    
11.9.6.16. execution::stopped_as_error [exec.stopped.as.error]
  1. stopped_as_error maps an input sender’s stopped completion operation into an error completion operation as a custom error type. The result is a sender that never completes with stopped, reporting cancellation by completing with an error.

  2. The name stopped_as_error denotes a customization point object. For some subexpressions s and e, let S be decltype((s)) and let E be decltype((e)). If the type S does not satisfy sender or if the type E doesn’t satisfy movable-value, stopped_as_error(s, e) is ill-formed. Otherwise, the expression stopped_as_error(s, e) is expression-equivalent to:

    let_stopped(s, [] { return just_error(e); })
    
11.9.6.17. execution::ensure_started [exec.ensure.started]
  1. ensure_started eagerly starts the execution of a sender, returning a sender that is usable as intput to additional sender algorithms.

  2. Let ensure-started-env be the type of an execution environment such that, given an instance e, the expression get_stop_token(e) is well-formed and has type stop_token.

  3. The name ensure_started denotes a customization point object. For some subexpression s, let S be decltype((s)). If sender_in<S, ensure-started-env> or constructible_from<decay_t<env_of_t<S>>, env_of_t<S>> is false, ensure_started(s) is ill-formed. Otherwise, the expression ensure_started(s) is expression-equivalent to:

    1. tag_invoke(ensure_started, get_completion_scheduler<set_value_t>(get_env(s)), s), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    2. Otherwise, tag_invoke(ensure_started, s), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above satisfies sender.

    3. Otherwise, constructs a sender s2, which:

      1. Creates an object sh_state that contains a stop_source, an initially null pointer to an operation state awaitaing completion, and that also reserves space for storing:

        • the operation state that results from connecting s with r described below, and

        • the sets of values and errors with which s can complete, with the addition of exception_ptr.

        • the result of decay-copying get_env(s).

        s2 shares ownership of sh_state with r described below.

      2. Constructs a receiver r such that:

        1. When set_value(r, args...) is called, decay-copies the expressions args... into sh_state. It then checks sh_state to see if there is an operation state awaiting completion; if so, it notifies the operation state that the results are ready. If any exceptions are thrown, the exception is caught and set_error(r, current_exception()) is called instead.

        2. When set_error(r, e) is called, decay-copies e into sh_state. If there is an operation state awaiting completion, it then notifies the operation state that the results are ready.

        3. When set_stopped(r) is called, it then notifies any awaiting operation state that the results are ready.

        4. get_env(r) is an expression e of type ensure-started-env such that get_stop_token(e) is well-formed and returns the results of calling get_token() on sh_state's stop source.

        5. r shares ownership of sh_state with s2. After r has been completed, it releases its ownership of sh_state.

      3. Calls get_env(s) and decay-copies the result into sh_state.

      4. Calls connect(s, r), resulting in an operation state op_state2. op_state2 is saved in sh_state. It then calls start(op_state2).

      5. When s2 is connected with a receiver out_r of type OutR, it returns an operation state object op_state that contains:

        • An object out_r' of type OutR decay-copied from out_r,

        • A reference to sh_state,

        • A stop callback of type optional<stop_token_of_t<env_of_t<OutR>>::callback_type<stop-callback-fn>>, where stop-callback-fn is the unspecified class type:

          struct stop-callback-fn {
            stop_source& stop_src_;
            void operator()() noexcept {
              stop_src_.request_stop();
            }
          };
          

        s2 transfers its ownership of sh_state to op_state.

      6. When start(op_state) is called:

        • If r has already been completed, then let CF be whichever completion function was used to complete r. Calls CF(out_r', args2...), where args2... is a pack of xvalues referencing the subobjects of sh_state that have been saved by the original call to CF(r, args...) and returns.

        • Otherwise, it emplace constructs the stop callback optional with the arguments get_stop_token(get_env(out_r')) and stop-callback-fn{stop-src}, where stop-src refers to the stop source of sh_state.

        • Then, it checks to see if stop-src.stop_requested() is true. If so, it calls set_stopped(out_r').

        • Otherwise, it sets sh_state operation state pointer to the address of op_state, registering itself as awaiting the result of the completion of r.

      7. When r completes it will notify op_state that the result are ready. Let CF be whichever completion function was used to complete r. op_state's stop callback optional is reset. Then CF(std::move(out_r'), args2...) is called, where args2... is a pack of xvalues referencing the subobjects of sh_state that have been saved by the original call to CF(r, args...).

      8. [Note: If sender s2 is destroyed without being connected to a receiver, or if it is connected but the operation state is destroyed without having been started, then when r completes and it releases its shared ownership of sh_state, sh_state will be destroyed and the results of the operation are discarded. -- end note]

    4. Given a subexpression s, let s2 be the result of ensure_started(s). The result of get_env(s2) shall return an lvalue reference to the object in sh_state that was initialized with the result of get_env(s).

    5. Given subexpressions s2 and e where s2 is a sender returned from ensure_started or a copy of such, let S2 be decltype((s2)) and let E be decltype((e)). The type of tag_invoke(get_completion_signatures, s2, e) shall be equivalent to:

      make_completion_signatures<
        copy_cvref_t<S2, S>,
        ensure-started-env,
        completion_signatures<set_error_t(exception_ptr&&),
                              set_error_t(Es)...>,
        set-value-signature,
        error-types>
      

      where Es is a (possibly empty) template parameter pack, set-value-signature is the alias template:

      template<class... Ts>
        using set-value-signature =
          completion_signatures<set_value_t(decay_t<Ts>&&...)>;
      

      and error-types is the alias template:

      template<class E>
        using error-types =
          completion_signatures<set_error_t(decay_t<E>&&)>;
      
  4. Let s be a sender expression, r be an instance of the receiver type described above, s2 be a sender returned from ensure_started(s) or a copy of such, r2 is the receiver to which s2 is connected, and args is the pack of subexpressions passed to r's completion function CSO when s completes. s2 shall invoke CSO(r2, args2...) where args2 is a pack of xvalue references to objects decay-copied from args, or by calling set_error(r2, e2) for some subexpression e2. The objects passed to r2's completion operation shall be valid until after the completion of the invocation of r2's completion operation.

11.9.7. Sender consumers [exec.consumers]

11.9.7.1. execution::start_detached [exec.start.detached]
  1. start_detached eagerly starts a sender without the caller needing to manage the lifetimes of any objects.

  2. The name start_detached denotes a customization point object. For some subexpression s, let S be decltype((s)). If S does not satisfy sender, start_detached is ill-formed. Otherwise, the expression start_detached(s) is expression-equivalent to:

    1. tag_invoke(start_detached, get_completion_scheduler<set_value_t>(get_env(s)), s), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above is void.

    2. Otherwise, tag_invoke(start_detached, s), if that expression is valid.

      • Mandates: The type of the tag_invoke expression above is void.

    3. Otherwise:

      1. Let R be the type of a receiver, let r be an rvalue of type R, and let cr be a lvalue reference to const R such that:

        1. The expression set_value(r) is not potentially-throwing and has no effect,

        2. For any subexpression e, the expression set_error(r, e) is expression-equivalent to terminate(),

        3. The expression set_stopped(r) is not potentially-throwing and has no effect, and

        4. The expression get_env(cr) is expression-equivalent to empty_env{}.

      2. Calls connect(s, r), resulting in an operation state op_state, then calls start(op_state).

    If the function selected above does not eagerly start the sender s after connecting it with a receiver that ignores value and stopped completion operations and calls terminate() on error completions, the behavior of calling start_detached(s) is undefined.

11.9.7.2. this_thread::sync_wait [exec.sync.wait]
  1. this_thread::sync_wait and this_thread::sync_wait_with_variant are used to block a current thread until a sender passed into it as an argument has completed, and to obtain the values (if any) it completed with. sync_wait requires that the input sender has exactly one value completion signature.

  2. For any receiver r created by an implementation of sync_wait and sync_wait_with_variant, the expressions get_scheduler(get_env(r)) and get_delegatee_scheduler(get_env(r)) shall be well-formed. For a receiver created by the default implementation of this_thread::sync_wait, these expressions shall return a scheduler to the same thread-safe, first-in-first-out queue of work such that tasks scheduled to the queue execute on the thread of the caller of sync_wait. [Note: The scheduler for an instance of run_loop that is a local variable within sync_wait is one valid implementation. -- end note]

  3. The templates sync-wait-type and sync-wait-with-variant-type are used to determine the return types of this_thread::sync_wait and this_thread::sync_wait_with_variant. Let sync-wait-env be the type of the expression get_env(r) where r is an instance of the receiver created by the default implementation of sync_wait.

    template<sender_in<sync-wait-env> S>
      using sync-wait-type =
        optional<value_types_of_t<S, sync-wait-env, decayed-tuple, type_identity_t>>;
    
    template<sender_in<sync-wait-env> S>
      using sync-wait-with-variant-type = optional<into-variant-type<S, sync-wait-env>>;
    
  4. The name this_thread::sync_wait denotes a customization point object. For some subexpression s, let S be decltype((s)). If sender_in<S, sync-wait-env> is false, or the number of the arguments completion_signatures_of_t<S, sync-wait-env>::value_types passed into the Variant template parameter is not 1, this_thread::sync_wait(s) is ill-formed. Otherwise, this_thread::sync_wait(s) is expression-equivalent to:

    1. tag_invoke(this_thread::sync_wait, get_completion_scheduler<set_value_t>(get_env(s)), s), if this expression is valid.

      • Mandates: The type of the tag_invoke expression above is sync-wait-type<S, sync-wait-env>.

    2. Otherwise, tag_invoke(this_thread::sync_wait, s), if this expression is valid and its type is.

      • Mandates: The type of the tag_invoke expression above is sync-wait-type<S, sync-wait-env>.

    3. Otherwise:

      1. Constructs a receiver r.

      2. Calls connect(s, r), resulting in an operation state op_state, then calls start(op_state).

      3. Blocks the current thread until a completion operation of r is executed. When it is:

        1. If set_value(r, ts...) has been called, returns sync-wait-type<S, sync-wait-env>{decayed-tuple<decltype(ts)...>{ts...}}. If that expression exits exceptionally, the exception is propagated to the caller of sync_wait.

        2. If set_error(r, e) has been called, let E be the decayed type of e. If E is exception_ptr, calls std::rethrow_exception(e). Otherwise, if the E is error_code, throws system_error(e). Otherwise, throws e.

        3. If set_stopped(r) has been called, returns sync-wait-type<S, sync-wait-env>{}.

  5. The name this_thread::sync_wait_with_variant denotes a customization point object. For some subexpression s, let S be the type of into_variant(s). If sender_in<S, sync-wait-env> is false, this_thread::sync_wait_with_variant(s) is ill-formed. Otherwise, this_thread::sync_wait_with_variant(s) is expression-equivalent to:

    1. tag_invoke(this_thread::sync_wait_with_variant, get_completion_scheduler<set_value_t>(get_env(s)), s), if this expression is valid.

      • Mandates: The type of the tag_invoke expression above is sync-wait-with-variant-type<S, sync-wait-env>.

    2. Otherwise, tag_invoke(this_thread::sync_wait_with_variant, s), if this expression is valid.

      • Mandates: The type of the tag_invoke expression above is sync-wait-with-variant-type<S, sync-wait-env>.

    3. Otherwise, this_thread::sync_wait(into_variant(s)).

11.10. execution::execute [exec.execute]

  1. execute creates fire-and-forget tasks on a specified scheduler.

  2. The name execute denotes a customization point object. For some subexpressions sch and f, let Sch be decltype((sch)) and F be decltype((f)). If Sch does not satisfy scheduler or F does not satisfy invocable, execute is ill-formed. Otherwise, execute is expression-equivalent to:

    1. tag_invoke(execute, sch, f), if that expression is valid. If the function selected by tag_invoke does not invoke the function f (or an object decay-copied from f) on an execution agent belonging to the associated execution resource of sch, or if it does not call std::terminate if an error occurs after control is returned to the caller, the behavior of calling execute is undefined.

      • Mandates: The type of the tag_invoke expression above is void.

    2. Otherwise, start_detached(then(schedule(sch), f)).

11.11. Sender/receiver utilities [exec.utils]

  1. This section makes use of the following exposition-only entities:

    // [Editorial note: copy_cvref_t as in [[P1450R3]] -- end note]
    // Mandates: is_base_of_v<T, remove_reference_t<U>> is true
    template<class T, class U>
      copy_cvref_t<U&&, T> c-style-cast(U&& u) noexcept requires decays-to<T, T> {
        return (copy_cvref_t<U&&, T>) std::forward<U>(u);
      }
    
  2. [Note: The C-style cast in c-style-cast is to disable accessibility checks. -- end note]

11.11.1. execution::receiver_adaptor [exec.utils.rcvr.adptr]

template<
    class-type Derived,
    receiver Base = unspecified> // arguments are not associated entities ([lib.tmpl-heads])
  class receiver_adaptor;
  1. receiver_adaptor simplifies the implementation of one receiver type in terms of another. It defines tag_invoke overloads that forward to named members if they exist, and to the adapted receiver otherwise.

  2. If Base is an alias for the unspecified default template argument, then:

    • Let HAS-BASE be false, and

    • Let GET-BASE(d) be d.base().

    otherwise, let:

    • Let HAS-BASE be true, and

    • Let GET-BASE(d) be c-style-cast<receiver_adaptor<Derived, Base>>(d).base().

    Let BASE-TYPE(D) be the type of GET-BASE(declval<D>()).

  3. receiver_adaptor<Derived, Base> is equivalent to the following:

    template<
      class-type Derived,
      receiver Base = unspecified> // arguments are not associated entities ([lib.tmpl-heads])
    class receiver_adaptor {
      friend Derived;
     public:
      using is_receiver = unspecified;
    
      // Constructors
      receiver_adaptor() = default;
      template<class B>
          requires HAS-BASE && constructible_from<Base, B>
        explicit receiver_adaptor(B&& base) : base_(std::forward<B>(base)) {}
    
     private:
      using set_value = unspecified;
      using set_error = unspecified;
      using set_stopped = unspecified;
      using get_env = unspecified;
    
      // Member functions
      template<class Self>
        requires HAS-BASE
      decltype(auto) base(this Self&& self) noexcept {
        return (std::forward<Self>(self).base_);
      }
    
      // [exec.utils.rcvr.adptr.nonmembers] Non-member functions
      template<class... As>
        friend void tag_invoke(set_value_t, Derived&& self, As&&... as) noexcept;
    
      template<class E>
        friend void tag_invoke(set_error_t, Derived&& self, E&& e) noexcept;
    
      friend void tag_invoke(set_stopped_t, Derived&& self) noexcept;
    
      friend decltype(auto) tag_invoke(get_env_t, const Derived& self)
          noexcept(see below);
    
      [[no_unique_address]] Base base_; // present if and only if HAS-BASE is true
    };
    
  4. [Note: receiver_adaptor provides tag_invoke overloads on behalf of the derived class Derived, which is incomplete when receiver_adaptor is instantiated.]

  5. [Example:

    using _int_completion =
      completion_signatures<set_value_t(int)>;
    
    template<receiver_of<_int_completion> R>
      class my_receiver : receiver_adaptor<my_receiver<R>, R> {
        friend receiver_adaptor<my_receiver, R>;
        void set_value() && {
          set_value(std::move(*this).base(), 42);
        }
       public:
        using receiver_adaptor<my_receiver, R>::receiver_adaptor;
      };
    

    -- end example]

11.11.1.1. Non-member functions [exec.utils.rcvr.adptr.nonmembers]
template<class... As>
  friend void tag_invoke(set_value_t, Derived&& self, As&&... as) noexcept;
  1. Let SET-VALUE be the expression std::move(self).set_value(std::forward<As>(as)...).

  2. Constraints: Either SET-VALUE is a valid expression or typename Derived::set_value denotes a type and callable<set_value_t, BASE-TYPE(Derived), As...> is true.

  3. Mandates: SET-VALUE, if that expression is valid, is not potentially-throwing.

  4. Effects: Equivalent to:

    • If SET-VALUE is a valid expression, SET-VALUE;

    • Otherwise, set_value(GET-BASE(std::move(self)), std::forward<As>(as)...).

template<class E>
  friend void tag_invoke(set_error_t, Derived&& self, E&& e) noexcept;
  1. Let SET-ERROR be the expression std::move(self).set_error(std::forward<E>(e)).

  2. Constraints: Either SET-ERROR is a valid expression or typename Derived::set_error denotes a type and callable<set_error_t, BASE-TYPE(Derived), E> is true.

  3. Mandates: SET-ERROR, if that expression is valid, is not potentially-throwing.

  4. Effects: Equivalent to:

    • If SET-ERROR is a valid expression, SET-ERROR;

    • Otherwise, set_error(GET-BASE(std::move(self)), std::forward<E>(e)).

friend void tag_invoke(set_stopped_t, Derived&& self) noexcept;
  1. Let SET-STOPPED be the expression std::move(self).set_stopped().

  2. Constraints: Either SET-STOPPED is a valid expression or typename Derived::set_stopped denotes a type and callable<set_stopped_t, BASE-TYPE(Derived)> is true.

  3. Mandates: SET-STOPPED, if that expression is valid, is not potentially-throwing.

  4. Effects: Equivalent to:

    • If SET-STOPPED is a valid expression, SET-STOPPED;

    • Otherwise, set_stopped(GET-BASE(std::move(self))).

friend decltype(auto) tag_invoke(get_env_t, const Derived& self)
  noexcept(see below);
  1. Constraints: Either self.get_env() is a valid expression or typename Derived::get_env denotes a type and callable<get_env_t, BASE-TYPE(const Derived&)> is true.

  2. Effects: Equivalent to:

    • If self.get_env() is a valid expression, self.get_env();

    • Otherwise, std::get_env(GET-BASE(self)).

  3. Remarks: The expression in the noexcept clause is:

    • If self.get_env() is a valid expression, noexcept(self.get_env());

    • Otherwise, noexcept(std::get_env(GET-BASE(self))).

11.11.2. execution::completion_signatures [exec.utils.cmplsigs]

  1. completion_signatures is a type that encodes a set of completion signatures ([async.ops]).

  2. [Example:

    class my_sender {
      using completion_signatures =
        completion_signatures<
          set_value_t(),
          set_value_t(int, float),
          set_error_t(exception_ptr),
          set_error_t(error_code),
          set_stopped_t()>;
    };
    
    // Declares my_sender to be a sender that can complete by calling
    // one of the following for a receiver expression R:
    //    set_value(R)
    //    set_value(R, int{...}, float{...})
    //    set_error(R, exception_ptr{...})
    //    set_error(R, error_code{...})
    //    set_stopped(R)
    

    -- end example]

  3. This section makes use of the following exposition-only entities:

    template<class Fn>
      concept completion-signature = see below;
    
    template<bool>
      struct indirect-meta-apply {
        template<template<class...> class T, class... As>
          using meta-apply = T<As...>; // exposition only
      };
    
    template<class...>
      concept always-true = true; // exposition only
    
    1. A type Fn satisfies completion-signature if and only if it is a function type with one of the following forms:

      • set_value_t(Vs...), where Vs is an arbitrary parameter pack.

      • set_error_t(E), where E is an arbitrary type.

      • set_stopped_t()

    template<class Tag,
              class S,
              class E,
              template<class...> class Tuple,
              template<class...> class Variant>
        requires sender_in<S, E>
      using gather-signatures = see below;
    
    1. Let Fns... be a template parameter pack of the arguments of the completion_signatures specialization named by completion_signatures_of_t<S, E>, let TagFns be a template parameter pack of the function types in Fns whose return types are Tag, and let Tsn be a template parameter pack of the function argument types in the n-th type in TagFns. Then, given two variadic templates Tuple and Variant, the type gather-signatures<Tag, S, E, Tuple, Variant> names the type META-APPLY(Variant, META-APPLY(Tuple, Ts0...), META-APPLY(Tuple, Ts1...), ... META-APPLY(Tuple, Tsm-1...)), where m is the size of the parameter pack TagFns and META-APPLY(T, As...) is equivalent to:

      typename indirect-meta-apply<always-true<As...>>::template meta-apply<T, As...>;
      
    2. The purpose of META-APPLY is to make it valid to use non-variadic templates as Variant and Tuple arguments to gather-signatures.

  4. template<completion-signature... Fns>
      struct completion_signatures {};
    
    template<class S,
              class E = empty_env,
              template<class...> class Tuple = decayed-tuple,
              template<class...> class Variant = variant-or-empty>
        requires sender_in<S, E>
      using value_types_of_t =
          gather-signatures<set_value_t, S, E, Tuple, Variant>;
    
    template<class S,
              class E = empty_env,
              template<class...> class Variant = variant-or-empty>
        requires sender_in<S, E>
      using error_types_of_t =
          gather-signatures<set_error_t, S, E, type_identity_t, Variant>;
    
    template<class S, class E = empty_env>
        requires sender_in<S, E>
      inline constexpr bool sends_stopped =
          !same_as<
            type-list<>,
            gather-signatures<set_stopped_t, S, E, type-list, type-list>>;
    

11.11.3. execution::make_completion_signatures [exec.utils.mkcmplsigs]

  1. make_completion_signatures is an alias template used to adapt the completion signatures of a sender. It takes a sender, and environment, and several other template arguments that apply modifications to the sender’s completion signatures to generate a new specialization of completion_signatures.

  2. [Example:

    // Given a sender S and an environment Env, adapt S’s completion
    // signatures by lvalue-ref qualifying the values, adding an additional
    // exception_ptr error completion if its not already there, and leaving the
    // other completion signatures alone.
    template<class... Args>
      using my_set_value_t =
        completion_signatures<
          set_value_t(add_lvalue_reference_t<Args>...)>;
    
    using my_completion_signatures =
      make_completion_signatures<
        S, Env,
        completion_signatures<set_error_t(exception_ptr)>,
        my_set_value_t>;
    

    -- end example]

  3. This section makes use of the following exposition-only entities:

    template<class... As>
      using default-set-value =
        completion_signatures<set_value_t(As...)>;
    
    template<class Err>
      using default-set-error =
        completion_signatures<set_error_t(Err)>;
    
  4. template<sender Sndr,
              class Env = empty_env,
              valid-completion-signatures AddlSigs =
                  completion_signatures<>,
              template<class...> class SetValue = default-set-value,
              template<class> class SetError = default-set-error,
              valid-completion-signatures SetStopped =
                  completion_signatures<set_stopped_t()>>
        requires sender_in<Sndr, Env>
    using make_completion_signatures =
      completion_signatures<see below>;
    
    • SetValue shall name an alias template such that for any template parameter pack As..., the type SetValue<As...> is either ill-formed or else valid-completion-signatures<SetValue<As...>> is satisfied.

    • SetError shall name an alias template such that for any type Err, SetError<Err> is either ill-formed or else valid-completion-signatures<SetError<Err>> is satisfied.

    Then:

    • Let Vs... be a pack of the types in the type-list named by value_types_of_t<Sndr, Env, SetValue, type-list>.

    • Let Es... be a pack of the types in the type-list named by error_types_of_t<Sndr, Env, error-list>, where error-list is an alias template such that error-list<Ts...> names type-list<SetError<Ts>...>.

    • Let Ss name the type completion_signatures<> if sends_stopped<Sndr, Env> is false; otherwise, SetStopped.

    Then:

    1. If any of the above types are ill-formed, then make_completion_signatures<Sndr, Env, AddlSigs, SetValue, SetError, SetStopped> is ill-formed,

    2. Otherwise, make_completion_signatures<Sndr, Env, AddlSigs, SetValue, SetError, SetStopped> names the type completion_signatures<Sigs...> where Sigs... is the unique set of types in all the template arguments of all the completion_signatures specializations in [AddlSigs, Vs..., Es..., Ss].

11.12. Execution contexts [exec.ctx]

  1. This section specifies some execution resources on which work can be scheduled.

11.12.1. run_loop [exec.run.loop]

  1. A run_loop is an execution resource on which work can be scheduled. It maintains a simple, thread-safe first-in-first-out queue of work. Its run() member function removes elements from the queue and executes them in a loop on whatever thread of execution calls run().

  2. A run_loop instance has an associated count that corresponds to the number of work items that are in its queue. Additionally, a run_loop has an associated state that can be one of starting, running, or finishing.

  3. Concurrent invocations of the member functions of run_loop, other than run and its destructor, do not introduce data races. The member functions pop_front, push_back, and finish execute atomically.

  4. [Note: Implementations are encouraged to use an intrusive queue of operation states to hold the work units to make scheduling allocation-free. — end note]

    class run_loop {
      // [exec.run.loop.types] Associated types
      class run-loop-scheduler; // exposition only
      class run-loop-sender; // exposition only
      struct run-loop-opstate-base { // exposition only
        virtual void execute() = 0;
        run_loop* loop_;
        run-loop-opstate-base* next_;
      };
      template<receiver_of<completion_signatures<set_value_t()>> R>
        using run-loop-opstate = unspecified; // exposition only
    
      // [exec.run.loop.members] Member functions:
      run-loop-opstate-base* pop_front(); // exposition only
      void push_back(run-loop-opstate-base*); // exposition only
    
     public:
      // [exec.run.loop.ctor] construct/copy/destroy
      run_loop() noexcept;
      run_loop(run_loop&&) = delete;
      ~run_loop();
    
      // [exec.run.loop.members] Member functions:
      run-loop-scheduler get_scheduler();
      void run();
      void finish();
    };
    
11.12.1.1. Associated types [exec.run.loop.types]
class run-loop-scheduler;
  1. run-loop-scheduler is an unspecified type that models the scheduler concept.

  2. Instances of run-loop-scheduler remain valid until the end of the lifetime of the run_loop instance from which they were obtained.

  3. Two instances of run-loop-scheduler compare equal if and only if they were obtained from the same run_loop instance.

  4. Let sch be an expression of type run-loop-scheduler. The expression schedule(sch) is not potentially-throwing and has type run-loop-sender.

class run-loop-sender;
  1. run-loop-sender is an unspecified type such that sender_of<run-loop-sender, set_value_t()> is true. Additionally, the types reported by its error_types associated type is exception_ptr, and the value of its sends_stopped trait is true.

  2. An instance of run-loop-sender remains valid until the end of the lifetime of its associated run_loop instance.

  3. Let s be an expression of type run-loop-sender, let r be an expression such that decltype(r) models the receiver_of concept, and let C be either set_value_t or set_stopped_t. Then:

    • The expression connect(s, r) has type run-loop-opstate<decay_t<decltype(r)>> and is potentially-throwing if and only if the initialiation of decay_t<decltype(r)> from r is potentially-throwing.

    • The expression get_completion_scheduler<C>(get_env(s)) is not potentially-throwing, has type run-loop-scheduler, and compares equal to the run-loop-scheduler instance from which s was obtained.

template<receiver_of<completion_signatures<set_value_t()>> R> // arguments are not associated entities ([lib.tmpl-heads])
  struct run-loop-opstate;
  1. run-loop-opstate<R> inherits unambiguously from run-loop-opstate-base.

  2. Let o be a non-const lvalue of type run-loop-opstate<R>, and let REC(o) be a non-const lvalue reference to an instance of type R that was initialized with the expression r passed to the invocation of connect that returned o. Then:

    • The object to which REC(o) refers remains valid for the lifetime of the object to which o refers.

    • The type run-loop-opstate<R> overrides run-loop-opstate-base::execute() such that o.execute() is equivalent to the following:

      if (get_stop_token(REC(o)).stop_requested()) {
        set_stopped(std::move(REC(o)));
      } else {
        set_value(std::move(REC(o)));
      }
      
    • The expression start(o) is equivalent to the following:

      try {
        o.loop_->push_back(&o);
      } catch(...) {
        set_error(std::move(REC(o)), current_exception());
      }
      
11.12.1.2. Constructor and destructor [exec.run.loop.ctor]
run_loop::run_loop() noexcept;
  1. Postconditions: count is 0 and state is starting.

run_loop::~run_loop();
  1. Effects: If count is not 0 or if state is running, invokes terminate(). Otherwise, has no effects.

11.12.1.3. Member functions [exec.run.loop.members]
run-loop-opstate-base* run_loop::pop_front();
  1. Effects: Blocks ([defns.block]) until one of the following conditions is true:

    • count is 0 and state is finishing, in which case pop_front returns nullptr; or

    • count is greater than 0, in which case an item is removed from the front of the queue, count is decremented by 1, and the removed item is returned.

void run_loop::push_back(run-loop-opstate-base* item);
  1. Effects: Adds item to the back of the queue and increments count by 1.

  2. Synchronization: This operation synchronizes with the pop_front operation that obtains item.

run-loop-scheduler run_loop::get_scheduler();
  1. Returns: an instance of run-loop-scheduler that can be used to schedule work onto this run_loop instance.

void run_loop::run();
  1. Effects: Equivalent to:

    while (auto* op = pop_front()) {
      op->execute();
    }
    
  2. Precondition: state is starting.

  3. Postcondition: state is finishing.

  4. Remarks: While the loop is executing, state is running. When state changes, it does so without introducing data races.

void run_loop::finish();
  1. Effects: Changes state to finishing.

  2. Synchronization: This operation synchronizes with all pop_front operations on this object.

11.13. Coroutine utilities [exec.coro.utils]

11.13.1. execution::as_awaitable [exec.as.awaitable]

  1. as_awaitable transforms an object into one that is awaitable within a particular coroutine. This section makes use of the following exposition-only entities:

    template<class S, class E>
      using single-sender-value-type = see below;
    
    template<class S, class E>
      concept single-sender =
        sender_in<S, E> &&
        requires { typename single-sender-value-type<S, E>; };
    
    template<class S, class P>
      concept awaitable-sender =
        single-sender<S, ENV-OF(P)> &&
        sender_to<S, awaitable-receiver> && // see below
        requires (P& p) {
          { p.unhandled_stopped() } -> convertible_to<coroutine_handle<>>;
        };
    
    template<class S, class P>
      class sender-awaitable;
    

    where ENV-OF(P) names the type env_of_t<P> if that type is well-formed, or empty_env otherwise.

    1. Alias template single-sender-value-type is defined as follows:

      1. If value_types_of_t<S, E, Tuple, Variant> would have the form Variant<Tuple<T>>, then single-sender-value-type<S, E> is an alias for type decay_t<T>.

      2. Otherwise, if value_types_of_t<S, E, Tuple, Variant> would have the form Variant<Tuple<>> or Variant<>, then single-sender-value-type<S, E> is an alias for type void.

      3. Otherwise, single-sender-value-type<S, E> is ill-formed.

    2. The type sender-awaitable<S, P> is equivalent to the following:

      template<class S, class P> // arguments are not associated entities ([lib.tmpl-heads])
      class sender-awaitable {
        struct unit {};
        using value_t = single-sender-value-type<S, ENV-OF(P)>;
        using result_t = conditional_t<is_void_v<value_t>, unit, value_t>;
        struct awaitable-receiver;
      
        variant<monostate, result_t, exception_ptr> result_{};
        connect_result_t<S, awaitable-receiver> state_;
      
       public:
        sender-awaitable(S&& s, P& p);
        bool await_ready() const noexcept { return false; }
        void await_suspend(coroutine_handle<P>) noexcept { start(state_); }
        value_t await_resume();
      };
      
      1. awaitable-receiver is equivalent to the following:

        struct awaitable-receiver {
          using is_receiver = unspecified;
          variant<monostate, result_t, exception_ptr>* result_ptr_;
          coroutine_handle<P> continuation_;
          // ... see below
        };
        

        Let r be an rvalue expression of type awaitable-receiver, let cr be a const lvalue that refers to r, let vs... be an arbitrary function parameter pack of types Vs..., and let err be an arbitrary expression of type Err. Then:

        1. If constructible_from<result_t, Vs...> is satisfied, the expression set_value(r, vs...) is equivalent to:

          try {
            r.result_ptr_->emplace<1>(vs...);
          } catch(...) {
            r.result_ptr_->emplace<2>(current_exception());
          }
          r.continuation_.resume();
          

          Otherwise, set_value(r, vs...) is ill-formed.

        2. The expression set_error(r, err) is equivalent to:

          r.result_ptr_->emplace<2>(AS-EXCEPT-PTR(err));
          r.continuation_.resume();
          

          where AS-EXCEPT-PTR(err) is:

          1. err if decay_t<Err> names the same type as exception_ptr,

          2. Otherwise, make_exception_ptr(system_error(err)) if decay_t<Err> names the same type as error_code,

          3. Otherwise, make_exception_ptr(err).

        3. The expression set_stopped(r) is equivalent to static_cast<coroutine_handle<>>(r.continuation_.promise().unhandled_stopped()).resume().

        4. For any expression tag whose type satisfies forwarding-query and for any pack of subexpressions as, tag_invoke(tag, get_env(cr), as...) is expression-equivalent to tag(get_env(as_const(cr.continuation_.promise())), as...) when that expression is well-formed.

      2. sender-awaitable::sender-awaitable(S&& s, P& p)

        • Effects: initializes state_ with connect(std::forward<S>(s), awaitable-receiver{&result_, coroutine_handle<P>::from_promise(p)}).

      3. value_t sender-awaitable::await_resume()

        • Effects: equivalent to:

          if (result_.index()) == 2)
            rethrow_exception(get<2>(result_));
          if constexpr (!is_void_v<value_t>)
            return std::forward<value_t>(get<1>(result_));
          
  2. as_awaitable is a customization point object. For some subexpressions e and p where p is an lvalue, E names the type decltype((e)) and P names the type decltype((p)), as_awaitable(e, p) is expression-equivalent to the following:

    1. tag_invoke(as_awaitable, e, p) if that expression is well-formed.

      • Mandates: is-awaitable<A, P> is true, where A is the type of the tag_invoke expression above.

    2. Otherwise, e if is-awaitable<E, U> is true, where U is an unspecified class type that lacks a member named await_transform. The condition is not is-awaitable<E, P> as that creates the potential for constraint recursion.

      • Preconditions: is-awaitable<E, P> is true and the expression co_await e in a coroutine with promise type U is expression-equivalent to the same expression in a coroutine with promise type P.

    3. Otherwise, sender-awaitable{e, p} if awaitable-sender<E, P> is true.

    4. Otherwise, e.

11.13.2. execution::with_awaitable_senders [exec.with.awaitable.senders]

  1. with_awaitable_senders, when used as the base class of a coroutine promise type, makes senders awaitable in that coroutine type.

    In addition, it provides a default implementation of unhandled_stopped() such that if a sender completes by calling set_stopped, it is treated as if an uncatchable "stopped" exception were thrown from the await-expression. In practice, the coroutine is never resumed, and the unhandled_stopped of the coroutine caller’s promise type is called.

    template<class-type Promise>
      struct with_awaitable_senders {
        template<OtherPromise>
          requires (!same_as<OtherPromise, void>)
        void set_continuation(coroutine_handle<OtherPromise> h) noexcept;
    
        coroutine_handle<> continuation() const noexcept { return continuation_; }
    
        coroutine_handle<> unhandled_stopped() noexcept {
          return stopped_handler_(continuation_.address());
        }
    
        template<class Value>
        see-below await_transform(Value&& value);
    
       private:
        // exposition only
        [[noreturn]] static coroutine_handle<> default_unhandled_stopped(void*) noexcept {
          terminate();
        }
        coroutine_handle<> continuation_{}; // exposition only
        // exposition only
        coroutine_handle<> (*stopped_handler_)(void*) noexcept = &default_unhandled_stopped;
      };
    
  2. void set_continuation(coroutine_handle<OtherPromise> h) noexcept

    • Effects: equivalent to:

      continuation_ = h;
      if constexpr ( requires(OtherPromise& other) { other.unhandled_stopped(); } ) {
        stopped_handler_ = [](void* p) noexcept -> coroutine_handle<> {
          return coroutine_handle<OtherPromise>::from_address(p)
            .promise().unhandled_stopped();
        };
      } else {
        stopped_handler_ = default_unhandled_stopped;
      }
      
  3. call-result-t<as_awaitable_t, Value, Promise&> await_transform(Value&& value)

    • Effects: equivalent to:

      return as_awaitable(std::forward<Value>(value), static_cast<Promise&>(*this));
      

Index

Terms defined by this specification

References

Informative References

[CWG2517]
Richard Smith. Useless restriction on use of parameter in constraint-expression. 10 June 2019. open. URL: https://wg21.link/cwg2517
[HPX]
Hartmut Kaiser; et al. HPX - The C++ Standard Library for Parallelism and Concurrency. URL: https://doi.org/10.21105/joss.02352
[N4885]
Thomas Köppe. Working Draft, Standard for Programming Language C++. 17 March 2021. URL: https://wg21.link/n4885
[P0443R14]
Jared Hoberock, Michael Garland, Chris Kohlhoff, Chris Mysen, H. Carter Edwards, Gordon Brown, D. S. Hollman. A Unified Executors Proposal for C++. 15 September 2020. URL: https://wg21.link/p0443r14
[P0981R0]
Richard Smith, Gor Nishanov. Halo: coroutine Heap Allocation eLision Optimization: the joint response. 18 March 2018. URL: https://wg21.link/p0981r0
[P1056R1]
Lewis Baker, Gor Nishanov. Add lazy coroutine (coroutine task) type. 7 October 2018. URL: https://wg21.link/p1056r1
[P1895R0]
Lewis Baker, Eric Niebler, Kirk Shoop. tag_invoke: A general pattern for supporting customisable functions. 8 October 2019. URL: https://wg21.link/p1895r0
[P1897R3]
Lee Howes. Towards C++23 executors: A proposal for an initial set of algorithms. 16 May 2020. URL: https://wg21.link/p1897r3
[P2175R0]
Lewis Baker. Composable cancellation for sender-based async operations. 15 December 2020. URL: https://wg21.link/p2175r0
[P2280r2]
Barry Revzin. Using unknown references in constant expressions. 15 May 2021. URL: https://wg21.link/p2280r2