Document #: | P3826R0 [Latest] [Status] |
Date: | 2025-10-05 |
Project: | Programming Language C++ |
Audience: |
SG1 Concurrency and Parallelism Working Group LEWG Library Evolution Working Group LWG Library Working Group |
Reply-to: |
Eric Niebler <eric.niebler@gmail.com> |
In the current Working Draft, 33 [exec] has sender algorithms that are customizable. While the sender/receiver concepts and the algorithms themselves have been stable for several years now, the customization mechanism has seen a fair bit of recent churn. [P3718R0] is the latest effort to shore up the mechanism. Unfortunately, there are gaps in its proposed resolution. This paper details those gaps.
The problems are fixable although the fixes are non-trivial. The time for elaborate fixes has passed. This paper proposes to remove the ability to customize sender algorithms for C++26. A future paper will propose to add the feature back post-’26.
The author feels that postponing the feature will be less disruptive and safer than trying to patch it at the last minute. Most common usages of sender/receiver will not be affected.
[P3718R0] identifies real problems with the status quo of sender algorithm customization. It proposes using information from the sender about where it will complete during “early” customization, which happens when a sender algorithm constructs and returns a sender; and it proposes using information from the receiver about where the operation will start during “late” customization, when the sender and the receiver are connected.
The problem with this separation of responsibilities is that many
senders do not know where they will complete until they know where they
will be started. A simple example is the
just()
sender; it completes inline wherever it is started. And the information
about where a sender will start is not known during early customization,
when the sender is being asked for this information.
For the expression then(sndr, fn)
for example, if the then
CPO asks
sndr
where it will complete,
sndr
might not be able to answer, in
which case no “early” customization is performed. And during “late”
(connect
-time)
customization, only the receiver’s information about where the operation
will start is used to find a customization. Presumably an algorithm like
then(sndr, fn)
would want to dispatch based on where the function
fn
will execute, but for some
expressions that information cannot be determined with the API proposed
in P3718.
An illustrative example is:
namespace ex = std::execution; auto sndr = ex::starts_on(gpu, ex::just()) | ex::then(fn); ::this_thread::sync_wait(std::move(sndr)); std
… where gpu
is a scheduler that
runs work (unsurprisingly) on a GPU.
fn
will execute on the GPU, so a
GPU implementation of then
should be
used. By the proposed resolution of P3718, algorithm customization
proceeds as follows:
During early customization, when starts_on(gpu, just()) | then(fn)
is executing, the then
CPO asks the
starts_on(gpu, just())
sender where it will complete as if by:
auto&&
tmp1
= ex::starts_on(gpu, ex::just()); autodom1
= ex::get_domain(ex::get_env(tmp1
));
The starts_on
sender will in
turn ask the
just()
sender, as if by:
auto&&
tmp2
= ex::just(); autodom2
= ex::get_domain(ex::get_env(tmp2
));
As discussed, the
just()
sender doesn’t know where it will complete until it knows where it will
be started, but that information is not yet available. As a result,
dom2
ends up as
default_domain
, which is then
reported as the domain for the
starts_on
sender. That’s incorrect.
The starts_on
sender will complete
on the GPU.
The then
CPO uses
default_domain
to find an
implementation of the then
algorithm, which will find the default implementation. As a result, the
then
CPO returns an ordinary
then
sender.
When that then
sender is
connected to sync_wait
’s receiver,
late customization happens.
connect
asks
sync_wait
’s receiver where the
then
sender will be started. It does
that with get_domain(get_env(rcvr))
.
sync_wait
starts operations on the
current thread, so the get_domain
query will return default_domain
. As
with early customization, late customization will also not find a GPU
implementation.
The end result of all of this is that a default (which is effectively
a CPU) implementation will be used to evaluate the
then
algorithm on the GPU. That is a
bad state of affairs.
OK, so there is a problem. What do we do? There are a number of different options.
std::execution
additionsAlthough the safest option, I hope most agree that such a drastic
step is not warranted by this issue. Pulling the
sender
abstraction and everything
that depends on it would result in the removal of:
The sender/receiver-related concepts and customization points, without which the ecosystem will have no shared async abstraction, and which will set back the adoption of structured concurrency three years.
The sender algorithms, which capture common async patterns and make them reusable,
execution::counting_scope
and execution::simple_counting_scope
,
and related features for incremental adoption of structured
concurrency,
execution::parallel_scheduler
and all of its related APIs, and
execution::task
and execution::task_scheduler
(C++26 will still not have a standard coroutine task type
<heavy sigh>).
This option should only be considered if all the other options are determined to have unacceptable risk.
This option would keep all of the above library components with the exception of the customizable sender algorithms:
then
,
upon_error
,
upon_stopped
let_value
,
let_error
,
let_stopped
bulk
,
bulk_chunked
,
bulk_unchunked
starts_on
,
continues_on
,
on
when_all
,
when_all_with_variant
stopped_as_optional
,
stopped_as_error
into_variant
sync_wait
affine_on
This would leave users with no easy standard way to start work on a given execution context, or transition to another execution context, or to execute work in parallel, or to wait for work to finish.
In fact, without the bulk
algorithms, we leave no way for the
parallel_scheduler
to execute work
in parallel!
While still delivering a standard async abstraction with minimal risk, the loss of the algorithms would make it just an abstraction. Like coroutines, adoption of senders as an async lingua franca will be hampered by lack of standard library support.
This is the option this paper proposes. We ship everything currently in the Working Draft but remove the ability to customize the algorithms. This gives us a free hand to design a better customization mechanism for C++29 – provided we have high confidence that those new customization hooks can be added without break existing behavior.
A fair question is: how can we have such certainty when we do not know what the customization hooks are yet?
To answer that question for myself, I implemented new customization hooks here that address the known issues. Using that design (described in Appendix A: The planned fix) as a polestar, this paper proposes wording to remove customization in such a way that will let us add it back later without breakage.
My experience implementing the solution gives me confidence that we can introduce that solution or one like it later without compatibility problems.
This option is not as reckless as it sounds. I describe the shape of a possible fix in Appendix A: The planned fix. It would not be the first time the Committee shipped a standard with known defects, and the DR process exists for just this purpose.
What gives me pause, however, is the fact that I have “fixed” this problem before only to find that my fix is broken, and not just once!
I have implemented my planned fix, and it seems to work, but it has not seen any real-world usage. In short, my confidence is not high enough to endorse this solution.
Should someone with sufficient interest come and vet my solution, I might change my mind. Shipping it as-is is certainly the least amount of work for everyone involved.
Removing algorithm customization is fairly straightforward in most
regards, but there are a few parts of std::execution
that need special care.
The parallel_scheduler
goes to
great lengths to ensure that the
bulk
family of algorithms –
bulk
,
bulk_chunked
, and
bulk_unchunked
– are executed in
parallel when the users requests it and when the underlying execution
context supports it.
To that end, the
parallel_scheduler
“provides a
customized implementation” of the
bulk_chunked
and
bulk_unchunked
algorithms, but
nothing is said about how those custom implementations are found or
under what circumstances users can be assured that the
parallel_scheduler
will use them.
Arguably, this is under-specified in the current Working Draft and
should be addressed whether this paper is accepted or not.
We have to give users a guarantee that if X, Y,
and Z conditions are met, bulk[_[un]chunked]
will be run in parallel with absolute certainty.
One solution is to say that the
bulk
algorithms are guaranteed to
execute in parallel when the immediate predecessor of the
bulk
operation is known to complete
on the parallel_scheduler
. In a
sender expression such as the following:
| std::execution::bulk(std::par, 1024, fn) sndr
If sndr
’s attributes advertizes a
completion scheduler of type
parallel_scheduler
, then we can
guarantee that the bulk
operation
will execute in parallel. Implementations can choose to parallelize
bulk
under other circumstances, but
we require this one.
The implication of offering this guarantee is that we must preserve the guarantee going forward. Any new customization mechanism we might add must never result in parallel execution becoming serialized.
The reverse is not necessarily true though. I maintain that a future
change that parallelizes a bulk
algorithm that formerly executed serially on the
parallel_scheduler
is an acceptable
change of behavior.
If SG1 or LEWG disagrees, there are ways to avoid even this behavior change.
Library issue #4336 describes the
poor interaction between
task_scheduler
, a type-erased
scheduler, and the bulk
family of
algorithms; namely, that the
task_scheduler
always executes
bulk
in serial, even when it is
wrapping a parallel_scheduler
.
This is not a problem caused by the customization mechanism, but it is something that can be addressed as part of the customization removal process.
When we address that issue, we must avoid the
parallel_scheduler
pitfall by
under-specifying the interaction with
bulk
. As with
parallel_scheduler
, users must have
a guarantee about the conditions under which
bulk
is accelerated on a
task_scheduler
.
Fortunately, the
parallel_scheduler
has already given
us a way to punch the bulk_chunked
and bulk_unchunked
algorithms
through a type-erased API boundary:
parallel_scheduler_backend
(33.16.3
[exec.sysctxrepl.psb]).
By specifying the behavior of
task_scheduler
in terms of
parallel_scheduler_backend
and
bulk_item_receiver_proxy
, we can
give task_scheduler
the ability to
parallelize bulk
without having to
invent a new mechanism.
bulk
algorithmsFew users will ever have a need to customize an algorithm like
then
or
let_value
. The
bulk
algorithms are a different
story. Anybody with a custom thread pool will benefit from a custom
bulk
implementation that can run in
parallel on the thread pool. The loss of algorithm customization is
particularly painful in this area. This section explores some options to
address these concerns and makes a recommendation.
bulk
,
bulk_chunked
, and
bulk_unchunked
This option cuts the Gordian knot, but comes at a high cost. The
parallel_scheduler
can hardly be
called “parallel” if it does not offer a way to execute work in
parallel, so cutting the bulk
algorithms probably means cutting
parallel_scheduler
also.
In this option, we keep the bulk
algorithms and the
parallel_scheduler
, and we say that
the bulk
algorithms are executed in
parallel on the parallel_scheduler
(and on a task_scheduler
that wraps
a parallel_scheduler
), but we leave
the mechanism unspecified.
This option is essentially the status quo, except that as
discussed in The parallel
scheduler, this aspect of the
parallel_scheduler
is currently
under-specified. The referenced section proposes a path forward.
A variant of this option is to specify an exposition-only mechanism
whereby bulk
gets parallelized.
This option makes
parallel_scheduler
and
task_scheduler
“magic” with respect
to the bulk
algorithms. End users
would have no standard mechanism to parallelize
bulk
on their own third-party thread
pools in C++26.
This is the approach taken by the Proposed wording below.
bulk*
algorithms onlyIn this option, we reintroduce algorithm customization with a
special-purpose API just for the
bulk
algorithms. For example, a
scheduler might have an optional sch.bulk_transform(sndr, env)
that turns a serial
bulk*
sender
into one that executes in parallel on scheduler
sch
. Whenever a
bulk*
sender
is passed to
connect
,
connect
can
check the sender’s predecessor for a completion scheduler that defines
bulk_transform
and uses it if
found.
The downside of this approach is that we will still have to support this API even when a more general algorithm customization mechanism is available. That doesn’t seem terribly onerous to me, but that is for SG1/LEWG to decide.
Without algorithm customization, manufacturers of special-purpose hardware accelerators will not be able to ship a scheduler that both:
works with any standard-conforming implementation of std::execution
,
and
performs optimally on their hardware for all of the standard algorithms.
See Mitigating factors for some reasons why this is not as terrible as it sounds.
The loss of direct support for sender algorithm customization is a
blow to power users of std::execution
,
but there are a few factors that mitigate the blow.
All of the senders returned from the standard algorithms are self-describing and can be unpacked into their constituent parts with structured bindings. A sufficiently motivated user can “customize” an algorithm by writing a recursive sender tree transformation, explicitly transforming senders before launching them.
The sender concepts and customization points make it possible for
users to write their own sender algorithms that interoperate with the
standard ones. If a user wants to change the behavior of the
then
algorithm in some way, they
have the option of writing their own and using it instead. I expect
libraries of third-party algorithms to appear on GitHub in time, as they
tend to.
Some execution contexts place extra-standard requirements on the code
that executes on them. For example, NVIDIA GPUs require
device-accelerated code to be annotated with its proprietary
__device__
annotation. Standard
libraries are unlikely to ship implementations of std::execution
with such annotations. The consequence is that, rather than shipping
just a GPU scheduler with some algorithm customizations, a vendor like
NVIDIA is already committed to shipping its own complete implementation
of std::execution
(in
a different namespace, of course).
For such vendors, the inability to customize standard algorithms is a moot point. Since it is implementing the standard algorithms, the implementations can do whatever they want.
The approach to removing sender algorithm customization is twofold:
Remove those components that facilitate algorithm customization and their uses where it is easy to do so.
In all other cases, turn normative mechanisms into non-normative ones so we can change them later. This results in smaller and safer wording changes and preserves the already agreed-upon semantics in a way that is easy to verify.
The steps for removing algorithm customization are detailed below.
Remove the type
default_domain
(33.9.5
[exec.domain.default]).
Remove the functions:
transform_sender
(33.9.6
[exec.snd.transform]),transform_env
(33.9.7
[exec.snd.transform.env]),
andapply_sender
(33.9.8
[exec.snd.apply]).Remove the query object
get_domain
(33.5.5
[exec.get.domain]).
Remove the exposition-only helpers:
completion-domain
(33.9.2
[exec.snd.expos]/8-9),get-domain-early
(33.9.2
[exec.snd.expos]/13),
andget-domain-late
(33.9.2
[exec.snd.expos]/14).Change the functions
get_completion_signatures
(33.9.9
[exec.getcomplsigs])
and connect
(33.9.10
[exec.connect]) to
operate on a sender determined as follows instead of passing the sender
through transform_sender
:
If the sender has a tag with an exposition-only transform-sender
member function, pass the sender to this function with the receiver’s
environment and continue the operation on the resulting sender. This
preserves the behavior of calling
transform_sender
with the
default_domain
.
Otherwise, perform the operation on the passed-in sender.
For the following algorithms that are currently expressed in
terms of a sender transformation to a lowered form, move the lowering
from alg.transform_sender(sndr, env)
to alg.transform-sender(sndr, env)
.
starts_on
(33.9.12.5
[exec.starts.on]),continues_on
(33.9.12.6
[exec.continues.on]),on
(33.9.12.8
[exec.on]),bulk
(33.9.12.11
[exec.bulk]),when_all_with_variant
(33.9.12.12
[exec.when.all]),stopped_as_optional
(33.9.12.14
[exec.stopped.opt]),
andstopped_as_error
(33.9.12.15
[exec.stopped.err]).For each sender adaptor algorithm in 33.9.12
[exec.adapt] that is
specified to be expression-equivalent to some
transform_sender
invocation of the
form:
(
some-computed-domain
(),make-sender
(tag, {args...}, sndr)); transform_sender
Change the expression to:
make-sender
(tag, {args...}, sndr);
For example, in 33.9.12.6 [exec.continues.on]/3, the following:
(
get-domain-early
(sndr),make-sender
(continues_on, sch, sndr)) transform_sender
would be changed to:
make-sender
(continues_on, sch, sndr)
Additionally, if there is some caveat of the form “except that
sndr
is evaluated only once,” that
caveat should be removed as appropriate.
Merge the schedule_from
(33.9.12.7
[exec.schedule.from])
and continues_on
(33.9.12.6
[exec.continues.on])
algorithms into one algorithm called
continues_on
. (Currently they are
separate so that they can be customized independently; by default
continues_on
merely dispatches to
schedule_from
.)
Change 33.9.13.1
[exec.sync.wait]
and 33.9.13.2
[exec.sync.wait.var]
to dispatch directly to their default implementations instead of
computing a domain and using
apply_sender
to dispatch to an
implementation.
Fix a bug in the on(sndr, sch, closure)
algorithm where a write_env
is
incorrectly changing the “current” scheduler before its child
continues_on
actually transfers to
that scheduler. continues_on
needs
to know the scheduler on which it will be started in order to find
customizations correctly in the future.
Tweak the wording of
parallel_scheduler
(33.15
[exec.par.scheduler])
to indicate that it
(parallel_scheduler
) is permitted to
run the bulk
family of algorithms in
parallel in accordance with those algorithms’ semantics, rather than
suggesting that those algorithms are “customized” for
parallel_scheduler
. The mechanism
for such remains non-normative, however we specify the conditions under
which the parallel_scheduler
is
guaranteed to run the bulk
algorithms in parallel. (This is currently under-specified.)
Respecify task_scheduler
in
terms of parallel_scheduler_backend
so that the bulk
algorithms can be
accelerated despite task_scheduler
’s
type-erasure. This addresses LWG#4336. As with
parallel_scheduler
, we specify the
conditions under which
task_scheduler
is guaranteed to run
the bulk
algorithms in
parallel.
From the scheduler
concept,
remove the required expression:
{ auto(get_completion_scheduler<set_value_t>(get_env(schedule(std::forward<Sch>(sch))))) } -> same_as<remove_cvref_t<Sch>>;
Instead, add a semantic requirement that if the above
expression is well-formed, then it shall compare equal to
sch
. Additionally, require that that
expression is well-formed for the
parallel_scheduler
, the
task_scheduler
, and
run_loop
’s scheduler, but not
inline_scheduler
. See inline_scheduler for the motivation behind
these changes, but in short: the
inline_scheduler
does not know where
it completes in C++26 but will in C++29.
Optional, but recommended: Change the env<>::query
member function to accept optional additional arguments after the query
tag. This restores the original design of
env
to that which was first proposed
in [P3325R1] and which was approved by LEWG
straw poll in St Louis. As described in Restoring algorithm
customization in C++29, when asking a sender for its completion
scheduler, the caller needs to pass extra information about where the
operation will be started, and that will require env<>::query
to accept extra arguments.
This is admittedly a lot of changes, but the first 9 changes represent a simplification from the status quo, and the other changes are either neutral in terms of specification or else correct an existing Library issue.
In the final accounting, the result of these changes will be a vastly simpler specification for [exec].
For C++29, we want the sender algorithms in std::execution
to
be customizable, with different implementations suited for different
execution contexts. If we remove customization for C++26, how do we add
it back without breaking code?
Recall that many senders do not know where they will complete until they know where they will be started, and that information is not currently provided when the sender is queried for its completion scheduler. This is the shoal on which algorithm customization has foundered, because without accurate information about where operations are executing, it is impossible to pick the right algorithm implementation.
Once the problem is stated plainly, the fix (or at least a major part of it) is obvious:
When asking the sender where it will complete, tell it where it will start.
The implication of this is that so-called “early” customization, performed when constructing a sender, will not be coming back. The receiver’s execution environment is not known when constructing a sender. C++29 will bring back “late” customization only.
A paper targetting C++29 will propose that we extend the
get_completion_scheduler
query to
support an optional environment argument. Given a sender
S
and receiver
R
, the query would look like:
// Pass the sender's attributes and the receiver's environment when computing // the completion scheduler: auto sch = get_completion_scheduler<set_value_t>(get_env(S), get_env(R));
It will not be possible in C++26 to pass the receiver’s environment in this way, making this a conforming extension since it would not change the meaning of any existing code.
This change will also make it possible to provide a completion
scheduler for the error channel in more cases. That is often not
possible today since many errors are reported inline on the context on
which the operation is started. The receiver’s environment knows where
the operation will be started, so by passing it to the get_completion_scheduler<set_error_t>
query, the error completion scheduler is knowable.
Note The paragraph above makes it sound like this
would be changing the behavior for the get_completion_scheduler<set_error_t>(get_env(sndr))
query. But that expression will behave as it always has. Only when
called with the receiver’s environment will any new behavior manifest;
hence, this change is a pure extension.
By the way, this extension to
get_completion_scheduler
motivates
the change to env<>::query
described above in The removal
process. Although we could decide to defer that change until it is
needed in C++29, it seems best to me to make the change now.
There are sender expressions that complete on an indeterminate
scheduler based on runtime factors;
when_all
is a good example. This is
the problem the get_domain
query
solved. So long as all of when_all
’s
child senders share a common domain tag – a property of the scheduler –
we know the domain on which the
when_all
operation will complete,
even though we do not know which scheduler it will complete on. The
domain controls algorithm selection, not the scheduler
directly.
So the plan will be to bring back a
get_domain
query in C++29.
Additionally, just as it is necessary to have three
get_completion_scheduler
queries,
one each for the three different completion channels, it is necessary to
have three get_completion_domain
queries for the times when the completion scheduler is indeterminate but
the domain is known.
Note Above we say, “So long as all of
when_all
’s child senders share a
common domain tag […]”. This sounds like we are adding a new requirement
to the when_all
algorithm. However,
this requirement will be met for all existing uses of
when_all
. Before C++29, all senders
will be in the “default” domain, so they trivially all share a common
domain.
Giving a non-default domain to a scheduler is the way to opt-in to
algorithm customization. Prior to C++29, there will be no
get_*domain
queries, hence the addition of those queries in C++29 will not affect
any existing schedulers. And the domain queries will be so-called
“forwarding” queries, meaning they will automatically be passed through
layers of sender adaptors. Users will not have to change their code in
order for domain information to be propagated. As a result, this change
is a pure extension.
connect
Since C++29 will support only late
(connect
-time)
customization, customizing an algorithm effectively amounts to
customizing that algorithm’s
connect
operation. By default, connect(sndr, rcvr)
calls sndr.connect(rcvr)
,
but in C++29 there will be a way to do something different depending on
the sender’s attributes and the receiver’s environment.
connect
will compute two domains, the “starting” domain and the (value)
“completion” domain:
Domain kind
|
Query
|
---|---|
Starting domain | get_domain(get_env(rcvr)) |
Completion domain | get_completion_domain<set_value_t>(get_env(sndr), get_env(rcvr)) |
How
connect
will
use this information to select an algorithm implementation is currently
under design. (See Appendix A: The
planned fix for more information.) But at that point, it is only a
matter of mechanism. The key point is that
connect
has
the information it needs to dispatch accurately, and that we can make
that addition without breaking code. And we can.
bulk
Once we have a general mechanism for customizing algorithms, we can
consider changing parallel_scheduler
and task_scheduler
to use that
mechanism to find parallel implementations of the
bulk
algorithms. In C++26, it is
unspecified precisely how those schedulers accelerate
bulk
, and we can certainly leave it
that way for C++29. No change is often the safest change and always the
easiest.
If we wanted to switch to using the new algorithm dispatch mechanics
in C++29, I believe we can do so with minimal impact on existing code.
Any behavior change would be an improvement, accelerating
bulk
operations that should
have been accelerated but were not.
Consider the following sender:
(parallel_scheduler(), just() | bulk(fn)) starts_on
In C++26, we can offer no iron-clad standard guarantee that this
bulk
operation will be accelerated
even though it is executing on the parallel scheduler. The predecessor
of bulk
,
just()
, does
not know where it will complete in C++26. There is no plumbing yet to
tell it that it will be started on the parallel scheduler. As a result,
it is QoI whether this bulk
will
execute in parallel or not.
But suppose we add a get_completion_domain<set_value_t>
query to the parallel_scheduler
such
that the query returns an instance of a new type:
parallel_domain
. Now, when
connecting the bulk
sender,
connect
will
ask for the predecessor’s domain, passing also the receiver’s
environment. Now the
just()
sender is able to say where it completes: the domain where it starts,
get_domain(get_env(rcvr))
.
This will return parallel_domain{}
.
connect
would then use that information to find a parallel implementation of
bulk
.
As a result, in C++29 we could guarantee that this usage of
bulk
will be parallelized. For some
stdlib implementations, this would be a behavior change: what once
executed serially on a thread of the parallel scheduler now executes in
parallel on many threads. Can that break working code? Yes, but only
code that had already violated the preconditions of
bulk
: that
fn
can safely be called in
parallel.
I do not believe this should be considered a breaking change, since any code that breaks is already broken.
All of the above is true also for
task_scheduler
, which merely adds an
indirection to the call to
connect
.
After the changes suggested by this paper, the
task_scheduler
accelerates
bulk
in the same way as
parallel_scheduler
.
Note If we assign
parallel_domain
to the
parallel_scheduler
, and we
also add a requirement to
when_all
that all of its child
operations share a common domain (see Domains), does that have the potential to break
existing code? It would not. We would make
parallel_domain
inherit from
default_domain
so that
when_all
will compute the common
domain as default_domain
even if one
child completes in the
parallel_domain
.
inline_scheduler
The suggestion above to extend the get_completion_scheduler<*>
query presents an intriguing possibility for the
inline_scheduler
: the ability for it
to report the scheduler on which its scheduling operations complete!
Consider the sender schedule(inline_scheduler{})
.
Ask it where it completes today and it will say, “I complete on the
inline_scheduler
.”, which isn’t
terribly useful. However, if you ask it, “Where will you complete – and
by the way you will be started on the
parallel_scheduler
?”, now that
sender can report that it will complete on the
parallel_scheduler
.
The result is that code that uses the
inline_scheduler
will no longer
cause the actual scheduler to be hidden.
This realization is the motivation behind the change to strike the
get_completion_scheduler<set_value_t>(get_env(schedule(sch)))
requirement from the scheduler
concept. We want that expression to be ill-formed for the
inline_scheduler
. Instead, we want
the following query to be well-formed (in C++29):
<set_value_t>(get_env(schedule(inline_scheduler())), get_env(rcvr)) get_completion_scheduler
That expression should be equivalent to get_scheduler(get_env(rcvr))
,
which says that the sender of
inline_scheduler
completes wherever
it is started.
NoteThe reason we do not want
inline_scheduler
to have a (largely
meaningless) completion scheduler in C++26 is because we want it to have
a meaningful one in C++29. And it would be strange if asking for the
completion scheduler gave different answers depending on whether or not
an environment was passed to the query.
This follows the general
principle that if you query a sender’s metadata early (sans environment)
and then later query it again with an environment, the answer should not
change. If the sender does not know the answer with certainty without an
environment, better for the expression to be ill-formed rather than
returning potentially inaccurate information.
[ Editor's note: In 33.4 [execution.syn], make the following changes: ]
… as before … namespace std::execution { // [exec.queries], queries
struct get_scheduler_t {struct get_domain_t {
unspecified
};unspecified
}; struct get_delegation_scheduler_t {unspecified
}; struct get_forward_progress_guarantee_t {unspecified
}; template<class CPO> struct get_completion_scheduler_t {unspecified
}; struct get_await_completion_adaptor_t {unspecified
};inline constexpr get_scheduler_t get_scheduler{}; inline constexpr get_delegation_scheduler_t get_delegation_scheduler{}; enum class forward_progress_guarantee; inline constexpr get_forward_progress_guarantee_t get_forward_progress_guarantee{}; template<class CPO> constexpr get_completion_scheduler_t<CPO> get_completion_scheduler{}; inline constexpr get_await_completion_adaptor_t get_await_completion_adaptor{}; … as before … // [exec.env], class template env template<queryable... Envs> struct env;inline constexpr get_domain_t get_domain{};
// [exec.domain.default], execution domains
// [exec.sched], schedulers struct scheduler_t {}; … as before … template<sender Sndr> using tag_of_t =struct default_domain;
see below
;// [exec.snd.transform], sender transformations
template<class Domain, sender Sndr, queryable... Env>
requires (sizeof...(Env) <= 1)
constexpr sender decltype(auto) transform_sender(
Domain dom, Sndr&& sndr, const Env&... env) noexcept(
see below
);// [exec.snd.transform.env], environment transformations
template<class Domain, sender Sndr, queryable Env>
constexpr queryable decltype(auto) transform_env(
Domain dom, Sndr&& sndr, Env&& env) noexcept;
// [exec.snd.apply], sender algorithm application
template<class Domain, class Tag, sender Sndr, class... Args>
constexpr decltype(auto) apply_sender(
// [exec.connect], the connect sender algorithm struct connect_t; inline constexpr connect_t connect{}; … as before …Domain dom, Tag, Sndr&& sndr, Args&&... args) noexcept(
see below
);
[ Editor's note: Remove subsection 33.5.5 [exec.get.domain]. ]
[ Editor's note: In 33.6 [exec.sched], change paragraphs 1 and 5 and strike paragraph 6 as follows: ]
The
scheduler
concept defines the requirements of a scheduler type (33.3 [exec.async.ops]).schedule
is a customization point object that accepts a scheduler. A valid invocation ofschedule
is a schedule-expression.namespace std::execution { template<class Sch> concept scheduler = <typename remove_cvref_t<Sch>::scheduler_concept, scheduler_t> && derived_from<Sch> && queryablerequires(Sch&& sch) { { schedule(std::forward<Sch>(sch)) } -> sender;
{ auto(get_completion_scheduler<set_value_t>(
get_env(schedule(std::forward<Sch>(sch))))) }
} && <remove_cvref_t<Sch>> && equality_comparable<remove_cvref_t<Sch>>; copyable}-> same_as<remove_cvref_t<Sch>>;
… as before …
- For a given scheduler expression
sch
, if the expressionauto(get_completion_scheduler<set_value_t>(get_env(schedule(sch))))
is well-formed, it shall have typeremove_cvref_t<Sch>
and shall compare equal tosch
.
- For a given scheduler expression
sch
, if the expressionget_domain(sch)
is well-formed, then the expressionget_domain(get_env(schedule(sch)))
is also well-formed and has the same type.
[ Editor's note: In 33.9.1 [exec.snd.general], change paragraph 1 as follows: ]
Subclauses 33.9.11 [exec.factories] and 33.9.12 [exec.adapt] define
customizablealgorithms that return senders.Each algorithm has a default implementation.Letsndr
be the result of an invocation of such an algorithm or an object equal to the result (18.2 [concepts.equality]), and letSndr
bedecltype((sndr))
. Letrcvr
be a receiver of typeRcvr
with associated environment env of typeEnv
such thatsender_to<Sndr, Rcvr>
istrue
.For the default implementation of the algorithm that producedConnectingsndr
, csndr
torcvr
and starting the resulting operation state (33.3 [exec.async.ops]) necessarily results in the potential evaluation (6.3 [basic.def.odr]) of a set of completion operations whose first argument is a subexpression equal torcvr
. LetSigs
be a pack of completion signatures corresponding to this set of completion operations, and letCS
be the type of the expressionget_completion_signatures<Sndr, Env>()
. ThenCS
is a specialization of the class templatecompletion_signatures
(33.10 [exec.cmplsig]), the set of whose template arguments isSigs
. If none of the types inSigs
are dependent on the typeEnv
, then the expressionget_completion_signatures<Sndr>()
is well-formed and its type isCS
.If a user-provided implementation of the algorithm that producedsndr
is selected instead of the default:
(1.1) Any completion signature that is in the set of types denoted bycompletion_signatures_of_t<Sndr, Env>
and that is not part ofSigs
shall correspond to error or stopped completion operations, unless otherwise specified.
(1.2) If none of the types inSigs
are dependent on the typeEnv
, thencompletion_signatures_of_t<Sndr>
andcompletion_signatures_of_t<Sndr, Env>
shall denote the same type.
[ Editor's note: Change 33.9.2 [exec.snd.expos] paragraph 6 as follows: ]
- For a scheduler
sch
,is
SCHED-ATTRS
(sch)an expressionequivalent too1
whose type satisfiesqueryable
such thato1.query(get_completion_scheduler<Tag>)
is an expression with the same type and value assch
MAKE-ENV
(get_completion_scheduler<set_value_t>, sch)where.Tag
is one ofset_value_t
orset_stopped_t
, and such thato1.query(get_domain)
is expression-equivalent tosch.query(get_domain)
is
SCHED-ENV
(sch)an expressionequivalent too2
whose type satisfiesqueryable
such thato2.query(get_scheduler)
is a prvalue with the same type and value assch
, and such thato2.query(get_domain)
is expression-equivalent tosch.query(get_domain)
.
MAKE-ENV
(get_scheduler, sch)
[ Editor's note: Remove
the prototype of the exposition-only
completion-domain
function just before 33.9.2
[exec.snd.expos]
paragraph 8, and with it remove paragraphs 8 and 9, which specify the
function’s behavior. ]
[ Editor's note: Remove
33.9.2
[exec.snd.expos]
paragraphs 13 and 14 and the prototypes for the
get-domain-early
and
get-domain-late
functions. ]
[ Editor's note: Remove subsection 33.9.5 [exec.domain.default]. ]
[ Editor's note: Remove subsection 33.9.6 [exec.snd.transform]. ]
[ Editor's note: Remove subsection 33.9.7 [exec.snd.transform.env]. ]
[ Editor's note: Remove subsection 33.9.8 [exec.snd.apply]. ]
[ Editor's note: Change 33.9.9 [exec.getcomplsigs] as follows: ]
Let
except
be an rvalue subexpression of an unspecified class typeExcept
such thatmove_constructible<
isExcept
> && derived_from<Except
, exception>true
. Letbe
CHECKED-COMPLSIGS
(e
)e
ife
is a core constant expression whose type satisfiesvalid-completion-signatures
; otherwise, it is the following expression:(
e
, throwexcept
, completion_signatures())Let
be expression-equivalent to
get-complsigs
<Sndr, Env...>()remove_reference_t<Sndr>::template get_completion_signatures<Sndr, Env...>()
.LetLetNewSndr
beSndr
ifsizeof...(Env) == 0
istrue
; otherwise,decltype(
wheres
)s
is the following expression:NewSndr
bedecltype(tag_of_t<Sndr>().
if that expression is well-formed, andtransform-sender
(declval<Sndr>(), declval<Env>()...))Sndr
otherwise.
transform_sender(
get-domain-late
(declval<Sndr>(), declval<Env>()...),declval<Sndr>(),
declval<Env>()...)
Constraints:
sizeof...(Env) <= 1
istrue
.Effects: Equivalent to: … as before …
[ Editor's note: Change 33.9.10 [exec.connect] as follows: ]
connect
connects (33.3 [exec.async.ops]) a sender with a receiver.The name
connect
denotes a customization point object. For subexpressionssndr
andrcvr
, letSndr
bedecltype((sndr))
andRcvr
bedecltype((rcvr))
,; letnew_sndr
be the expressiontransform_sender(decltype(
get-domain-late
(sndr, get_env(rcvr))){}, sndr, get_env(rcvr))tag_of_t<Sndr>().
if that expression is well-formed, andtransform-sender
(sndr, get_env(rcvr))sndr
otherwise; and letDS
andDR
bedecay_t<decltype((new_sndr))>
anddecay_t<Rcvr>
, respectively.Let
connect-awaitable-promise
be … as before …
[ Editor's note: Change 33.9.11.1 [exec.schedule] paragraph 4 as follows: ]
If the expression
<set_value_t>(get_env(sch.schedule()))
get_completion_scheduler== sch
is ill-formed
or well-formed and does not
evaluates to
false
sch
,
the behavior of calling schedule(sch)
is undefined.
[ Editor's note: From 33.9.12.1 [exec.adapt.general], strike paragraph (3.6) as follows: ]
Unless otherwise specified:
… as before …
(3.5) An adaptor whose child senders are all non-dependent (33.3 [exec.async.ops]) is itself non-dependent.
(3.6)
These requirements apply to any function that is selected by the implementation of the sender adaptor.(3.7) Recommended practice: Implementations should use the completion signatures of the adaptors to communicate type errors to users and to propagate any such type errors from child senders.
[ Editor's note: Change 33.9.12.5 [exec.starts.on] paragraph 3 as follows: ]
Otherwise, the expression
starts_on(sch, sndr)
is expression-equivalent to:.
make-sender
(starts_on, sch, sndr)transform_sender(
query-with-default
(get_domain, sch, default_domain()),make-sender
(starts_on, sch, sndr))
except thatsch
is evaluated only once.Let
out_sndr
andenv
be subexpressions such thatOutSndr
isdecltype((out_sndr))
. Ifis
sender-for
<OutSndr, starts_on_t>false
, then theexpressionsexpressionstarts_on.transform_env(out_sndr, env)
andstarts_on.
transform_sendertransform-sender
(out_sndr, env)areis ill-formed; otherwise it is equivalent to:auto&& [_, sch, sndr] = out_sndr; return let_value( (sch), schedule[sndr = std::forward_like<OutSndr>(sndr)]() mutable noexcept(is_nothrow_move_constructible_v<decay_t<OutSndr>>) { return std::move(sndr); });
- Let
out_sndr
be … as before …
[ Editor's note: Remove subsection 33.9.12.6 [exec.continues.on] ]
[ Editor's note: Change 33.9.12.7 [exec.schedule.from] to [exec.continues.on] and change it as follows: ]
33.9.12.
76execution::
[execschedule_from
continues_on
.schedule.from.continues.on]
schedule_from
continues_on
schedules work dependent on the completion of a sender onto a scheduler’s associated execution resource.
[Note 1:schedule_from
is not meant to be used in user code; it is used in the implementation ofcontinues_on
. — end note]The name
schedule_from
continues_on
denotes a customization point object. For some subexpressionssch
andsndr
, letSch
bedecltype((sch))
andSndr
bedecltype((sndr))
. IfSch
does not satisfy scheduler, orSndr
does not satisfysender
,schedule_from(sch, sndr)
continues_on(sndr, sch)
is ill-formed.Otherwise, the expression
schedule_from(sch, sndr)
continues_on(sndr, sch)
is expression-equivalent to:
make-sender
(continues_on, sch, sndr)transform_sender(
query-with-default
(get_domain, sch, default_domain()),make-sender
(schedule_from, sch, sndr))except that sch is evaluated only once.
The exposition-only class template
impls-for
(33.9.1 [exec.snd.general]) is specialized forschedule_from_t
continues_on_t
as follows:namespace std::execution { template<> struct
impls-for
<schedule_from_t
continues_on_t
> :default-impls
{ static constexpr autoget-attrs
=see below
; static constexpr autoget-state
=see below
; static constexpr autocomplete
=see below
; template<class Sndr, class... Env> static consteval voidcheck-types
(); }; }The member
is initialized with a callable object equivalent to the following lambda:
impls-for
<schedule_from_t
continues_on_t
>::get-attrs
[](const auto& data, const auto& child) noexcept -> decltype(auto) { return
JOIN-ENV
(SCHED-ATTRS
(data),FWD-ENV
(get_env(child))); }The member
is initialized with a callable object equivalent to the following lambda:
impls-for
<schedule_from_t
continues_on_t
>::get-state
… as before …
template<class Sndr, class... Env> static consteval void
check-types
();… as before …
The member
is initialized with a callable object equivalent to the following lambda:
impls-for
<schedule_from_t
continues_on_t
>::complete
… as before …
Let
out_sndr
be a subexpression denoting a sender returned fromschedule_from(sch, sndr)
continues_on(sndr, sch)
or one equal to such, and letOutSndr
be the typedecltype((out_sndr))
. Letout_rcvr
be … as before …
[ Editor's note: Change 33.9.12.8 [exec.on] paragraphs 3-8 as follows: ]
Otherwise, if
decltype((sndr))
satisfiessender
, the expressionon(sch, sndr)
is expression-equivalent to:.
make-sender
(on, sch, sndr)transform_sender(
query-with-default
(get_domain, sch, default_domain()),make-sender
(on, sch, sndr))except that
sch
is evaluated only once.For subexpressions
sndr
,sch
, andclosure
, if
(4.1)
decltype((sch))
does not satisfyscheduler
, or(4.2)
decltype((sndr))
does not satisfysender
, or(4.3)
closure
is not a pipeable sender adaptor closure object ([exec.adapt.obj]), the expressionon(sndr, sch, closure)
is ill-formed; otherwise, it is expression-equivalent to:.
make-sender
(on,product-type
{sch, closure}, sndr)transform_sender(
get-domain-early
(sndr),make-sender
(on,product-type
{sch, closure}, sndr))except that
sndr
is evaluated only once.Let
out_sndr
andenv
be subexpressions, letOutSndr
bedecltype((out_sndr))
, and letEnv
bedecltype((env))
. Ifis
sender-for
<OutSndr, on_t>false
, then theexpressionsexpressionon.transform_env(out_sndr, env)
andon.
transform_sender
transform-sender
(out_sndr, env)areis ill-formed.Otherwise: Let
not-a-scheduler
be an unspecified empty class type.
The expression
on.transform_env(out_sndr, env)
has effects equivalent to:auto&& [_, data, _] = out_sndr; if constexpr (scheduler<decltype(data)>) {
JOIN-ENV
(SCHED-ENV
(std::forward_like<OutSndr>(data)),FWD-ENV
(std::forward<Env>(env))); return } else { return std::forward<Env>(env); }
The expression
on.
has effects equivalent to:transform_sender
transform-sender
(out_sndr, env)auto&& [_, data, child] = out_sndr; if constexpr (scheduler<decltype(data)>) { auto orig_sch =
query-with-default
(get_scheduler, env,not-a-scheduler
()); if constexpr (same_as<decltype(orig_sch),not-a-scheduler
>) { returnnot-a-sender
{}; } else { return continues_on( (std::forward_like<OutSndr>(data), std::forward_like<OutSndr>(child)), starts_on::move(orig_sch)); std} } else { auto& [sch, closure] = data; auto orig_sch =query-with-default
( <set_value_t>, get_completion_scheduler(child), get_envquery-with-default
(get_scheduler, env,not-a-scheduler
())); if constexpr (same_as<decltype(orig_sch),not-a-scheduler
>) { returnnot-a-sender
{}; } else { returnwrite_env
continues_on
(continues_on
write_env
( ::forward_like<OutSndr>(closure)( std( continues_on(std::forward_like<OutSndr>(child),SCHED-ENV
(orig_sch)), write_env)), schorig_sch
),
SCHED-ENV
(sch)
SCHED-ENV
(sch)orig_sch
); } }
[ Editor's note: Change 33.9.12.9 [exec.then] paragraph 3 as follows: ]
Otherwise, the expression
is expression-equivalent to
then-cpo
(sndr, f):.
make-sender
(then-cpo
, f, sndr)
get-domain-early
(sndr),make-sender
(then-cpo
, f, sndr)) transform_sender(except that
sndr
is evaluated only once.
[ Editor's note: Change 33.9.12.10 [exec.let] paragraphs 2-4 as follows: ]
For
let_value
,let_error
, andlet_stopped
, letset-cpo
beset_value
,set_error
, andset_stopped
, respectively. Let the expressionlet-cpo
be one oflet_value
,let_error
, orlet_stopped
. For a subexpressionsndr
, letbe expression-equivalent to the first well-formed expression below:
let-env
(sndr)
- (2.1)
SCHED-ENV
(get_completion_scheduler
<decayed-typeof
<set-cpo
>>(get_env(sndr)))
- (2.2)
MAKE-ENV
(get_domain, get_domain(get_env(sndr)))
- (2.3)
(void(sndr), env<>{})
The names
let_value
,let_error
, andlet_stopped
denote … as before …Otherwise, the expression
is expression-equivalent to
let-cpo
(sndr, f):.
make-sender
(let-cpo
, f, sndr)
get-domain-early
(sndr),make-sender
(let-cpo
, f, sndr)) transform_sender(except that
sndr
is evaluated only once.
[ Editor's note: Change 33.9.12.11 [exec.bulk] paragraphs 3 and 4 and insert paragraphs 5 and 6 as follows: ]
Otherwise, the expression
is expression-equivalent to:
bulk-algo
(sndr, policy, shape, f)
transform_sender(
get-domain-early
(sndr),make-sender
(bulk-algo
,product-type
<see below
, Shape, Func>{policy, shape, f}, sndr))
except thatThe first template argument ofsndr
is evaluated only once.product-type
isPolicy
ifPolicy
modelscopy_constructible
, andconst Policy&
otherwise.Let
sndr
andbe an expression such thatenv
be subexpressionsSndr
isdecltype((sndr))
. Ifis
sender-for
<Sndr, bulk_t>false
, then the expressionbulk.transform_sender(sndr, env)
is ill-formed; otherwise, it is equivalent to:
as-bulk-chunked
(sndr)auto [_, data, child] = sndr; auto& [policy, shape, f] = data; auto new_f = [func = std::move(f)](Shape begin, Shape end, auto&&... vs) noexcept(noexcept(f(begin, vs...))) { while (begin != end) (begin++, vs...); func} return bulk_chunked(std::move(child), policy, shape, std::move(new_f));
[ Note: This causes thebulk(sndr, policy, shape, f)
sender to be expressed in terms ofbulk_chunked(sndr, policy, shape, f)
when it is connected to a receiverwhose execution domain does not customize. — end note ]bulk
Let
sndr
andenv
be subexpressions, letSndr
bedecltype((sndr))
, and letsch
be expression-equivalent toget_completion_scheduler<set_value_t>(get_env(sndr.
. Ifget
<2>()))is
sender-for
<Sndr,decayed-typeof
<bulk-algo
>>false
, the expressionis ill-formed; otherwise, it is expression-equivalent to:
bulk-algo
.transform-sender
(sndr, env)
[ Editor's note: Change 33.9.12.12 [exec.when.all] as follows: ]
when_all
andwhen_all_with_variant
both … as before …The names
when_all
andwhen_all_with_variant
denote customization point objects. Letsndrs
be a pack of subexpressions,and letSndrs
be a pack of the typesdecltype((sndrs))...
, and let. The expressionsCD
be the typecommon_type_t<decltype(
. Letget-domain-early
(sndrs))...>CD2
beCD
ifCD
is well-formed, anddefault_domain
otherwisewhen_all(sndrs...)
andwhen_all_with_variant(sndrs...)
are ill-formed if any of the following istrue
:The expression
when_all(sndrs...)
is expression-equivalent to:.
make-sender
(when_all, {}, sndrs...)
make-sender
(when_all, {}, sndrs...)) transform_sender(CD2(),The exposition-only class template
impls-for
(33.9.1 [exec.snd.general]) is specialized forwhen_all_t
as follows:namespace std::execution { template<> struct
impls-for
<when_all_t> :default-impls
{static constexpr autostatic constexpr auto
get-attrs
=see below
;get-env
=see below
; static constexpr autoget-state
=see below
; static constexpr autostart
=see below
; static constexpr autocomplete
=see below
; template<class Sndr, class... Env> static consteval voidcheck-types
(); }; }… as before …
- Throws: Any exception thrown as a result of evaluating the Effects
, or an exception of an unspecified type derived from.exception
whenCD
is ill-formed
The member
is initialized with a callable object equivalent to the following lambda expression:
impls-for
<when_all_t>::get-attrs
[](auto&&, auto&&... child) noexcept { if constexpr (same_as<CD, default_domain>) { return env<>(); } else {
MAKE-ENV
(get_domain, CD()); return } }… as before …
The expression
when_all_with_variant(sndrs...)
is expression-equivalent to:.
make-sender
(when_all_with_variant, {}, sndrs...)
make-sender
(when_all_with_variant, {}, sndrs...)); transform_sender(CD2(),Given subexpressions
sndr
andenv
, ifis
sender-for
<decltype((sndr)), when_all_with_variant_t>false
, then the expressionwhen_all_with_variant.
is ill-formed; otherwise, it is equivalent to:transform_sender
transform-sender
(sndr, env)auto&& [_, _, ...child] = sndr; return when_all(into_variant(std::forward_like<decltype((sndr))>(child))...);
[Note 1: This causes the
when_all_with_variant(sndrs...)
sender to becomewhen_all(into_variant(sndrs)...)
when it is connected with a receiverwhose execution domain does not customize. — end note]when_all_with_variant
[ Editor's note: Change 33.9.12.13 [exec.into.variant] paragraph 3 as follows: ]
Otherwise, the expression
into_variant(sndr)
is expression-equivalent to:.
make-sender
(into_variant, {}, sndr)
get-domain-early
(sndr),make-sender
(into_variant, {}, sndr)) transform_sender(except that
sndr
is only evaluated once.
[ Editor's note: Change 33.9.12.14 [exec.stopped.opt] paragraphs 2 and 4 as follows: ]
The name
stopped_as_optional
denotes a pipeable sender adaptor object. For a subexpressionsndr
, letSndr
bedecltype((sndr))
. The expressionstopped_as_optional(sndr)
is expression-equivalent to:.
make-sender
(stopped_as_optional, {}, sndr)
get-domain-early
(sndr),make-sender
(stopped_as_optional, {}, sndr)) transform_sender(except that
sndr
is only evaluated once.The exposition-only class template
impls-for
… as before …Let
sndr
andenv
be subexpressions such thatSndr
isdecltype((sndr))
andEnv
isdecltype((env))
. Ifis
sender-for
<Sndr, stopped_as_optional_t>false
then the expressionstopped_as_optional.
is ill-formed; otherwise, iftransform_sender
transform-sender
(sndr, env)sender_in<
ischild-type
<Sndr>,FWD-ENV-T
(Env)>false
, the expressionstopped_as_optional.
is equivalent totransform_sender
transform-sender
(sndr, env); otherwise, it is equivalent to:
not-a-sender
()auto&& [_, _, child] = sndr; using V =
single-sender-value-type
<child-type
<Sndr>,FWD-ENV-T
(Env)>; return let_stopped( (std::forward_like<Sndr>(child), then[]<class... Ts>(Ts&&... ts) noexcept(is_nothrow_constructible_v<V, Ts...>) { return optional<V>(in_place, std::forward<Ts>(ts)...); }), []() noexcept { return just(optional<V>()); });
[ Editor's note: Change 33.9.12.15 [exec.stopped.err] paragraphs 2 and 3 as follows: ]
The name
stopped_as_error
denotes a pipeable sender adaptor object. For some subexpressionssndr
anderr
, letSndr
bedecltype((sndr))
and letErr
bedecltype((err))
. If the typeSndr
does not satisfysender
or if the typeErr
does not satisfymovable-value
,stopped_as_error(sndr, err)
is ill-formed. Otherwise, the expressionstopped_as_error(sndr)
is expression-equivalent to:.
make-sender
(stopped_as_error, err, sndr)
get-domain-early
(sndr),make-sender
(stopped_as_error, err, sndr)) transform_sender(except that
sndr
is only evaluated once.Let
sndr
andenv
be subexpressions such thatSndr
isdecltype((sndr))
andEnv
isdecltype((env))
. Ifis
sender-for
<Sndr, stopped_as_error_t>false
then the expressionstopped_as_error.
is ill-formed; otherwise, it is equivalent to:transform_sender
transform-sender
(sndr, env)auto&& [_, err, child] = sndr; using E = decltype(auto(err)); return let_stopped( ::forward_like<Sndr>(child), std[err = std::forward_like<Sndr>(err)]() noexcept(is_nothrow_move_constructible_v<E>) { return just_error(std::move(err)); });
[ Editor's note: Change 33.9.12.16 [exec.associate] paragraph 10 as follows: ]
The name
associate
denotes a pipeable sender adaptor object. For subexpressionssndr
andtoken
:
(10.1) If
decltype((sndr))
does not satisfysender
, orremove_cvref_t<decltype((token))>
does not satisfyscope_token
, thenassociate(sndr, token)
is ill-formed.(10.2) Otherwise, the expression
associate(sndr, token)
is expression-equivalent to:.
make-sender
(associate,associate-data
(token, sndr))
get-domain-early
(sndr), transform_sender(make-sender
(associate,associate-data
(token, sndr)))except that
sndr
is evaluated only once.
[ Editor's note: Change 33.9.13.1 [exec.sync.wait] paragraphs 4 and 9 as follows: ]
The name
this_thread::sync_wait
denotes a customization point object. For a subexpressionsndr
, letSndr
bedecltype((sndr))
. The expressionthis_thread::sync_wait(sndr)
is expression-equivalent tothe following, except thatsndr
is evaluated only once:sync_wait.
, whereapply
(sndr)apply
is the exposition-only member function specified below.
get-domain-early
(sndr), sync_wait, sndr) apply_sender(Mandates:
(4.1)
sender_in<Sndr,
is true.sync-wait-env
>(4.2) The type
is well-formed.
sync-wait-result-type
<Sndr>
- (4.3)
same_as<decltype(
ise
),sync-wait-result-type
<Sndr>>true
, wheree
is theapply_sender
expression i>… as before …
For a subexpression
sndr
, letSndr
bedecltype((sndr))
. Ifsender_to<Sndr,
issync-wait-receiver
<Sndr>>false
, the expressionsync_wait.
is ill-formed; otherwise, it is equivalent to:apply_sender
apply
(sndr)
sync-wait-state
<Sndr> state; auto op = connect(sndr,sync-wait-receiver
<Sndr>{&state}); (op); start .loop.run(); stateif (state.error) { (std::move(state.error)); rethrow_exception} return std::move(state.result);
[ Editor's note: Change Note 1 in 33.9.13.1 [exec.sync.wait] paragraph 10.1 as follows: ]
[Note 1: The
defaultimplementation ofsync_wait
achieves forward progress guarantee delegation by providing arun_loop
scheduler via theget_delegation_scheduler
query on thesync-wait-receiver
’s environment. Therun_loop
is driven by the current thread of execution. — end note]
[ Editor's note: Change 33.9.13.2 [exec.sync.wait.var] paragraphs 1 and 2 as follows: ]
The name
this_thread::sync_wait_with_variant
denotes a customization point object. For a subexpressionsndr
, letSndr
bedecltype(into_variant(sndr))
. The expressionthis_thread::sync_wait_with_variant(sndr)
is expression-equivalent tothe following, exceptsndr
is evaluated only once:sync_wait_with_variant.
, whereapply
(sndr)apply
is the exposition-only member function specified below.apply_sender(get-domain-early(sndr), sync_wait_with_variant, sndr)
Mandates:
(1.1)
sender_in<Sndr,
issync-wait-env
>true
.(1.2) The type
is well-formed.
sync-wait-with-variant-result-type
<Sndr>
- (1.3)
same_as<decltype(
ise
),sync-wait-with-variant-result-type
<Sndr>>true
, wheree
is theapply_sender
expression i>The expression
sync_wait_with_variant.
is equivalent to:apply_sender
apply
(sndr)using result_type =
sync-wait-with-variant-result-type
<Sndr>; if (auto opt_value = sync_wait(into_variant(sndr))) { return result_type(std::move(get<0>(*opt_value))); } return result_type(nullopt);
[ Editor's note: Change Note 1 in 33.9.13.1 [exec.sync.wait] paragraph 10.1 as follows: ]
[Note 1: The
defaultimplementation ofsync_wait_with_variant
achieves forward progress guarantee delegation (6.10.2.3 [intro.progress]) by relying on the forward progress guarantee delegation provided bysync_wait
. — end note]
[ Editor's note: Change 33.11.2 [exec.env] as follows: ]
namespace std::execution { template<
queryable
... Envs> struct env { // exposition only Envs0 envs0; // exposition only Envs1 envs1; ⋮// exposition only Envsn-1 envsn-1; template<class QueryTag, class... Args
> constexpr decltype(auto) query(QueryTag q, Args&&... args
) const noexcept(see below
); }; template<class... Envs> (Envs...) -> env<unwrap_reference_t<Envs>...>; env}
- The class template
env
is used to construct a queryable object from several queryable objects. Query invocations on the resulting object are resolved by attempting to query each subobject in lexical order.… as before …
template<class QueryTag
, class... Args
> constexpr decltype(auto) query(QueryTag q, Args&&... args
) const noexcept(see below
);
Let
has-query
be the following exposition-only concept:template<class Env, class QueryTag
, class... Args
> concepthas-query
= // exposition only requires (const Env& env, Args&&... args
) { .query(QueryTag(), std::forward<Args>(args)...
); env};Let
fe
be the first element ofenvs0, envs1, … envsn-1
such that the expressionis well-formed.
fe
.query(q, std::forward<Args>(args)...
)Constraints:
(
ishas-query
<Envs, QueryTag, Args...
> || ...)true
.Effects: Equivalent to:
return
fe
.query(q, std::forward<Args>(args)...
);Remarks: The expression in the
noexcept
clause is equivalent tonoexcept(
.fe
.query(q, std::forward<Args>(args)...
))
[ Editor's note: In 33.12.1.2 [exec.run.loop.types], add a new paragraph after paragraph 4 as follows: ]
- Let
sch
be an expression of typerun-loop-scheduler
. The expressionschedule(sch)
has typerun-loop-sender
and is not potentially-throwing ifsch
is not potentially-throwing.
- For type
set-tag
other thanset_error_t
, the expressionget_completion_scheduler<
evaluates toset-tag
>(get_env(schedule(sch
))) ==sch
true
.
[ Editor's note: Change 33.13.3 [exec.affine.on] paragraph 3 as follows: ]
Otherwise, the expression
affine_on(sndr, sch)
is expression-equivalent to:.
make-sender
(affine_on, sch, sndr)
get-domain-early
(sndr),make-sender
(affine_on, sch, sndr)) transform_sender(except that
sndr
is evaluated only once.
[ Editor's note: Change paragraph 3 of 33.13.4 [exec.inline.scheduler] as follows: ]
Let sndr be an expression of type
inline-sender
, letrcvr
be an expression such thatreceiver_of<decltype((rcvr)), CS>
istrue
whereCS
iscompletion_signatures<set_value_t()>
, then:[ Editor's note: Move the text of (3.1) below into this paragraph. ](3.1) the expression
connect(sndr, rcvr)
has typeand is potentially-throwing if and only if
inline-state
<remove_cvref_t<decltype((rcvr))>>((void)sndr, auto(rcvr))
is potentially-throwing, and.(3.2) the expression
get_completion_scheduler<set_value_t>(get_env(sndr))
has typeinline_scheduler
and is potentially-throwing if and only ifget_env(sndr)
is potentially-throwing.
[ Editor's note: Change 33.13.5 [exec.task.scheduler] as follows: ]
namespace std::execution { class task_scheduler {
class
ts-sender
; // exposition onlytemplate<receiver R>
class state; // exposition only
template<class Sch>
class
public: using scheduler_concept = scheduler_t; template<class Sch, class Allocator = allocator<void>> requires (!same_as<task_scheduler, remove_cvref_t<Sch>>) && scheduler<Sch> explicit task_scheduler(Sch&& sch, Allocator alloc = {});backend-for
; // exposition onlyts-sender
see below
schedule();template <class Sndr, class Env> // exposition only
friend bool operator==(const task_scheduler& lhs, const task_scheduler& rhs) noexcept; template<class Sch> requires (!same_as<task_scheduler, Sch>) && scheduler<Sch> friend bool operator==(const task_scheduler& lhs, const Sch& rhs) noexcept; private: <
see below
bulk-transform
(Sndr&& sndr, const Env& env);void
parallel_scheduler_backend
>sch_
; // exposition only shared_ptr// see [exec.sysctxrepl.psb]
}; }
task_scheduler
is a class that modelsscheduler
(33.6 [exec.sched]). Given an objects
of typetask_scheduler
, letbe the
SCHED
(s)sched_
member of the object owned bys.
.sch_
- For an lvalue
r
of type derived fromreceiver_proxy
, letbe an object of a type that models
WRAP-RCVR
(r)receiver
and whose completion handlers result in invoking the corresponding completion handlers ofr
.template<class Sch>
backend-for
: parallel_scheduler_backend {// exposition only
structbackend-for
(Sch sch) : sched_(std::move(sch)) {} explicit void schedule(receiver_proxy& r, span<byte> s) noexcept override; void schedule_bulk_chunked(size_t shape, bulk_item_receiver_proxy& r, span<byte> s) noexcept override; void schedule_bulk_unchunked(size_t shape, bulk_item_receiver_proxy& r, span<byte> s) noexcept override;sched_
;// exposition only
Sch };
- Let
sndr
be a sender whose only value completion signature isset_value_t()
and for which the expressionget_completion_scheduler<set_value_t>(get_env(sndr)) ==
issched_
true
.void schedule(receiver_proxy& r, span<byte> s) noexcept override;
- Effects: Constructs an operation state
os
withconnect(schedule(
and callssched_
),WRAP-RCVR
(r))start(os)
.void schedule_bulk_chunked(size_t shape, bulk_item_receiver_proxy& r, span<byte> s) noexcept override;
- Effects: Let
chunk_size
be an integer less than or equal toshape
, letnum_chunks
be(shape + chunk_size - 1) / chunk_size
, and letfn
be a function object such that for an integeri
,fn(i)
callsr.execute(i * chunk_size, m)
, wherem
is the lesser of(i + 1) * chunk_size
andshape
. Constructs an operation stateos
as if withconnect(bulk(sndr, par, num_chunks, fn),
and callsWRAP-RCVR
(r))start(os)
.void schedule_bulk_unchunked(size_t shape, bulk_item_receiver_proxy& r, span<byte> s) noexcept override;
- Effects: Let
fn
be a function object such that for an integeri
,fn(i)
is equivalent tor.execute(i, i + 1)
. Constructs an operation stateos
as if withconnect(bulk(sndr, par, shape, fn),
and callsWRAP-RCVR
(r))start(os)
.template<class Sch, class Allocator = allocator<void>> requires(!same_as<task_scheduler, remove_cvref_t<Sch>>) && scheduler<Sch> explicit task_scheduler(Sch&& sch, Allocator alloc = {});
- Effects: Initialize
sch_
withallocate_shared<
.backend-for
<remove_cvref_t<Sch>>
>(alloc, std::forward<Sch>(sch))[ Editor's note: Paragraphs 3-7 are kept unmodified. Remove paragraphs 8-12 and add the following paragraphs: ]
see below
schedule();
Returns: a prvalue
sndr
whose typeSndr
modelssender
such that:
(8.1)
get_completion_scheduler<set_value_t>(get_env(sndr))
is equal to*this
.(8.2) If a receiver
rcvr
is connected tosndr
and the resulting operation state is started, calls, where
sch_
->schedule(r, s)
(8.2.1)
r
is a proxy forrcvr
with basesystem_context_replaceability::receiver_proxy
(33.15 [exec.par.scheduler]) and(8.2.2)
s
is a preallocated backend storage forr
.template <class BulkSndr, class Env> // exposition only
see below
bulk-transform
(BulkSndr&& bulk_sndr, const Env& env);
Constraints:
sender_in<BulkSndr, Env>
istrue
and eitheror
sender-for
<BulkSndr, bulk_chunked_t>is
sender-for
<BulkSndr, bulk_unchunked_t>true
.Returns: a prvalue
sndr
whose type modelssender
such that:
(10.1)
get_completion_scheduler<set_value_t>(get_env(sndr))
is equal to*this
.(10.2)
bulk_sndr
is connected to an unspecified receiver if a receiverrcvr
is connected tosndr
. If the resulting operation state is started,
(10.2.1) If
bulk_sndr
completes with valuesvals
, letargs
be a pack of lvalue subexpressions designating objects decay-copied fromvals
. Then
(10.2.1.1) If
bulk_sndr
is the result of callingbulk_chunked(child, policy, shape, f)
,is called where
sch_
->schedule_bulk_chunked(shape, r, s)r
is a bulk chunked proxy forrcvr
with callablef
and argumentsargs
, ands
is a preallocated backend storage forr
.(10.2.1.2) Otherwise,
bulk_sndr
is the result of callingbulk_unchunked(child, policy, shape, f)
. Callswhere
sch_
->schedule_bulk_unchunked(shape, r, s)r
is a bulk unchunked proxy forrcvr
with callablef
and argumentsargs
, ands
is a preallocated backend storage forr
.(10.2.2) All other completion operations are forwarded unchanged.
[ Editor's note: In 33.15 [exec.par.scheduler], add a new paragraph after paragraph 3, another before paragraph 10, and change paragraphs 10 and 11 as follows: ]
- The expression
get_forward_progress_guarantee(sch)
returnsforward_progress_guarantee::parallel
.?. The expression
get_completion_scheduler<set_value_t>(get_env(schedule(sch))) == sch
evaluates totrue
.… as before …
?. Let
sch
be a subexpression of typeparallel_scheduler
. For subexpressionssndr
andenv
, iftag_of_t<Sndr>
is neitherbulk_chunked_t
norbulk_unchunked_t
, the expressionsch.
is ill-formed; otherwise, letbulk-transform
(sndr, env)child
,pol
,shape
, andf
be subexpressions equal to the arguments used to createsndr
.
When the tag type ofparallel_scheduler
provides a customized implementation of thebulk_chunked
algorithm (33.9.12.11 [exec.bulk]). If a receiverrcvr
is connected to the sender returned bybulk_chunked(sndr, pol, shape, f)
sndr
isbulk_chunked_t
, the expressionsch.
returns a sender such that if it is connected to a receiverbulk-transform
(sndr, env)rcvr
and the resulting operation state is started, then:
(10.1) If
sndr
child
completes with valuesvals
, letargs
be a pack of lvalue subexpressions designatingvals
, thenb.schedule_bulk_chunked(shape, r, s)
is called, where(10.2) All other completion operations are forwarded unchanged.
[ Note: Customizing the behavior of
bulk_chunked
affects thedefaultimplementation ofbulk
. — end note ]
When the tag type ofparallel_scheduler
provides a customized implementation of thebulk_unchunked
algorithm (33.9.12.11 [exec.bulk]). If a receiverrcvr
is connected to the sender returned bybulk_unchunked(sndr, pol, shape, f)
sndr
isbulk_unchunked_t
, the expressionsch.
returns a sender such that if it is connected to a receiverbulk-transform
(sndr, env)rcvr
and the resulting operation state is started, then:
Our willingness to remove algorithm customization depends on our confidence that we can add it back later without breaking code. Section Restoring algorithm customization in C++29 talks about how we would go about this. This appendix fleshies out some of the details.
A sender expression represents a task graph, the nodes of which are asynchronous operations. Every async operation is started on some execution context, the starting context, and completes on another execution context, the completing context. The two might be the same, but that’s irrelevant.
Note This is a simplification. Some senders like
when_all
can complete on one of
several contexts. We solve that problem with domains as described
below.
Imagine we assign each execution resource a color. The mission then is to paint every node in the task graph with the colors of its starting and completing contexts. Once we know where each operation will start and complete, we can use that information to pick the right algorithm implementation.
In regards to customization, each color can be thought to represent not an individual execution resource, but rather a set of algorithm implementations. Two different execution resources might use the same set of algorithm implementations, so they would have the same “color”. In fact, most execution resources will use the default set of algorithm implementations, in which case they all have the same color.
That’s not always the case though. A thread pool would not want to
use the default implementation of
bulk
for example – that would be
serial. The thread pool would have a different color corresponding to
its set of preferred algorithm implementations.
In std::execution
today, this notion of color is called a “domain”. A domain is a tag type
that is used to select a set of algorithm implementations. Schedulers,
which are stand-ins for execution resources, advertize their domain with
the get_domain
query.
Completing the mission requires two things:
Identifying the starting and completing domain of every operation in the task graph, and
Using that information to select the preferred implementation for the algorithm that operation represents.
Let’s take these two separately.
So-called “early” customization, which determines the return type of
then(sndr, fn)
for example, is predicated on the fact that senders know the domain on
which they will complete. As discussed above, that’s false. Many senders
only know where they will complete once they know where they will start,
which isn’t known until the sender is connected to a receiver.
So early customization is irreparably broken. There is no plan to add it back.
That leaves late customization, which is performed by the
connect
customization point. The receiver, which is an extension of caller,
knows where the operation will start. If the sender is given this
information – that is, if the sender is told where it will start – it
can accurately report where it will complete. This is the key
insight.
When
connect
queries a sender’s attributes for its domain, it should pass the
receiver’s environment. That way a sender has all the information
available when computing its completion domain.
get_completion_domain
It is sometimes the case that a sender’s value and error completions can happen on different domains. For example, imagine trying to schedule work on a GPU. If it succeeds, you are in the GPU domain, and Bob’s your uncle. If scheduling fails, however, the error cannot be reported on the GPU because we failed to make it there!
So asking a sender for a singular completion domain is not flexible
enough. We have three separate queries for a sender’s completion
scheduler: get_completion_scheduler<set_[value|error|stopped]_t>
.
Similarly, we should have three separate queries for a sender’s
completion domain: get_completion_domain<set_[value|error|stopped]_t>
.
Note If we have the
get_completion_scheduler
queries,
why do we need
get_completion_domain
? We can ask
the completion scheduler for its domain, right? The answer is that a
sender like when_all(s1, s2)
doesn’t know what scheduler it will complete on. It completes on the
context of whichever sender, s1
or
s2
, finishes last. But if
s1
and
s2
have the same completion
domain, it doesn’t matter that we do not know the completion
scheduler. The domain determines the preferred set of algorithm
implementations. Hence we need separate queries for the completion
domain. (Additionally, when_all
must
require that all of its child senders share a common domain.)
The addition of the completion domain queries creates a nice symmetry as shown in the table below (with additions in green):
Receiver
|
Sender
|
|
---|---|---|
Query for scheduler | get_scheduler |
get_completion_scheduler<set_value_t> get_completion_scheduler<set_error_t> get_completion_scheduler<set_stopped_t> |
Query for domain | get_domain |
get_completion_domain<set_value_t> get_completion_domain<set_error_t> get_completion_domain<set_stopped_t> |
For a sender sndr
and an
environment env
, we can get the
sender’s completion domain as follows:
auto completion_domain = get_completion_domain<set_value_t>(get_env(sndr), env);
A sender like
just()
would
implement this query as follows:
template <class... Values> class just_sender { private: struct attrs { template <class Env> auto query(get_completion_domain_t<set_value_t>, const Env& env) const noexcept { // just(...) completes where it starts. the domain of the environment is where // the sender will start, so return that. return get_domain(env); } //... }; public: () const noexcept { attrs get_envreturn attrs{}; } //... };
Note A query that accepts an additional argument is
novel in std::execution
,
but the query system was designed to support this usage. See
33.2.2
[exec.queryable.concept].
connect
With the addition of the get_completion_domain<...>
queries that can accept the receiver’s environment,
connect
can
now “paint” the operation with its starting and completing colors, aka
domains. When passed arguments sndr
and rcvr
, the starting domain
is:
// Get the operation's starting domain: auto starting_domain = get_domain(get_env(rcvr));
To get the completion domain (when the operation completes successfully):
// Get the operation's completion domain for the value channel: auto completion_domain = get_completion_domain<set_value_t>(get_env(sndr), get_env(rcvr));
Now
connect
has
all the information it needs to select the correct algorithm
implementation. Great!
But this presents the
connect
function with a dilemna: how does it use two domains to pick
one algorithm implementation?
Consider that the starting domain might want a say in how
start
works, and the completing
domain might want a say in how
set_value
works. So should we let
the starting domain customize start
and the completing domain customize
set_value
?
No. start
and
set_value
are bookends around an
async operation; they must match. Often
set_value
needs state that is set up
in start
. Customizing the two
independently is madness.
NoteThe following is more speculative than what has been described so far.
A possible solution I have been exploring is to bring back sender transforms. Each domain can apply its transform in turn. I do not yet have reason to believe the order matters, but it is important that when asked to transform a sender, a domain knows whether it is the “starting” domain or the “completing” domain.
Here is how a domain might customize
bulk
when it is the completing
domain:
struct thread_pool_domain { template <
sender-for
<bulk_t> Sndr, class Env> auto transform_sender(set_value_t, Sndr&& sndr, const Env& env) const { //... } };
Since it has set_value_t
as its
first argument, this transform is only applied when
thread_pool_domain
is an operation’s
completion domain. Had the first argument been
start_t
, the transform would only be
used when thread_pool_domain
is a
starting domain.
transform_sender
In this reimagined customization design, the
connect
CPO
does a few things:
Determines the starting and completing domains,
Applies the completing domain’s transform (if any),
Applies the starting domain’s transform (if any) to the resulting sender,
Connnects the twice-transformed sender to the receiver.
The first three steps are doing something different than connecting a
sender and receiver, so it makes sense to factor them out into their own
utility. I call it transform_sender
here, but it does not need to be normative since only
connect
will
call it.
The new transform_sender
looks
like this:
template <class Domain, class Tag, class Sndr, class Env> concept
has-sender-transform-for
= requires (Sndr(*make_sndr)(), const Env env) { ().transform_sender(Tag(), make_sndr(), env); Domain} template <class Domain, class Tag> constexpr autotransform-sender-recurse
=overload-set
{ []<class Self, class Sndr, class Env>(this Self self, Sndr&& sndr, const Env& env) -> decltype(auto) requireshas-sender-transform-for
<Domain, Tag, Sndr, Env> { return self(Domain().transform_sender(Tag(), std::forward<Sndr>(sndr), env)); }, []<class Sndr, class Env>(Sndr&& sndr, const _Env&) -> Sndr { return std::forward<Sndr>(sndr); } }; template <class Sndr, class Env> auto transform_sender(Sndr&& sndr, const Env& env) { auto starting_domain = get_domain(env); auto completing_domain = get_completion_domain<set_value_t>(get_env(sndr), env); auto starting_transform =transform-sender-recurse
<decltype(starting_domain), start_t>; auto completing_transform =transform-sender-recurse
<decltype(completing_domain), set_value_t>; return starting_transform(completing_transform(std::forward<Sndr>(sndr), env), env); }
With this definition of
transform_sender
, connect(sndr, rcvr)
is equivalent to transform_sender(sndr, get_env(rcvr)).connect(rcvr)
.
Let’s see how this new approach addresses the problems noted in the motivating example above. The troublesome code is:
namespace ex = std::execution; auto sndr = ex::starts_on(gpu, ex::just()) | ex::then(fn); ::this_thread::sync_wait(std::move(sndr)); std
The problem with P3718
describes how the current design and the “fixed” one proposed in [P3718R0] go off the rails while
determining the domain in which the function
fn
will execute, causing it to use a
CPU implementation instead of a GPU one.
In the new design, when the then
sender is being connected to
sync_wait
’s receiver, the starting
domain will still be the
default_domain
, but when asking the
sender where it will complete, the answer will be different. Let’s see
how:
When asked for its completion domain, the
then
sender will ask the
starts_on
sender where it will
complete, as if by:
auto&&
tmp1
= ex::starts_on(gpu, ex::just()); autodom1
= ex::get_completion_domain<ex::set_value_t>(ex::get_env(tmp1
), ex::get_env(rcvr));
In turn, the starts_on
sender
asks the
just()
sender where it will complete, telling it where it will start.
(This is the new bit.) It looks like:
auto&&
tmp2
= ex::just(); // ask for the gpu scheduler's domain: autogpu-dom
= ex::get_completion_domain<ex::set_value_t>(gpu); // construct an env that reflects the fact thattmp2
will be started on the gpu: autoenv2
= ex::env{ex::prop{ex::get_scheduler, gpu}, ::prop{ex::get_domain,gpu-dom
}, ex::get_env(rcvr)}; ex// pass the new env when asking `just()` for its completion domain: autodom2
= ex::get_completion_domain<ex::set_value_t>(ex::get_env(tmp2
),env2
);
The
just()
sender, when asked where it will complete, will respond with the domain
on which it is started. That information is provided by the
env2
environment passed to
the query: get_domain(
.
That will return
env2
)gpu-dom
.
Having correctly determined that the
then
sender will start on the
default domain and complete on the GPU domain,
connect
can
select the right implementation for the
then
algorithm. It does that by
calling:
return ex::transform_sender(sndr, ex::get_env(rcvr)).connect(rcvr);
The transform_sender
call will
execute the following (simplified):
::default_domain().transform_sender(ex::start, ex
gpu-dom
.transform_sender(ex::set_value, sndr, ex::get_env(rcvr)), ::get_env(rcvr)) ex
The default_domain
does not apply
any transformation to then
senders,
so this expression reduces to:
gpu-dom
.transform_sender(ex::set_value, sndr, ex::get_env(rcvr))
So, in the new customization scheme, the GPU domain gets a crack at
transforming the then
sender before
it is connected to a receiver, as it should.