Skip to content

Implement P3149R11 and P3815R1#1713

Draft
ispeters wants to merge 8 commits intoNVIDIA:mainfrom
ispeters:p3149_plus_p3815
Draft

Implement P3149R11 and P3815R1#1713
ispeters wants to merge 8 commits intoNVIDIA:mainfrom
ispeters:p3149_plus_p3815

Conversation

@ispeters
Copy link

@ispeters ispeters commented Dec 14, 2025

This PR will implement P3149R11 patched with P3815R1.

  • [exec.scope.concepts]
    • concept scope_association
    • concept scope_token
  • [exec.associate]
    • associate algorithm
  • [exec.stop.when]
    • stop-when algorithm
  • [exec.spawn]
    • spawn algorithm
  • [exec.spawn.future]
    • spawn_future algorithm
  • [exec.counting.scopes]
    • simple_counting_scope
    • counting_scope

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 14, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ericniebler
Copy link
Collaborator

/ok to test fd9d04b

@ispeters
Copy link
Author

ispeters commented Dec 16, 2025

The latest push includes an implementation of stdexec::associate with some tests, but the tests are incomplete so I haven't marked associate as done in the summary.

I have also tried to address the build failures on the previous iteration, but I don't have a local test environment for the failing build types so I haven't tested the fixes locally, other than to confirm that they still build with my local Clang (i.e. Homebrew clang version 21.1.7).

@ispeters
Copy link
Author

Rebased on #1717.

@ispeters ispeters force-pushed the p3149_plus_p3815 branch 2 times, most recently from 96d4661 to f786c00 Compare December 22, 2025 06:42
@ericniebler
Copy link
Collaborator

/ok to test f786c00

@ispeters ispeters force-pushed the p3149_plus_p3815 branch 3 times, most recently from 67262ee to 97d79ee Compare December 23, 2025 21:08
@ericniebler
Copy link
Collaborator

/ok to test 97d79ee

@ispeters
Copy link
Author

I managed to get GCC 12 installed on my Mac and it repro'd the ICE and the build failure in test_stop_when.cpp; I think I've addressed the build failures in GCC, nvcc, and MSVC.

@ispeters ispeters force-pushed the p3149_plus_p3815 branch 2 times, most recently from 52e1755 to 521d849 Compare December 24, 2025 08:06
@ispeters
Copy link
Author

simple_counting_scope still needs more tests, but I did discover that [exec.simple.counting.mem], paragraph 9 has a bug: we need to move the scope to the joined state and return true if count is zero, regardless of the value of state.

@ericniebler
Copy link
Collaborator

/ok to test cc867f4

@ispeters ispeters force-pushed the p3149_plus_p3815 branch 2 times, most recently from a1dd153 to 9263dab Compare December 25, 2025 07:32
@ispeters
Copy link
Author

I think I've fixed the build breaks in old gcc, nvc++, and MSVC. Getting a repro of the gcc 11 builds was quite a pain—I can't figure out how to get gcc 11 to run properly on macOS Tahoe so I had to use container, which I learned about here. Fixing MSVC is a matter of guessing. I can't run nvc++ on my Mac, but it turns out it's available on Godbolt, and this was very helpful.

I've updated __counting_scopes.hpp rather extensively; I renamed some of the variables in simple_counting_scope to make the naming more consistent, and I've thoroughly documented both the paragraphs in the standard being implemented and my justifications for weaker-than-seq_cst memory orderings.

@ispeters
Copy link
Author

Pull the guts of simple_counting_scope into a __base_scope and implement both simple_counting_scope and counting_scope in terms of it.

@ispeters
Copy link
Author

Do some template hoisting and renaming, and replace a single virtual function with a function pointer to cut down on RTTI.

@ispeters
Copy link
Author

Add spawn_future and rebase on #1682.

@ericniebler
Copy link
Collaborator

/ok to test 4717c70

@ispeters
Copy link
Author

I think I've fixed both the gcc build errors and the memory leak. The GPUs seems to be on holiday, though—I haven't seen any CI results from the GPU jobs in a while.

@ericniebler
Copy link
Collaborator

/ok to test 6596a20

@ericniebler
Copy link
Collaborator

/ok to test 7f04df1

@ispeters ispeters force-pushed the p3149_plus_p3815 branch 2 times, most recently from 0ece92d to a7c40ee Compare January 1, 2026 08:47
@ispeters
Copy link
Author

ispeters commented Jan 1, 2026

  • Rebase on a bunch of recent commits.
  • Incorporate @ericniebler's "add missing include" change into 401944f and update authorship attribution
  • Simplify the implementation of forwarding stop requests by only invoking __abandon if the operation is never started
  • Still haven't addressed the ASAN error, though

@ericniebler
Copy link
Collaborator

/ok to test a7c40ee

@ericniebler
Copy link
Collaborator

/ok to test 3c160e1

@ericniebler
Copy link
Collaborator

/ok to test c5ee582

@ccotter ccotter mentioned this pull request Jan 6, 2026
ispeters and others added 7 commits January 18, 2026 18:58
This diff adds a definition for `concept stdexec::scope_association`
plus tests confirming it accepts and rejects the expected things.
This diff adds a definition for `concept stdexec::scope_token` plus
tests confirming it accepts and rejects the expected things.

Co-authored-by: Eric Niebler <eniebler@nvidia.com>
This diff defines `stdexec::associate` and adds some initial tests to
confirm it works properly. Still a work in progress.
This diff defines `stdexec::__stop_when` as the implementation of
_`stop-when`_ and adds tests to validate the algorithm.

Co-authored-by: Eric Niebler <eniebler@nvidia.com>
This diff adds `stdexec::spawn` and its tests.

Co-authored-by: Eric Niebler <eniebler@nvidia.com>
This diff adds `stdexec::simple_counting_scope` and
`stdexec::counting_scope` plus tests for both. The tests found a spec
bug that's been filed as an LWG issue.

Co-authored-by: Eric Niebler <eniebler@nvidia.com>
This diff adds `stdexec::spawn_future` and some basic tests.

Co-authored-by: Eric Niebler <eniebler@nvidia.com>
@ispeters
Copy link
Author

I just force-pushed after rebasing on main.

This diff probably implements the intended design that stop requests
sent to a started future-sender get forwarded to the spawned work. The
existing tests still pass, but I don't have new tests to confirm the new
behaviour.

Co-authored-by: Eric Niebler <eniebler@nvidia.com>
@ispeters
Copy link
Author

I've re-jiggered the implementation of stop request handling in spawn_future; I need to hand over the computer to my kid now, so I haven't had a chance to re-review my own code like I usually do, but I think this cleans up the implementation some, at the cost of another virtual function on __future_state_base.

Copy link
Collaborator

@ericniebler ericniebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking amazing. minor comments only. lmk if you want a hand with merge conflicts.

// but the above code ICEs gcc 11 and 12 (and maybe MSVC)
// so we declare a named callable
struct __deleter {
constexpr void operator()(__wrap_sender_t* p) const noexcept {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all identifier names need to be uglified.

Suggested change
constexpr void operator()(__wrap_sender_t* p) const noexcept {
constexpr void operator()(__wrap_sender_t* __p) const noexcept {

Comment on lines +97 to +98
__associate_data(__associate_data&& __other)
noexcept(__nothrow_move_constructible<__wrap_sender_t>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move operations should all be noexcept unconditionally. you can then static_assert that __nothrow_move_constructible<__wrap_sender_t> is true.

}
};

using __sender_ref = std::unique_ptr<__wrap_sender_t, __deleter>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using a unique_ptr for this seems like overkill. my preference would be to use the __scope_guard in __detail/__scope.hpp instead. something like:

__scope_guard __g{std::destroy_at<_Sender>, &__sndr};

}

std::pair<__assoc_t, __sender_ref> release() && noexcept {
__sender_ref u(__assoc_ ? std::addressof(__sndr_) : nullptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think you can pass nullptr to std::destroy_at the way you can with delete.

}

private:
__associate_data(std::pair<__assoc_t, __sender_ref> __parts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__associate_data(std::pair<__assoc_t, __sender_ref> __parts)
explicit __associate_data(std::pair<__assoc_t, __sender_ref> __parts)

// to avoid invoking request_stop when it's unnecessary. It might be.
//
// We *must* invoke request_stop before returning from the function if our CAS
// succeeds (because we arrived before either __complete or __coneume), or if the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// succeeds (because we arrived before either __complete or __coneume), or if the
// succeeds (because we arrived before either __complete or __consume), or if the

void __do_consume(auto& __rcvr) noexcept {
using __variant_t = decltype(this->__result_);

__variant_t::visit(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__variant_t::visit(
__visit(

Comment on lines +611 to +624
auto* op = traits::allocate(alloc, 1);

try {
traits::construct(
alloc,
op,
alloc,
static_cast<_Sender&&>(__sndr),
static_cast<_Token&&>(__tkn),
__spawn_common::__choose_senv(__env, get_env(__sndr)));
} catch (...) {
traits::deallocate(alloc, op, 1);
throw;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

god i hate having to deal with allocators. scope guard here instead of try/catch.

// __data_of<_Sender> is a unique_ptr specialization
using __unique_ptr_t = __data_of<std::remove_cvref_t<_Sender>>;

__future_operation_base(__unique_ptr_t&& __future) noexcept
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit

Comment on lines +158 to +163
return __apply(
[&__rcvr](__ignore, auto&& token, __ignore) noexcept {
return __make_token_fn{}(
static_cast<decltype(token)>(token), get_stop_token(STDEXEC::get_env(__rcvr)));
},
static_cast<_Self&&>(__self));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this can now be (untested):

auto &[__tag, __token, __child] = __self;
return __make_token_fn{}(__forward_like<_Self>(__token),
                         get_stop_token(STDEXEC::get_env(__rcvr)));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants