Skip to content

gh-144356: Avoid races when computing set_iterator.__length_hint__ under no-gil#144357

Open
hyongtao-code wants to merge 12 commits intopython:mainfrom
hyongtao-code:fix-data-race
Open

gh-144356: Avoid races when computing set_iterator.__length_hint__ under no-gil#144357
hyongtao-code wants to merge 12 commits intopython:mainfrom
hyongtao-code:fix-data-race

Conversation

@hyongtao-code
Copy link
Contributor

@hyongtao-code hyongtao-code commented Jan 31, 2026

Long log:

setiter_len() was reading so->used without atomic access while concurrent
mutations update it atomically under Py_GIL_DISABLED.

In free-threaded builds, setiter_len() could race with concurrent set
mutation and iterator exhaustion.

Use an atomic load for so->used to avoid a data race. This preserves the
existing semantics of __length_hint__ while making the access thread-safe.

Signed-off-by: Yongtao Huang yongtaoh2022@gmail.com

setiter_len() was reading so->used without atomic access while concurrent
mutations update it atomically under Py_GIL_DISABLED.

Use an atomic load for so->used to avoid a data race. This preserves the
existing semantics of __length_hint__ while making the access thread-safe.

Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
@hyongtao-code hyongtao-code changed the title gh-144356: fix data race in setiter_len() under no-gil gh-144356: Avoid races when computing set_iterator.__length_hint__ under no-gil Feb 1, 2026
@hyongtao-code
Copy link
Contributor Author

Thanks for the review. I’ve decided to address both issues in this PR. I also added a corresponding test case for the issue you pointed out.

setiterobject *si = (setiterobject*)op;
Py_ssize_t len = 0;
if (si->si_set != NULL && si->si_used == si->si_set->used)
#ifdef Py_GIL_DISABLED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might work for setiter_len, but setiter_iternext itself is not yet thread safe (also because of setting si->si_set to zero).

For several other iterations the approach is to keep the reference si->si_set , but use another attribute to signal exhaustion of the iterator. For example for itertools.cycle or the reversed operator.

Note: I tried creating a minimal example where concurrent iteration fails, but I have succeeded yet (the example does not crash, although I have not run thread sanitizer on it yet)

Test for concurrent iteration on set iterator
import unittest
from threading import Thread, Barrier


class TestSetIter(unittest.TestCase):
    def test_set_iter(self):
        """Test concurrent iteration over a set"""

        NUM_LOOPS = 10_000
        NUM_THREADS = 4
        

        for ii in range(NUM_LOOPS):
            if ii % 1000 ==0:
                print(f'test_set_iter {ii}')
            barrier = Barrier(NUM_THREADS)
            
            # make sure the underlying set is unique referenced by the iterator
            iterator = iter(set((1,2,))) 
            
            def worker():
                barrier.wait()
                while True:
                    iterator.__length_hint__()
                    try:
                        next(iterator)
                    except StopIteration:
                        break

                
            threads = [Thread(target=worker) for _ in range(NUM_THREADS)]
            for t in threads:
                t.start()
            for t in threads:
                t.join()
                
            assert iterator.__length_hint__()==0

if __name__ == "__main__":
    unittest.main()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I think your points make a lot of sense, and I really appreciate the two links you shared—they helped me get a more complete picture of the iterator-related data race.
I’ll try to construct the case you mentioned under a TSan environment.
If it turns out to be appropriate, we can address it fully in this PR, that would be great. Of course, this will take some time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should fix this like we have fixed others and as Sam suggested only clear the associated set in non-free-threading builds. The current code is incorrect because it uses try incref which can fail spuriously if the set object is not marked to enable try incref.

Py_END_CRITICAL_SECTION();
si->si_pos = i+1;
if (key == NULL) {
#ifdef Py_GIL_DISABLED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should follow the pattern that we use in other iterators: don't clear si->si_set when the iterator is exhausted in the free-threaded build.

That will keep other things simpler.

Copy link
Contributor

@eendebakpt eendebakpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some more review comments. I think we can get this right, but a simpler approach here would be to put a critical section on the set iterator itself.

Comment on lines 1154 to 1158
#ifdef Py_GIL_DISABLED
FT_ATOMIC_STORE_SSIZE_RELAXED(si->si_pos, i + 1);
#else
si->si_pos = i + 1;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#ifdef Py_GIL_DISABLED
FT_ATOMIC_STORE_SSIZE_RELAXED(si->si_pos, i + 1);
#else
si->si_pos = i + 1;
#endif
FT_ATOMIC_STORE_SSIZE_RELAXED(si->si_pos, i + 1);

On the normal build the macro will expand to si->si_pos = i + 1;

#ifdef Py_GIL_DISABLED
/* free-threaded: keep si_set; just mark exhausted */
FT_ATOMIC_STORE_SSIZE_RELAXED(si->si_pos, -1);
si->len = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and some other places) should also be atomic?

else {
#ifdef Py_GIL_DISABLED
/* free-threaded: keep si_set; just mark exhausted */
FT_ATOMIC_STORE_SSIZE_RELAXED(si->si_pos, -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value -1 written here could be overwritten by a concurrent thread (at line 1155). Which means that over exhaustion of the set iterator it is restored back to life. This does not lead to overflows or other issues (afaic), but is a bit odd behaviour.


if (key == NULL) {
si->si_set = NULL;
#ifndef Py_GIL_DISABLED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the normal build you still have to do si->si_set = NULL;, otherwise the si->si_set is decref'ed again in setiter_dealloc.

@hyongtao-code
Copy link
Contributor Author

put a critical section on the set iterator

Thanks for the suggestion. I simplified the implementation by protecting both the iterator and the set with Py_BEGIN_CRITICAL_SECTION2(self, so), so concurrent use of the same iterator is properly serialized. This keeps the logic aligned with other iterators and avoids the races.

It looks like there is an unrelated flaky test case.

Copy link
Contributor

@eendebakpt eendebakpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hyongtao-code Thanks for your work on this! The core idea of the PR has shifted a bit, which is why I have some more review comments, but we are getting closer I believe.

if (so != NULL) {
Py_BEGIN_CRITICAL_SECTION2(op, so);
if (si->si_pos >= 0 &&
si->si_used == FT_ATOMIC_LOAD_SSIZE_RELAXED(so->used))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are in a critical section on so the atomic load here is not required.

Py_ssize_t len = 0;
if (si->si_set != NULL && si->si_used == si->si_set->used)

#ifdef Py_GIL_DISABLED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we also set si->si_pos to -1 in the non-free threading build, then we can keep the code for the FT and non-FT the same I think. For the non-FT build we then set both si->si)set to zero and si->si_pos to -1 at exhaustion, but the code will be simpler.

Also the two code paths in setiter_iternext will be more similar then.

hyongtao-code and others added 2 commits February 9, 2026 19:09
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
for _ in range(NUM_LOOPS):
s.add(i)
s.discard(i - 1)
i += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the reader here will be finished much faster than the writer. (the length hint seems way cheaper than the add and discard). There are several ways to make this more balanced. Here is a suggestion, but there might be better ones.

Suggested change
i += 1
def reader():
barrier.wait()
while s:
it.__length_hint__()
def writer():
barrier.wait()
i = 0
for _ in range(NUM_LOOPS):
s.add(i)
s.discard(i - 1)
i += 1
s.clear()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delayed response. I will check them one by one. Really, thanks for your guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants