Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nogil inconsistent empty list while sorting #126559

Open
luisggpina opened this issue Nov 7, 2024 · 6 comments
Open

nogil inconsistent empty list while sorting #126559

luisggpina opened this issue Nov 7, 2024 · 6 comments
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@luisggpina
Copy link

luisggpina commented Nov 7, 2024

Bug report

Bug description:

Hi,

We're a research group focused on testing concurrent runtimes. Our work-in-progress prototype found a violation of atomicity on the current nogil build when using concurrent operations on the same list. The program below shows the wrong behavior.

from threading import Thread,Barrier,Lock

def t1(b1,l,r):
    b1.wait()
    r[0] = l.__len__()

def t2(b1,l,r):
    b1.wait()
    r[0] = l.__str__()

def t3(b1,l):
    b1.wait()
    l.sort()

def normalSetTest(i):

    sharedList = [1,2,3] * 100
    barrier = Barrier(2)
    res = [ None ]
    threads = [
            Thread(target= t1, args=(barrier,sharedList,res,)),
            #Thread(target= t2, args=(barrier,sharedList,res,)),
            Thread(target= t3, args=(barrier,sharedList,)),
            ]


    for t in threads:
        t.start()
    for t in threads:
        t.join()

    if res[0] in [ 0, '[]' ]:
        print("\tfound bug: " + str(res[0]))

print("test begin...")

for n in range(0,10):
    threads = []

    for i in range(0,1):
        threads.append(Thread(target= normalSetTest, args=(n,)))

    for t in threads:
        t.start()

    for t in threads:
        t.join()

print("test Done")

A list with 300 integers is sorted in t3 while another thread either gets the size of the list t1 or turns the list into a string t2. Running the code above will show threads t1 and t2 finding an inconsistent empty list with length 0 or turned into the string "[]".
Our tool did not find any other interesting values: either it's an expected value (original list or sorted list) OR it's empty.

Sample output:

test begin...
        found bug: 0
        found bug: 0
        found bug: 0
        found bug: 0
        found bug: 0
test Done

We're happy to provide more details about this bug, and to help developers reproducing it.

Output of python -VV: Python 3.14.0a1+ experimental free-threading build (heads/main:faa3272fb8d, Oct 29 2024, 09:14:25) [GCC 14.2.1 20240805]

@flypoodles and @overlorde are part of the team, adding them so they get notified about further discussion.

I believe this issue is part of the ongoing conversation on #126136 about acceptable behaviors of containers operated by many threads concurrently, described in the original nogil PEP: https://peps.python.org/pep-0703/#container-thread-safety

CPython versions tested on:

3.13, 3.14, CPython main branch

Operating systems tested on:

Linux

@luisggpina luisggpina added the type-bug An unexpected behavior, bug, or error label Nov 7, 2024
@ZeroIntensity ZeroIntensity added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.13 bugs and security fixes topic-free-threading 3.14 new features, bugs and security fixes labels Nov 7, 2024
@colesbury
Copy link
Contributor

I don't think this is worth changing. Python, since at least Python 2.x days, has temporarily made the list appear empty when sorting in place. This can be visible to other threads depending on the types of objects or if the element comparison is implemented in Python.

cpython/Objects/listobject.c

Lines 2831 to 2839 in 09d6f5d

/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
saved_ob_size = Py_SIZE(self);
saved_ob_item = self->ob_item;
saved_allocated = self->allocated;
Py_SET_SIZE(self, 0);

@ZeroIntensity
Copy link
Member

I'm not sure we would need to mess with how the list gets sorted--the easier fix would be to lock the list during execution of __len__ and __str__ (that fixes the reproducer on my end), but that would slow things down.

@luisggpina
Copy link
Author

Is it possible to move this:

cpython/Objects/listobject.c

Lines 2831 to 2839 in 09d6f5d

/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
saved_ob_size = Py_SIZE(self);
saved_ob_item = self->ob_item;
saved_allocated = self->allocated;
Py_SET_SIZE(self, 0);

To after the pre-sort check here:

/* End of pre-sort check: ms is now set properly! */

And only if the comparator used is custom?

That would still cause other threads to see the list as empty while the list is being sorted using a Python comparator, but that feels less surprising and more consistent than the current approach. It can also be documented as part of the expected behavior when using Python comparators.

@colesbury
Copy link
Contributor

Yes, we could do something like that. It would mean that sorting lists of strings, ints, and floats would essentially be atomic, but list of other types (or lists of mixed types) would still not be atomic, even if they are atomic in the GIL-enabled build.

@luisggpina
Copy link
Author

Hmm, not sure if that's a good option.

I understand the value of not locking to read the size of a list, that sounds like a very common operation that should be fast.

Here's a concurrent-friendly alternative:

  1. Pick a large number that lists are unlikely to be the length of (e.g., 12348, maybe even regenerated at compile time)
  2. When sorting, set the length to that with an atomic write
  3. When getting the length, use an atomic read.
    1. I's not the special number, return (most common, no locks, one atomic read)
    2. It's the special number, get the lock and read again
      1. It's not the special number. Someone else was sorting the list, they are done, and we hold the lock now. Return the new length.
      2. It's still the special number. Either the list actually is that length and we got super unlucky, or we are sorting the list and should return zero. Check which, and return the correct value.

This seems to be fast in the common case, backwards compatible with the zero length behavior, and allows for sorting to be atomic.

@colesbury
Copy link
Contributor

I'm not enthusiastic about further changes to the length field. That has the potential to leak complexity into other list APIs or direct accesses to ob_size and I don't think that's worth it.

For additional context, the "empty list behavior" is documented:

CPython implementation detail: While a list is being sorted, the effect of attempting to mutate, or even inspect, the list is undefined. The C implementation of Python makes the list appear empty for the duration, and raises ValueError if it can detect that the list has been mutated during a sort.

https://docs.python.org/3/library/stdtypes.html#list.sort:~:text=makes%20the%20list-,appear%20empty,-for%20the%20duration

Here's a sketch of an alternative. I still don't think it's worthwhile, but it keeps the complexity confined to places that I'm more comfortable with.

The underlying ideas are:

  • We only need to make the list appear empty if some escaping call during sorting, like a comparison, attempts to modify the list (or releases the list lock to allow another thread to do so).
  • We don't want to make the list appear empty if we're not making any escaping calls, because those are the cases that appeared atomic with the GIL.
  • It's difficult to determine which case we're in at the beginning of PyList_Sort(), but releasing the list's lock is an indication that we made an escaping call. If PyList_Sort() doesn't release the list's lock, then no other code can modify the list, and we don't need to make it appear empty.

To implement this:

  • Extend PyCriticalSection with an optional callback function that is invoked immediately before the critical section is suspended (i.e., before the lock is temporarily released).
  • In PyList_Sort(), the callback makes the list appear empty.

As I wrote above, I still don't think this is worthwhile. I don't want to extend PyCriticalSection with new features just for a single use case, but if we run into further issues like this I'd reconsider that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants