Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F: Ungil on nightly insufficient for soundness of allow_threads, even without scoped-tls or send_wrapper #3658

Open
steffahn opened this issue Dec 16, 2023 · 8 comments · May be fixed by #3646

Comments

@steffahn
Copy link

steffahn commented Dec 16, 2023

use std::cell::Cell;
use std::thread;

use pyo3::prelude::*;

#[pyclass]
struct Class(Cell<u8>);

fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        let class_1 = Py::new(py, Class(Cell::new(0)))?;
        let class_2 = class_1.clone_ref(py);
        let cell: &Cell<u8> = &class_1.borrow(py).0;
        py.allow_threads(|| {
            let t = thread::spawn(move || {
                Python::with_gil(|py| {
                    let cell: &Cell<u8> = &class_2.borrow(py).0;
                    let init = cell.get();
                    loop {
                        let gotten = cell.get();
                        if init != gotten {
                            println!("concurrently modified! - {init} vs {gotten}");
                            break;
                        }
                    }
                });
            });
            loop {
                cell.set(cell.get().wrapping_add(1));
                if t.is_finished() {
                    break;
                }
            }
            t.join().unwrap();
        });
        PyResult::Ok(())
    })?;
    PyResult::Ok(())
}
$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/…`
concurrently modified! - 63 vs 66

(concurrent modification of a Cell is UB; I skipped the additional step where you could abuse this data race to cause more obvious UB such as reading memory out of bounds)

This is issue is possibly irrelevant in light of existing issues like #3640 (and fixed by #3646 which abandons Ungil). Also, it might in principle also be fixable by making .borrow do a dynamic check that we’re in the same thread as previous .borrow; or by demanding Sync for pyclass; or letting the user choose between either of the two options (dynamic check or Sync).

I’m raising this issue as another point of argument that nightly Ungil isn’t as good as it claims to be, and this would need to be addressed in case the argument would win that we should keep Ungil (with documented unsoundness warnings) for its benefits of not conservatively requiring Send.

@mejrs
Copy link
Member

mejrs commented Dec 16, 2023

To support things like subinterpreters and "no gil" pyclasses must be Sync, so we might be requiring that at some point anyway.

@steffahn
Copy link
Author

And now, here’s another argument why allow_threads should better require Send, even if pyclass enforced Sync: The GILProtected wrapper!

use std::cell::Cell;
use std::thread;

use pyo3::prelude::*;
use pyo3::sync::GILProtected;

fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        let protected = GILProtected::new(Cell::new(0));
        let cell: &Cell<u8> = protected.get(py);
        py.allow_threads(|| {
            thread::scope(|s| {
                let t = s.spawn(|| {
                    Python::with_gil(|py| {
                        let cell: &Cell<u8> = protected.get(py);
                        let init = cell.get();
                        loop {
                            let gotten = cell.get();
                            if init != gotten {
                                println!("concurrently modified! - {init} vs {gotten}");
                                break;
                            }
                        }
                    });
                });
                loop {
                    cell.set(cell.get().wrapping_add(1));
                    if t.is_finished() {
                        break;
                    }
                }
            })
        });
        PyResult::Ok(())
    })?;
    PyResult::Ok(())
}

@davidhewitt
Copy link
Member

davidhewitt commented Dec 17, 2023

Thank you @steffahn, your scrutiny here is invaluable!

Given that nogil Python is now seemingly on track for Python 3.13, I would be in favour of moving fast here and requiring #[pyclass] to beSync. (Except for #[pyclass(unsendable)], which already does thread affinity checks on .borrow()).

If I follow the reasoning on the second example, this implies that Ungil isn't as good as the Send bound (ignoring #3640 which is broken with both auto traits), as GILProtected is much more likely to interact with allow_threads than SendWrapper.

(Although a slight weakening of that point is that with Python nogil I assume we need to rework GILProtected, potentially this work can be brought forward if we wanted to eliminate this negative interaction with Ungil.)

@mejrs
Copy link
Member

mejrs commented Dec 17, 2023

To make subinterpreters and no-gil work we'd need to get rid of GilProtected too

@adamreichold
Copy link
Member

Somewhat off-topic for subinterpreters, I guess GILProtected and GILOnceCell could become something like "interpreter-local storage" in analogy to TLS?

As for nogil, I suspect we'll want to provide a sort of "GIL emulation" using a normal mutex so that Rust-build extensions could work the same for gil and nogil builds. Or we drop any GIL-based synchronization outright, but I guess that would force everybody into paying the cost even if they are happy with single-threaded Python.

(At the danger of outing myself, I will miss Python as a single-threaded language avoiding the complexities of parallelism. I am not convinced that emphasizing throughput over efficiency is the right thing to do for almost all current Python users, but mainly for large organizations with AI-related investments into Python-based technology stacks.)

@davidhewitt
Copy link
Member

Thinking further, it seems to me the root cause of the problems we're running into here are down to the GIL marker being overloaded, it's currently in use as "safe to call Python APIs" as well as a synchronization primitive. In a nogil-only world in the long run it may simplify things to be able to remove that latter use.

Somewhat off-topic for subinterpreters, I guess GILProtected and GILOnceCell could become something like "interpreter-local storage" in analogy to TLS?

Ooof, maybe, I was sort of hoping that we wouldn't change how they worked for subinterpreters, but I guess if we want interpreter isolation, that's a possible way to do it.

At the danger of outing myself, [...]

You're not the only one I know who holds this view, even if I myself am quite optimistic for nogil. I think the SC would never have accepted the proposal if the belief wasn't that the single thread efficiency losses are minimal, and that most Python code won't need to have new adjustments to be thread safe.

@davidhewitt
Copy link
Member

As for nogil, I suspect we'll want to provide a sort of "GIL emulation" using a normal mutex so that Rust-build extensions could work the same for gil and nogil builds. Or we drop any GIL-based synchronization outright, but I guess that would force everybody into paying the cost even if they are happy with single-threaded Python.

I suspect that we will need to keep something around, because I think mutex + GIL has been proven too many times that it's a recipe for deadlocks.

That said, we could just offer GILSafeMutex which unlocks the gil while blocking for the mutex, and on nogil world is just a noop wrapper on a normal mutex. Though actually we should check the mechanics of the stop-the-world GC, because we might want the same thing in nogil anyway. (I.e. avoid deadlocks between mutexes and the GC pauses)

@adamreichold
Copy link
Member

adamreichold commented Dec 18, 2023

if the belief wasn't that the single thread efficiency losses are minimal,

This is going off the rails where do we find the space to discuss this stuff? I think the above is realistic, i.e. the quantitative losses will be bearable. But I also think the real price is something different, namely the reduced audience for people making changes to Python, i.e. to mostly people doing this in a professional capacity and therefore having the context window to learn and manage the new complexities around e.g. reference counting. But I also admit that this change is happening already independently of the nogil changes, e.g. the JIT-like late optimizations are probably a similar barrier to entry, implemented for similar reasons as the nogil changes and similarly well-received in the community. So my main gripe is that I suspect that many Python users will loose the ability to fully understand and shape their own tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants