`F: Ungil` on nightly insufficient for soundness of `allow_threads`, even without `scoped-tls` or `send_wrapper` #3658

steffahn · 2023-12-16T16:51:12Z

use std::cell::Cell;
use std::thread;

use pyo3::prelude::*;

#[pyclass]
struct Class(Cell<u8>);

fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        let class_1 = Py::new(py, Class(Cell::new(0)))?;
        let class_2 = class_1.clone_ref(py);
        let cell: &Cell<u8> = &class_1.borrow(py).0;
        py.allow_threads(|| {
            let t = thread::spawn(move || {
                Python::with_gil(|py| {
                    let cell: &Cell<u8> = &class_2.borrow(py).0;
                    let init = cell.get();
                    loop {
                        let gotten = cell.get();
                        if init != gotten {
                            println!("concurrently modified! - {init} vs {gotten}");
                            break;
                        }
                    }
                });
            });
            loop {
                cell.set(cell.get().wrapping_add(1));
                if t.is_finished() {
                    break;
                }
            }
            t.join().unwrap();
        });
        PyResult::Ok(())
    })?;
    PyResult::Ok(())
}

$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/…`
concurrently modified! - 63 vs 66

(concurrent modification of a Cell is UB; I skipped the additional step where you could abuse this data race to cause more obvious UB such as reading memory out of bounds)

This is issue is possibly irrelevant in light of existing issues like #3640 (and fixed by #3646 which abandons Ungil). Also, it might in principle also be fixable by making .borrow do a dynamic check that we’re in the same thread as previous .borrow; or by demanding Sync for pyclass; or letting the user choose between either of the two options (dynamic check or Sync).

I’m raising this issue as another point of argument that nightly Ungil isn’t as good as it claims to be, and this would need to be addressed in case the argument would win that we should keep Ungil (with documented unsoundness warnings) for its benefits of not conservatively requiring Send.

The text was updated successfully, but these errors were encountered:

mejrs · 2023-12-16T22:28:00Z

To support things like subinterpreters and "no gil" pyclasses must be Sync, so we might be requiring that at some point anyway.

steffahn · 2023-12-16T23:06:31Z

And now, here’s another argument why allow_threads should better require Send, even if pyclass enforced Sync: The GILProtected wrapper!

use std::cell::Cell;
use std::thread;

use pyo3::prelude::*;
use pyo3::sync::GILProtected;

fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        let protected = GILProtected::new(Cell::new(0));
        let cell: &Cell<u8> = protected.get(py);
        py.allow_threads(|| {
            thread::scope(|s| {
                let t = s.spawn(|| {
                    Python::with_gil(|py| {
                        let cell: &Cell<u8> = protected.get(py);
                        let init = cell.get();
                        loop {
                            let gotten = cell.get();
                            if init != gotten {
                                println!("concurrently modified! - {init} vs {gotten}");
                                break;
                            }
                        }
                    });
                });
                loop {
                    cell.set(cell.get().wrapping_add(1));
                    if t.is_finished() {
                        break;
                    }
                }
            })
        });
        PyResult::Ok(())
    })?;
    PyResult::Ok(())
}

davidhewitt · 2023-12-17T00:22:40Z

Thank you @steffahn, your scrutiny here is invaluable!

Given that nogil Python is now seemingly on track for Python 3.13, I would be in favour of moving fast here and requiring #[pyclass] to beSync. (Except for #[pyclass(unsendable)], which already does thread affinity checks on .borrow()).

If I follow the reasoning on the second example, this implies that Ungil isn't as good as the Send bound (ignoring #3640 which is broken with both auto traits), as GILProtected is much more likely to interact with allow_threads than SendWrapper.

(Although a slight weakening of that point is that with Python nogil I assume we need to rework GILProtected, potentially this work can be brought forward if we wanted to eliminate this negative interaction with Ungil.)

mejrs · 2023-12-17T16:45:06Z

To make subinterpreters and no-gil work we'd need to get rid of GilProtected too

adamreichold · 2023-12-17T17:09:46Z

Somewhat off-topic for subinterpreters, I guess GILProtected and GILOnceCell could become something like "interpreter-local storage" in analogy to TLS?

As for nogil, I suspect we'll want to provide a sort of "GIL emulation" using a normal mutex so that Rust-build extensions could work the same for gil and nogil builds. Or we drop any GIL-based synchronization outright, but I guess that would force everybody into paying the cost even if they are happy with single-threaded Python.

(At the danger of outing myself, I will miss Python as a single-threaded language avoiding the complexities of parallelism. I am not convinced that emphasizing throughput over efficiency is the right thing to do for almost all current Python users, but mainly for large organizations with AI-related investments into Python-based technology stacks.)

davidhewitt · 2023-12-18T07:54:47Z

Thinking further, it seems to me the root cause of the problems we're running into here are down to the GIL marker being overloaded, it's currently in use as "safe to call Python APIs" as well as a synchronization primitive. In a nogil-only world in the long run it may simplify things to be able to remove that latter use.

Somewhat off-topic for subinterpreters, I guess GILProtected and GILOnceCell could become something like "interpreter-local storage" in analogy to TLS?

Ooof, maybe, I was sort of hoping that we wouldn't change how they worked for subinterpreters, but I guess if we want interpreter isolation, that's a possible way to do it.

At the danger of outing myself, [...]

You're not the only one I know who holds this view, even if I myself am quite optimistic for nogil. I think the SC would never have accepted the proposal if the belief wasn't that the single thread efficiency losses are minimal, and that most Python code won't need to have new adjustments to be thread safe.

davidhewitt · 2023-12-18T08:00:06Z

As for nogil, I suspect we'll want to provide a sort of "GIL emulation" using a normal mutex so that Rust-build extensions could work the same for gil and nogil builds. Or we drop any GIL-based synchronization outright, but I guess that would force everybody into paying the cost even if they are happy with single-threaded Python.

I suspect that we will need to keep something around, because I think mutex + GIL has been proven too many times that it's a recipe for deadlocks.

That said, we could just offer GILSafeMutex which unlocks the gil while blocking for the mutex, and on nogil world is just a noop wrapper on a normal mutex. Though actually we should check the mechanics of the stop-the-world GC, because we might want the same thing in nogil anyway. (I.e. avoid deadlocks between mutexes and the GC pauses)

adamreichold · 2023-12-18T08:19:02Z

if the belief wasn't that the single thread efficiency losses are minimal,

This is going off the rails where do we find the space to discuss this stuff? I think the above is realistic, i.e. the quantitative losses will be bearable. But I also think the real price is something different, namely the reduced audience for people making changes to Python, i.e. to mostly people doing this in a professional capacity and therefore having the context window to learn and manage the new complexities around e.g. reference counting. But I also admit that this change is happening already independently of the nogil changes, e.g. the JIT-like late optimizations are probably a similar barrier to entry, implemented for similar reasons as the nogil changes and similarly well-received in the community. So my main gripe is that I suspect that many Python users will loose the ability to fully understand and shape their own tools.

adamreichold linked a pull request Dec 22, 2023 that will close this issue

Close TLS-based soundness holes by running closure on a newly created thread. #3646

Draft

9 tasks

This was referenced Sep 13, 2024

Tracking issue for no-gil/freethreaded work #4265

Open

0.23 Release #4554

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`F: Ungil` on nightly insufficient for soundness of `allow_threads`, even without `scoped-tls` or `send_wrapper` #3658

`F: Ungil` on nightly insufficient for soundness of `allow_threads`, even without `scoped-tls` or `send_wrapper` #3658

steffahn commented Dec 16, 2023 •

edited

Loading

mejrs commented Dec 16, 2023 •

edited

Loading

steffahn commented Dec 16, 2023

davidhewitt commented Dec 17, 2023 •

edited

Loading

mejrs commented Dec 17, 2023

adamreichold commented Dec 17, 2023

davidhewitt commented Dec 18, 2023

davidhewitt commented Dec 18, 2023

adamreichold commented Dec 18, 2023 •

edited

Loading

F: Ungil on nightly insufficient for soundness of allow_threads, even without scoped-tls or send_wrapper #3658

F: Ungil on nightly insufficient for soundness of allow_threads, even without scoped-tls or send_wrapper #3658

Comments

steffahn commented Dec 16, 2023 • edited Loading

mejrs commented Dec 16, 2023 • edited Loading

steffahn commented Dec 16, 2023

davidhewitt commented Dec 17, 2023 • edited Loading

mejrs commented Dec 17, 2023

adamreichold commented Dec 17, 2023

davidhewitt commented Dec 18, 2023

davidhewitt commented Dec 18, 2023

adamreichold commented Dec 18, 2023 • edited Loading

`F: Ungil` on nightly insufficient for soundness of `allow_threads`, even without `scoped-tls` or `send_wrapper` #3658

`F: Ungil` on nightly insufficient for soundness of `allow_threads`, even without `scoped-tls` or `send_wrapper` #3658

steffahn commented Dec 16, 2023 •

edited

Loading

mejrs commented Dec 16, 2023 •

edited

Loading

davidhewitt commented Dec 17, 2023 •

edited

Loading

adamreichold commented Dec 18, 2023 •

edited

Loading