Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ Thus, when copying a Rust struct to a Python object, we first allocate `PyClassO
move `T` into it.

The primary way to interact with Python objects implemented in Rust is through the `Bound<'py, T>` smart pointer.
By having the `'py` lifetime of the `Python<'py>` token, this ties the lifetime of the `Bound<'py, T>` smart pointer to the lifetime of the GIL and allows PyO3 to call Python APIs at maximum efficiency.
By having the `'py` lifetime of the `Python<'py>` token, this ties the lifetime of the `Bound<'py, T>` smart pointer to the lifetime for which the thread is attached to the Python interpreter and allows PyO3 to call Python APIs at maximum efficiency.

`Bound<'py, T>` requires that `T` implements `PyClass`.
This trait is somewhat complex and derives many traits, but the most important one is `PyTypeInfo`
Expand Down
3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,8 @@ generate-import-lib = ["pyo3-ffi/generate-import-lib"]
# Changes `Python::attach` to automatically initialize the Python interpreter if needed.
auto-initialize = []

# Enables `Clone`ing references to Python objects `Py<T>` which panics if the GIL is not held.
# Enables `Clone`ing references to Python objects `Py<T>` which panics if the
# thread is not attached to the Python interpreter.
py-clone = []

# Adds `OnceExt` and `MutexExt` implementations to the `parking_lot` types
Expand Down
28 changes: 18 additions & 10 deletions guide/src/class.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,9 +203,10 @@ mod my_module {

It is often useful to turn a `#[pyclass]` type `T` into a Python object and access it from Rust code. The [`Py<T>`] and [`Bound<'py, T>`] smart pointers are the ways to represent a Python object in PyO3's API. More detail can be found about them [in the Python objects](./types.md#pyo3s-smart-pointers) section of the guide.

Most Python objects do not offer exclusive (`&mut`) access (see the [section on Python's memory model](./python-from-rust.md#pythons-memory-model)). However, Rust structs wrapped as Python objects (called `pyclass` types) often *do* need `&mut` access. Due to the GIL, PyO3 *can* guarantee exclusive access to them.
Most Python objects do not offer exclusive (`&mut`) access (see the [section on Python's memory model](./python-from-rust.md#pythons-memory-model)). However, Rust structs wrapped as Python objects (called `pyclass` types) often *do* need `&mut` access.
However, the Rust borrow checker cannot reason about `&mut` references once an object's ownership has been passed to the Python interpreter.

The Rust borrow checker cannot reason about `&mut` references once an object's ownership has been passed to the Python interpreter. This means that borrow checking is done at runtime using with a scheme very similar to `std::cell::RefCell<T>`. This is known as [interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html).
To solve this, PyO3 does borrow checking at runtime using a scheme very similar to `std::cell::RefCell<T>`. This is known as [interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html).

Users who are familiar with `RefCell<T>` can use `Py<T>` and `Bound<'py, T>` just like `RefCell<T>`.

Expand Down Expand Up @@ -685,7 +686,8 @@ impl MyClass {
}
```

Calls to these methods are protected by the GIL, so both `&self` and `&mut self` can be used.
Both `&self` and `&mut self` can be used, due to the use of [runtime borrow checking](#bound-and-interior-mutability).

The return type must be `PyResult<T>` or `T` for some `T` that implements `IntoPyObject`;
the latter is allowed if the method cannot raise Python exceptions.

Expand Down Expand Up @@ -828,7 +830,12 @@ impl MyClass {

## Classes as function arguments

Free functions defined using `#[pyfunction]` interact with classes through the same mechanisms as the self parameters of instance methods, i.e. they can take Python-bound references, Python-bound reference wrappers or Python-independent references:
Class objects can be used as arguments to `#[pyfunction]`s and `#[pymethods]` in the same way as the self parameters of instance methods, i.e. they can be passed as:
- `Py<T>` or `Bound<'py, T>` smart pointers to the class Python object,
- `&T` or `&mut T` references to the Rust data contained in the Python object, or
- `PyRef<T>` and `PyRefMut<T>` reference wrappers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also mention PyClassGuard here? Or do we wait with that until we deprecate PyRef eventually?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about it, I think it would be good to introduce PyClassGuard as part of 0.27 if we can figure out the way to sequence it in! I would prefer defer from this PR for the moment though.


Examples of each of these below:

```rust,no_run
# #![allow(dead_code)]
Expand All @@ -838,29 +845,30 @@ struct MyClass {
my_field: i32,
}

// Take a reference when the underlying `Bound` is irrelevant.
// Take a reference to Rust data when the Python object is irrelevant.
#[pyfunction]
fn increment_field(my_class: &mut MyClass) {
my_class.my_field += 1;
}

// Take a reference wrapper when borrowing should be automatic,
// but interaction with the underlying `Bound` is desired.
// but access to the Python object is still needed
#[pyfunction]
fn print_field(my_class: PyRef<'_, MyClass>) {
fn print_field_and_return_me(my_class: PyRef<'_, MyClass>) -> PyRef<'_, MyClass> {
println!("{}", my_class.my_field);
my_class
}

// Take a reference to the underlying Bound
// when borrowing needs to be managed manually.
// Take (a reference to) a Python object smart pointer when borrowing needs to be managed manually.
#[pyfunction]
fn increment_then_print_field(my_class: &Bound<'_, MyClass>) {
my_class.borrow_mut().my_field += 1;

println!("{}", my_class.borrow().my_field);
}

// Take a GIL-independent reference when you want to store the reference elsewhere.
// When the Python object smart pointer needs to be stored elsewhere prefer `Py<T>` over `Bound<'py, T>`
// to avoid the lifetime restrictions.
#[pyfunction]
fn print_refcnt(my_class: Py<MyClass>, py: Python<'_>) {
println!("{}", my_class.get_refcnt(py));
Expand Down
6 changes: 3 additions & 3 deletions guide/src/conversions/tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ It is also worth remembering the following special types:

| What | Description |
| ---------------- | ------------------------------------- |
| `Python<'py>` | A GIL token, used to pass to PyO3 constructors to prove ownership of the GIL. |
| `Bound<'py, T>` | A Python object connected to the GIL lifetime. This provides access to most of PyO3's APIs. |
| `Py<T>` | A Python object isolated from the GIL lifetime. This can be sent to other threads. |
| `Python<'py>` | A token used to prove attachment to the Python interpreter. |
| `Bound<'py, T>` | A Python object with a lifetime which binds it to the attachment to the Python interpreter. This provides access to most of PyO3's APIs. |
| `Py<T>` | A Python object not connected to any lifetime of attachment to the Python interpreter. This can be sent to other threads. |
| `PyRef<T>` | A `#[pyclass]` borrowed immutably. |
| `PyRefMut<T>` | A `#[pyclass]` borrowed mutably. |

Expand Down
3 changes: 2 additions & 1 deletion guide/src/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,8 @@ print(f"a: {a}\nb: {b}")
a: <builtins.Inner object at 0x0000020044FCC670>
b: <builtins.Inner object at 0x0000020044FCC670>
```
The downside to this approach is that any Rust code working on the `Outer` struct now has to acquire the GIL to do anything with its field.
The downside to this approach is that any Rust code working on the `Outer` struct potentially has to attach to the Python interpreter to do anything with the `inner` field. (If `Inner` is `#[pyclass(frozen)]` and implements `Sync`, then `Py::get`
may be used to access the `Inner` contents from `Py<Inner>` without needing to attach to the interpreter.)

## I want to use the `pyo3` crate re-exported from dependency but the proc-macros fail!

Expand Down
4 changes: 2 additions & 2 deletions guide/src/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,11 @@ This is a first step towards adding first-class support for generating type anno

### `py-clone`

This feature was introduced to ease migration. It was found that delayed reference counts cannot be made sound and hence `Clon`ing an instance of `Py<T>` must panic without the GIL being held. To avoid migrations introducing new panics without warning, the `Clone` implementation itself is now gated behind this feature.
This feature was introduced to ease migration. It was found that delayed reference counting (which PyO3 used historically) could not be made sound and hence `Clone`-ing an instance of `Py<T>` is impossible when not attached to Python interpreter (it will panic). To avoid migrations introducing new panics without warning, the `Clone` implementation itself is now gated behind this feature.

### `pyo3_disable_reference_pool`

This is a performance-oriented conditional compilation flag, e.g. [set via `$RUSTFLAGS`][set-configuration-options], which disabled the global reference pool and the associated overhead for the crossing the Python-Rust boundary. However, if enabled, `Drop`ping an instance of `Py<T>` without the GIL being held will abort the process.
This is a performance-oriented conditional compilation flag, e.g. [set via `$RUSTFLAGS`][set-configuration-options], which disabled the global reference pool and the associated overhead for the crossing the Python-Rust boundary. However, if enabled, `Drop`ping an instance of `Py<T>` when not attached to the Python interpreter will abort the process.

### `macros`

Expand Down
42 changes: 20 additions & 22 deletions guide/src/free-threading.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,26 @@
# Supporting Free-Threaded CPython

CPython 3.13 introduces an experimental "free-threaded" build of CPython that
does not rely on the [global interpreter
lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock)
(often referred to as the GIL) for thread safety. As of version 0.23, PyO3 also
has preliminary support for building Rust extensions for the free-threaded
Python build and support for calling into free-threaded Python from Rust.

If you want more background on free-threaded Python in general, see the [what's
new](https://docs.python.org/3/whatsnew/3.13.html#whatsnew313-free-threaded-cpython)
entry in the 3.13 release notes, the [free-threading HOWTO
guide](https://docs.python.org/3/howto/free-threading-extensions.html#freethreading-extensions-howto)
in the CPython docs, the [extension porting
guide](https://py-free-threading.github.io/porting-extensions/) in the
community-maintained Python free-threading guide, and [PEP
703](https://peps.python.org/pep-0703/), which provides the technical background
CPython 3.14 declared support for the "free-threaded" build of CPython that
does not rely on the [global interpreter lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock)
(often referred to as the GIL) for thread safety. Since version 0.23, PyO3
supports building Rust extensions for the free-threaded Python build and
calling into free-threaded Python from Rust.

If you want more background on free-threaded Python in general, see the
[what's new](https://docs.python.org/3/whatsnew/3.13.html#whatsnew313-free-threaded-cpython)
entry in the 3.13 release notes (when the "free-threaded" build was first added as an experimental
mode), the
[free-threading HOWTO guide](https://docs.python.org/3/howto/free-threading-extensions.html#freethreading-extensions-howto)
in the CPython docs, the
[extension porting guide](https://py-free-threading.github.io/porting-extensions/)
in the community-maintained Python free-threading guide, and
[PEP 703](https://peps.python.org/pep-0703/), which provides the technical background
for the free-threading implementation in CPython.

In the GIL-enabled build, the global interpreter lock serializes access to the
Python runtime. The GIL is therefore a fundamental limitation to parallel
scaling of multithreaded Python workflows, due to [Amdahl's
law](https://en.wikipedia.org/wiki/Amdahl%27s_law), because any time spent
In the GIL-enabled build (the only choice before the "free-threaded" build was introduced),
the global interpreter lock serializes access to the Python runtime. The GIL is therefore
a fundamental limitation to parallel scaling of multithreaded Python workflows, due to
[Amdahl's law](https://en.wikipedia.org/wiki/Amdahl%27s_law), because any time spent
executing a parallel processing task on only one execution context fundamentally
cannot be sped up using parallelism.

Expand Down Expand Up @@ -123,9 +123,7 @@ free-threaded build.

The free-threaded interpreter does not have a GIL. Many existing extensions
providing mutable data structures relied on the GIL to lock Python objects and
make interior mutability thread-safe. Historically, PyO3's API was designed
around the same strong assumptions, but is transitioning towards more general
APIs applicable for both builds.
make interior mutability thread-safe.

Calling into the CPython C API is only legal when an OS thread is explicitly
attached to the interpreter runtime. In the GIL-enabled build, this happens when
Expand Down
13 changes: 11 additions & 2 deletions guide/src/parallelism.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
# Parallelism

CPython has the infamous [Global Interpreter Lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock) (GIL), which prevents several threads from executing Python bytecode in parallel. This makes threading in Python a bad fit for [CPU-bound](https://en.wikipedia.org/wiki/CPU-bound) tasks and often forces developers to accept the overhead of multiprocessing. There is an experimental "free-threaded" version of CPython 3.13 that does not have a GIL, see the PyO3 docs on [free-threaded Python](./free-threading.md) for more information about that.
Historically, CPython was limited by the [global interpreter lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock) (GIL), which only allowed a single thread to drive the Python interpreter at a time. This made threading in Python a bad fit for [CPU-bound](https://en.wikipedia.org/wiki/CPU-bound) tasks and often forced developers to accept the overhead of multiprocessing.

Rust is well-suited to multithreaded code, and libraries like [`rayon`] can help you leverage safe parallelism with minimal effort. The [`Python::detach`] method can be used to allow the Python interpreter to do other work while the Rust work is ongoing.

To enable full parallelism in your application, consider also using [free-threaded Python](./free-threading.md) which is supported since Python 3.14.

## Parallelism under the Python GIL

Let's take a look at our [word-count](https://github.com/PyO3/pyo3/blob/main/examples/word-count/src/lib.rs) example, where we have a `search` function that utilizes the [`rayon`] crate to count words in parallel.

In PyO3 parallelism can be easily achieved in Rust-only code. Let's take a look at our [word-count](https://github.com/PyO3/pyo3/blob/main/examples/word-count/src/lib.rs) example, where we have a `search` function that utilizes the [rayon](https://github.com/rayon-rs/rayon) crate to count words in parallel.
```rust,no_run
# #![allow(dead_code)]
use pyo3::prelude::*;
Expand Down Expand Up @@ -32,6 +39,7 @@ fn search(contents: &str, needle: &str) -> usize {
```

But let's assume you have a long running Rust function which you would like to execute several times in parallel. For the sake of example let's take a sequential version of the word count:

```rust,no_run
# #![allow(dead_code)]
# fn count_line(line: &str, needle: &str) -> usize {
Expand Down Expand Up @@ -175,3 +183,4 @@ collecting the results from the worker threads. You should always call
cases where worker threads need to acquire the GIL, to prevent deadlocks.

[`Python::detach`]: {{#PYO3_DOCS_URL}}/pyo3/marker/struct.Python.html#method.detach
[`rayon`]: https://github.com/rayon-rs/rayon
13 changes: 13 additions & 0 deletions guide/src/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ impl PartialEq<Foo> for FooBound<'_> {
```

## Calling Python callables (`__call__`)

CPython support multiple calling protocols: [`tp_call`] and [`vectorcall`]. [`vectorcall`] is a more efficient protocol unlocking faster calls.
PyO3 will try to dispatch Python `call`s using the [`vectorcall`] calling convention to archive maximum performance if possible and falling back to [`tp_call`] otherwise.
This is implemented using the (internal) `PyCallArgs` trait. It defines how Rust types can be used as Python `call` arguments. This trait is currently implemented for
Expand All @@ -110,6 +111,18 @@ Rust tuples may make use of [`vectorcall`] where as `Bound<'_, PyTuple>` and `Py
[`tp_call`]: https://docs.python.org/3/c-api/call.html#the-tp-call-protocol
[`vectorcall`]: https://docs.python.org/3/c-api/call.html#the-vectorcall-protocol

## Detach from the interpreter for long-running Rust-only work

When executing Rust code which does not need to interact with the Python interpreter, use [`Python::detach`] to allow the Python interpreter to proceed without waiting for the current thread.

On the GIL-enabled build, this is crucial for best performance as only a single thread may ever be attached at a time.

On the free-threaded build, this is still best practice as there are several "stop the world" events (such as garbage collection) where all threads attached to the Python interpreter are forced to wait.

As a rule of thumb, attaching and detaching from the Python interpreter takes less than a millisecond, so any work which is expected to take multiple milliseconds can likely benefit from detaching from the interpreter.

[`Python::detach`]: {{#PYO3_DOCS_URL}}/pyo3/marker/struct.Python.html#method.detach

## Disable the global reference pool

PyO3 uses global mutable state to keep track of deferred reference count updates implied by `impl<T> Drop for Py<T>` being called without being attached to the interpreter. The necessary synchronization to obtain and apply these reference count updates when PyO3-based code next attaches to the interpreter is somewhat expensive and can become a significant part of the cost of crossing the Python-Rust boundary.
Expand Down
Loading
Loading