Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start adding a performance section to the guide. #3304

Merged
merged 2 commits into from
Jul 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions guide/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
- [Debugging](debugging.md)
- [Features reference](features.md)
- [Memory management](memory.md)
- [Performance](performance.md)
- [Advanced topics](advanced.md)
- [Building and distribution](building_and_distribution.md)
- [Supporting multiple Python versions](building_and_distribution/multiple_python_versions.md)
Expand Down
94 changes: 94 additions & 0 deletions guide/src/performance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Performance

To achieve the best possible performance, it is useful to be aware of several tricks and sharp edges concerning PyO3's API.

## `extract` versus `downcast`

Pythonic API implemented using PyO3 are often polymorphic, i.e. they will accept `&PyAny` and try to turn this into multiple more concrete types to which the requested operation is applied. This often leads to chains of calls to `extract`, e.g.

```rust
# #![allow(dead_code)]
# use pyo3::prelude::*;
# use pyo3::{exceptions::PyTypeError, types::PyList};

fn frobnicate_list(list: &PyList) -> PyResult<&PyAny> {
todo!()
}

fn frobnicate_vec(vec: Vec<&PyAny>) -> PyResult<&PyAny> {
todo!()
}

#[pyfunction]
fn frobnicate(value: &PyAny) -> PyResult<&PyAny> {
if let Ok(list) = value.extract::<&PyList>() {
frobnicate_list(list)
} else if let Ok(vec) = value.extract::<Vec<&PyAny>>() {
frobnicate_vec(vec)
} else {
Err(PyTypeError::new_err("Cannot frobnicate that type."))
}
}
```

This suboptimal as the `FromPyObject<T>` trait requires `extract` to have a `Result<T, PyErr>` return type. For native types like `PyList`, it faster to use `downcast` (which `extract` calls internally) when the error value is ignored. This avoids the costly conversion of a `PyDowncastError` to a `PyErr` required to fulfil the `FromPyObject` contract, i.e.

```rust
# #![allow(dead_code)]
# use pyo3::prelude::*;
# use pyo3::{exceptions::PyTypeError, types::PyList};
# fn frobnicate_list(list: &PyList) -> PyResult<&PyAny> { todo!() }
# fn frobnicate_vec(vec: Vec<&PyAny>) -> PyResult<&PyAny> { todo!() }
#
#[pyfunction]
fn frobnicate(value: &PyAny) -> PyResult<&PyAny> {
// Use `downcast` instead of `extract` as turning `PyDowncastError` into `PyErr` is quite costly.
if let Ok(list) = value.downcast::<PyList>() {
frobnicate_list(list)
} else if let Ok(vec) = value.extract::<Vec<&PyAny>>() {
frobnicate_vec(vec)
} else {
Err(PyTypeError::new_err("Cannot frobnicate that type."))
}
}
```

## Access to GIL-bound reference implies access to GIL token

Calling `Python::with_gil` is effectively a no-op when the GIL is already held, but checking that this is the case still has a cost. If an existing GIL token can not be accessed, for example when implementing a pre-existing trait, but a GIL-bound reference is available, this cost can be avoided by exploiting that access to GIL-bound reference gives zero-cost access to a GIL token via `PyAny::py`.

For example, instead of writing

```rust
# #![allow(dead_code)]
# use pyo3::prelude::*;
# use pyo3::types::PyList;

struct Foo(Py<PyList>);

struct FooRef<'a>(&'a PyList);

impl PartialEq<Foo> for FooRef<'_> {
fn eq(&self, other: &Foo) -> bool {
Python::with_gil(|py| self.0.len() == other.0.as_ref(py).len())
}
}
```

use more efficient

```rust
# #![allow(dead_code)]
# use pyo3::prelude::*;
# use pyo3::types::PyList;
# struct Foo(Py<PyList>);
# struct FooRef<'a>(&'a PyList);
#
impl PartialEq<Foo> for FooRef<'_> {
fn eq(&self, other: &Foo) -> bool {
// Access to `&'a PyAny` implies access to `Python<'a>`.
let py = self.0.py();
self.0.len() == other.0.as_ref(py).len()
}
}
```
1 change: 1 addition & 0 deletions newsfragments/3304.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added a "performance" section to the guide collecting performance-related tricks and problems.
adamreichold marked this conversation as resolved.
Show resolved Hide resolved
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -496,6 +496,7 @@ pub mod doc_test {
"guide/src/migration.md" => guide_migration_md,
"guide/src/module.md" => guide_module_md,
"guide/src/parallelism.md" => guide_parallelism_md,
"guide/src/performance.md" => guide_performance_md,
"guide/src/python_from_rust.md" => guide_python_from_rust_md,
"guide/src/python_typing_hints.md" => guide_python_typing_hints_md,
"guide/src/rust_cpython.md" => guide_rust_cpython_md,
Expand Down