Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add FromPyObjectBound trait for extracting &str without GIL Refs #3928

Merged
merged 8 commits into from
Mar 8, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions guide/src/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,8 @@ Interactions with Python objects implemented in Rust no longer need to go though

To minimise breakage of code using the GIL-Refs API, the `Bound<T>` smart pointer has been introduced by adding complements to all functions which accept or return GIL Refs. This allows code to migrate by replacing the deprecated APIs with the new ones.

To identify what to migrate, temporarily switch off the `gil-refs` feature to see deprecation warnings on all uses of APIs accepting and producing GIL Refs. Over one or more PRs it should be possible to follow the deprecation hints to update code. Depending on your development environment, switching off the `gil-refs` feature may introduce [some very targeted breakages](#deactivating-the-gil-refs-feature), so you may need to fixup those first.

For example, the following APIs have gained updated variants:
- `PyList::new`, `PyTyple::new` and similar constructors have replacements `PyList::new_bound`, `PyTuple::new_bound` etc.
- `FromPyObject::extract` has a new `FromPyObject::extract_bound` (see the section below)
Expand Down Expand Up @@ -272,7 +274,15 @@ impl<'py> FromPyObject<'py> for MyType {

The expectation is that in 0.22 `extract_bound` will have the default implementation removed and in 0.23 `extract` will be removed.

### Deactivating the `gil-refs` feature

As a final step of migration, deactivating the `gil-refs` feature will set up code for best performance and is intended to set up a forward-compatible API for PyO3 0.22.

There is one notable API removed when this feature is disabled. `FromPyObject` trait implementations for types which borrow directly from the input data cannot be implemented by PyO3 without GIL Refs (while the migration is ongoing). These types are `&str`, `Cow<'_, str>`, `&[u8]`, `Cow<'_, u8>`.

To ease pain during migration, these types instead implement a new temporary trait `FromPyObjectBound` which is the expected future form of `FromPyObject`. The new temporary trait ensures is that `obj.extract::<&str>()` continues to work (with the new constraint that the extracted value now depends on the input `obj` lifetime), as well for these types in `#[pyfunction]` arguments.

An unfortunate final point here is that PyO3 cannot offer this new implementation for `&str` on `abi3` builds older than Python 3.10. On code which needs to build for 3.10 or older, many cases of `.extract::<&str>()` may need to be replaced with `.extract::<PyBackedStr>()`, which is string data which borrows from the Python `str` object. Alternatively, use `.extract::<Cow<str>>()`, `.extract::<String>()` to copy the data into Rust for these versions.
davidhewitt marked this conversation as resolved.
Show resolved Hide resolved

## from 0.19.* to 0.20

Expand Down
55 changes: 55 additions & 0 deletions src/conversion.rs
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,61 @@ pub trait FromPyObject<'py>: Sized {
}
}

/// Expected form of [`FromPyObject`] to be used in a future PyO3 release.
///
/// The difference between this and `FromPyObject` is that this trait takes an
/// additional lifetime `'a`, which is the lifetime of the input `Bound`.
///
/// This allows implementations for `&'a str` and `&'a [u8]`, which could not
/// be expressed by the existing `FromPyObject` trait once the GIL Refs API was
/// removed.
///
/// # Usage
///
/// Users are generally advised against implementing this trait, instead they should
/// implement the normal `FromPyObject` trait. This trait has a blanket implementation
/// for `T: FromPyObject`.
///
/// The only case where this trait should be implemented is when the lifetime of the
/// extracted value is tied to the lifetime `'a` of the input `Bound` instead of the
/// GIL lifetime `py`, as is the case for the `&'a str` implementation.
///
/// Similarly, users should typically not call these trait methods and should instead
/// use this via the `extract` method on `Bound` and `Py`.
pub trait FromPyObjectBound<'a, 'py>: Sized {
/// Extracts `Self` from the bound smart pointer `obj`.
///
/// Users are advised against calling this method directly: instead, use this via
/// [`Bound<'_, PyAny>::extract`] or [`Py::extract`].
fn from_py_object_bound(ob: &'a Bound<'py, PyAny>) -> PyResult<Self>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also already call this from_py_object, but maybe it's better to have a clear distinction for now...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one question which made me hesitate on encouraging users to use this trait too much (and also make this name clunky) is that I think to resolve the TODO in instance.rs:

        // TODO it might be possible to relax this bound in future, to allow
        // e.g. `.extract::<&str>(py)` where `py` is short-lived.
        'py: 'a,

we would need to make this trait accept ob: Borrowed<'a, 'py, PyAny> as the input, as that's basically &Bound with additional flexibility on the relationship between 'a and 'py.

It felt to me that this is a hurdle we can cross down the line, and working with Bound for now keeps it simple...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, using Borrowed there would certainly be an option. I agree that this is a question best answered once we actually removed the gil-refs and do the final migration. We could also seal this trait for now (but maybe that's a bit drastic). Removing a seal in a patch release, if it turns out to be necessary, would also be an option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's seal this, and leave a note saying that "if you have a need to use this trait, please let us know and help us work out the design".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I pushed an additional commit to seal the trait.


/// Extracts the type hint information for this type when it appears as an argument.
///
/// For example, `Vec<u32>` would return `Sequence[int]`.
/// The default implementation returns `Any`, which is correct for any type.
///
/// For most types, the return value for this method will be identical to that of [`IntoPy::type_output`].
/// It may be different for some types, such as `Dict`, to allow duck-typing: functions return `Dict` but take `Mapping` as argument.
#[cfg(feature = "experimental-inspect")]
fn type_input() -> TypeInfo {
TypeInfo::Any
}
}

impl<'py, T> FromPyObjectBound<'_, 'py> for T
where
T: FromPyObject<'py>,
{
fn from_py_object_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self> {
Self::extract_bound(ob)
}

#[cfg(feature = "experimental-inspect")]
fn type_input() -> TypeInfo {
<T as FromPyObject>::type_input()
}
}

/// Identity conversion: allows using existing `PyObject` instances where
/// `T: ToPyObject` is expected.
impl<T: ?Sized + ToPyObject> ToPyObject for &'_ T {
Expand Down
43 changes: 38 additions & 5 deletions src/conversions/std/slice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@ use std::borrow::Cow;

#[cfg(feature = "experimental-inspect")]
use crate::inspect::types::TypeInfo;
#[cfg(not(feature = "gil-refs"))]
use crate::types::PyBytesMethods;
use crate::{
types::{PyByteArray, PyBytes},
FromPyObject, IntoPy, Py, PyAny, PyObject, PyResult, Python, ToPyObject,
types::{PyAnyMethods, PyByteArray, PyByteArrayMethods, PyBytes},
Bound, IntoPy, Py, PyAny, PyObject, PyResult, Python, ToPyObject,
};

impl<'a> IntoPy<PyObject> for &'a [u8] {
Expand All @@ -18,7 +20,8 @@ impl<'a> IntoPy<PyObject> for &'a [u8] {
}
}

impl<'py> FromPyObject<'py> for &'py [u8] {
#[cfg(feature = "gil-refs")]
impl<'py> crate::FromPyObject<'py> for &'py [u8] {
fn extract(obj: &'py PyAny) -> PyResult<Self> {
Ok(obj.downcast::<PyBytes>()?.as_bytes())
}
Expand All @@ -29,20 +32,50 @@ impl<'py> FromPyObject<'py> for &'py [u8] {
}
}

#[cfg(not(feature = "gil-refs"))]
impl<'a> crate::conversion::FromPyObjectBound<'a, '_> for &'a [u8] {
fn from_py_object_bound(obj: &'a Bound<'_, PyAny>) -> PyResult<Self> {
Ok(obj.downcast::<PyBytes>()?.as_bytes())
}

#[cfg(feature = "experimental-inspect")]
fn type_input() -> TypeInfo {
Self::type_output()
}
}

/// Special-purpose trait impl to efficiently handle both `bytes` and `bytearray`
///
/// If the source object is a `bytes` object, the `Cow` will be borrowed and
/// pointing into the source object, and no copying or heap allocations will happen.
/// If it is a `bytearray`, its contents will be copied to an owned `Cow`.
impl<'py> FromPyObject<'py> for Cow<'py, [u8]> {
fn extract(ob: &'py PyAny) -> PyResult<Self> {
#[cfg(feature = "gil-refs")]
impl<'py> crate::FromPyObject<'py> for Cow<'py, [u8]> {
fn extract_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self> {
if let Ok(bytes) = ob.downcast::<PyBytes>() {
return Ok(Cow::Borrowed(bytes.clone().into_gil_ref().as_bytes()));
}

let byte_array = ob.downcast::<PyByteArray>()?;
Ok(Cow::Owned(byte_array.to_vec()))
}
}

#[cfg(not(feature = "gil-refs"))]
impl<'a> crate::conversion::FromPyObjectBound<'a, '_> for Cow<'a, [u8]> {
fn from_py_object_bound(ob: &'a Bound<'_, PyAny>) -> PyResult<Self> {
if let Ok(bytes) = ob.downcast::<PyBytes>() {
return Ok(Cow::Borrowed(bytes.as_bytes()));
}

let byte_array = ob.downcast::<PyByteArray>()?;
Ok(Cow::Owned(byte_array.to_vec()))
}

#[cfg(feature = "experimental-inspect")]
fn type_input() -> TypeInfo {
Self::type_output()
}
}

impl ToPyObject for Cow<'_, [u8]> {
Expand Down
55 changes: 46 additions & 9 deletions src/conversions/std/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,8 @@ impl<'a> IntoPy<PyObject> for &'a String {
}

/// Allows extracting strings from Python objects.
/// Accepts Python `str` and `unicode` objects.
/// Accepts Python `str` objects.
#[cfg(feature = "gil-refs")]
impl<'py> FromPyObject<'py> for &'py str {
fn extract(ob: &'py PyAny) -> PyResult<Self> {
ob.downcast::<PyString>()?.to_str()
Expand All @@ -125,6 +126,42 @@ impl<'py> FromPyObject<'py> for &'py str {
}
}

#[cfg(all(not(feature = "gil-refs"), any(Py_3_10, not(Py_LIMITED_API))))]
impl<'a> crate::conversion::FromPyObjectBound<'a, '_> for &'a str {
fn from_py_object_bound(ob: &'a Bound<'_, PyAny>) -> PyResult<Self> {
ob.downcast::<PyString>()?.to_str()
}

#[cfg(feature = "experimental-inspect")]
fn type_input() -> TypeInfo {
<String as crate::FromPyObject>::type_input()
}
}

#[cfg(feature = "gil-refs")]
impl<'py> FromPyObject<'py> for Cow<'py, str> {
fn extract_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self> {
ob.extract().map(Cow::Owned)
}

#[cfg(feature = "experimental-inspect")]
fn type_input() -> TypeInfo {
<String as crate::FromPyObject>::type_input()
}
}

#[cfg(not(feature = "gil-refs"))]
impl<'a> crate::conversion::FromPyObjectBound<'a, '_> for Cow<'a, str> {
fn from_py_object_bound(ob: &'a Bound<'_, PyAny>) -> PyResult<Self> {
ob.downcast::<PyString>()?.to_cow()
}

#[cfg(feature = "experimental-inspect")]
fn type_input() -> TypeInfo {
<String as crate::FromPyObject>::type_input()
}
}

/// Allows extracting strings from Python objects.
/// Accepts Python `str` and `unicode` objects.
impl FromPyObject<'_> for String {
Expand Down Expand Up @@ -169,9 +206,9 @@ mod tests {
Python::with_gil(|py| {
let s = "Hello Python";
let py_string: PyObject = Cow::Borrowed(s).into_py(py);
assert_eq!(s, py_string.extract::<&str>(py).unwrap());
assert_eq!(s, py_string.extract::<Cow<'_, str>>(py).unwrap());
let py_string: PyObject = Cow::<str>::Owned(s.into()).into_py(py);
assert_eq!(s, py_string.extract::<&str>(py).unwrap());
assert_eq!(s, py_string.extract::<Cow<'_, str>>(py).unwrap());
})
}

Expand All @@ -180,9 +217,9 @@ mod tests {
Python::with_gil(|py| {
let s = "Hello Python";
let py_string = Cow::Borrowed(s).to_object(py);
assert_eq!(s, py_string.extract::<&str>(py).unwrap());
assert_eq!(s, py_string.extract::<Cow<'_, str>>(py).unwrap());
let py_string = Cow::<str>::Owned(s.into()).to_object(py);
assert_eq!(s, py_string.extract::<&str>(py).unwrap());
assert_eq!(s, py_string.extract::<Cow<'_, str>>(py).unwrap());
})
}

Expand All @@ -201,7 +238,7 @@ mod tests {
let s = "Hello Python";
let py_string = s.to_object(py);

let s2: &str = py_string.bind(py).extract().unwrap();
let s2: Cow<'_, str> = py_string.bind(py).extract().unwrap();
assert_eq!(s, s2);
})
}
Expand Down Expand Up @@ -238,19 +275,19 @@ mod tests {
assert_eq!(
s,
IntoPy::<PyObject>::into_py(s3, py)
.extract::<&str>(py)
.extract::<Cow<'_, str>>(py)
.unwrap()
);
assert_eq!(
s,
IntoPy::<PyObject>::into_py(s2, py)
.extract::<&str>(py)
.extract::<Cow<'_, str>>(py)
.unwrap()
);
assert_eq!(
s,
IntoPy::<PyObject>::into_py(s, py)
.extract::<&str>(py)
.extract::<Cow<'_, str>>(py)
.unwrap()
);
})
Expand Down
19 changes: 16 additions & 3 deletions src/impl_/extract_argument.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
use crate::{
conversion::FromPyObjectBound,
exceptions::PyTypeError,
ffi,
pyclass::boolean_struct::False,
types::{any::PyAnyMethods, dict::PyDictMethods, tuple::PyTupleMethods, PyDict, PyTuple},
Borrowed, Bound, FromPyObject, PyAny, PyClass, PyErr, PyRef, PyRefMut, PyResult, PyTypeCheck,
Python,
Borrowed, Bound, PyAny, PyClass, PyErr, PyRef, PyRefMut, PyResult, PyTypeCheck, Python,
};

/// Helper type used to keep implementation more concise.
Expand All @@ -27,7 +27,7 @@ pub trait PyFunctionArgument<'a, 'py>: Sized + 'a {

impl<'a, 'py, T> PyFunctionArgument<'a, 'py> for T
where
T: FromPyObject<'py> + 'a,
T: FromPyObjectBound<'a, 'py> + 'a,
{
type Holder = ();

Expand All @@ -52,6 +52,19 @@ where
}
}

#[cfg(all(Py_LIMITED_API, not(any(feature = "gil-refs", Py_3_10))))]
impl<'a> PyFunctionArgument<'a, '_> for &'a str {
type Holder = Option<std::borrow::Cow<'a, str>>;

#[inline]
fn extract(
obj: &'a Bound<'_, PyAny>,
holder: &'a mut Option<std::borrow::Cow<'a, str>>,
) -> PyResult<Self> {
Ok(holder.insert(obj.extract()?))
}
}

/// Trait for types which can be a function argument holder - they should
/// to be able to const-initialize to an empty value.
pub trait FunctionArgumentHolder: Sized {
Expand Down
9 changes: 6 additions & 3 deletions src/impl_/pymodule.rs
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,10 @@ impl<T: PyTypeInfo> PyAddToModule for T {

#[cfg(test)]
mod tests {
use std::sync::atomic::{AtomicBool, Ordering};
use std::{
borrow::Cow,
sync::atomic::{AtomicBool, Ordering},
};

use crate::{
types::{any::PyAnyMethods, module::PyModuleMethods, PyModule},
Expand All @@ -176,15 +179,15 @@ mod tests {
module
.getattr("__name__")
.unwrap()
.extract::<&str>()
.extract::<Cow<'_, str>>()
.unwrap(),
"test_module",
);
assert_eq!(
module
.getattr("__doc__")
.unwrap()
.extract::<&str>()
.extract::<Cow<'_, str>>()
.unwrap(),
"some doc",
);
Expand Down
5 changes: 4 additions & 1 deletion src/inspect/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -464,7 +464,10 @@ mod conversion {
assert_display(&String::type_input(), "str");

assert_display(&<&[u8]>::type_output(), "bytes");
assert_display(&<&[u8]>::type_input(), "bytes");
assert_display(
&<&[u8] as crate::conversion::FromPyObjectBound>::type_input(),
"bytes",
);
}

#[test]
Expand Down
7 changes: 5 additions & 2 deletions src/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1315,9 +1315,12 @@ impl<T> Py<T> {
/// Extracts some type from the Python object.
///
/// This is a wrapper function around `FromPyObject::extract()`.
pub fn extract<'py, D>(&self, py: Python<'py>) -> PyResult<D>
pub fn extract<'a, 'py, D>(&'a self, py: Python<'py>) -> PyResult<D>
where
D: FromPyObject<'py>,
D: crate::conversion::FromPyObjectBound<'a, 'py>,
// TODO it might be possible to relax this bound in future, to allow
// e.g. `.extract::<&str>(py)` where `py` is short-lived.
'py: 'a,
{
self.bind(py).as_any().extract()
}
Expand Down
Loading
Loading