Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.21.0-beta.0: can't extract to Vec<Cow<'_, [u8]>> without gil-refs feature #3990

Closed
bmerry opened this issue Mar 24, 2024 · 11 comments · Fixed by #4031
Closed

0.21.0-beta.0: can't extract to Vec<Cow<'_, [u8]>> without gil-refs feature #3990

bmerry opened this issue Mar 24, 2024 · 11 comments · Fixed by #4031
Labels

Comments

@bmerry
Copy link

bmerry commented Mar 24, 2024

Bug Description

Even though I'm using the new Bound API (in a new project), it seems I have to enable the gil-refs feature for Bound::extract::<Vec<Cow<'_, [u8]>>> to compile. I don't know if that's expected, but it was surprising to me, since the documentation led me to believe the feature should only be needed for backwards compatibility.

Steps to Reproduce

In src/lib.rs:

use pyo3::prelude::*;
use pyo3::types::PyList;
use std::borrow::Cow;

#[pyclass]
struct Foo {}

#[pymethods]
impl Foo {
    fn bar(&self, x: Bound<'_, PyList>) -> PyResult<()> {
        let _value: Vec<Cow<'_, [u8]>> = x.extract()?;
        Ok(())
    }
}

In Cargo.toml:

[package]
name = "extract"
edition = "2021"

[lib]
name = "_lib"
crate-type = ["cdylib", "rlib"]  # rlib included just for testing

[dependencies]
pyo3 = { version = "0.21.0-beta.0", features = ["extension-module"] }

Run cargo build.

Output:

error[E0277]: the trait bound `Vec<Cow<'_, [u8]>>: pyo3::FromPyObject<'_>` is not satisfied
    --> src/lib.rs:11:44
     |
11   |         let _value: Vec<Cow<'_, [u8]>> = x.extract()?;
     |                                            ^^^^^^^ the trait `pyo3::FromPyObject<'_>` is not implemented for `Vec<Cow<'_, [u8]>>`
     |
     = help: the trait `pyo3::FromPyObject<'py>` is implemented for `Vec<T>`
     = note: required for `Vec<Cow<'_, [u8]>>` to implement `FromPyObjectBound<'_, '_>`
note: required by a bound in `pyo3::prelude::PyAnyMethods::extract`
    --> /home/bruce/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.21.0-beta.0/src/types/any.rs:1647:12
     |
1645 |     fn extract<'a, T>(&'a self) -> PyResult<T>
     |        ------- required by a bound in this associated function
1646 |     where
1647 |         T: FromPyObjectBound<'a, 'py>;
     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `PyAnyMethods::extract`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `extract` (lib) due to previous error

Backtrace

No response

Your operating system and version

Ubuntu 22.04

Your Python version (python --version)

3.10.12

Your Rust version (rustc --version)

rustc 1.75.0 (82e1608df 2023-12-21)

Your PyO3 version

0.21.0-beta.0

How did you install python? Did you use a virtualenv?

apt

Additional Info

When enabling the gil-refs feature, the sample code compiles without warnings.

I should add that I'm using Vec<Cow<'_, [u8]>> because it's the natural fit for the Rust code I'm trying to bind, rather than expecting it to borrow the memory from the bytes objects stored in the PyList. I can imagine that might not be practical without using a GILPool to keep them alive.

@bmerry bmerry added the bug label Mar 24, 2024
@adamreichold
Copy link
Member

I think the additional Vec layer might obstruct our workaround to keep this available with GIL refs disabled but without make FromPyObjectBound the new FromPyObject, i.e. we are missing the impl

impl<'py, T> FromPyObjectBound<'_, 'py> for Vec<T>
where
  T: FromPyObjectBound<'_, 'py>

which is not provided by the blanket impl based on FromPyObject which does not "lift" the "bound variant" into the interiors of these types.

In your case, could you try extract the Cow<'_, [u8]> from the list items directly and manually collecting them into a Vec?

@adamreichold
Copy link
Member

We could provide those impls but it would be a bit of a game of whack-a-mole as the same reasoning applies to e.g. HashMap or any other container and the problem should go away with 0.22 when FromPyObjectBound becomes the new FromPyObject.

@adamreichold
Copy link
Member

adamreichold commented Mar 24, 2024

In your case, could you try extract the Cow<'_, [u8]> from the list items directly and manually collecting them into a Vec?

Of course, this does not work as the new extract method yields a reduced lifetime, i.e.

error[E0515]: cannot return value referencing function parameter `item`
  --> src/lib.rs:11:62
   |
11 |         let _value: Vec<Cow<'_, [u8]>> = x.iter().map(|item| item.extract()).collect::<PyResult<_>>()?;
   |                                                              ----^^^^^^^^^^
   |                                                              |
   |                                                              returns a value referencing data owned by the current function
   |                                                              `item` is borrowed here

happens. So I suspect this will require PyBackedBytes in any case and even the additional impl I suggested above would not help because it cannot magically extend the lifetimes of the list items.

@adamreichold
Copy link
Member

So indeed

use pyo3::prelude::*;
use pyo3::pybacked::PyBackedBytes;
use pyo3::types::PyList;

#[pyclass]
struct Foo {}

#[pymethods]
impl Foo {
    fn bar(&self, x: Bound<'_, PyList>) -> PyResult<()> {
        let _value: Vec<PyBackedBytes> = x.extract()?;
        Ok(())
    }
}

should be the closest thing to the previous pattern you can do without the GIL pool.

@davidhewitt
Copy link
Member

I agree with the above, probably the documentation could do better to emphasise this case has changed.

In general the additional 'a lifetime in FromPyObjectBound is useful for getting data out of single Python objects but has no real benefit for collections where reading them necessarily requires temporary ownership of the inner values.

It might be that we choose never to have FromPyObjectBound and just embrace owning structures like PyBackedBytes. I think the next couple of release cycles will tell us what feels best.

@bmerry
Copy link
Author

bmerry commented Mar 24, 2024

Thanks, I wasn't aware of PyBackedBytes! It sounds useful. I can probably write code to accept Vec<T> where T: Deref<Target = [u8]> (or an iterator or whatever) which can then consume Vec<PyBackedBytes> without any need to use Cow.

@adamreichold in your example, are the PyBackedBytes objects backed by the bytes objects from Python or do they get copied into a Box?

@davidhewitt should I leave this open while the documentation is figured out?

@davidhewitt
Copy link
Member

Sure thing, yes.

Thanks, I wasn't aware of PyBackedBytes! It sounds useful. I can probably write code to accept Vec<T> where T: Deref<Target = [u8]> (or an iterator or whatever) which can then consume Vec<PyBackedBytes> without any need to use Cow.

Reading this I immediately wonder if we should implement AsRef[u8]> if we didn't already, as I think that's more idiomatic than using Deref as the bound.

@bmerry
Copy link
Author

bmerry commented Mar 24, 2024

Reading this I immediately wonder if we should implement AsRef[u8]> if we didn't already, as I think that's more idiomatic than using Deref as the bound.

I'm still fairly new to Rust and I'm still figuring out when to use AsRef vs Deref vs Borrow, but that sounds good to me.

@adamreichold
Copy link
Member

in your example, are the PyBackedBytes objects backed by the bytes objects from Python or do they get copied into a Box?

The clone only a reference to the bytes object, so indeed the data itself is a reference into the bytes, i.e. backed by the bytes which is what we tried to convey by the name.

@adamreichold
Copy link
Member

Reading this I immediately wonder if we should implement AsRef[u8]> if we didn't already, as I think that's more idiomatic than using Deref as the bound.

I agree that as a bound AsRef would be more natural, but types which really hold only a T often implement both Deref and AsRef and I would also suggest we provide both impls here.

@davidhewitt
Copy link
Member

Agreed both implementations are reasonable here; I wasn't thinking to remove the Deref implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants