-
Notifications
You must be signed in to change notification settings - Fork 783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add second lifetime to FromPyObject
#4390
base: main
Are you sure you want to change the base?
Conversation
I propose that we add a |
Thanks! I think that's a great idea. It's essentially the same thing, but it's much easier to grasp by giving it a distinct name, plus we can better document it in the API docs. I will adapt this to make use of that. |
c960124
to
a1c0bed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pushing on with this! Overall it looks good to me, and I only have a few docs suggestions.
Before we definitely commit to this, I want to take a brief moment just to reflect on the choice of Borrowed
over &Bound
. There's at least two good technical reasons, which is the lifetime constraint and performance by avoiding pointer-to-pointer. The codspeed branch does show a few slight perf improvements in the ~4% range on tuple extraction and we have further possibilities like PyRef
containing Borrowed
once we do this.
That said, we deliberately kept the Borrowed
smart pointer out of the way as much as possible in the original 0.21 design to try to keep new concepts down. Is it fine to increase its use? I think the upsides justify this, though we might want to increase the quality of documentation and examples on Borrowed
(which I'd left quite minimal so far).
@@ -616,8 +616,9 @@ pub fn build_derive_from_pyobject(tokens: &DeriveInput) -> Result<TokenStream> { | |||
let ident = &tokens.ident; | |||
Ok(quote!( | |||
#[automatically_derived] | |||
impl #trait_generics #pyo3_path::FromPyObject<#lt_param> for #ident #generics #where_clause { | |||
fn extract_bound(obj: &#pyo3_path::Bound<#lt_param, #pyo3_path::PyAny>) -> #pyo3_path::PyResult<Self> { | |||
impl #trait_generics #pyo3_path::FromPyObject<'_, #lt_param> for #ident #generics #where_clause { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be an interesting question for later if we ever support the 'a
lifetime in our #[derive(FromPyObject)]
. Definitely not for now 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, something to experiment with in the future, but out of scope for now for sure 😅
src/conversion.rs
Outdated
/// the normal `FromPyObject` trait. This trait has a blanket implementation | ||
/// for `T: FromPyObject`. | ||
/// Note: depending on the implementation, the lifetime of the extracted result may | ||
/// depend on the lifetime of the `obj` or the `prepared` variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a
rather than prepared
here and below? Also maybe written as 'py
or 'a
, as we do elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I was a bit lazy and just uncommented the docs that we temporarily removed during gil-ref migration.
How about
/// Note: depending on the implementation, the extracted result may
/// depend on the Python lifetime `'py` or the input lifetime `'a` of `obj`.
Also the example below that seem a bit confusing, what do you think about using Cow<str>
as an example, which may or may not borrow from the input ('a
) depending on the Python runtime type and maybe Bound<'py, PyString>
as an example that depends on the Python lifetime? Maybe we also add an example of a collection type (maybe Vec
) to introduce FromPyObjectOwned
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of that sounds great, yes please 👍
impl<'py, T> FromPyObject<'_, 'py> for PyRef<'py, T> | ||
where | ||
T: PyClass, | ||
{ | ||
fn extract_bound(obj: &Bound<'py, PyAny>) -> PyResult<Self> { | ||
fn extract(obj: Borrowed<'_, 'py, PyAny>) -> PyResult<Self> { | ||
obj.downcast::<T>()?.try_borrow().map_err(Into::into) | ||
} | ||
} | ||
|
||
impl<'py, T> FromPyObject<'py> for PyRefMut<'py, T> | ||
impl<'py, T> FromPyObject<'_, 'py> for PyRefMut<'py, T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then the follow-up future PR will be what to do about PyRef
. Should we split into PyRef
/ PyRefBorrowed
, or keep the one type PyRef<'a, 'py, T>
and accept some breakage? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think I we want to provide both variants, I would prefer PyRef<'a, 'py, T>
and PyRefOwned<'py, T>
since that would be more consistent with FromPyObjectOwned
.
In any case we would need to find a good way to differentiate their constructors. Neither obj.borrow_borrowed()
nor obj.borrow_owned()
feel to appealing to me 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it's tough to find names. I'm somewhat tempted to wait until after 0.23 and to first see how all these trait changes we've already made play out before we change PyRef
.
Thanks for the review! You are right, the choice of
I agree, if we put |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the recently added FromPyObject
documentation is too "implementation detail"-y.
I'd prefer if this documentation is kept short and concise, one paragraph max.
For example:
Depending on the python version and implementation some `FromPyObject`
implementations may produce Rust types that point into the Python type.
For example `PyString` and `PyBytes` may convert into `str` and `[u8]` references
that borrow from the original Python object.
Types that do not borrow from the input should use `FromPyObjectOwned` instead.
src/conversion.rs
Outdated
/// borrow from the input lifetime `'a`. The behavior depends on the runtime | ||
/// type of the Python object. For a Python byte string, the existing string | ||
/// data can be borrowed (lifetime: `'a`) into a [`Cow::Borrowed`]. For a Python | ||
/// Unicode string, the data may have to be reencoded to UTF-8, and copied into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or not, depending on the python version the string will be interned in the unicode object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm aware of that, but that felt like too much detail for me, and is not really relevant for the point I am trying to make here. That's why I tried to phrase it in a rather vague way.
Thanks for the feedback! I took another pass over it and tried to make it a bit more concise. I do however think that it's worth to talk about the individual lifetimes at play here, since this trait is right in the center of PyO3. I moved the paragraph about collections into the Let me know what you think about this. |
9cd0a41
to
91ef5fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for keeping this branch alive, I guess let's move forward with this before it gets too painful to keep updating.
I had yet another reflection on &Bound
vs Borrowed
and I'm feeling reasonably comfortable that Borrowed
is the correct choice.
@@ -656,8 +656,8 @@ macro_rules! tuple_conversion ({$length:expr,$(($refN:ident, $n:tt, $T:ident)),+ | |||
} | |||
} | |||
|
|||
impl<'py, $($T: FromPyObject<'py>),+> FromPyObject<'py> for ($($T,)+) { | |||
fn extract_bound(obj: &Bound<'py, PyAny>) -> PyResult<Self> | |||
impl<'a, 'py, $($T: FromPyObject<'a, 'py>),+> FromPyObject<'a, 'py> for ($($T,)+) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the record and the decision to stick with Borrowed
, the reason is that if we used &'a Bound
here for the input argument, then tuple extraction would be forced to restrict elements to FromPyObjectOwned
.
That maybe wouldn't be a terrible thing, as this is currently a special case when compared with every other container, which is forced to use FromPyObjectOwned
due to their mutability.
But I think it would be a breaking change, so it's probably not a good change overall. (E.g. users would lose the ability to extract tuples containing &str
, &Bound
and Borrowed
)
I think as a secondary effect, by using Borrowed
here we might be able to change extract_argument.rs
to make more use of Borrowed
, which IIRC is a possible performance win.
I believe this now also finishes the |
Fantastic, thank you so much for your help on pushing that over the line. When I get some time I'll try to write up a list of what we might have left for 0.23 (hopefully not much). |
Ah, I just started playing with this locally and found one additional finding: having Is that a problem? I think not, because that's what |
Oh, and one more case which comes up in local testing. At the moment |
It feels to me like we might want to consider macro tricks such that users can use both |
That's true, but as you mention below, I actually think this a good thing and makes a clearer use case between
Hmm, good question. I guess that is part of the question about how prominent we want |
Yes agreed, it's definitely a follow-up to make changes to expand I wonder, are there any other options we have here? E.g. does it work to have a trait with both I am still feeling a little uneasy about the choice to commit |
And (sorry to keep dangling ideas on here), inspired by the new
Both changes would significantly complicate the trait. I wonder, is there a way that we can support a simple trait like All food for thought and I think worth discussing while we're committing to breaking changes here. |
No problem 😄
If we're willing to do more breaking changes here, that sounds like a good idea to me. The only downsides I can see
This would be really cool indeed. I guess we would maybe somehow need to take advantage of method precedence between trait and inherent methods, to make the fall through to |
If we're breaking |
That would make a very nice symmetry. I think if we do it, then it should work for all containers, which would probably mean one hidden method per container (vec, smallvec, arrays), maybe these could be unified with "return position impl trait in trait" (with something like |
0205631
to
2dbfa84
Compare
2dbfa84
to
2e2cf36
Compare
I've looked into this again. I think we would (for example) want something like this trait FromPyObject<'a, 'py>: Sized {
// snip
#[doc(hidden)]
fn extract_array<const N: usize>(
obj: Borrowed<'_, 'py, PyAny>,
_: private::Token,
) -> PyResult<[Self; N]>
where
Self: FromPyObjectOwned<'py>,
{
todo!()
}
} which we then use inside of Unfortunately this currently does not compile. I believe this is the corresponding |
Ah, nice research 👍. Agreed we can make a reminder for ourselves to improve bytes specializations once MSRV uses the next solve (gonna be a while 😂). Given the early reception of This branch really deserves to be merged asap, so please forgive me the delay, I think this now is highest thing on my to-do perhaps excluding a 0.23.1 patch release. I will try to play around with this branch and see how I feel about the ergonomics & perf, maybe once the kids are in bed tonight. |
I tend to agree here. We can maybe think about adding the
Absolutely no worries :) |
One idea which struck me, what if we keep The idea would be that here we are currently breaking the main trait to get to two lifetimes, and we are introducing But I think unlike This idea might have the nice boost that it's probably less breaking than adding the second lifetime to I feel like there's some detail about |
Interesting thought, I have't played around with it yet, but I opened with #4720 based on this including the changes adding the second lifetime to Maybe we can use that to play around with and judge whats the best option here. My initial feeling says that the one single trait approach might have a bigger hurdle in the beginning but is easier in the long run (less traits, using common patters) but I haven't tried the other approach yet. |
Co-authored-by: David Hewitt <mail@davidhewitt.dev>
2e2cf36
to
7636f0b
Compare
I did play around with the idea a bit. It looks like it is possible to stick the |
Hmm, interesting. One nice thing about
is that I'd guess most users would choose to implement it anyway as it's the simpler to implement. It's quite appealing to me that the default is simple. Plus it's less breaking than adding the lifetime & using I also feel like it's probably quite rare to have a case where I'm quite happy to be wrong here, though my gut feeling is that if cheap-and-cheerful |
This adds the second lifetime to
FromPyObject
and removesFromPyObjectBound
. I hope I adjusted all the bounds correctly. I believe the equivalent to our current implementation is use HRTB for the input lifetime on any generic bound. But this should only be necessary for containers that create temporary Python references during extraction. For wrapper types it should be possible to just forward the relaxed bound.For easier trait bounds
FromPyObjectOwned
is introduced. This is blanket implemented for anyFromPyObject
type that does not depend on the input lifetime. It is intended to be used as a trait bound, the idea is inspired byserde
(Deserialize
<=>DeserializeOwned
)I tried to document different cases in the migration guide, but it can probably still be extended.
Changelog entry is still missing, will write that tomorrow.Breaking changes:
extract
method without defaultextract_bound
to make the default work