Support all receiver types for all protocol methods #1206

davidhewitt · 2020-09-23T06:39:04Z

At the moment the protocol methods are in an inconsistent state: some of them can take PyRef or PyRefMut, and some of them take &self or &mut self.

This is confusing to users and also gets in the way in certain cases like needing to access Python inside a protocol method (which can be obtained from PyRef::py() for example) or wanting to return PyRef<Self>.

The simple solution is to just change all protocol methods to use TryFromPyCell trait. This is however a breaking change.

The better solution is to change all #[pyproto] methods to support any of the five of &PyCell, PyRef<Self>, PyRefMut<Self>, &self and &mut self, just like we support for #[pymethods].

I've had some ideas how to approach this second point so would like to take a shot at it soon.

The text was updated successfully, but these errors were encountered:

kngwyu · 2020-09-23T08:19:25Z

The better solution is to change all #[pyproto] methods to support any of the five of &PyCell, PyRef, PyRefMut, &self and &mut self, just like we support for #[pymethods].

So what would the trait definition would be?
Or we remove __dunder__ methods?

davidhewitt · 2020-09-23T08:23:35Z

I think I see a way to make a PyMethodReceiver trait which is implemented by all five types.

davidhewitt · 2020-09-28T08:16:57Z

Quick update from me: over the weekend I had a little time to play with a trait along the lines of:

pub trait PyMethodReceiver<'a>: Sized {
    fn receive<T, U>(slf: &'a PyAny, f: impl FnOnce(Self) -> T) -> PyResult<U>
    where
        T: IntoPyCallbackOutput<U>,
        U: 'static;
}

It almost works, but there's some lifetime challenges about making sure the receiver lifetime is scoped correctly and safely. For the &self and &mut self receivers, I currently have some issues:

// THIS IMPL ISN'T SAFE BECAUSE &'a C IN THE CLOSURE CAN OUTLIVE THE `PYREF` GUARD
impl<'a, C: PyClass + 'a> PyMethodReceiver<'a> for &'a C {
    fn receive<T, U>(slf: &'a PyAny, f: impl FnOnce(Self) -> T) -> PyResult<U>
    where
        T: IntoPyCallbackOutput<U>,
        U: 'static
    {
        let cell: &PyCell<C> = slf.extract()?;

        // Introduce a PyRef to hold the guard. The lifetime of this is not long enough.
        let _ref = cell.try_borrow()?;

        // XXX: Use of unsafe below is not sound; was a hack to get compilation but the lifetime inferred outlives the guard.
        f(unsafe { cell.try_borrow_unguarded()? }).convert(slf.py())
    }
}

I'm hopeful that if I continue experimenting with designs in this space I'll be able to come up with a definition which is sound and gets us what we want.

kngwyu · 2020-09-28T08:25:36Z

So you mean you're extending the current fn __dunder__(slf: Self::Receiver...) to accept &Self or &mut Self type?
Interesting, but I'm not sure we really need this.
Both fn __dunder__(slf: &Self) and fn __dunder__(slf: PyRef<Self>) are not straightforward to write, so I think the usablity gain is not so much.

davidhewitt · 2020-10-01T07:26:34Z

I think that the most important thing is that we support fn __dunder__(slf: PyRef<Self>), for a couple of reasons:

it's possible to return slf: PyRef<Self> from methods, which is not possible with &self.
it's possible to get Python from slf.py(), which is also not possible with &self.

So for 0.13 we could change all dunder methods to use TryFromPyCell trait, even if we can't support the full set.

n1t0 · 2020-12-08T19:41:51Z

I'm also very interested in this, for another use case: PyRef<Self> gives access to all inherited classes. I don't think there is any way (even unsafe) to access them with the provided &self.

I started to implemented this type then quickly found myself blocked on PyO3/pyo3#1205 / PyO3/pyo3#1206 due to not being able to return `Self` from `__enter__`. We'll have to wait for a future pyo3 release before we can finish the Rust port.

davidhewitt · 2021-01-14T23:47:35Z

I took another look at this today, with a hope that after #1328 it should be possible to do further refactoring to support all of the receiver types listed above.

The answer is that if we change the traits to have slf: Self::Receiver as the first argument, e.g. like the existing PyIterProtocol trait:

pub trait PyIterProtocol<'p>: PyClass {
    fn __iter__(slf: Self::Receiver) -> Self::Result
    where
        Self: PyIterIterProtocol<'p>,
    {
        unimplemented!()
    }

    fn __next__(slf: Self::Receiver) -> Self::Result
    where
        Self: PyIterNextProtocol<'p>,
    {
        unimplemented!()
    }
}

... then for these traits it's not possible to call self.__iter__(). It has to be called as e.g. PyIterProtocol::__iter__(self).

As a consequence, supporting &self syntax in #[pyproto] for these traits is confusing in my opinion. If I see this code:

#[pyproto]
impl PyIterProtocol for MyClass {
    fn __iter__(slf: PyRef<Self>) -> PyRef<Self>
    {
        slf
    }

    fn __next__(&mut self) -> Option<i32>
    {
        unimplemented!()
    }
}

Then I would expect I should be able to call self.__next__() on MyClass instances. But actually I can't, because the trait definition above doesn't have an &mut self receiver. I have to call it as MyClass::__next__(self).

This problem might eventually be fixed with the arbitrary_self_types feature in a far-future Rust version. In the interest of providing a solution now, I think we have two options:

Just support TryFromPyCell receivers for all protocols.
That's the solution dicussed elsewhere in this thread. It's breaking for all existing #[pyproto] implementations which use &self or &mut self receivers. At least all protocols will be consistent with each other after this.
Merge #[pyproto] and #[pymethods].
This idea is a bit more radical, but I actually think it could be quite nice. Basically we let users write slot methods in #[pymethods] and the proc macro can detect these and handle them specially.

Migrating for users should not be too hard because all they will have to do is merge their #[pyproto] blocks into their #[pymethods].

Doing this would remove a lot of the current things we do to make #[pyproto] work (e.g. inserting lifetimes, lots of extra protocol traits). So the pyo3 codebase would probably be smaller and easier to maintain. From a user perspective, pyo3 would have a smaller API.

There are some nice advantages to using traits though, like documentation and grouping (e.g. force both buffer methods to be implemented together). We can always make sure the guide has good docs.

... I'm tempted to hack around in the near future and see what this feels like in practice.

birkenfeld · 2021-04-13T05:08:44Z

Without knowing too much about the implementation specifics, idea #2 sounds good to me. It will feel natural to Python users, since it mirrors how special methods are defined there.

On the Rust side though, the "trait-ness" of these interfaces is lost. Do people use the traits on the Rust side to handle different objects with common traits - even if the traits only contain Python related functionality? (If desperately needed this could still be mitigated by still having the traits and implementing them automatically from the pymethods macro, with its methods just forwarding to the related pymethod.)

davidhewitt · 2021-04-13T05:56:00Z

Do people use the traits on the Rust side to handle different objects with common traits - even if the traits only contain Python related functionality?

It's possible, but I haven't seen this in practice. Note that a Rust extension module will only have these protocol traits defined for its own pyclasses, and not for any builtin or thirdparty Python object. So most consumption of the Python protocols will still need to go via Python's dynamic type system / attribute lookup at runtime.

Having drafted the implementation for option 1 in #1561, I also now strongly prefer option 2. I agree about just having #[pymethods] being natural for Python users, and the migration for existing PyO3 projects will be easier. Option 2 will require cut-and-pasting all #[pyproto] methods into #[pymethods]. Option 1 actually forces quite a significant re-write of all #[pyproto] methods.

In either way, I think release 0.14 is due soon as we've got a lot of changes piling up, so I'm regrettably going to shift this to the 0.15 milestone. In my eyes this is one of the top things to sort for 0.15, along with #1056.

kngwyu · 2021-04-13T08:49:59Z

I think idea 2 could be better for those who already know Python well, but for Rust users who don't know Python very well, the trait is much better. I actually didn't know Python very well when I had to write some Rust extensions for Python. I'm not sure it's a major case, though.
How about converting user-defined trait methods to take slf: PyRef<T>? I agree that this is certainly ugly, but in the future, we can use arbitrary self.

Also, even if we remove protocol traits, I think we have to leave some traits (e.g., buffer). So, if we are going to remove protocol traits, we need to clarify what traits should be remained.

davidhewitt · 2021-04-14T06:51:05Z

How about converting user-defined trait methods to take slf: PyRef? I agree that this is certainly ugly, but in the future, we can use arbitrary self.

This is exactly what I've drafted in #1561. As well as being a bit ugly it's also a big migration for all existing code to have to change.

I think idea 2 could be better for those who already know Python well, but for Rust users who don't know Python very well, the trait is much better. I actually didn't know Python very well when I had to write some Rust extensions for Python. I'm not sure it's a major case, though.

What I think I'm hearing from this is that the traits provide useful grouping and documentation. I think with good documentation in the guide we can have most of the same benefit with #[pymethods].

kngwyu · 2021-04-14T15:49:18Z

I meant rewriting trait methods by proc macro.

davidhewitt · 2021-04-15T07:23:09Z

Ahh I see!

Yes, it's true #[pyproto] could rewrite all &self to slf: PyRef<Self>. It'd also have to rewrite all uses of self to slf inside of the function body. The result would mean that there would be less hard-breaking migration for users implementing these traits.

The downsides of this approach:

Users calling these traits would still have to change call syntax from self.__str__() to Self::__str__(slf) (with slf being a PyRef.
This would add even more magic to the #[pyproto] macro (which I think already does too much). I don't think we'd want to document this rewriting of self as a supported feature. Instead we should say it is a temporary bridge which would be removed after a couple of PyO3 versions.

If people think that option 1 is the better final design than option 2, then I could support migrating to option 1 via this. However I still think that option 2 is probably nicer overall. I need to try and draft an implementation of option 2 and see what it looks like in reality!

davidhewitt · 2021-04-21T19:43:09Z

Another interesting point for the discussion: a user this week tried to implement __str__ and __repr__ in #[pymethods] and then was confused why they did not work correctly. I had to point them to #[pyproto] docs on Gitter: https://kushaldas.in/posts/adding-dunder-methods-to-a-python-class-written-in-rust.html

This further makes me think just having the one macro would help avoid confusion.

kngwyu · 2021-04-22T02:18:20Z

OK, now I'm also inclined to use pymethod for everything. Raising compile errors and preparing documentation would be big stuff, though.

davidhewitt · 2021-04-22T05:53:54Z

Yes, I'm willing to put in the effort to build all of the necessary implementation for 0.15!

mejrs · 2021-04-27T18:08:56Z

a user this week tried to implement __str__ and __repr__ in #[pymethods] and then was confused why they did not work

I've had this problem too.

preparing documentation

I'm on board to do this. I know my way around the magic methods (in python) pretty well.

davidhewitt · 2021-04-27T18:50:12Z

Help with the documentation when we're ready would be really amazing. I'll probably start experimenting with this in about a month's time once 0.14 release is done and any relevant bugfixes out the way.

I imagine that what this will end up being like is that the #[pymethods] which have constraints in the types and arguments will be the ones that need documentation.

For example, we might need to document constraints like:

`__setattr__` 
  - Takes a receiver (`&self`, `&mut self`, `PyRef<Self>`, `Py<Self>` etc.)
  - Takes two arguments
  - Returns `()` or `PyResult<()>`

`__str__`
  - Takes a receiver (`&self`, `&mut self`, `PyRef<Self>`, `Py<Self>` etc.)
  - Takes no arguments
  - _Should_ return a string type (`&str`, `String`, `&PyString`, `&PyAny`, `PyObject` etc.), optionally wrapped in `PyResult`.

I imagine that it might need a couple iterations to make this all easy to read.

davidhewitt · 2021-09-24T21:34:42Z

The initial implementation for #[pymethods] is merged in #1864 and the remaining work is tracked in #1884. The plan is not to change #[pyproto] any more, and to eventually deprecate. Anyone who has a need for this functionality is encouraged to try the experimental #[pymethods] implementation in 0.15 once that releases (or on main already).

davidhewitt added the enhancement label Sep 23, 2020

davidhewitt self-assigned this Sep 23, 2020

davidhewitt mentioned this issue Sep 23, 2020

Trouble using PyContextProtocol #1205

Closed

davidhewitt added this to the 0.13 milestone Oct 1, 2020

kngwyu mentioned this issue Nov 5, 2020

Refactor pyproto internals #1117

Closed

n1t0 mentioned this issue Nov 25, 2020

Python - Mutable components huggingface/tokenizers#530

Merged

davidhewitt mentioned this issue Dec 19, 2020

pyproto: remove inventory from implementation #1328

Merged

davidhewitt modified the milestones: 0.13, 0.14 Dec 22, 2020

davidhewitt mentioned this issue Jan 15, 2021

pyproto: small refactoring to backend macro #1386

Merged

n1t0 mentioned this issue Mar 18, 2021

Improve PySequence objects huggingface/tokenizers#659

Open

This was referenced Apr 12, 2021

pyproto: deprecate py_methods #1560

Merged

pyproto: no self receivers #1561

Closed

davidhewitt modified the milestones: 0.14, 0.15 Apr 13, 2021

This was referenced Jun 29, 2021

Ensure all APIs are documented (#![deny(missing_docs)]) #306

Closed

Guide docs for all protocol methods #1032

Closed

davidhewitt mentioned this issue Sep 18, 2021

pymethods: add support for protocol methods #1864

Merged

6 tasks

davidhewitt closed this as completed Sep 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support all receiver types for all protocol methods #1206

Support all receiver types for all protocol methods #1206

davidhewitt commented Sep 23, 2020

kngwyu commented Sep 23, 2020

davidhewitt commented Sep 23, 2020

davidhewitt commented Sep 28, 2020

kngwyu commented Sep 28, 2020

davidhewitt commented Oct 1, 2020

n1t0 commented Dec 8, 2020 •

edited

Loading

davidhewitt commented Jan 14, 2021 •

edited

Loading

birkenfeld commented Apr 13, 2021

davidhewitt commented Apr 13, 2021 •

edited

Loading

kngwyu commented Apr 13, 2021 •

edited

Loading

davidhewitt commented Apr 14, 2021 •

edited

Loading

kngwyu commented Apr 14, 2021 •

edited

Loading

davidhewitt commented Apr 15, 2021

davidhewitt commented Apr 21, 2021

kngwyu commented Apr 22, 2021

davidhewitt commented Apr 22, 2021

mejrs commented Apr 27, 2021

davidhewitt commented Apr 27, 2021

davidhewitt commented Sep 24, 2021

Support all receiver types for all protocol methods #1206

Support all receiver types for all protocol methods #1206

Comments

davidhewitt commented Sep 23, 2020

kngwyu commented Sep 23, 2020

davidhewitt commented Sep 23, 2020

davidhewitt commented Sep 28, 2020

kngwyu commented Sep 28, 2020

davidhewitt commented Oct 1, 2020

n1t0 commented Dec 8, 2020 • edited Loading

davidhewitt commented Jan 14, 2021 • edited Loading

birkenfeld commented Apr 13, 2021

davidhewitt commented Apr 13, 2021 • edited Loading

kngwyu commented Apr 13, 2021 • edited Loading

davidhewitt commented Apr 14, 2021 • edited Loading

kngwyu commented Apr 14, 2021 • edited Loading

davidhewitt commented Apr 15, 2021

davidhewitt commented Apr 21, 2021

kngwyu commented Apr 22, 2021

davidhewitt commented Apr 22, 2021

mejrs commented Apr 27, 2021

davidhewitt commented Apr 27, 2021

davidhewitt commented Sep 24, 2021

n1t0 commented Dec 8, 2020 •

edited

Loading

davidhewitt commented Jan 14, 2021 •

edited

Loading

davidhewitt commented Apr 13, 2021 •

edited

Loading

kngwyu commented Apr 13, 2021 •

edited

Loading

davidhewitt commented Apr 14, 2021 •

edited

Loading

kngwyu commented Apr 14, 2021 •

edited

Loading