-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SequenceProtocol and MappingProtocol descriptions to the guide #1546
Conversation
Supposedly resolved by PyO3#1107 Fix a typo in the subsection header.
Updates the class customization guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for giving some much needed love to our documentation!
I have a couple of suggestions and questions; comments below.
Also, a more general question I have wondered about: the PyResult
bit of all the return values is optional - e.g. for a proto which returns PyResult<T>
, just T
is also an acceptable return value. Do you think it's better if:
- we include
PyResult<T>
on all these return values, with a note in an# Error Handling
section saying PyResult is optional - we just have
T
on all the return value, with a note in an# Error Handling
section saying all return types can be wrapped in PyResult
I think that just having T
on the return value is easier to read, but it's maybe less obvious for new users to then figure out how to return errors unless they read the whole document and find the Error Handling
section!
*Note:* Negative integer indexes are handled as follows: if `__len__()` is defined, | ||
it is called and the sequence length is used to compute a positive index, | ||
which is passed to `__getitem__()`. | ||
If `__len__()` is not defined, the index is passed as is to the function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This differs from https://docs.python.org/3/reference/datamodel.html#object.__getitem__, which says that negative integers are are passed directly to __getitem__
which then can interpret the negative numbers as it likes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume PyO3 plugs the __getitem__
method implementation into the PySequenceMethods.sq_item
slot. The C function in that slot has a slightly different interface from the Python-native __getitem__()
method, as described in the C-API docs linked above. PyObject_GetItem()
and PySequence_GetItem()
handle the negative indexes themselves when __len__()
is also implemented.
My implementations of __getitem__()
always start with assert!(idx >= 0);
because I also implement __len__()
, and they indeed work with the negative indexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, this is a really good point. Perhaps in the documentation here we should link all these methods to their relevant slot information?
All the relationships between these methods and their slots is defined in https://github.com/PyO3/pyo3/blob/main/pyo3-macros-backend/src/defs.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps in the documentation here we should link all these methods to their relevant slot information?
I like this idea, but the mapping between methods and slots looks not so straightforward. There are methods that do not have a C function slot like __bytes__()
, __format__()
and __reversed__()
. And then there are methods that share the same slot like __setitem__()
and __delitem__()
. I think we could add a symbolic tag like PySequenceMethods.sq_item
to each trait method.
Another interesting issue here is the implicit slot method probing order: for example the evaluation of obj[i]
Python expression probes PyMappingMethods.mp_subscript
before PySequenceMethods.sq_item
. This is relevant when someone implements both of these traits for some reason.
* `fn __reversed__(&self) -> PyResult<impl ToPyObject>` | ||
|
||
Called (if present) by the `reversed()` built-in to implement reverse iteration. | ||
It should return a new iterator object that iterates over all the objects in | ||
the container in reverse order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm we should probably move __reversed__
to the PyIterProtocol
trait? (I can do it in a separate PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, __reversed__()
is a standalone method, not a PyMappingMethods
slot, so there is no particular reason it should be a part of PyMappingProtocol
.
However, implementing __reversed__()
for sequences is not very useful, because the default implementation seems sufficient for most cases.
I personally do not like the idea of adding anything to PyIterProtocol
though. It's already semantically overloaded as it encompasses both IntoIterator
and Iterator
Rust traits. But then the user is expected to implement a different subset of methods for each of the intended uses (as an iterator or as an iterable).
Currently, we also have to write mandatory boilerplate code like
fn __iter__(slf: PyRef<Self>) -> PyRef<Self> {
slf
}
for every custom iterator class.
Most Rust programmers would expect a blanket implementation like
impl<I> IntoIterator for I where
I: Iterator,
from the stdlib to exist instead.
I'm sorry about the long rant, but PyIterProtocol
was the biggest road bump for me when working with PyO3.
I really wish it was two distinct traits, like for example PyIterProtocol
and PyIntoIterProtocol
, so that anyone who knows how Rust iterators work could easily implement Python iterators without scratching their head about "why do I need to return PyRef<Self>
instead Self
here?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I completely agree that the current situation is confusing in many ways. I wonder if alternatively here I should remove __reversed__
from #[pyproto]
completely. In my opinion #[pymethods]
are very easy to learn, and implementing __reversed__
in #[pymethods]
will already work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea very much. There are other non-slotted methods that are included in the protocol traits, like __bytes__()
. And then there are C function slots that have no corresponding PyObjectProtocol
method like tp_call
aka __call__()
.
I'm all for implementing true non-slotted methods as #[pymethods]
. This way we can avoid surprises like #1465 in the future.
- Style Co-authored-by: David Hewitt <1939362+davidhewitt@users.noreply.github.com>
Hmm, either way is fine with me. I mostly followed the convention |
This is very true. I think in that case it's better to leave the documentation with
I agree. There's a lot of magic going on in these trait implementations, which I really wish we could simplify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the discussion here, I'm happy that this is good to be merged to the guide as-is. Thanks very much for this; the #[pyproto]
documentation is desperately in need of love!
References: