Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax input requirements in deserialize_str #40

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ShaneMurphy2
Copy link

@ShaneMurphy2 ShaneMurphy2 commented Mar 22, 2023

Instead of attempting to downcast the input to a PyString, rely on the fact that for str objects, calling __str__ is a no-op, while for other objects it may allow us to support neat use cases.

An example of one such use case is turning Python's uuid.UUID into Rust's uuid::Uuid. If you call Uuid::depythonize you end up in <Uuid as Deserialize>::deserialize (if you enabled its serde feature). Since Depythonizer doesn't override Deserialize::is_human_readable, the implementation tries to deserialize with deserialize_str. At that point, the Deserializer has a value of type uuid.UUID (actually PyAny, but it really is uuid.UUID), and the visitor expects a value of type &str. We can make that happen by using str.

Instead of attempting to downcast the input to a PyString, rely on the
fact that for str objects, calling __str__ is a no-op, while for other
objects it may allow us to support neat use cases.

An example of one such use case is turning Python's uuid.UUID into
Rust's uuid::Uuid. If you call Uuid::depythonize you end up in <Uuid as
Deserialize>::deserialize (if you enabled its serde feature). Since
Depythonizer doesn't override Deserialize::is_human_readable, the
implementation tries to deserialize with deserialize_str. At that
point, the Deserializer has a value of type uuid.UUID (actually PyAny,
but it really is uuid.UUID), and the visitor expects a value of type
&str. We can make that happen by using __str__.
@ShaneMurphy2
Copy link
Author

Opened this PR but I'm not sure this is the right way to do this, or if it's even broadly useful. Wanted to get some feedback.

Generally, I think this approach is sound, as it mirrors the way deserialize_bool relies on is_true as opposed to a strict cast.

An alternate idea I had would be to maintain a list of "known types" that we want to stringify, but that could get expensive.

@ShaneMurphy2
Copy link
Author

ShaneMurphy2 commented Mar 22, 2023

I just realized there's no way to implement the serialization half of this, since there is nothing in the serde data model that tells us the original Rust type was a Uuid. This might still be a good change, but I'll wait for feedback.

@davidhewitt
Copy link
Owner

Hello, thanks for the PR!

So I think this use-case is reasonable but I'm a bit nervous about introducing this coercion globally. Instead would you be prepared to amend this PR to add a depythonize_custom API which takes a setting to enable this behaviour? That way we can have the default be unchanged.

The is_true comparison you draw above is incorrect IIRC - I believe PyAny::is_true() only checks for the True singleton rather than truthy objects.

@ShaneMurphy2
Copy link
Author

Hello, thanks for the PR!

:)

So I think this use-case is reasonable but I'm a bit nervous about introducing this coercion globally. Instead would you be prepared to amend this PR to add a depythonize_custom API which takes a setting to enable this behaviour? That way we can have the default be unchanged.

Totally, I was considering feature flags/custom for this.

The is_true comparison you draw above is incorrect IIRC - I believe PyAny::is_true() only checks for the True singleton rather than truthy objects.

The docs say "This is equivalent to the Python expression bool(self)".

@davidhewitt
Copy link
Owner

The docs say "This is equivalent to the Python expression bool(self)".

Right you are; I had peeked at the implementation and mixed up PyObject_IsTrue (which is equivalent to the above), versus Py_IsTrue (which would be equivalent to self is True).

@ShaneMurphy2
Copy link
Author

I did some further pondering about the usefulness of this change with regards to UUIDs. I got wrapped around the axle with serde, pythonize, and {To/From}PyObject. Preserving non serde data model types through serialization isn't really in scope for pythonize I think. Really what I was looking for was a custom implementation of FromPyObject.

That being said, I think that being more permissive with what is considered a str coming from Python is sensible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants