-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Generic dataclasses #259
base: master
Are you sure you want to change the base?
Conversation
Exciting PR, I've been waiting a long time for generic support in dataclasses :) Do you have any update please? |
* Raise descriptive error for unbound fields.
4203601
to
8443336
Compare
…notated generics, partials, and callables
I came up with 1 additional usecase, with 3 solutions that are somewhat related to Annotated and Generics. DelimitedListStr = NewType('DelimitedListStr', list[str], field=lambda *args, **kwargs: DelimitedList(mf.String(), *args, **kwargs)) I came up with these 3 solutions for this. One uses generics, the other doesn't.
DelimitedListStr = Annotated[List[str], DelimitedList[mf.String]]
DelimitedListStr = Annotated[List[str], functools.partial(DelimitedList, mf.String)]
DelimitedListStr = Annotated[List[str], lambda *args, **kwargs: DelimitedList(mf.String, *args, **kwargs)] Support for each has been added, including unittests. |
As API choices, these make me nervous. Recognizing generic callables as (potential) schema field annotations, and then testing them with a zero-arg call, as is done here marshmallow_dataclass/marshmallow_dataclass/__init__.py Lines 1090 to 1097 in 4531c35
seems both confusing and dangerous.
Can the same functionality be obtained by creating a custom Marshmallow Field subclass? E.g. something like: class DelimitedStrListField(DelimitedList):
def __init__(self, *args, **kwargs):
super().__init__(mf.String, *args, **kwargs)
DelimitedListStr = Annotated[List[str], DelimitedStrListField] |
Right.. We don't yet know if the annotation is supposed to be a marshmallow Field compatible callable. So we could be calling anything including The class approach certainly works, I was trying to use a one liner. But the DelimitedListStr = Annotated[List[str], MaFieldCallable(lambda *args, **kwargs: DelimitedList(mf.String, *args, **kwargs))] but I don't see the value in that. over using either the |
This approach was unsafe. See PR lovasoa#259 for more details
I hadn't noticed that you explicitly check for a partial applied to a Field (here). That is safe. I still think it adds unnecessary clutter to the API. Also, allowing a
A value of that would be that it is more explicitly readable: it makes the intent of the annotation more clear to the unindoctrinated reader. I'm in favor of stopping at recognizing annotations that are either subclasses or instances of I don't really see the extra three lines required to define a custom Field subclass as too verbose. (As well, doing so provides a good place to add a docstring, if that would help clarify the intent.) |
I'll switch to using the generic approach myself. So I'm fine with removing the
Agreed. I did investigate further but couldn't find a satisfactor way to get typehints of lambdas, even when using the Callable typehint. While I agree that 3 lines isn't bad, given that I have multiple one lines currently, each would take up 3 + 2 blank lines (style guide) So instead of 5 lines, it'd be 25 lines. Which is harder to read at a glance. But again, generics fixes my usecase and I can't find a satisfactory answer to the valid concerns being brought up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this review is complete, but I'm about to leave on a road-trip, so wanted to get this out.
marshmallow_dataclass/__init__.py
Outdated
obj, | ||
schema_ctx: _SchemaContext, | ||
) -> type: | ||
"""typing.get_type_hints doesn't work with generic aliasses. But this 'hack' works.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this function should be renamed _resolve_forward_type_refs
(or similar).
Get_type_hints
returns a dict containing the type hints for all members of a class (while resolving forward type references found along the way).
The purpose of this function, by way of contrast, is to resolve forward type references in a single type hint. (If there are no forward type references in obj
, this function (I think) returns obj
unchanged.)
(Spelling nit: "aliases")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the forward refs have already been resolved. This really exists purely to get the typehint for generics.
maybe _get_type_hint_of_generic_object
?
import typing
T = typing.TypeVar('T')
class A(Generic[T]):
a: T
class B(A[int]):
pass
print(typing.get_type_hints(B))
print(typing.get_type_hints(A[int]))
>=========================
{'a': ~T}
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[51], [line 12](vscode-notebook-cell:?execution_count=51&line=12)
[9](vscode-notebook-cell:?execution_count=51&line=9) pass
[11](vscode-notebook-cell:?execution_count=51&line=11) print(typing.get_type_hints(B))
---> [12](vscode-notebook-cell:?execution_count=51&line=12) print(typing.get_type_hints(A[int]))
File ~\.pyenv\pyenv-win\versions\3.11.3\Lib\typing.py:2347, in get_type_hints(obj, globalns, localns, include_extras)
~/.pyenv/pyenv-win/versions/3.11.3/Lib/typing.py:2345) return {}
~/.pyenv/pyenv-win/versions/3.11.3/Lib/typing.py:2346) else:
-> ~/.pyenv/pyenv-win/versions/3.11.3/Lib/typing.py:2347) raise TypeError('{!r} is not a module, class, method, '
~/.pyenv/pyenv-win/versions/3.11.3/Lib/typing.py:2348) 'or function.'.format(obj))
~/.pyenv/pyenv-win/versions/3.11.3/Lib/typing.py:2349) hints = dict(hints)
~/.pyenv/pyenv-win/versions/3.11.3/Lib/typing.py:2350) for name, value in hints.items():
TypeError: __main__.A[int] is not a module, class, method, or function.
def _get_generic_type_hints(obj) -> type:
"""typing.get_type_hints doesn't work with generic aliases. But this 'hack' works."""
class X:
x: obj # type: ignore[name-defined]
return typing.get_type_hints(X)['x']
print(_get_generic_type_hints(B))
print(_get_generic_type_hints(A[int]))
>=========================
<class '__main__.B'>
__main__.A[int]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your last examples, _get_generic_type_hints
is just the identity, right?
>>> _get_generic_type_hints(B) is B
True
>>> _get_generic_type_hints(A[int]) is A[int]
True
But, the value of _get_generic_type_hints
comes in resolving forward references:
>>> _get_generic_type_hints(A["int"])
__main__.A[int]
>>> _get_generic_type_hints(A["int"]) is A[int]
False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe what is really wanted is (untested)
def _get_generic_type_hints(generic_alias):
class X(generic_alias): pass
return typing.get_type_hints(X)
Or, in one line:
return typing.get_type_hints(types.new_class("_", bases=(generic_alias,)))
That version of _get_generic_type_hints
then matches the signature of typing.get_type_hints
. As things stand, it does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, good catch! You're absolutely right.
- This just worked 🤦♂️ I clearly spend too long trying to make it work to see the obvious solution.
(
type_hints[field.name]
if not is_generic_type(clazz)
else field.type
)
- But didn't work for
A["int"]
and no tests caught that yet. So I'll add a tests and rename the function as per your recommendation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, if that works, maybe just roll this into _get_type_hints
so that there's a single _get_type_hints
function that can be used for any dataclass, be it generic or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_generic_type_hints(generic_alias): class X(generic_alias): pass return typing.get_type_hints(X)
Does not work.
print(_get_generic_type_hints(A[int]))
>=====================
{'a': ~T}
Instead of {'a': __main__.A[int]}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not work.
That's too bad.
I still think that refactoring so that _get_type_hints
will work with generic aliases of dataclasses as well as plain dataclasses is probably worth it. It will move all the is_generic_type
special-casing out of
marshmallow_dataclass/marshmallow_dataclass/__init__.py
Lines 583 to 603 in 7ac088d
type_hints = {} | |
if not is_generic_type(clazz): | |
type_hints = _get_type_hints(clazz, schema_ctx) | |
attributes.update( | |
( | |
field.name, | |
_field_for_schema( | |
( | |
type_hints[field.name] | |
if not is_generic_type(clazz) | |
else _get_generic_type_hints(field.type, schema_ctx) | |
), | |
_get_field_default(field), | |
field.metadata, | |
base_schema, | |
), | |
) | |
for field in fields | |
if field.init or include_non_init | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree but I've already sunk too many hours into this particular problem. If it is even possible to do, it'll have to be someone else as I don't have the knowledge to take this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dairiki I found a way around it by resolving the forward refs when we get the dataclass fields. We now no longer call get_type_hints
at all anymore as it's relevant code is internalized.
We were already looping over the mro and fields just like get_type_hints
.
…ead of duplicating logic
I don't want to loop over the fields multiple times so internalized the relevant code from typing.get_type_hints into the generic_resolver
Bump. |
+1 to this PR, love this @mvanderlee -- we were just recently dealing with this issue and ended up creating our own bespoke field that handled this, but if this gets merged in we'd love to bump our version and use this. |
@mvanderlee I'm really sorry, I don't have a lot of bandwidth. @dairiki , maybe you can review this? |
I apologize as well. For the past month I've been mostly out of town and swamped with other concerns and activities. |
Gentle bump |
@dairiki , feel free to merge if you think it's okay |
@lovasoa @mvanderlee I apologize profusely for my lack of attention to this. (In my defense, I have been busy: travel, family and pet medical issues, and a lost month due to COVID infection.) I've attempted to start a review on this a couple of times. Each time I get bogged down by what seems to be mostly a re-implementation of I haven't found a great way to phrase this, so I will just say it: I could be wrong about that. (I haven't yet fully digested the new code in Or perhaps, the rest of you are not as bothered as I am with attempting to keep a level of abstraction (that is not under our maintenance) between our code and the implementation details of Feel free to discuss... But that is what I've been getting stuck on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few other minor comments I've had in the queue for awhile.
@@ -268,6 +289,8 @@ def add_schema(_cls=None, base_schema=None, cls_frame=None, stacklevel=1): | |||
""" | |||
|
|||
def decorator(clazz: Type[_U], stacklevel: int = stacklevel) -> Type[_U]: | |||
_check_decorated_type(clazz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_check_decorate_type
requires only that isinstance(clazz, type)
.
Do we want to require that clazz is a dataclass (isinstance(clazz, type) and dataclasses.is_dataclass(clazz)
) here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because we already have a big warning when a class is not a dataclass in _internal_class_schema
So non-dataclasses are allowed, but not supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just run a simple test.
You are correct in that decorating non-data class classes with add_schema
works — at least in simple cases. (That doesn't necessarily mean that we should allow it.)
It appears, however, that, as things stand in this PR, no big warning is emitted in that case. Further investigation reveals that get_resolved_dataclass_fields
"just works" (with no warnings emitted) even if its argument is not a dataclass. (I have some suspicion that perhaps fields from the classes __mro__
are not properly handled in that case, but I haven't looked close enough to say for sure.)
In any case, I think that either:
- We should disallow using the
add_schema
decorator on non-dataclasses. (Why allow it if it's unsupported/untested?) - Or, we need a test to ensure the warning is, in fact, emitted when it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had to do some digging.
- In today's version, 8.7.1, I can call
class_schema(NonDataclass)
just fine. It only shows the warning if one of it's fields is not a dataclass. i.e.: Nested non dataclasses. - This goes back to when the warning was originally added: e31faa8
- The behaviour still works the same with this PR.
I don't disagree with removing support for non-dataclasses, but don't see why that should be part of this PR.
As an example, how well does this PR cope with PEP 696 (coming October 7 in Python 3.13)? In any case, since the 3.13 release is imminent, we should be testing with that, too. |
The generic_resolver itself does a lot more than The merging with The above 'merge' does break in 3.13 due to a warning being raised. The added type_params from 3.12 that will be required as an arg in Adding support for TypeVars with defaults would probably be a fair bit of work mainly because of the recursion caused by allowing TypeVars as defaults. Or if the there are multiple defaults that should override each other. |
@dairiki I have a second branch with the If that matches your expectations I can update this PR with that. |
Fixes: #230, possbily #267, #269
Building on the fantastic work of @onursatici and @dairiki in PRs #172 and #232
Currently broken on py3.6, but once #257 is merged I'll rebase and thus remove py3.6 and py3.7 support.
I've tested this with tox on python versions 38, 39, 310, 311, 312