-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add signature for dataclasses.replace #14849
Changes from 15 commits
6dbaf9c
4dcbe44
7ed3741
89257b5
32b1d47
1f08816
c456a5f
8118d29
789cb2b
9f0974c
a37e406
367c0e9
0e84c4f
9cfc081
7b907cf
985db60
15dbb7b
3227fde
2dbf249
b32881c
26056a4
40315b7
c005895
d71bc21
d914b94
2cb6dee
5735726
d38897e
04f0ee3
f402b86
306c3f3
9b491f5
283fe3d
29780e9
9c43ab6
94024ba
70240c4
99bf973
8fe75a7
bea50e8
cd35951
d71a7c0
957744e
6abf6ff
4c4fc94
be4a290
ee0ae21
3f656f8
65d6e89
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -7,6 +7,7 @@ | |||
|
||||
from mypy import errorcodes, message_registry | ||||
from mypy.expandtype import expand_type | ||||
from mypy.messages import format_type_bare | ||||
from mypy.nodes import ( | ||||
ARG_NAMED, | ||||
ARG_NAMED_OPT, | ||||
|
@@ -23,6 +24,7 @@ | |||
Context, | ||||
DataclassTransformSpec, | ||||
Expression, | ||||
FuncDef, | ||||
IfStmt, | ||||
JsonDict, | ||||
NameExpr, | ||||
|
@@ -37,7 +39,7 @@ | |||
TypeVarExpr, | ||||
Var, | ||||
) | ||||
from mypy.plugin import ClassDefContext, SemanticAnalyzerPluginInterface | ||||
from mypy.plugin import ClassDefContext, FunctionSigContext, SemanticAnalyzerPluginInterface | ||||
from mypy.plugins.common import ( | ||||
_get_decorator_bool_argument, | ||||
add_attribute_to_class, | ||||
|
@@ -74,6 +76,7 @@ | |||
frozen_default=False, | ||||
field_specifiers=("dataclasses.Field", "dataclasses.field"), | ||||
) | ||||
_INTERNAL_REPLACE_SYM_NAME = "__mypy_replace" | ||||
|
||||
|
||||
class DataclassAttribute: | ||||
|
@@ -326,6 +329,7 @@ def transform(self) -> bool: | |||
add_attribute_to_class(self._api, self._cls, "__match_args__", match_args_type) | ||||
|
||||
self._add_dataclass_fields_magic_attribute() | ||||
self._add_internal_replace_method(attributes) | ||||
|
||||
info.metadata["dataclass"] = { | ||||
"attributes": [attr.serialize() for attr in attributes], | ||||
|
@@ -334,6 +338,35 @@ def transform(self) -> bool: | |||
|
||||
return True | ||||
|
||||
def _add_internal_replace_method(self, attributes: list[DataclassAttribute]) -> None: | ||||
""" | ||||
Stashes the signature of 'dataclasses.replace(...)' for this specific dataclass | ||||
to be used later whenever 'dataclasses.replace' is called for this dataclass. | ||||
""" | ||||
arg_types: list[Type] = [Instance(self._cls.info, [])] | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Btw this (and also the return type below, etc) look not very careful about generic dataclasses. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right. I'd be better off not adding the first arg and ret_type in this phase, and instead add them in the hook: |
||||
arg_kinds = [ARG_POS] | ||||
arg_names: list[str | None] = [None] | ||||
for attr in attributes: | ||||
assert attr.type is not None | ||||
arg_types.append(attr.type) | ||||
arg_kinds.append( | ||||
ARG_NAMED if attr.is_init_var and not attr.has_default else ARG_NAMED_OPT | ||||
) | ||||
arg_names.append(attr.name) | ||||
|
||||
signature = CallableType( | ||||
arg_types=arg_types, | ||||
arg_kinds=arg_kinds, | ||||
arg_names=arg_names, | ||||
ret_type=Instance(self._cls.info, []), | ||||
fallback=self._api.named_type("builtins.function"), | ||||
name=f"replace of {self._cls.info.name}", | ||||
) | ||||
|
||||
self._cls.info.names[_INTERNAL_REPLACE_SYM_NAME] = SymbolTableNode( | ||||
kind=MDEF, node=FuncDef(typ=signature), plugin_generated=True | ||||
) | ||||
|
||||
def add_slots( | ||||
self, info: TypeInfo, attributes: list[DataclassAttribute], *, correct_version: bool | ||||
) -> None: | ||||
|
@@ -787,3 +820,43 @@ def _is_dataclasses_decorator(node: Node) -> bool: | |||
if isinstance(node, RefExpr): | ||||
return node.fullname in dataclass_makers | ||||
return False | ||||
|
||||
|
||||
def replace_function_sig_callback(ctx: FunctionSigContext) -> CallableType: | ||||
""" | ||||
Returns a signature for the 'dataclasses.replace' function that's dependent on the type | ||||
of the first positional argument. | ||||
""" | ||||
if len(ctx.args) != 2: | ||||
# Ideally the name and context should be callee's, but we don't have it in FunctionSigContext. | ||||
ctx.api.fail(f'"{ctx.default_signature.name}" has unexpected type annotation', ctx.context) | ||||
return ctx.default_signature | ||||
|
||||
if len(ctx.args[0]) != 1: | ||||
return ctx.default_signature # leave it to the type checker to complain | ||||
|
||||
obj_arg = ctx.args[0][0] | ||||
|
||||
# <hack> | ||||
from mypy.checker import TypeChecker | ||||
|
||||
assert isinstance(ctx.api, TypeChecker) | ||||
obj_type = ctx.api.expr_checker.accept(obj_arg) | ||||
# </hack> | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW this kind of hack is quite common (I myself used it several times, for old sqlalchemy plugin, and internally). I would propose to instead add another plugin hook, e.g. cc @JukkaL what do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This proposal still stands. Do we have an issue for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||
|
||||
obj_type = get_proper_type(obj_type) | ||||
if not isinstance(obj_type, Instance): | ||||
return ctx.default_signature | ||||
|
||||
replace_func = obj_type.type.get_method(_INTERNAL_REPLACE_SYM_NAME) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like this hack at all :(
So, what's the problem? Looks like we have it in the metadata anyway: info.metadata["dataclass"] = {
"attributes": [attr.serialize() for attr in attributes], There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Metadata should contain serialized JSON, not ad hoc type objects that I don't mean to serialize. That's why I'm saying I'd like if there'd be a way to stash runtime arbitrary data belonging to a plugin on a TypeInfo. I can't easily serialize it again from the metadata at the type checker stage, and anyway the analysis stage works much better to construct the signature, but then I have to put it somewhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can deserialize attributes with mypy/mypy/plugins/dataclasses.py Line 149 in 9944d5f
replace_function_sig_callback and do not store anything extra.
Or am I missing something? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can look it up again, but I believe I couldn't do it with the API provided by that hook, but only with the semantic analysis API. Also
BTW, I considered implementing it differently, by replacing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, sounds reasonable. But, other plugins do not do this. I don't think that this use-case should be the first one to do something like this.
This can quickly get out of control, if you have a lot of arguments including optional ones. Plus, I agree with your point that overloads can generate unreadable error messages.
I cannot see anything specific about it. So, the algorithm would be: def replace_sig_hook(ctx):
attributes = deserialize_metadata()
signature = compute_from_dataclass_attributes(attributes)
return signature Please, in case you have any problems - let me know, I would be happy to help! In case this would be very slow (and users would complain) we can find ways to optimize it in the future 👍 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think this code from pynamodb-mypy illustrates the problem: As you see, I need to use "private API" to deserialize the type at the type-checker phase. I remember wishing there was a "plugin_ephemeral_storage" when I was working on that plugin, since I'm literally deserializing something that I just serialized in a previous phase.
My main concern is indeed not the slowness (cannot imagine
|
||||
if replace_func is None: | ||||
obj_type_str = format_type_bare(obj_type) | ||||
ctx.api.fail( | ||||
f'Argument 1 to "replace" has incompatible type "{obj_type_str}"; expected a dataclass', | ||||
ctx.context, | ||||
) | ||||
return ctx.default_signature | ||||
|
||||
signature = get_proper_type(replace_func.type) | ||||
assert isinstance(signature, CallableType) | ||||
return signature |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2002,7 +2002,6 @@ e: Element[Bar] | |
reveal_type(e.elements) # N: Revealed type is "typing.Sequence[__main__.Element[__main__.Bar]]" | ||
[builtins fixtures/dataclasses.pyi] | ||
|
||
|
||
[case testIfConditionsInDefinition] | ||
# flags: --python-version 3.11 --always-true TRUTH | ||
from dataclasses import dataclass | ||
|
@@ -2036,4 +2035,38 @@ Foo( | |
present_4=4, | ||
present_5=5, | ||
) | ||
|
||
[builtins fixtures/dataclasses.pyi] | ||
|
||
[case testReplace] | ||
from dataclasses import dataclass, replace, InitVar | ||
from typing import ClassVar | ||
|
||
@dataclass | ||
class A: | ||
x: int | ||
q: InitVar[int] | ||
q2: InitVar[int] = 0 | ||
c: ClassVar[int] | ||
|
||
|
||
a = A(x=42, q=7) | ||
a2 = replace(a) # E: Missing named argument "q" for "replace" of "A" | ||
a2 = replace(a, q=42) | ||
a2 = replace(a, x=42, q=42) | ||
a2 = replace(a, x=42, q=42, c=7) # E: Unexpected keyword argument "c" for "replace" of "A" | ||
a2 = replace(a, x='42', q=42) # E: Argument "x" to "replace" of "A" has incompatible type "str"; expected "int" | ||
a2 = replace(a, q='42') # E: Argument "q" to "replace" of "A" has incompatible type "str"; expected "int" | ||
reveal_type(a2) # N: Revealed type is "__main__.A" | ||
|
||
[builtins fixtures/dataclasses.pyi] | ||
|
||
[case testReplaceNotDataclass] | ||
from dataclasses import replace | ||
|
||
replace(5) # E: Argument 1 to "replace" has incompatible type "int"; expected a dataclass | ||
|
||
class C: | ||
pass | ||
|
||
replace(C()) # E: Argument 1 to "replace" has incompatible type "C"; expected a dataclass | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would propose to add tests that cover:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To clarify on the fine grained (daemon) mode, I want to double-check that if there is only a replace call in a different module (and nothing else), it will be correctly rechecked (e.g. a new error will appear) if you change one of the dataclass field types. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good idea, did I do this right? d71bc21 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose to use a syntactically invalid name for internal things, like
mypy-replace
, we do this in several other places, see e.g.get_unique_redefinition_name()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Good idea using an invalid name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding an
ephemeral_data
dict toTypeInfo
, likemetadata
but ephemeral and intended for passing data between phases rather than between invocations, thus it doesn't have to be serializable?I was thinking something like
ephemeral_data: Dict[Plugin, object]
so that the data could remain compartmentalized by plugins and plugins wouldn't need to make up their sentinels.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be not worth it. But if this will keep coming, yes, we can add something like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a similar need in my pynamodb plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
deserialize_type
call:https://github.com/pynamodb/pynamodb-mypy/blob/a9684e20b413c39b447e54238003c2d99a5c0c73/pynamodb_mypy/plugin.py#L50-L60
... even though... I can't quite remember, maybe that's a good thing? i.e.maybe due to caching semanal is skipped and
metadata
is what I'm supposed to consult?But, as you can see, in the checker phase I don't have a public API to really "deserialize" what's in the metadata back into a type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ilevkivskyi Can you advice whether
get_class_decorator_hook_2
will be called when there's already a cache?