-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: Correct type annotation for to_dict. #55130
TYP: Correct type annotation for to_dict. #55130
Conversation
The `into` argument of DataFrame.to_dict and Series.to_dict can be either a class or instance of a class of dict; this is covariant - subclasses of dict can also be used. The argument was annotated as `type[dict]` though, so type checkers marked passing initialized objects (required for collections.defaultdict) as an incorrect argument type. Fix by annotating `into` to take either a subclass of dict or an initialized instance of a subclass of dict.
afbc799
to
449eb46
Compare
Unfortunately a generic type annotation with a default triggers an existing mypy limitation (python/mypy#3737). The current workaround is to use overloads and then not annotate the implementation containing the default parameter; this still enables mypy to deduce correct return types. Two overloads are added for Series.to_dict, even though they could be combined using a Union type, as at least two overloads are required for a single method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to defer to @twoertwein on reviewing this one
pandas/core/frame.py
Outdated
index: bool = True, | ||
) -> dict | list[dict]: | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the annotations for the non-overload definition? If it triggers a mypy error within the function, it is okay to add an ignore comment.
edit: having the ignore comment will function as a TODO enforced by mypy (when a future mypy allows TypeVars+default )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsspencer I pushed to your branch so that pd.DataFrame([]).to_dict("dict")
works correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should now work for all cases:
import pandas as pd
from typing import MutableMapping
class MyDict(dict): ...
def test(into: MutableMapping):
# MutableMapping
reveal_type(pd.DataFrame([]).to_dict("dict", into=into))
reveal_type(pd.Series([]).to_dict(into=into))
# dict
reveal_type(pd.DataFrame([]).to_dict("dict"))
reveal_type(pd.Series([]).to_dict())
# MyDict
reveal_type(pd.DataFrame([]).to_dict("dict", into=MyDict))
reveal_type(pd.Series([]).to_dict(into=MyDict))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created pandas-dev/pandas-stubs#784 so we make a similar change in the stubs
# error: Incompatible default for argument "into" (default has type "type[ | ||
# dict[Any, Any]]", argument has type "type[MutableMappingT] | MutableMappingT") | ||
@deprecate_nonkeyword_arguments( | ||
version="3.0", allowed_args=["self"], name="to_dict" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this okay @mroeschke? It would make it consistent with DataFrame.to_dict
(and allow the dict
default case for typing) but it might be a bit "odd" to require keyword-only for a function that takes exactly one argument
(need to also fix the failing doc test for this change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup the consistency argument makes sense to go forward with this deprecation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Green now
Thanks @jsspencer |
The
into
argument of DataFrame.to_dict and Series.to_dict can be either a class or instance of a class of dict; this is covariant - subclasses of dict can also be used. The argument was annotated astype[dict]
though, so type checkers marked passing initialized objects (required for collections.defaultdict) as an incorrect argument type.Fix by annotating
into
to take either a subclass of dict or an initialized instance of a subclass of dict.