Skip to content

TYP overload fillna #40737 #40887

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -5007,6 +5007,121 @@ def rename(
errors=errors,
)

@overload
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not put these in .pyi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told to add them directly to these files since existing overloads were already there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback yes, that would be great! Thing is though, the .pyi files require you to define all methods of a module

Given the sheer number of methods this module has, I'd suggest taking this PR with the overloads here, and then moving the overloads (along with annotations for all other methods) to a pandas/core/frame.pyi file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LarWong I'll get back to you on typing value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not put these in .pyi?

I don't think we want to go the route of using stubs for python files

@jreback yes, that would be great! Thing is though, the .pyi files require you to define all methods of a module

not that I think it should be done here, but it is possible to partially type a module using stubs.

https://github.com/python/typeshed/blob/master/CONTRIBUTING.md#incomplete-stubs

Partial modules (i.e. modules that are missing some or all classes, functions, or attributes) must include a top-level getattr() function marked with an # incomplete comment (see example below).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @simonjayhawkins , didn't know that was possible. Why not use partial stubs for overloads? Because some methods, like Series.drop

pandas/pandas/core/series.py

Lines 4478 to 4487 in 84d9c5e

def drop(
self,
labels=None,
axis=0,
index=None,
columns=None,
level=None,
inplace=False,
errors="raise",
) -> Series:

have 5 (five!) arguments with defaults before inplace, leading to...34 overloads 🤯 ! And even if we disallowed labels being passed as None in that one, that would still leave us with 18 overloads!


For typing value here, do you think Scalar | Mapping[Hashable, Scalar] | Series | DataFrame | None would be correct?

Copy link
Member

@simonjayhawkins simonjayhawkins Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that if a stub file is present it takes precedence over the python file. so we cannot ensure internal consistency and need to duplicate the type annotations to be able to check the functions in the module itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have 5 (five!) arguments with defaults before inplace, leading to...34 overloads

I think we are planning to to drop the inplace argument, hopefully pandas 2.0 and we won't need all these overloads. #16529

def fillna(
self,
value=...,
method: str | None = ...,
axis: Axis | None = ...,
inplace: Literal[False] = ...,
limit=...,
downcast=...,
) -> DataFrame:
...

@overload
def fillna(
self,
value,
method: str | None,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value,
*,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
method: str | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
method: str | None,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value,
*,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value,
method: str | None,
*,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value=...,
method: str | None = ...,
axis: Axis | None = ...,
inplace: bool = ...,
limit=...,
downcast=...,
) -> DataFrame | None:
...

@doc(NDFrame.fillna, **_shared_doc_kwargs)
def fillna(
self,
Expand Down
115 changes: 115 additions & 0 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -4581,6 +4581,121 @@ def drop(
errors=errors,
)

@overload
def fillna(
self,
value=...,
method: str | None = ...,
axis: Axis | None = ...,
inplace: Literal[False] = ...,
limit=...,
downcast=...,
) -> Series:
...

@overload
def fillna(
self,
value,
method: str | None,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value,
*,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
method: str | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
*,
method: str | None,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value,
*,
axis: Axis | None,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value,
method: str | None,
*,
inplace: Literal[True],
limit=...,
downcast=...,
) -> None:
...

@overload
def fillna(
self,
value=...,
method: str | None = ...,
axis: Axis | None = ...,
inplace: bool = ...,
limit=...,
downcast=...,
) -> Series | None:
...

@doc(NDFrame.fillna, **_shared_doc_kwargs)
def fillna(
self,
Expand Down
3 changes: 0 additions & 3 deletions pandas/core/strings/object_array.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
Pattern,
Set,
Union,
cast,
)
import unicodedata
import warnings
Expand Down Expand Up @@ -371,9 +370,7 @@ def _str_get_dummies(self, sep="|"):
try:
arr = sep + arr + sep
except TypeError:
arr = cast(Series, arr)
arr = sep + arr.astype(str) + sep
arr = cast(Series, arr)

tags: Set[str] = set()
for ts in Series(arr).str.split(sep):
Expand Down