Skip to content

TYP: mostly DataFrame return overloads #56739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 13, 2024
Merged

TYP: mostly DataFrame return overloads #56739

merged 6 commits into from
Feb 13, 2024

Conversation

twoertwein
Copy link
Member

@twoertwein twoertwein commented Jan 5, 2024

  • DataFrame.any/bool return bool when axis=None and Series otherwise
  • DataFrame.sum/prod currently always returns a Series (but will return anything when axis=None in the future)
  • DataFrame.mean/median/var/std/skew/kurt return Series when axis is not None and potentially anything when axis=None

I assume it is fine to directly use keyword-only arguments in NDFrame as it is private(?), and Series+DataFrame overwrite the methods in question.

@@ -11538,9 +11538,42 @@ def _reduce_axis1(self, name: str, func, skipna: bool) -> Series:
res_ser = self._constructor_sliced(result, index=self.index, copy=False)
return res_ser

@doc(make_doc("any", ndim=2))
# error: Signature of "any" incompatible with supertype "NDFrame"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be fine removing # error: ... when we do not expect that they can be fixed (it doesn't make sense for NDFrame to have those overloads, so there will always be a mismatch between them.)

@@ -392,7 +392,7 @@ def dtype_counts(self) -> Mapping[str, int]:

@property
@abstractmethod
def non_null_counts(self) -> Sequence[int]:
def non_null_counts(self) -> list[int] | Series:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Series is not a Sequence as Series.__hash__ = None

@twoertwein twoertwein marked this pull request as ready for review January 5, 2024 04:24
@twoertwein
Copy link
Member Author

The keyword-only changes do not seem to be needed by mypy/pyright: can revert those changes (DataFrame.any is already keyword-only, but DataFrame.all/min//sum/... aren't)

@mroeschke mroeschke added DataFrame DataFrame data structure Typing type annotations, mypy/pyright type checking labels Jan 5, 2024
@twoertwein
Copy link
Member Author

The keyword-only changes do not seem to be needed by mypy/pyright: can revert those changes (DataFrame.any is already keyword-only, but DataFrame.all/min//sum/... aren't)

xref #57087 (I'm also in favor of keyword-only)

@twoertwein
Copy link
Member Author

@mroeschke Let me know if I should try to outsource some of the changes to a separate PR to make it easier to review. The keyword-only changes caused many changes: typing overloads and fixing positional usage in the tests and docs.

@mroeschke
Copy link
Member

Yeah could you break out the keyword-only deprecation into it's own PR separate from the misc typing changes?

@twoertwein
Copy link
Member Author

Yeah could you break out the keyword-only deprecation into it's own PR separate from the misc typing changes?

I removed the keyword-deprecation decorators

@@ -11805,6 +11805,7 @@ def _logical_func(

def any(
self,
*,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be in your follow up PR? Does it change the current behavior of e.g. DataFrame/Series.any()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be fine having those changes: NDFrame is private and all those methods are overwritten by DataFrame and Series, so these changes are not user-facing. Let me revert them to keep it less controversial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind, I will revert this change back - see failing mypy&stubtest


@deprecate_nonkeyword_arguments(version="3.0", allowed_args=["self"], name="std")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be in your follow up too?

@mroeschke mroeschke added this to the 3.0 milestone Feb 13, 2024
@mroeschke mroeschke merged commit baf98b8 into pandas-dev:main Feb 13, 2024
@mroeschke
Copy link
Member

Thanks @twoertwein

@twoertwein twoertwein deleted the dateframe_overloads branch April 7, 2024 10:59
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
* TYP: mostly DataFrame return overloads

* remove keyword-only enforcement

* revert keyword-only in NDFrame

* revert revert

* remove decorators in Series
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFrame DataFrame data structure Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants