Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/TST: non-numeric EA reductions #59234

Merged
merged 4 commits into from
Jul 13, 2024

Conversation

lukemanley
Copy link
Member

Fixes a few bugs related to non-numeric EA reductions and expands testing by un-skipping a number of existing tests.

> pytest -k test_reduce pandas/tests/extension/

# main branch
2160 passed, 1222 skipped, 154 xfailed

# PR
2507 passed, 886 skipped, 143 xfailed

BUG fix example:

In [1]: import pandas as pd

In [2]: arr = pd.array([None], dtype="duration[ms][pyarrow]")

In [3]: pd.DataFrame({"a": arr}).min().dtype

# main branch
Out[3]: int64[pyarrow]

# PR
Out[3]: duration[ms][pyarrow]

@lukemanley lukemanley added Bug ExtensionArray Extending pandas with custom dtypes or arrays. Reduction Operations sum, mean, min, max, etc. labels Jul 12, 2024
@lukemanley lukemanley added this to the 3.0 milestone Jul 12, 2024
@@ -525,7 +525,10 @@ def _reduce(
self, name: str, *, skipna: bool = True, axis: AxisInt | None = 0, **kwargs
):
if name in ["min", "max"]:
return getattr(self, name)(skipna=skipna, axis=axis)
result = getattr(self, name)(skipna=skipna, axis=axis)
if kwargs.get("keepdims"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include keepdims in the _reduce signature instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

if name == "std":
from pandas.core.arrays import TimedeltaArray

return TimedeltaArray._from_sequence(result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this possibly preserve the resolution of the input? e.g std of a ms resolution datetime array should return ms resolution

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the resolution is preserved in nanops.nanstd.

@mroeschke mroeschke merged commit 39bd3d3 into pandas-dev:main Jul 13, 2024
45 checks passed
@mroeschke
Copy link
Member

Thanks @lukemanley

lithomas1 added a commit to lithomas1/pandas that referenced this pull request Sep 9, 2024
jorisvandenbossche pushed a commit to WillAyd/pandas that referenced this pull request Oct 2, 2024
jorisvandenbossche pushed a commit to WillAyd/pandas that referenced this pull request Oct 2, 2024
jorisvandenbossche pushed a commit to WillAyd/pandas that referenced this pull request Oct 2, 2024
jorisvandenbossche pushed a commit to WillAyd/pandas that referenced this pull request Oct 3, 2024
jorisvandenbossche pushed a commit to WillAyd/pandas that referenced this pull request Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug ExtensionArray Extending pandas with custom dtypes or arrays. Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants