Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame with Int64 columns casts to float64 with .max()/.min() #34210

1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1128,6 +1128,7 @@ ExtensionArray
- Fixed bug where :meth:`StringArray.memory_usage` was not implemented (:issue:`33963`)
- Fixed bug where :meth:`DataFrameGroupBy` would ignore the ``min_count`` argument for aggregations on nullable boolean dtypes (:issue:`34051`)
- Fixed bug that `DataFrame(columns=.., dtype='string')` would fail (:issue:`27953`, :issue:`33623`)
- Fixed bug where :meth:`DataFrame.min` and :meth:`DataFrame.max` with Int64 columns casts to float64 (:issue:`32651`)

Other
^^^^^
Expand Down
4 changes: 3 additions & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -8508,7 +8508,9 @@ def _get_data(axis_matters):
raise NotImplementedError(msg)
return data

if numeric_only is not None and axis in [0, 1]:
is_numeric = self._mgr.blocks and all(b.is_numeric for b in self._mgr.blocks)

if (is_numeric or numeric_only is not None) and axis is not None:
df = self
if numeric_only is True:
df = _get_data(axis_matters=True)
Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/frame/test_analytics.py
Original file line number Diff line number Diff line change
Expand Up @@ -912,6 +912,19 @@ def test_mean_extensionarray_numeric_only_true(self):
expected = pd.DataFrame(arr).mean()
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("numeric_only", [True, False, None])
@pytest.mark.parametrize("method", ["min", "max"])
def test_minmax_extensionarray(self, method, numeric_only):
# https://github.com/pandas-dev/pandas/issues/32651
int64_info = np.iinfo("int64")
ser = Series([int64_info.max, None, int64_info.min], dtype=pd.Int64Dtype())
df = DataFrame({"Int64": ser})
result = getattr(df, method)(numeric_only=numeric_only)
expected = Series(
[getattr(int64_info, method)], index=pd.Index(["Int64"], dtype="object")
)
tm.assert_series_equal(result, expected)

def test_stats_mixed_type(self, float_string_frame):
# don't blow up
float_string_frame.std(1)
Expand Down