Skip to content

BUG: make Series.agg aggregate when possible #53324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,31 @@ Notable bug fixes

These are bug fixes that might have notable behavior changes.

.. _whatsnew_210.notable_bug_fixes.notable_bug_fix1:
.. _whatsnew_210.notable_bug_fixes.series.agg:

Previously, :meth:`Series.agg` did not necessary aggregate, even if given an aggregation function:

*Previous behavior*:

.. code-block:: ipython

In [1]: ser = pd.Series([1, 2, 3])
In [2]: ser.agg(np.sum)
0 1
1 2
2 3
dtype: int64

Now it will always aggregate, when passed an aggregation function:

*New behavior*:

.. ipython:: python

ser = pd.Series([1, 2, 3])
ser.agg(np.sum)

More generally, the result from :meth:`Series.agg` will now always be the same as the single-column result from :meth:`DataFrame.agg` (:issue:`53324`).

notable_bug_fix1
^^^^^^^^^^^^^^^^
Expand Down
15 changes: 1 addition & 14 deletions pandas/core/apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -1084,22 +1084,9 @@ def agg(self):
result = super().agg()
if result is None:
f = self.f

# string, list-like, and dict-like are entirely handled in super
assert callable(f)

# try a regular apply, this evaluates lambdas
# row-by-row; however if the lambda is expected a Series
# expression, e.g.: lambda x: x-x.quantile(0.25)
# this will fail, so we can try a vectorized evaluation

# we cannot FIRST try the vectorized evaluation, because
# then .agg and .apply would have different semantics if the
# operation is actually defined on the Series, e.g. str
try:
result = self.obj.apply(f)
except (ValueError, AttributeError, TypeError):
result = f(self.obj)
result = f(self.obj)

return result

Expand Down
17 changes: 0 additions & 17 deletions pandas/tests/apply/test_series_apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -346,18 +346,6 @@ def test_demo():
tm.assert_series_equal(result, expected)


def test_agg_apply_evaluate_lambdas_the_same(string_series):
# test that we are evaluating row-by-row first
# before vectorized evaluation
result = string_series.apply(lambda x: str(x))
expected = string_series.agg(lambda x: str(x))
tm.assert_series_equal(result, expected)

result = string_series.apply(str)
expected = string_series.agg(str)
tm.assert_series_equal(result, expected)


def test_with_nested_series(datetime_series):
# GH 2316
# .agg with a reducer and a transform, what to do
Expand All @@ -370,11 +358,6 @@ def test_with_nested_series(datetime_series):
expected = DataFrame({"x": datetime_series, "x^2": datetime_series**2})
tm.assert_frame_equal(result, expected)

with tm.assert_produces_warning(FutureWarning, match=msg):
# GH52123
result = datetime_series.agg(lambda x: Series([x, x**2], index=["x", "x^2"]))
tm.assert_frame_equal(result, expected)


def test_replicate_describe(string_series):
# this also tests a result set that is all scalars
Expand Down