TST: Add test for col names during groupby().agg() #43244

calvh · 2021-08-27T03:26:17Z

Column names should consistently be retained when using df.groupby().agg()

closes BUG: groupby().agg() loses column names for an empty dataframe with 'idxmax' as an aggregation function #42332
tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry

Whats new:
Add a test to pandas/tests/groupby/aggregate/test_aggregate.py

The issue appears to be fixed now:

Tested on 1.4.0.dev0+508.g5fe02971f8

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 5fe0297
python : 3.9.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-31-generic
Version : #33-Ubuntu SMP Wed Aug 11 13:19:04 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_CA.UTF-8
LOCALE : en_CA.UTF-8

pandas : 1.4.0.dev0+508.g5fe02971f8
numpy : 1.20.3
pytz : 2021.1
dateutil : 2.8.2
pip : 20.3.4
setuptools : 57.4.0
Cython : 0.29.24
pytest : 6.2.4
hypothesis : 6.14.9
sphinx : 4.1.2
blosc : 1.10.4
feather : None
xlsxwriter : 3.0.1
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.26.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 2021.05.0
fastparquet : 0.7.1
gcsfs : 2021.05.0
matplotlib : 3.4.3
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : 5.0.0
pyxlsb : None
s3fs : 2021.05.0
scipy : 1.7.1
sqlalchemy : 1.4.23
tables : 3.6.1
tabulate : 0.8.9
xarray : 0.18.2
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.54.0

Column names should consistently be retained when using df.groupby().agg()

phofl · 2021-08-27T12:10:57Z

pandas/tests/groupby/aggregate/test_aggregate.py

+        ["id1", "id2"]
+    )
+
+    df_sum_idx = df.sum().index.names


Thanks for the tests. Could you please check the whole DataFrame? Additionally it would be good, if you could parametrize here

@phofl sorry for the delay

After some experimenting, here is what I came up with to parametrize the test. I'm unsure of how to parametrize the sum function.

Also, I am sorry but could you elaborate on the meaning of the "whole dataframe" as I am not sure what you meant.

Thank you.

@pytest.mark.parametrize( "agg_params", [ {"start": pd.NamedAgg(column="time", aggfunc="min")}, { "start": pd.NamedAgg(column="time", aggfunc="min"), "peak_time": pd.NamedAgg(column="values", aggfunc="idxmax"), }, {"peak_time": pd.NamedAgg(column="values", aggfunc="idxmax")}, ], ) def test_groupby_agg_column_names(agg_params): # GH42332 grouped = ( DataFrame(columns=["id1", "id2", "time", "values"], dtype="int") .groupby(["id1", "id2"]) ) aggregated = grouped.agg(**agg_params) assert ( grouped.sum().index.names == aggregated.index.names == ["id1", "id2"] )

jreback · 2021-08-31T22:44:30Z

pandas/tests/groupby/aggregate/test_aggregate.py

+
+    expected = ["id1", "id2"]
+
+    assert df_sum_idx == expected


use tm.assert_frame_equal and construct the actual expected value.

github-actions · 2021-10-01T00:04:33Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

calvh · 2021-10-01T13:41:30Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

Sorry, yes I am still interested in working on this

mroeschke · 2021-10-31T00:52:18Z

Appears this PR has been dormant for a while and still needs updates so closing. If interested in continuing, please merge master, address related comments and we can reopen.

TST: Add test for col names during groupby().agg()

5fe0297

Column names should consistently be retained when using df.groupby().agg()

phofl reviewed Aug 27, 2021

View reviewed changes

jreback added this to the 1.4 milestone Aug 31, 2021

jreback added Groupby Testing pandas testing functions or related to the test suite labels Aug 31, 2021

jreback requested changes Aug 31, 2021

View reviewed changes

github-actions bot added the Stale label Oct 1, 2021

mroeschke closed this Oct 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

TST: Add test for col names during groupby().agg() #43244

TST: Add test for col names during groupby().agg() #43244

Uh oh!

calvh commented Aug 27, 2021 •

edited

Loading

Uh oh!

phofl Aug 27, 2021

Uh oh!

calvh Aug 31, 2021

Uh oh!

jreback Aug 31, 2021

Uh oh!

github-actions bot commented Oct 1, 2021

Uh oh!

calvh commented Oct 1, 2021

Uh oh!

mroeschke commented Oct 31, 2021

Uh oh!

Uh oh!

Uh oh!

TST: Add test for col names during groupby().agg() #43244

TST: Add test for col names during groupby().agg() #43244

Uh oh!

Conversation

calvh commented Aug 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

INSTALLED VERSIONS

Uh oh!

phofl Aug 27, 2021

Choose a reason for hiding this comment

Uh oh!

calvh Aug 31, 2021

Choose a reason for hiding this comment

Uh oh!

jreback Aug 31, 2021

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 1, 2021

Uh oh!

calvh commented Oct 1, 2021

Uh oh!

mroeschke commented Oct 31, 2021

Uh oh!

Uh oh!

calvh commented Aug 27, 2021 •

edited

Loading