-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Aggregating groupby with multiple functions all return the same value #16904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you include a code sample to create the DataFrame? |
I don't understand. The output in #7186 is obviously not the same:
The reason they look so similar is because they're defined by
What am I missing? |
Regardless of how this turns out, it does seem like another test might be needed to shore this one up for good (@dsm054 : your examples would be a good starting point unless @HristoBuyukliev has additional code to provide). |
Yeah, my bad. My code had a bug in it, and I didn't see the old issue's results have some differences. I'm closing the issue now. |
Actually, I would like to keep the issue open until we can confirm coverage for this issue. @dsm054 , would you like to add your example code to tests? |
@HristoBuyukliev : feel free to contribute a test on that front as well, so that you can remain assured that this won't be an issue for you again 😄 |
Yeah, I mean, say it turned out that when you have a numpy function and multiple lambdas in an agg call that the last lambda function dominated the others for some reason. Would any of us really have been shocked? Surprised, maybe, but usually there's about a bug a week where I'm genuinely startled no one noticed before.. I'll add a test this weekend if no one else gets to it. |
@dsm054 : Go right ahead and add it! You were the one who helped resolve this. |
closing, but @dsm054 if you have a repor with a test pls comment (or open a new issue). |
So, I want to aggregate a grouped by object by two criterion: count of observations, and count of nonzero observations:
Problem description
While this code is referenced in issue 7186, and the issue is closed, there is a bug: all the output columns are the same. That is evident even in issue 7186, and I'm shocked how nobody picked it up.
Output of
pd.show_versions()
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-31-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 36.0.1
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 5.4.1
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: 1.0b3
sqlalchemy: 1.0.14
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.7.3
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: