-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: groupby.apply for Categorical should preserve categories (closes… #10142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Does this change anything for groupby.apply in the transform case, rather than the aggregate case? It would be nice to add a test for transform-like apply with categoricals if we don't have one already. |
Oh good point ... the following actually doesn't reindex:
I'll dig into the code path for this |
@mortada I actually think not reindexing is the right behavior here, for transforming grouped operators. The result index should be identical to the input index. |
As cool thanks guys, indeed I see that in the docs about transform. I'll add a test case for this behavior |
d8b49d5
to
bef6932
Compare
OK, looks good to me. Can you add a release note? |
bef6932
to
aeef3c2
Compare
ah absolutely. I'm not quite sure what section this should go, I just added it in "other enhancements" |
I'd probably call this a bug fix? |
yeah I was actually thinking the same thing, but the original issue was created as an enhancement |
@@ -26,6 +26,8 @@ New features | |||
Other enhancements | |||
^^^^^^^^^^^^^^^^^^ | |||
|
|||
- groupby.apply aggregation for Categorical now preserves categories (:issue:`10138`) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeh, why don't you move to bug fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure will do
aeef3c2
to
659bbec
Compare
@@ -2596,6 +2596,34 @@ def get_stats(group): | |||
result = self.df.groupby(cats).D.apply(get_stats) | |||
self.assertEqual(result.index.names[0], 'C') | |||
|
|||
def test_apply_categorical_data(self): | |||
# GH 10138 | |||
dense = Categorical(list('abc')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an ordered=False
, can you add a test for ordered=True
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I'll change this to test both cases
659bbec
to
eb33cf2
Compare
@@ -68,6 +68,7 @@ Bug Fixes | |||
|
|||
|
|||
- Bug in ``mean()`` where integer dtypes can overflow (:issue:`10172`) | |||
- Bug in groupby.apply aggregation for Categorical not preserving categories (:issue:`10138`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to 0.16.2
eb33cf2
to
c8bf1c4
Compare
@jreback moved this to |
ENH: groupby.apply for Categorical should preserve categories (closes…
@mortada gr8 thanks! |
closes #10138