Skip to content

fixed transform docs #43058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

willie3838
Copy link
Contributor

@pep8speaks
Copy link

pep8speaks commented Aug 15, 2021

Hello @willie3838! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-08-17 16:34:43 UTC

@simonjayhawkins simonjayhawkins added Apply Apply, Aggregate, Transform, Map Docs labels Aug 16, 2021
@jreback jreback added this to the 1.4 milestone Aug 17, 2021
return a %(klass)s having the same indexes as the original object
filled with the transformed values
Apply function ``func`` column-by-column to the GroupBy object and return a %(klass)s
with the same length as the group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you say index (can say length too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Changes to the argument name (f -> func) is great - but the summary is not quite accurate. Some thoughts below.

Call function producing a like-indexed %(klass)s on each group and
return a %(klass)s having the same indexes as the original object
filled with the transformed values
Apply function ``func`` column-by-column to the GroupBy object and return a %(klass)s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sometimes true, but not always. For example:

def foo(x):
    print(x)
    return x

df = pd.DataFrame({'a': [1, 2], 'b': [2, 3], 'c': [3, 4]})
df.groupby('a').transform(foo)

gives (after clipping of the first group which is used to determine slowpath vs fastpath)

   b  c
0  2  3
   b  c
1  3  4

In this case, the transform is evaluated on the entire group, not column-by-column.

Copy link
Member

@rhshadrach rhshadrach Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation here would be to combine the previous version (calling on the group) along with what you have (column-by-column). If calling on the first group is successful, then transform will operate group-by-group. Otherwise, it falls back to column-by-column (within each group).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input! However, I think combining the grouping + columns is a bit confusing. How would a user differentiate between using the .apply() vs the .transform() function then?

Copy link
Member

@rhshadrach rhshadrach Aug 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willie3838 - Agreed this is a part of the API that needs cleaning up, but I'd recommend leaving that aspect out of scope for this PR and documenting the behavior as it currently exists. If you agree with my assessment that the current documentation in this PR is not correct, then it needs to be fixed. On the other hand, if you think my assessment is not right, then let me know how!

return a %(klass)s having the same indexes as the original object
filled with the transformed values
Apply function ``func`` column-by-column to the GroupBy object and return a %(klass)s
with the same number of indices as the group.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part of the phrase is referring to the entire return of the transform method. Thus, saying "as the group" here doesn't make sense - there can be multiple groups, so there is no "the group". Also, @jreback was commenting that the return must have the same index (not just the same number of elements!) as the input.

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Sep 22, 2021
@mroeschke
Copy link
Member

Thanks for the PR, but appears that it has gone stale. Closing for now, but if interested in continuing let us know and we can reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Docs Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: DataFrameGroupBy.transform
6 participants