Skip to content

DOC: DataFrame.agg and Series.agg documentation is unclear on which function we can use #49528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
arnaudlegout opened this issue Nov 4, 2022 · 2 comments
Open
1 task done
Labels
Apply Apply, Aggregate, Transform, Map Docs

Comments

@arnaudlegout
Copy link
Contributor

arnaudlegout commented Nov 4, 2022

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.agg.html
https://pandas.pydata.org/docs/reference/api/pandas.Series.agg.html

Documentation problem

During discussion for issue #49352 we found that the documentation of the agg function for both DataFrame and Series is unclear on

  • which function will work
  • what will be passed to the function
  • how function names are looked up (and so which optional parameter can be used if we use function names)

Suggested fix for documentation

For DataFrame.agg, it must be explained that:

The passed function must accept either a Series and return a scalar, or a DataFrame and returns a Series. If the axis is 0 and the function accept a Series, the passed Series will be a column, otherwise, it will be a row. If function names are passed, the called function will be the first function found according to the MRO from a DataFrame object (that is, the result of getattr(df, 'func_name')). For instance, passing mean will match DataFrame.mean.

For Series.agg, it must be explained that:

The passed function must accept either a Series and return a scalar. If function names are passed, the called function will be the first function found according to the MRO from a Series object (that is, the result of getattr(s, 'func_name')).

@arnaudlegout arnaudlegout added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 4, 2022
@arnaudlegout
Copy link
Contributor Author

I can propose a PR for this issue, but I need someone to check the behavior I described is correct because I am not sure. What I described correspond to my understanding of core.appy._try_aggregate_string_function that I believed is called to convert a function name to a function object.

@arnaudlegout
Copy link
Contributor Author

arnaudlegout commented Nov 4, 2022

I missed that after searching the DataFrame MRO, it falls back to numpy MRO. So updating the documentation.

For DataFrame.agg, it must be explained that:

The passed function must accept either a Series and return a scalar, or a DataFrame and returns a Series. If the axis is 0 and the function accept a Series, the passed Series will be a column, otherwise, it will be a row. If function names are passed, the called function will be the first function found according to the MRO from a DataFrame object (that is, the result of getattr(df, 'func_name')). For instance, passing mean will match DataFrame.mean. If no function is found with this name, then it will be searched according to the MRO of numpy.

For Series.agg, it must be explained that:

The passed function must accept either a Series and return a scalar. If function names are passed, the called function will be the first function found according to the MRO from a Series object (that is, the result of getattr(s, 'func_name')). If no function is found with this name, then it will be searched according to the MRO of numpy.

@mroeschke mroeschke added Apply Apply, Aggregate, Transform, Map and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Docs
Projects
None yet
Development

No branches or pull requests

2 participants