Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: value_counts to produce both count and normalized #60385

Open
2 of 3 tasks
Keramatfar opened this issue Nov 21, 2024 · 1 comment
Open
2 of 3 tasks

ENH: value_counts to produce both count and normalized #60385

Keramatfar opened this issue Nov 21, 2024 · 1 comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Closing Candidate May be closeable, needs more eyeballs Enhancement

Comments

@Keramatfar
Copy link

Keramatfar commented Nov 21, 2024

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I would like pandas to have a feature for when I need to see both the count and relative counts of a Series at once.

Feature Description

Now, I can get the count by value_counts and the relative count by passing the normalize parameter to that function. Sometimes it is more interesting to see them at once, probably using a new parameter to this function. Maybe editing the normalize parameter to handle three states, raw, relative, or both.

Alternative Solutions

Using two consecutive calls to value_counts by different values for normalize could provide the functionality.

Additional Context

No response

@Keramatfar Keramatfar added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 21, 2024
@rhshadrach
Copy link
Member

Thanks for the request. I think two calls to value_counts would be unnecessary, and less performant than:

ser = pd.Series([1, 1, 2, 1, 3, 3, 1])

result = ser.value_counts().to_frame()
result["normalized"] = result / result["count"].sum()
print(result)
#    count  normalized
# 1      4    0.571429
# 3      2    0.285714
# 2      1    0.142857

This seems readily doable via the existing API, and therefore I am negative on adding it.

@rhshadrach rhshadrach added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Closing Candidate May be closeable, needs more eyeballs and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 21, 2024
@rhshadrach rhshadrach changed the title ENH: ENH: value_counts to produce both count and normalized Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Closing Candidate May be closeable, needs more eyeballs Enhancement
Projects
None yet
Development

No branches or pull requests

2 participants