Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pivot_table downcasting dtypes even if not necessary #47971

Closed
3 tasks done
phofl opened this issue Aug 4, 2022 · 5 comments · Fixed by #60374
Closed
3 tasks done

BUG: pivot_table downcasting dtypes even if not necessary #47971

phofl opened this issue Aug 4, 2022 · 5 comments · Fixed by #60374
Assignees
Labels
Bug good first issue Needs Tests Unit test(s) needed to prevent regressions Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@phofl
Copy link
Member

phofl commented Aug 4, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

df = pd.DataFrame({"x": "a", "y": "b", "age": [20, 40]})
result = df.pivot_table(
    index='x', columns='y', values='age', aggfunc='mean', dropna=True
)

result = df.pivot_table(
    index='x', columns='y', values='age', aggfunc='mean', dropna=False
)

Issue Description

with dropna=True this returns int64 dtype, while we get float64 with dropna=False. This happens because we try to downcast if we set dropna, because we drop all nan rows which cast our dtypes to float.

But the downcast path is also hit, when we don't have all nan rows and hence the aggregation function returned the correct dtype all along.

Expected Behavior

I think both cases should be consistent if no nans are dropped, e.g. we should not try to downcast.

If we want to do this, we should probably deprecate or changing in 2.0, but not in a minor release

Installed Versions

main
@phofl phofl added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Needs Discussion Requires discussion from core team before further action Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 4, 2022
@phofl phofl changed the title BUG: pivot downcasting dtypes even if not necessary BUG: pivot_table downcasting dtypes even if not necessary Aug 4, 2022
@rhshadrach
Copy link
Member

Related: #53521

@rhshadrach rhshadrach removed the Needs Triage Issue that has not been reviewed by a pandas team member label Jan 13, 2024
@jorisvandenbossche
Copy link
Member

Both cases seem to result now in a float result:

In [14]: df = pd.DataFrame({"x": "a", "y": "b", "age": [20, 40]})
    ...: result = df.pivot_table(
    ...:     index='x', columns='y', values='age', aggfunc='mean', dropna=True
    ...: )
    ...: 

In [15]: result
Out[15]: 
y     b
x      
a  30.0

In [16]: result = df.pivot_table(
    ...:     index='x', columns='y', values='age', aggfunc='mean', dropna=False
    ...: )

In [17]: result
Out[17]: 
y     b
x      
a  30.0

So we could close this issue after adding an explicit test for this.

@jorisvandenbossche jorisvandenbossche added Needs Tests Unit test(s) needed to prevent regressions good first issue and removed Needs Discussion Requires discussion from core team before further action labels Nov 5, 2024
@Swati-Sneha
Copy link
Contributor

take

@AbhishekChaudharii
Copy link
Contributor

Hi @jorisvandenbossche, it seems that this issue is the same as #47477. The test case was also added with the bug fix and while writing the test case and whatsnew @phofl has mentioned #47477 instead of this one(#47971). I think we can close this issue.

@rhshadrach
Copy link
Member

rhshadrach commented Nov 19, 2024

@AbhishekChaudharii - the test added in #47477 is with Int64 and not int64. I think we should also have a test with int64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug good first issue Needs Tests Unit test(s) needed to prevent regressions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
5 participants