Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pivot_table sometimes returns Series (#4386) #14534

Closed
wants to merge 1 commit into from

Conversation

mroeschke
Copy link
Member

The pivot_table docs mentions that a DataFrame should be returned. In some cases, a Series is returned. I had to also alter some existing tests that expected pivot_table to return a Series

Any feedback welcomed, thanks!

@codecov-io
Copy link

codecov-io commented Oct 29, 2016

Current coverage is 85.28% (diff: 100%)

Merging #14534 into master will decrease coverage by <.01%

@@             master     #14534   diff @@
==========================================
  Files           140        140          
  Lines         50693      50695     +2   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43235      43236     +1   
- Misses         7458       7459     +1   
  Partials          0          0          

Powered by Codecov. Last update 726efc7...02ef0a8

@@ -170,6 +170,9 @@ def pivot_table(data, values=None, index=None, columns=None, aggfunc='mean',
if len(index) == 0 and len(columns) > 0:
table = table.T

if isinstance(table, Series):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you see if you can find the actual root cause here?

e.g. may be as simple as making sure that the column/index seletors are list-like and not scalars

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jreback. I found the root cause (here). I'll work on this and the tests tonight.

Once fixed, do you still want to keep the isinstance(table, Series) check or should I drop it?

Copy link
Contributor

@jreback jreback Nov 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I think a you can simply always make aggfunc a list (if its not None nor a list-like) and this will just work. use is_list_like to test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your solution works for the specific scenario in #4386, but a Seriesis still returned regardless when pivoting with only the columns argument or pivoting with no values or cols passed due to the groupby and unstack operations respectively. In these cases, I think to_frame() might be necessary if pivot_table should always return a DataFrame.

Or should I just fix the specific scenario in #4386 and change the docs to mention that pivot_table can return a Series?

@@ -350,7 +350,7 @@ def _check_output(result, values_col, index=['A', 'B'],
# no rows
rtable = self.data.pivot_table(columns=['AA', 'BB'], margins=True,
aggfunc=np.mean)
tm.assertIsInstance(rtable, Series)
tm.assertIsInstance(rtable, DataFrame)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you compare this with an expected result.

@@ -485,7 +485,7 @@ def test_margins_no_values_no_cols(self):
# Regression test on pivot table: no values or cols passed.
result = self.data[['A', 'B']].pivot_table(
index=['A', 'B'], aggfunc=len, margins=True)
result_list = result.tolist()
result_list = result.values.flatten().tolist()
self.assertEqual(sum(result_list[:-1]), result_list[-1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here (I know this is old / not yours) .....

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 2, 2016
@mroeschke
Copy link
Member Author

mroeschke commented Nov 9, 2016

Hi @jreback:

I added a fix to address the original bug. However, I believe to_frame() will be necessary because two scenarios (pivoting with only the columns argument and pivoting with no values or cols passed) will always return Series from the groupby and unstack operations respectively.

BUG: pivot_table someitmes returns Series (pandas-dev#4386)

BUG: pivot_table sometimes returns Series (pandas-dev#4386)

BUG: pivot_table sometimes returns Series (pandas-dev#4386)

pep 8 fixes

Restructure condional and update whatsnew
@jreback
Copy link
Contributor

jreback commented Nov 18, 2016

this looks like its duplicating #13554. Can you see if you tests are any different (maybe there is another case).

@mroeschke
Copy link
Member Author

Ah yes, I wanted to move #13554 along since it looked inactive for a while (but it looks active now). I can look into adding more tests and building upon #13554.

@jreback
Copy link
Contributor

jreback commented Dec 26, 2016

can you rebase

@mroeschke mroeschke closed this Dec 30, 2016
@mroeschke
Copy link
Member Author

Closing since this PR duplicated #13554 when it was inactive. Will reopen if #13554 stalls.

@mroeschke mroeschke deleted the fix_4386 branch December 20, 2017 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC/BUG: pivot_table returns Series in specific circumstance
3 participants