Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC/BUG: pivot_table returns Series in specific circumstance #4386

Closed
davidshinn opened this issue Jul 28, 2013 · 6 comments
Closed

DOC/BUG: pivot_table returns Series in specific circumstance #4386

davidshinn opened this issue Jul 28, 2013 · 6 comments
Labels
Docs Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@davidshinn
Copy link
Contributor

The docstrings and other documentation say that the pivot_table function returns a DataFrame. However, this likely leads to confusion like #4371, because under narrow circumstances, passing a certain set of argument dtypes results in the function returning a Series (see ipython examples at end):

  1. values is single string (not a list, not even a single valued list)
  2. cols=None
  3. aggfunc is single string/function (not a list, not even a single valued list)

Unfortunately, this is not clear from the docs or from normal use (except for condition 1).

Should this:

  1. eventually be fixed to only return a DataFrame no matter the circumstances to be less confusing
  2. be documented correctly (seems a little difficult to convey in the docstring and other docs without a lot of bulk).

My thoughts are changing the function to return only a DataFrame in future versions (> 0.13) and providing some deprecation warning in the meantime is better than trying to explain this in the docs.

I would be happy to provide the deprecation warning and document notes as a pull request.

Thanks.

Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) 
Type "copyright", "credits" or "license" for more information.

IPython 1.0.dev -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
%guiref   -> A brief reference about the graphical user interface.

In [1]: import pandas as pd
   ...: import numpy as np
   ...: pd.__version__
   ...: 
Out[1]: '0.12.0-57-g7bf2a7d'

In [2]: df = pd.DataFrame({'col1': [3, 4, 5], 'col2': ['C', 'D', 'E'], 'col3': [1, 3, 9]})

In [3]: df
Out[3]: 
   col1 col2  col3
0     3    C     1
1     4    D     3
2     5    E     9

In [4]: # Case 1: (a) values is single string label (b) cols is unspecified
   ...: #         (c) aggfunc is single lable/function (not a list)
   ...: # Expect: Series type
   ...: pivoted_1 = df.pivot_table('col1', rows=['col3', 'col2'], aggfunc=np.sum)
   ...: print pivoted_1
   ...: print type(pivoted_1)
col3  col2
1     C       3
3     D       4
9     E       5
Name: col1, dtype: int64
<class 'pandas.core.series.Series'>

In [5]: # Case 2: (a) values is single string label (b) cols is single string label
   ...: # Expected: DataFrame
   ...: pivoted_2 = df.pivot_table('col1', rows='col3', cols='col2', aggfunc=np.sum)
   ...: print pivoted_2
   ...: print type(pivoted_2)
col2   C   D   E
col3            
1      3 NaN NaN
3    NaN   4 NaN
9    NaN NaN   5
<class 'pandas.core.frame.DataFrame'>

In [6]: # Case 3: (a) values is single string label (b) cols is unspecified
   ...: #         (c) aggfunc is a list
   ...: # Expect: DataFrame
   ...: pivoted_3 = df.pivot_table('col1', rows='col3', aggfunc=[np.sum])
   ...: print pivoted_3
   ...: print type(pivoted_3)
      sum
col3     
1       3
3       4
9       5
<class 'pandas.core.frame.DataFrame'>
@jreback
Copy link
Contributor

jreback commented Sep 28, 2013

@davidshinn want to do a PR for this?

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Apr 4, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@youlyst
Copy link

youlyst commented Jan 8, 2016

Hi
i want to ask about comment above:
passing a certain set of argument dtypes results in the function returning a Series

calling df.pivot_table with empty index, also returns Series object instead of DataFrame
is there reason for this behaviour?
thanks

@jreback
Copy link
Contributor

jreback commented Jan 8, 2016

if u would like to implement as indicated above that would be great

@davidshinn
Copy link
Contributor Author

@youlyst, feel free to do a PR to fix this. Sorry I've been out of touch since I originally posted. If no one gets to this by February, I'll submit a PR so that pivot_table always returns a DataFrame for consistency.

@youlyst
Copy link

youlyst commented Jan 9, 2016

I do not feel for it yet, i am working with pandas very short time. but it could change after month...
my question was mainly to make sure that i am not doing something wrong.

@jreback
Copy link
Contributor

jreback commented Jan 9, 2016

yui-knk added a commit to yui-knk/pandas that referenced this issue Jul 4, 2016
Before this commit, if

* `values` is not list like
* `columns` is `None`
* `aggfunc` is not instance of `list`

`pivot_table` returns a `Series`.

This commit adds checking for `columns.nlevels` is
greater than 1 to prevent from casting `table` to
a `Series`.

This will fix pandas-dev#4386.
mroeschke added a commit to mroeschke/pandas that referenced this issue Oct 29, 2016
mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 8, 2016
BUG: pivot_table sometimes returns Series (pandas-dev#4386)

BUG: pivot_table sometimes returns Series (pandas-dev#4386)
mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 9, 2016
mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 9, 2016
BUG: pivot_table sometimes returns Series (pandas-dev#4386)

BUG: pivot_table sometimes returns Series (pandas-dev#4386)
mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 17, 2016
BUG: pivot_table someitmes returns Series (pandas-dev#4386)

BUG: pivot_table sometimes returns Series (pandas-dev#4386)

BUG: pivot_table sometimes returns Series (pandas-dev#4386)

pep 8 fixes

Restructure condional and update whatsnew
yui-knk added a commit to yui-knk/pandas that referenced this issue Nov 18, 2016
Before this commit, if

* `values` is not list like
* `columns` is `None`
* `aggfunc` is not instance of `list`

`pivot_table` returns a `Series`.

This commit adds checking for `columns.nlevels` is
greater than 1 to prevent from casting `table` to
a `Series`.

This will fix pandas-dev#4386.
yui-knk added a commit to yui-knk/pandas that referenced this issue Dec 27, 2016
Before this commit, if

* `values` is not list like
* `columns` is `None`
* `aggfunc` is not instance of `list`

`pivot_table` returns a `Series`.

This commit adds checking for `columns.nlevels` is
greater than 1 to prevent from casting `table` to
a `Series`.

This will fix pandas-dev#4386.
yui-knk added a commit to yui-knk/pandas that referenced this issue Mar 6, 2017
Before this commit, if

* `values` is not list like
* `columns` is `None`
* `aggfunc` is not instance of `list`

`pivot_table` returns a `Series`.

This commit adds checking for `columns.nlevels` is
greater than 1 to prevent from casting `table` to
a `Series`.

This will fix pandas-dev#4386.
@jreback jreback modified the milestones: 0.20.0, Next Major Release Apr 18, 2017
analyticalmonk pushed a commit to analyticalmonk/pandas that referenced this issue Apr 20, 2017
Before this commit, if

* `values` is not list like
* `columns` is `None`
* `aggfunc` is not instance of `list`

`pivot_table` returns a `Series`.

This commit adds checking for `columns.nlevels` is
greater than 1 to prevent from casting `table` to
a `Series`.

This will fix pandas-dev#4386.

DOC: add docs for pandas-dev#13554
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
3 participants