Skip to content

BUG: Add squeeze keyword to groupby to allow reduction in returned type #3599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2013

Conversation

jreback
Copy link
Contributor

@jreback jreback commented May 14, 2013

This will allow a reduction in the returned type from DataFrame -> Series
if the groups are unique.

This is a fix for a regression from 0.10.1.

Allows functionaility in #2893, by specifying squeeze=True
in groupby call. #3596 functionaility is back as the default

This returns a Series because we are passing squeeze=True

In [9]: df2 = DataFrame([{"val1": 1, "val2" : 20}, 
{"val1":1, "val2": 19},{"val1":1, "val2": 27}, {"val1":1, "val2": 12}])

In [10]: df2
Out[10]: 
   val1  val2
0     1    20
1     1    19
2     1    27
3     1    12

In [11]: def func(dataf):
   ....:     return dataf["val2"]  - dataf["val2"].mean()
   ....: 

In [12]: df2.groupby("val1", squeeze=True).apply(func)
Out[12]: 
0    0.5
1   -0.5
2    7.5
3   -7.5
Name: 1, dtype: float64

Traditionally returns a DataFrame (even though have unique groups)
Implicity (squeeze=False)

In [13]: df = DataFrame([[1,1],[1,1]],columns=['X','Y'])

In [14]: df
Out[14]: 
   X  Y
0  1  1
1  1  1

In [15]: df.groupby('X').count()
Out[15]: 
   X  Y
X      
1  2  2

@jreback
Copy link
Contributor Author

jreback commented May 14, 2013

@wesm pls take a look. I think the prior behavior is correct, most often you would not want the reduction (which is data dependent). The new keyword allows it.

@hayd
Copy link
Contributor

hayd commented May 15, 2013

This is a good feature/fix.
Note: squeeze is what's used in read_csv to do this... perhaps we should use the same name here?

@jreback
Copy link
Contributor Author

jreback commented May 15, 2013

thats a good idea! will update. thankxs

@jreback
Copy link
Contributor Author

jreback commented May 15, 2013

@hayd better? (and is the example in v0.11.1 adequate?)

… from

    DataFrame -> Series if groups are unique. Regression from 0.10.1,
    partial revert on (GH2893_) with (GH3596_)

CLN: renamed reduce_if_possible -> squeeze

DOC: added v0.11.1 example
jreback added a commit that referenced this pull request May 15, 2013
BUG: Add squeeze keyword to groupby to allow reduction in returned type
@jreback jreback merged commit e82003f into pandas-dev:master May 15, 2013
@jreback jreback mentioned this pull request Sep 22, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants