It's difficult to predict what DataFrame.groupby().apply() will return:

I found It's difficult to predict what `DataFrame.groupby().apply()` will return. the result depends on the type of the return object and the index of the return object. for example:

```
import pandas as pd
df = pd.DataFrame({"a":[1, 2, 1, 2], "b":[1, 2, 3, 4], "c":[5, 6, 7, 8]})
```

When the argument and the return object is `DataFrame` and has the same index object, there are not group keys in the result:

```
print df.groupby("a").apply(lambda x:x)
```

the output is:

```
   a  b  c
0  1  1  5
1  2  2  6
2  1  3  7
3  2  4  8
```

if the index is not the same object, there are group keys, even the index values are the same:

```
print df.groupby("a").apply(lambda x:x[:])
```

the output is:

```
     a  b  c
a           
1 0  1  1  5
  2  1  3  7
2 1  2  2  6
  3  2  4  8
```

if the function returns `Series` object and the index of these `Series` objects are not he same values, the index of the result is a `MultiIndex`:

```
print df.groupby("a").apply(lambda x:x.b + x.c)
```

the output:

```
a   
1  0     6
   2    10
2  1     8
   3    12
dtype: int64
```

If all the `Series` objects have the same index values, the `Series` objects are the rows of the result:

```
print df.groupby("a").apply(lambda x:(x.b + x.c).reset_index(drop=True))
```

the output:

```
   0   1
a       
1  6  10
2  8  12
```

Here are more exampes:

Because the index is the same object:

```
print df.groupby("a").apply(lambda x:(x.b + x.c).to_frame())
```

not group keys in the output:

```
    0
0   6
1   8
2  10
3  12
```

If we copy the return value, the index is not the same object:

```
print df.groupby("a").apply(lambda x:(x.b + x.c).to_frame()[:])
```

the output contains group keys:

```
      0
a      
1 0   6
  2  10
2 1   8
  3  12
```

```
print df.groupby("a").apply(lambda x:x[["b", "c"]])
```

no group keys because the index object is the same (but use `x[:]` will get the group keys):

```
   b  c
0  1  5
1  2  6
2  3  7
3  4  8
```

It seems that there is no document about this.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

It's difficult to predict what DataFrame.groupby().apply() will return: #9867

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

It's difficult to predict what DataFrame.groupby().apply() will return: #9867

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions