Skip to content

QST: Related to working of as_index in pandas groupby  #40591

Closed
@ninjakx

Description

@ninjakx
  • I have searched the [pandas] tag on StackOverflow for similar questions.

  • I have asked my usage related question on StackOverflow.


Question about pandas

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
df.A = df.agg('City_{0[A]}'.format, axis=1)
df.index = (df.index)*int(np.random.randint(6,23)**0.5)+int(np.random.randint(2,23))

This is my df:

 A B C D
City_2 3 7 3
City_0 4 8 9
City_1 1 2 1
City_5 5 0 9
City_5 1 6 0
City_0 3 8 6
City_7 2 6 6
City_1 6 0 2
City_8 6 2 4
City_2 2 5 6

If I do

df.groupby([df.index//5,df.A], as_index=True).mean()
or
df.groupby([df.index//5,df.A]).mean()

Result:

B C D
A
1 City_2 3 7 3
2 City_0 4 8 9
3 City_1 1 2 1
City_5 5 0 9
4 City_0 3 8 6
City_5 1 6 0
5 City_7 2 6 6
6 City_1 6 0 2
City_8 6 2 4
7 City_2 2 5 6

So If do .reset_index()

I will get this (that index with no column name will get name as level_0 ):

  level_0 A B C D
1 City_2 3 7 3
2 City_0 4 8 9
3 City_1 1 2 1
3 City_5 5 0 9
4 City_0 3 8 6
4 City_5 1 6 0
5 City_7 2 6 6
6 City_1 6 0 2
6 City_8 6 2 4
7 City_2 2 5 6

The same is not with the as_index=False
df.groupby([df.index//5,df.A], as_index=False).mean() will give me

 A B C D
City_2 3 7 3
City_0 4 8 9
City_1 1 2 1
City_5 5 0 9
City_0 3 8 6
City_5 1 6 0
City_7 2 6 6
City_1 6 0 2
City_8 6 2 4
City_2 2 5 6

I will lose the groupby index info.

My question:
shouldn't be as_index=False give df.index//5 column also. Or as_index is designed to work like this only.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions