- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 19.2k
 
Description
- 
I have checked that this issue has not already been reported.
 - 
I have confirmed this bug exists on the latest version of pandas.
 - 
(optional) I have confirmed this bug exists on the master branch of pandas.
 
Problem
I would like create a dictionary from pandas dataframe using groupby, which the key is the group which is used in pandas groupby and the value is list of the groupby output.
# create a dataframe 
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [2, 3, 4], 'C': [3, 4, 5]})When subsetting a single column after groupby, it works well,
# get a single column after groupby
dict(df.groupby('A')['B'].apply(list))
# output
{1: [2], 2: [3], 3: [4]}but when subsetting multiple columns, it outputs the list of dataframe.columns.
# get multiple columns after groupby
dict(df.groupby('A')[['B', 'C']].apply(list))
# output
{1: ['B', 'C'], 2: ['B', 'C'], 3: ['B', 'C']}I checked raw output when subsetting columns after groupby and found that it outputs all columns even if subsetting columns after groupby.
# get raw output when subsetting columns after groupby
list(df.groupby('A')[['B', 'C']])
# output
[(1,
     A  B  C
  0  1  2  3),
 (2,
     A  B  C
  1  2  3  4),
 (3,
     A  B  C
  2  3  4  5)]Expected Output
For 1st example:
# get multiple columns after groupby
dict(df.groupby('A')[['B', 'C']].apply(list))
# output
{1: [2, 3], 2: [3, 4], 3: [4, 5]}And for 2nd example:
# get raw output when subsetting columns after groupby
list(df.groupby('A')[['B', 'C']])
# output
[(1,
     B  C
  0  2  3),
 (2,
     B  C
  1  3  4),
 (3,
     B  C
  2  4  5)]Output of pd.show_versions()
INSTALLED VERSIONS
commit           : 2a7d332
python           : 3.8.5.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.15.0-29-generic
Version          : #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : None
LOCALE           : en_US.UTF-8
pandas           : 1.1.2
numpy            : 1.19.1
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.1.1
setuptools       : 47.1.0
Cython           : None
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : 7.18.1
pandas_datareader: None
bs4              : 4.9.3
bottleneck       : None
fsspec           : 0.8.4
fastparquet      : 0.4.1
gcsfs            : None
matplotlib       : 3.3.1
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : 0.16.0
pytables         : None
pyxlsb           : None
s3fs             : None
scipy            : 1.5.3
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : None
xlwt             : None
numba            : 0.51.2