-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Get wrong result when groupby category column with dropna=False #36327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Bug
Categorical
Categorical Data Type
Groupby
Missing-data
np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Comments
Hey, have you checked #35646? This looks like a duplicate |
It looks different from #35646 to me -- is specifically about categoricals, and happens with >1 groupby columns? Your code here
Output
Expected output
|
Just a +1 report of hitting this issue, still present on |
Thanks for the report! Further investigations and PRs to fix are welcome. |
1 task
4 tasks
3 tasks
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Bug
Categorical
Categorical Data Type
Groupby
Missing-data
np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
I have confirmed this bug exists on the master branch of pandas.
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 2a7d332
python : 3.7.6.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.18362
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 1.1.2
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.2.0.post20200210
Cython : 0.29.15
pytest : 5.3.5
hypothesis : 5.5.4
sphinx : 2.4.0
blosc : None
feather : None
xlsxwriter : 1.2.7
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.2
fsspec : 0.6.2
fastparquet : None
gcsfs : None
matplotlib : 3.1.3
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.13
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.48.0
The text was updated successfully, but these errors were encountered: