-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent behavior when DataFrame with strings and None is created from lists or dictionary. #32218
Comments
None is converted to 'None' when numpy array (made from passed list pandas/pandas/core/internals/construction.py Line 173 in 54b4001
Should numpy handle this or we can do a workaround. Can work on a PR. |
But isn't the "bigger" issue here that the behavior is different for seemingly two equivalent ways of instantiating a Dataframe? |
I am assuming one is wrong, the case where None becomes "None". |
I tested against master (895f0b4) and the issue is now gone, I think this can be closed now. |
we would take a PR with a validation test |
take |
hey I'm looking for first-issues and I'm wondering what needs to be done? |
Hey , can i do this. can you guide how to do this. please |
Code Sample
Problem description
None is not casted consistently for DataFrames with None values and dtype set to str.
If DataFrame is created from list, then None is casted to str -> None -> "None".
If DataFrame is created from dict, then None remains NoneType -> None -> None.
IMO, the latter is the preferred behavior. I hope you consider this when continuing your work on string types and na types in future versions.
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 41.2.0
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.16.0
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None
The text was updated successfully, but these errors were encountered: