Skip to content

BUG GH23224 Allow integer_array to be initialized with all None #23237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 23, 2018

Conversation

JustinZhengBC
Copy link
Contributor

Allows for initialization of integer_array with [None], lists of None, and lists made of only None and other na types

@pep8speaks
Copy link

Hello @JustinZhengBC! Thanks for submitting the PR.

@@ -174,8 +174,12 @@ def coerce_to_array(values, dtype, mask=None, copy=False):
inferred_type = infer_dtype(values)
if inferred_type not in ['floating', 'integer',
'mixed-integer', 'mixed-integer-float']:
raise TypeError("{} cannot be converted to an IntegerDtype".format(
values.dtype))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this before the

if inferred_type not in

and make that an elif

[
(integer_array([None]), integer_array([np.nan])),
(integer_array([None, np.nan]), integer_array([np.nan, np.nan]))])
def test_to_integer_array_none(result, expected):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u also try an array of all np.nan

raise TypeError("{} cannot be converted to an IntegerDtype".format(
values.dtype))
if inferred_type is 'mixed' and all(isna(x) for x in values):
values = np.array([np.nan] * len(values)) # GH 23224
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use

isna(values).any() as we always return an. array

@@ -174,8 +174,12 @@ def coerce_to_array(values, dtype, mask=None, copy=False):
inferred_type = infer_dtype(values)
if inferred_type not in ['floating', 'integer',
'mixed-integer', 'mixed-integer-float']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don’t recall if we are checking for exactly ndim == 1 right after we convert to ndarray

can u add this as well ( and some tests)

@codecov
Copy link

codecov bot commented Oct 19, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@cb8a67e). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #23237   +/-   ##
=========================================
  Coverage          ?   92.19%           
=========================================
  Files             ?      169           
  Lines             ?    50974           
  Branches          ?        0           
=========================================
  Hits              ?    46997           
  Misses            ?     3977           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.62% <100%> (?)
#single 42.27% <0%> (?)
Impacted Files Coverage Δ
pandas/core/arrays/integer.py 95.12% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cb8a67e...14399b1. Read the comment docs.

@jreback jreback added this to the 0.24.0 milestone Oct 19, 2018
@jreback jreback added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate ExtensionArray Extending pandas with custom dtypes or arrays. labels Oct 19, 2018
@jreback
Copy link
Contributor

jreback commented Oct 19, 2018

lgtm. i think you need to merge master and push again. ping on green.

@JustinZhengBC
Copy link
Contributor Author

@jreback green now. I also moved my test cases to another more appropriate test that someone added since I last pulled.

if inferred_type not in ['floating', 'integer',
'mixed-integer', 'mixed-integer-float']:
if inferred_type is 'mixed' and isna(values).any():
values = np.array([np.nan] * len(values)) # GH 23224
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use

values = np.empty(len(values))
values.fill(np.nan)

@jreback jreback merged commit c8b45e0 into pandas-dev:master Oct 23, 2018
@jreback
Copy link
Contributor

jreback commented Oct 23, 2018

thanks @JustinZhengBC

jorisvandenbossche added a commit that referenced this pull request Oct 23, 2018
@@ -173,8 +173,11 @@ def coerce_to_array(values, dtype, mask=None, copy=False):
values = np.array(values, copy=copy)
if is_object_dtype(values):
inferred_type = lib.infer_dtype(values)
if inferred_type not in ['floating', 'integer',
'mixed-integer', 'mixed-integer-float']:
if inferred_type is 'mixed' and isna(values).any():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoudn't this any be all ?
Because now this allows something like this:

In [14]: pd.core.arrays.integer_array([np.nan, {'a':1}])
Out[14]: IntegerArray([nan, nan], dtype='Int64')

A cornercase, but basically any python object that is not a datetime or string (something that numpy will convert to an object array).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, yep need to have all here

@jorisvandenbossche
Copy link
Member

@JustinZhengBC feel free to open a follow-up PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug ExtensionArray Extending pandas with custom dtypes or arrays. Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support None in integer_array
4 participants