Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: IntervalIndex constructor inconsistencies #18421

Closed
jschendel opened this issue Nov 22, 2017 · 1 comment · Fixed by #18424
Closed

BUG: IntervalIndex constructor inconsistencies #18421

jschendel opened this issue Nov 22, 2017 · 1 comment · Fixed by #18424
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Interval Interval data type
Milestone

Comments

@jschendel
Copy link
Member

Code Sample, a copy-pastable example if possible

  1. IntervalIndex constructor ignores closed parameter for purely NA data:
In [3]: pd.IntervalIndex([np.nan], closed='both')
Out[3]:
IntervalIndex([nan]
              closed='right',
              dtype='interval[float64]')

In [4]: pd.IntervalIndex([np.nan, np.nan], closed='neither')
Out[4]:
IntervalIndex([nan, nan]
              closed='right',
              dtype='interval[float64]')

This only occurs on master, as it appears to be an over-correction resulting from the fix in #18340


  1. IntervalIndex also ignores closed when it conflicts with the how the input data is closed:
In [6]: ivs = [pd.Interval(0, 1, closed='both'), pd.Interval(10, 20, closed='both')]

In [7]: pd.IntervalIndex(ivs, closed='neither')
Out[7]:
IntervalIndex([[0, 1], [10, 20]]
              closed='both',
              dtype='interval[int64]')

The behavior above occurs on master, and is a result of #18340. The opposite behavior occurred prior, where intervals would always be coerced to match the closed specified by the constructor. Should probably raise when the constructor vs. inferred closed conflict.


  1. Inconsistent dtype for empty IntervalIndex depending on the method of construction:
In [2]: pd.IntervalIndex([]).dtype
Out[2]: interval[object]

In [3]: pd.IntervalIndex.from_intervals([]).dtype
Out[3]: interval[object]

In [4]: pd.IntervalIndex.from_breaks([]).dtype
Out[4]: interval[float64]

In [5]: pd.IntervalIndex.from_tuples([]).dtype
Out[5]: interval[float64]

In [6]: pd.IntervalIndex.from_arrays([], []).dtype
Out[6]: interval[float64]

Expected Output

  1. IntervalIndex constructor should not ignore the closed parameter for purely NA data (since it can't infer closed from the input data).

  2. IntervalIndex should raise when given conflicting closed vs. inferred closed from data.

  3. IntervalIndex should have the same dtype for empty data regardless of the method of construction. It's not immediately clear to me which dtype should be used, but my feeling is interval[object] since that's the behavior of the constructor/from_intervals.

@jreback
Copy link
Contributor

jreback commented Nov 22, 2017

  1. should be object

@jreback jreback added Bug Difficulty Intermediate Dtype Conversions Unexpected or buggy dtype conversions Interval Interval data type labels Nov 22, 2017
@jreback jreback added this to the Next Major Release milestone Nov 22, 2017
@jreback jreback modified the milestones: Next Major Release, 0.22.0 Nov 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Interval Interval data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants