Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.drop() with boolean index: inconsistent behaviour #8530

Closed
urraca opened this issue Oct 10, 2014 · 6 comments
Closed

Series.drop() with boolean index: inconsistent behaviour #8530

urraca opened this issue Oct 10, 2014 · 6 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions

Comments

@urraca
Copy link

urraca commented Oct 10, 2014

The behaviour below occurs in versions '0.15.0rc1-21-g32c5016' and '0.14.1'.

When the label passed to the drop method of a Series is not in the index:
(a) if the index is not all booleans then an error is raised
(b) if the index is all booleans and has length > 1 then no error is raised; the original series is returned
(c) if the index is all booleans and has length == 1 then an error is raised

I propose that:

  1. the difference between the behaviour in (a) and (b) should be documented
  2. the behaviour in (c) should be changed to match that in (b)

Examples of current behaviour:
(a)

>>> pd.Series([1, 1], index=['a', True]).drop(False)
ValueError: labels [False] not contained in axis

(b)

>>> pd.Series([1, 1], index=[True, True]).drop(False)
True    1
True    1
dtype: int64

(c)

>>> pd.Series([1], index=[True]).drop(False)
ValueError: labels [False] not contained in axis
@urraca urraca changed the title 'drop' method for Series with boolean index: inconsistent behaviour Series.drop() with boolean index: inconsistent behaviour Oct 10, 2014
@jreback
Copy link
Contributor

jreback commented Oct 10, 2014

feel free to submit a PR for 1) docs

why would c) be incorrect? A value NOT in the axis by definition raises

as an aside, using a fully boolean index by definition almost always has duplicates which makes it a bit tricky to work with it.

@jreback
Copy link
Contributor

jreback commented Oct 10, 2014

#6599 is an attempt to consolidate the drop behavior, feel free to pick this up / comment as appropriate as well.

@jreback jreback added API Design Dtype Conversions Unexpected or buggy dtype conversions labels Oct 10, 2014
@urraca
Copy link
Author

urraca commented Oct 10, 2014

Thanks for your comments.

My key point is that (b) and (c) are inconsistent in a way users would not expect.

Personally, I find it convenient that errors are not raised in (b) but I suppose that it is most important that the behaviour be consistent.

To change the behaviour so that (b) raises an error would have the advantage of making the behaviour consistent across all index types and lengths, perhaps making further documentation unnecessary.

@jreback
Copy link
Contributor

jreback commented Oct 10, 2014

sorry, you are right. I think b) should raise, that looks like a bug (as I said before, just be careful with bool indexes).

@jreback jreback added Bug and removed API Design labels Oct 10, 2014
@jreback jreback added this to the 0.15.1 milestone Oct 10, 2014
@jreback
Copy link
Contributor

jreback commented Oct 10, 2014

@urraca yes this stems from the fact that b) has a duplicate index, while c) does not. Have to think about this.

@wesm
Copy link
Member

wesm commented Jul 6, 2018

This behavior is not present in 0.23.2

@wesm wesm closed this as completed Jul 6, 2018
@datapythonista datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

No branches or pull requests

4 participants