Skip to content

API: make CategoricalDtype.__eq__ stricter #37929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Nov 18, 2020 · 4 comments
Closed

API: make CategoricalDtype.__eq__ stricter #37929

jbrockmendel opened this issue Nov 18, 2020 · 4 comments
Labels

Comments

@jbrockmendel
Copy link
Member

We allow for partially-initialized CategoricalDtypes to compare as equal to fully-initialized ones, leading to some surprising behaviors. The one that I stumbled over was this:

cat = pd.Categorical(["a", "b", "c"], ordered=True)
dtype = pd.CategoricalDtype()

>>> cat.dtype == dtype
True

cat2 = cat.astype(dtype)

>>> cat2.dtype == cat.dtype
False

We should stop special-casing categories=None for this purpose.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 18, 2020
@jreback
Copy link
Contributor

jreback commented Nov 18, 2020

this was for back compat

iirc there should be some issue refs in the dtypes def

but agreed we should fix this

@jbrockmendel
Copy link
Member Author

The other surprise I found when going to fix this was that pandas_dtype("categorical") has ordered=None while CategoricalDtype() has ordered=False

@jorisvandenbossche
Copy link
Member

I think in general I think this "uninitialized" CategoricalDtype object is to support the general "categories" dtype. So if we want to make the dtype more strict, we will need to come up with an alternative way to support this.
(that's not to say that there might be something buggy in the specific example case you show, but that's for astyping an array that is already categorical)

@jbrockmendel jbrockmendel added API Design Categorical Categorical Data Type and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 6, 2021
@jbrockmendel
Copy link
Member Author

Closed by #38516

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants