Skip to content

Conversation

Aniketsy
Copy link
Contributor

@Aniketsy Aniketsy commented Jul 24, 2025

Fixes #61935

  • Fixes a bug where assert_index_equal raises a TypeError instead of AssertionError when comparing two CategoricalIndex objects with check_categorical=True and exact=False.

  • Ensures consistency with expected testing behavior by properly raising an AssertionError in these cases.

Please let me know if my approach or fix needs any improvements . I’m open to feedback and happy to make changes based on suggestions.

@Aniketsy
Copy link
Contributor Author

Hi @mroeschke
I've opened a pull request addressing
BUG: Fix TypeError in assert_index_equal when comparing CategoricalIndex with check_categorical=True and exact=False ([#61941])
The changes are ready for review.

I'd really appreciate it if you could take a look and provide feedback .
Please let me know if anything needs to be improved or clarified.

Thanks!

@mroeschke mroeschke added the Testing pandas testing functions or related to the test suite label Jul 25, 2025
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs a unit test

@Aniketsy Aniketsy force-pushed the bugfix-assert-categoricalindex-typeerror branch from 333eb89 to 5edf8ce Compare July 25, 2025 18:06
@Aniketsy
Copy link
Contributor Author

Hi @mroeschke,

Thank you for your review. I’ve updated the PR based on your feedback ,please have a look when convenient.

Additionally, I noticed one check failure (pre-commit.ci-pr) and wanted to ask if you could help clarify the reason behind it. Apologies if this isn't the appropriate way to raise this, please do let me know the correct approach if needed.

Thanks again!

@Aniketsy Aniketsy force-pushed the bugfix-assert-categoricalindex-typeerror branch from de5b16b to 08739f2 Compare July 26, 2025 10:38
@Aniketsy
Copy link
Contributor Author

Checks fail

Hi @jorisvandenbossche, I ran pre-commit locally and all hooks passed. However, the GitHub checks are still showing a failure. Could you please advise if I’ve missed something?

@Aniketsy
Copy link
Contributor Author

Hi @mroeschke
When you have a moment, could you please review this PR? I've been working on resolving the check failure, but haven't been able to pinpoint the issue yet. Any insights or suggestions you could provide would be greatly appreciated.

Thank you!

Comment on lines 324 to 325
ci1 = CategoricalIndex(["a", "b", "c"], categories=["a", "b", "c"], ordered=False)
ci2 = CategoricalIndex(["a", "x", "c"], categories=["a", "b", "c"], ordered=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test when 1 index is CategoricalIndex and the other is just an Index?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I had meant an additional test in addition to the one you had

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please , let me know if , I need to add an additional test .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aniketsy yes, please add it as an additional test, and keep the original one you added (because we want to cover the case of the bug report #61935, where both left and right are CategoricalIndex)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche Should I keep these as two separate tests for clarity, or merge them into one parametrized test?

Comment on lines 328 to 329
if hasattr(left, "_internal_get_values") and hasattr(
right, "_internal_get_values"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use isinstance checks to call _internal_get_values if the object is a CategoricalIndex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankyou ! for correcting me . I have updated that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one check fail due to:
CategoricalIndex does not have an _internal_get_values() method.
So should i use .value or something else . Please suggest

@Aniketsy
Copy link
Contributor Author

Hi @mroeschke
I just wanted to check in on this PR to see if there’s anything further you’d like me to update or improve.
Thankyou !

@Aniketsy
Copy link
Contributor Author

Hi @jorisvandenbossche, I’ve added the separate test as suggested. Please let me know if you’d prefer me to merge these into a single parametrized test instead.

@Aniketsy
Copy link
Contributor Author

Hi @jorisvandenbossche , just a gentle reminder to review the changes whenever you get a chance. Thanks!

mismatch = left._values != right._values
try:
mismatch = left._values != right._values
except TypeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment pointing back to the original issue, otherwise ill be confused as to why this can happen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure , I will add comment

try:
mismatch = left._values != right._values
except TypeError:
if isinstance(left, CategoricalIndex) and isinstance(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there cases where this doesn't hold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure, but I don’t think so.

if isinstance(left, CategoricalIndex) and isinstance(
right, CategoricalIndex
):
mismatch = left.codes != right.codes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we raised TypeError it'll be because the dtypes dont match and arent comparable. in that case, comparing codes doesn't make much sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it that makes sense.

@Aniketsy
Copy link
Contributor Author

@jbrockmendel just to confirm before proceeding to apply changes as per suggestion.

except TypeError:
                if isinstance(left, CategoricalIndex) and isinstance(
                    right, CategoricalIndex
                ):
                    mismatch = left.codes != right.codes
                else:
                    mismatch = left.values != right.values

Should I simply replace it with this .

except TypeError:
                mismatch = np.ones(len(left), dtype=bool)

Also do i need to update tests as i have added for two cases.
Please correct me if I'm wrong .

@jbrockmendel
Copy link
Member

In that context I don't think it makes sense so show the mismatch. Just say that the types aren't comparable.

@Aniketsy Aniketsy force-pushed the bugfix-assert-categoricalindex-typeerror branch from 0ef7755 to 5d4af24 Compare September 27, 2025 05:36
@Aniketsy
Copy link
Contributor Author

@jbrockmendel I’ve updated the changes. Please review them when you have time. Thanks!

@Aniketsy
Copy link
Contributor Author

Aniketsy commented Oct 1, 2025

It looks like the errors in the CI are from another file and not caused by the changes in this PR. As far as I can tell, this shouldn’t be related, but happy to be corrected if I’m wrong.

@jbrockmendel
Copy link
Member

the code checks failures should be fixed by merging main

@Aniketsy
Copy link
Contributor Author

Aniketsy commented Oct 2, 2025

merged main, but still its showing check failures .

@jbrockmendel
Copy link
Member

Looks like those are caused by #62545

@Aniketsy
Copy link
Contributor Author

Aniketsy commented Oct 3, 2025

@jbrockmendel The CI check failure is resolved.

@mroeschke mroeschke added this to the 3.0 milestone Oct 3, 2025
@mroeschke mroeschke merged commit 1028791 into pandas-dev:main Oct 3, 2025
42 checks passed
@mroeschke
Copy link
Member

Thanks @Aniketsy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Testing pandas testing functions or related to the test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: assert_index_equal(CategoricalIndex, CategoricalIndex, check_categorical=True, exact=False) raises TypeError instead of AssertionError

4 participants