Ensure comparisons with pyints and integer series always succeed #16532

seberg · 2024-08-12T18:27:40Z

When Python integers are compared to a series of integers, the result can always be correctly defined no matter the values of the Python integer.

This was always a very mild issue. But with NumPy 2 behavior not upcasting the computation result type based on the value anymore, even things like:

cudf.Series([1, 2, 3], dtype="int8") < 1000

would fail.
(Similar paths could be taken for other integer scalars, but there would be mostly nice for performance.)

N.B. NumPy/pandas also support exact comparisons when mixing e.g. uint64 and int64. This is another rare exception that cudf currently does not support.

Closes gh-16282

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

Just as a note for reviewers. The later operator checks, need other to be fully coerced, so unfortunately can't deal with all operator special paths at the start.

When Python integers are compared to a series of integers, the result can always be correctly defined no matter the values of the Python integer. This was always a very mild issue. But with NumPy 2 behavior not upcasting the computation result type based on the value anymore, even things like: ``` cudf.Series([1, 2, 3], dtype="int8") < 1000 ``` would fail. N.B. NumPy/pandas also support exact comparisons when mixing e.g. uint64 and int64. This is another rare exception that cudf currently does not support.

mroeschke · 2024-08-12T18:59:07Z

/ok to test

mroeschke

Would there be a similar issue with comparing pyfloats and integer or float series?

seberg · 2024-08-13T06:00:12Z

Would there be a similar issue with comparing pyfloats and integer or float series?

Yes and no: we (as in numpy/pandas) are ~~usually~~ content with normal promotion logic there (unlike Python, which corrects for overflows and rounding when comparing int and float scalars).

mroeschke · 2024-08-13T20:59:43Z

/merge

jakirkham · 2024-08-13T21:19:17Z

Thanks Sebastian and Matt! 🙏

seberg requested a review from a team as a code owner August 12, 2024 18:27

seberg requested review from bdice and brandon-b-miller August 12, 2024 18:27

github-actions bot added the Python Affects Python cuDF API. label Aug 12, 2024

seberg force-pushed the comp-int branch from d422d9c to 029dcd4 Compare August 12, 2024 18:32

seberg force-pushed the comp-int branch from 029dcd4 to 6b575cf Compare August 12, 2024 18:58

mroeschke added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 12, 2024

mroeschke approved these changes Aug 13, 2024

View reviewed changes

Merge branch 'branch-24.10' into comp-int

b3fa480

rapids-bot bot merged commit cf3fabf into rapidsai:branch-24.10 Aug 13, 2024
80 checks passed

seberg deleted the comp-int branch August 14, 2024 06:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure comparisons with pyints and integer series always succeed #16532

Ensure comparisons with pyints and integer series always succeed #16532

seberg commented Aug 12, 2024

mroeschke commented Aug 12, 2024

mroeschke left a comment

seberg commented Aug 13, 2024 •

edited

Loading

mroeschke commented Aug 13, 2024

jakirkham commented Aug 13, 2024

Ensure comparisons with pyints and integer series always succeed #16532

Ensure comparisons with pyints and integer series always succeed #16532

Conversation

seberg commented Aug 12, 2024

Checklist

mroeschke commented Aug 12, 2024

mroeschke left a comment

Choose a reason for hiding this comment

seberg commented Aug 13, 2024 • edited Loading

mroeschke commented Aug 13, 2024

jakirkham commented Aug 13, 2024

seberg commented Aug 13, 2024 •

edited

Loading