BUG: isin casting to float64 for unsigned int and list #46693

phofl · 2022-04-08T11:24:12Z

closes BUG: isin() give incorrect results for uint64 columns #46485 (Replace xxxx with the Github issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

The argument is cast to int64 which leads to an upcast later on -> to avoid this we can either use object-dtype for int or implement some casting logic which casts values to unsigned if possible

jreback · 2022-04-08T23:27:30Z

looks fine. cc @jbrockmendel if any comments.

jbrockmendel · 2022-04-09T21:56:40Z

pandas/core/algorithms.py

@@ -446,7 +447,11 @@ def isin(comps: AnyArrayLike, values: AnyArrayLike) -> npt.NDArray[np.bool_]:
        )

    if not isinstance(values, (ABCIndex, ABCSeries, ABCExtensionArray, np.ndarray)):
-        values = _ensure_arraylike(list(values))
+        if not is_signed_integer_dtype(comps):
+            # GH#46485 Use object to avoid upcast to float64 later


IIUC the problem is that this comes back with int64, then np.find_common_type for int64+uint64 is float64, which is lossy for big integers.

we face a similar problem elsewhere, and ideally id like to re-use some of the logic we use for those. the place that comes to mind is in Index._find_common_type_compat.

So we should move the logic from _find_common_type_compat into a helper function we can call in both places?

if it can be done gracefully thatd be nice. otherwise i guess a TODO to look into sharing sooner or later

Sharing this is not straightforward right now, so would add a todo and try to refactor later

github-actions · 2022-06-04T00:05:03Z

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

jreback · 2022-06-24T18:14:40Z

thanks @phofl

)

BUG: isin casting to float64 for unsigned int and list

61c3a6f

phofl added Bug Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Regression Functionality that used to work in a prior pandas version labels Apr 8, 2022

Merge remote-tracking branch 'upstream/main' into 46485

45f8a48

jreback added this to the 1.5 milestone Apr 8, 2022

jbrockmendel reviewed Apr 9, 2022

View reviewed changes

github-actions bot added the Stale label Jun 4, 2022

phofl added 2 commits June 15, 2022 10:04

Merge branch 'main' of https://github.com/pandas-dev/pandas into 46485

452f16f

Add todo

ac1e777

phofl removed the Stale label Jun 15, 2022

Merge remote-tracking branch 'upstream/main' into 46485

1afb98f

jreback approved these changes Jun 24, 2022

View reviewed changes

jreback merged commit e5c7543 into pandas-dev:main Jun 24, 2022

phofl deleted the 46485 branch June 25, 2022 14:24

yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022

BUG: isin casting to float64 for unsigned int and list (pandas-dev#46693

2baa7b0

)

adrian17 mentioned this pull request Oct 24, 2024

PERF: Slowdowns with .isin() on columns typed as np.uint64 #60098

Closed

3 tasks

pbrochart mentioned this pull request Apr 28, 2025

PERF: Restore old performances with .isin() on columns typed as np.ui… #61320

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: isin casting to float64 for unsigned int and list #46693

BUG: isin casting to float64 for unsigned int and list #46693

Uh oh!

phofl commented Apr 8, 2022

Uh oh!

jreback commented Apr 8, 2022

Uh oh!

jbrockmendel Apr 9, 2022

Uh oh!

phofl Apr 22, 2022

Uh oh!

jbrockmendel May 4, 2022

Uh oh!

phofl Jun 15, 2022

Uh oh!

github-actions bot commented Jun 4, 2022

Uh oh!

jreback commented Jun 24, 2022

Uh oh!

Uh oh!

Uh oh!

BUG: isin casting to float64 for unsigned int and list #46693

BUG: isin casting to float64 for unsigned int and list #46693

Uh oh!

Conversation

phofl commented Apr 8, 2022

Uh oh!

jreback commented Apr 8, 2022

Uh oh!

jbrockmendel Apr 9, 2022

Choose a reason for hiding this comment

Uh oh!

phofl Apr 22, 2022

Choose a reason for hiding this comment

Uh oh!

jbrockmendel May 4, 2022

Choose a reason for hiding this comment

Uh oh!

phofl Jun 15, 2022

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 4, 2022

Uh oh!

jreback commented Jun 24, 2022

Uh oh!

Uh oh!