-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Add np.intc
to _factorizers
in pd.merge
#52478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Did this fail on main as well? Couldn't reproduce on main yesterday |
Did not test on main. I don't have a Windows computer before next week. Did you test on Windows? Because I can see on Linux |
pandas/core/reshape/merge.py
Outdated
@@ -109,6 +109,7 @@ | |||
np.int64: libhashtable.Int64Factorizer, | |||
np.longlong: libhashtable.Int64Factorizer, | |||
np.int32: libhashtable.Int32Factorizer, | |||
np.intc: libhashtable.Int32Factorizer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this going to be platform-dependent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, an actual np.intc
is only available on Windows, and for Linux (and probably also Mac OS) it is np.int32
.
TBH, I did not know about np.intc
before some tests started failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a generic IntFactorizer? It is a bit pedantic since I think most distributions will have an int be 32 bits, but that definitely is not a guarantee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc @seberg mentioned hoping to get rid of np.intc in numpy 2.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not worried about intc
, bu its system dependend, it doesn't have to be 32bit, though. cnp.int_t
is what worries me a bit, because its long
, but if we want 64bit on windows it won't match up with np.array([1])
anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
I accidentally force-pushed the main into this branch, which was why this PR was closed. I will open a new PR with the changes, though I will change |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.