-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
PERF: fix performance regression from #62542 #62623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
be21b2e
fc10a5f
ab2fab8
7e8033d
5219386
4ff07e3
c7fc292
4c8d770
35f075a
448f944
cf0a26d
2e5a47c
ca32c01
46c9883
69c35ee
40983dd
00be2c2
06297b6
4f6c9a8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1907,7 +1907,9 @@ int64_t str_to_int64(const char *p_item, int *error, char tsep) { | |
int64_t number = strtoll(p, &endptr, 10); | ||
|
||
if (errno == ERANGE) { | ||
*error = ERROR_OVERFLOW; | ||
// Python's integers can handle pure overflow errors, | ||
// but for invalid characters, try using different conversion methods. | ||
*error = *endptr ? ERROR_INVALID_CHARS : ERROR_OVERFLOW; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you sure that
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does, here is an example. #include <errno.h>
#include <stdio.h>
#include <stdlib.h>
int main(void) {
// 1 << 65 + "foo"
const char *str = "36893488147419103232foo";
char *endptr;
long long int number = strtoll(str, &endptr, 10);
printf("Original String: %s\nNumber: %lld\nEndPtr: %s\nError: %d\n", str,
number, endptr, errno);
return 0;
} Output:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ERRNO 34 is ERANGE. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know if this is the official implementation of gcc, but looks like it only assigns errno to ERANGE. https://github.com/gcc-mirror/gcc/blob/master/libiberty/strtoll.c |
||
errno = 0; | ||
return 0; | ||
} | ||
|
@@ -1967,7 +1969,9 @@ uint64_t str_to_uint64(uint_state *state, const char *p_item, int *error, | |
uint64_t number = strtoull(p, &endptr, 10); | ||
|
||
if (errno == ERANGE) { | ||
*error = ERROR_OVERFLOW; | ||
// Python's integers can handle pure overflow errors, | ||
// but for invalid characters, try using different conversion methods. | ||
*error = *endptr ? ERROR_INVALID_CHARS : ERROR_OVERFLOW; | ||
errno = 0; | ||
return 0; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this comment mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recently, I added a change that on overflow, it tries to convert to Python integers (PyLongObject).
pandas/pandas/_libs/parsers.pyx
Lines 1081 to 1084 in e95948f
Since Python supports big integers and it's used to represent big integers in Pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other part of the comment refers to the change in this PR, that flags
maybe_int
toFalse
inpandas/_libs/parsers.pyx