-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-12946: [C++] String swap case kernel #10855
ARROW-12946: [C++] String swap case kernel #10855
Conversation
LGTM. Need to add implementation for UTF8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! Just a couple comments.
@lidavidm please enable CI |
For CI, it appears that Ubuntu 20.04 provides utf8proc 2.5, but the isupper/islower functions are not provided until 2.6: https://juliastrings.github.io/utf8proc/releases/ Meanwhile, RTools35 is using utf8proc 2.4; it builds utf8proc from source due to some other issue, but I believe it's using the system headers anyways looking at the compiler command since the system headers come before the self-built utf8proc ones. I think bumping the minimum utf8proc version and fiddling with the build flags so that the utf8proc headers take precedence will be needed. |
An alternative solution is to use helper functions already available, refer to https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_string.cc#L1391-L1408 Related notes in Arrow w.r.t. to lower/upper casing of UTF8 can be found in: |
Yeah the logs say
which doesn't make sense. |
As @edponce suggested, I replaced some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, thank you @Christian8491
Some tests invoke the incorrect kernel (ASCII test uses |
This PR adds `swapcase` compute kernel for string. It is similar to `Python str.swapcase()` Closes apache#10855 from Christian8491/ARROW-12946-String-swap-case-kernel Authored-by: christian <ccce91@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
This PR adds
swapcase
compute kernel for string. It is similar toPython str.swapcase()