Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enables partial regex support for octal digits on the GPU #4735

Merged
merged 7 commits into from
Feb 10, 2022

Conversation

NVnavkumar
Copy link
Collaborator

Fixes #4409.

Since a couple of underlying libcudf issues with octal digit support in regular expression have now been resolved, this code re-enables octal digit support with the appropriate transpiling from Java Pattern support to libcudf regular expression format. A couple of known limitations:

  • octal digits from \200 to \377 are not supported by libcudf, so this will still fallback to CPU
  • octal digits are not currently supported in character classes (e.g. [\022]) in libcudf, so this will also fallback to CPU

Documentation has been updated to reflect these limitations.

Signed-off-by: Navin Kumar <navink@nvidia.com>
…nspiler

Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
@revans2
Copy link
Collaborator

revans2 commented Feb 9, 2022

Just a small question. If we can parse them properly why not switch them over to something that CUDF does support? like hex digits?

@NVnavkumar
Copy link
Collaborator Author

Just a small question. If we can parse them properly why not switch them over to something that CUDF does support? like hex digits?

CUDF actually also doesn't support hex digits in the range that corresponds to 128-255

Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
@NVnavkumar
Copy link
Collaborator Author

build

Copy link
Contributor

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am not sure why the build failed but it seems unrelated.

@andygrove
Copy link
Contributor

build

@sameerz sameerz added the feature request New feature or request label Feb 9, 2022
@sameerz sameerz added this to the Jan 31 - Feb 11 milestone Feb 9, 2022
@andygrove andygrove merged commit f5ef7e9 into NVIDIA:branch-22.04 Feb 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Possible race condition in regular expression support for octal digits
4 participants