Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable empty repetition near end-of-line anchor for rlike, regexp_extract and regexp_replace #8081

Merged

Conversation

NVnavkumar
Copy link
Collaborator

Fixes #5659.

This enables regular expressions with empty repetitions adjacent to end-of-line anchors in every regular expression SQL function but split(), a use case that had been working before but had been excluded due to other edge cases that were not functioning. This does this by modifying to checks to narrowly allow this particular case.

Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
@NVnavkumar NVnavkumar self-assigned this Apr 12, 2023
@NVnavkumar
Copy link
Collaborator Author

build

Signed-off-by: Navin Kumar <navink@nvidia.com>
@NVnavkumar
Copy link
Collaborator Author

build

@NVnavkumar
Copy link
Collaborator Author

build

@NVnavkumar NVnavkumar merged commit 90cc0a3 into NVIDIA:branch-23.06 Apr 13, 2023
abellina pushed a commit to abellina/spark-rapids that referenced this pull request Apr 14, 2023
…extract and regexp_replace (NVIDIA#8081)

* WIP: fix false positive repetition near line anchor bug

Signed-off-by: Navin Kumar <navink@nvidia.com>

* Enable repetition near end of line anchor edge case

Signed-off-by: Navin Kumar <navink@nvidia.com>

* Add regexp_replace test with empty repetition

Signed-off-by: Navin Kumar <navink@nvidia.com>

* Need to add unicode enabled check for these 2 unit tests

Signed-off-by: Navin Kumar <navink@nvidia.com>

---------

Signed-off-by: Navin Kumar <navink@nvidia.com>
@sameerz sameerz added the task Work required that improves the product but is not user facing label Apr 16, 2023
@mattahrens mattahrens added the feature request New feature or request label Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Minimize false positives when falling back to CPU for end of line/string anchors and newlines
4 participants