-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hir::literal::Extractor
not producing optimal literals in some cases
#1032
Labels
Comments
Yeah I think this sounds right to me! |
plusvic
added a commit
to plusvic/regex
that referenced
this issue
Jul 10, 2023
When repetitions didn't have an explicit max value, like in `(ab){2,}` the literal extractor was producing sub-optimal literals, like `"ab"` instead of `"abab"`. Close rust-lang#1032
BurntSushi
pushed a commit
that referenced
this issue
Jul 11, 2023
When repetitions didn't have an explicit max value, like in `(ab){2,}` the literal extractor was producing sub-optimal literals, like `"ab"` instead of `"abab"`. Close #1032
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What version of regex are you using?
regex 1.9.1
regex-syntax 0.7.3
Describe the bug at a high level.
The literal extractor in the
regex_syntax
crate is not extracting the literals I expect from some repetition nodes. Specifically, when the repetition doesn't have a max value it doesn't produce optimal literals.For instance, the regexp
ab{2,}
could produce the inexact literalI("abab")
, becauseab
is guarantee to appear at least twice, however it producesI("ab")
. If a max value is explicitly specified, like inab{2,10}
, the result isI("abab")
as expected.The issue resides in this code snippet:
https://github.com/rust-lang/regex/blob/28e16fa5c34ab30a84b20de730cbdbe636e8a6df/regex-syntax/src/hir/literal.rs#L480C13-L497
The condition
if min < max
could be removed if we assume that when max isNone
it means that it should be >= min, which is the case in regular expressions. The snippet above could be replaced with:If I'm not missing something obvious I can prepare a PR and send it for review.
What are the steps to reproduce the behavior?
What is the actual behavior?
The program above fails in the second
assert_eq
statement because the literal produced for(ab){2,}
isI("ab")
instead of `I("abab")What is the expected behavior?
Both
assert_eq
statements should pass.The text was updated successfully, but these errors were encountered: