-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use IndexOfAny{Except}InRange in RegexCompiler / source generator #76859
Conversation
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsDepends on #76803. This augments our existing use of IndexOf, IndexOfAny, and IndexOfAnyExcept to also support IndexOfAnyInRange and IndexOfAnyExceptInRange. That means, for example, we can now efficiently find the start of a pattern like As part of this, I changed some tuples to instead be named structs. They were becoming unwieldy, and we expect we'll be adding even more here as additional IndexOf variants become available.
|
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexFindOptimizations.cs
Show resolved
Hide resolved
.../System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexFindOptimizations.cs
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexNode.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the tests we have currently exercise this code already for both source generator and compiled engine.
Left a few nits, but this otherwise LGTM assuming CI will be green once the change in System.Memory goes in.
BTW, just curious, did you happen to run our perf tests to measure any gains that we might have gotten from this? |
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Show resolved
Hide resolved
This augments our existing use of IndexOf, IndexOfAny, and IndexOfAnyExcept to also support IndexOfAnyInRange and IndexOfAnyExceptInRange. That means, for example, we can now efficiently find the start of a pattern like `[0-9]{5}`, via a vectorized search, whereas previously it'd require iterating character by character in a scalar loop. As part of this, I changed some tuples to instead be named structs. They were becoming unwieldy, and we expect we'll be adding even more here as additional IndexOf variants become available.
And add a bit more test coverage
146f238
to
2036504
Compare
I didn't. I don't think any of our perf tests have patterns impacted by this. I'll pay attention to any improvements/regression reports, though, in case one slips through. |
Yup. Though I added a few more tests to cover a few gaps. |
Depends on #76803.
This augments our existing use of IndexOf, IndexOfAny, and IndexOfAnyExcept to also support IndexOfAnyInRange and IndexOfAnyExceptInRange. That means, for example, we can now efficiently find the start of a pattern like
[0-9]{5}
, via a vectorized search, whereas previously it'd require iterating character by character in a scalar loop.As part of this, I changed some tuples to instead be named structs. They were becoming unwieldy, and we expect we'll be adding even more here as additional IndexOf variants become available.