-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NonBacktracking inner matching loop optimizations #70217
Conversation
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsThis PR addresses the regressions related to the inner matching loop performance in #70022 introduced by #69839 and includes further optimizations. The changes in the PR are roughly:
One optimization that I evaluated, but that didn't decisively help was pulling out the decision to handle the "last end of line" special minterm outside the inner matching loop. I'm guessing the I compared performance after implementing each optimization to check that each was helping. Here's a total before and after comparison for NonBacktracking on all the relevant benchmarks in dotnet/performance.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those improvements look great. Do we know why the cited regressions regress?
if (nextState is not null || builder.TryCreateNewTransition(dfaMatchingState, mintermId, dfaOffset, checkThreshold: true, out nextState)) | ||
int dfaOffset = (state.DfaStateId << builder._mintermsLog) | mintermId; | ||
int nextStateId = builder._delta[dfaOffset]; | ||
if (nextStateId > 0 || builder.TryCreateNewTransition(state.DfaStateId, mintermId, dfaOffset, checkThreshold: true, out nextStateId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the only call site for the wrapper TryCreateNewTransition that was added. Any reason this can't instead just be:
if (nextStateId > 0 || builder.TryCreateNewTransition(state.DfaState!, mintermId, dfaOffset, checkThreshold: true, out DfaMatchingState<TSet> newState))
{
state.DfaStateId = newState.Id;
return true;
}
rather than having that extra wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this is cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, now I remember why I added it: having a wrapper for the state ID allowed a single if body here to handle both an existing transition and a new one. I'll still remove the wrapper, since expanding the code here makes the different cases a bit more apparent anyway.
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
@@ -367,7 +367,7 @@ public SymbolicMatch FindMatch(RegexRunnerMode mode, ReadOnlySpan<char> input, i | |||
Debug.Assert(matchEnd >= startat - 1); | |||
matchStart = matchEnd < startat ? | |||
startat : | |||
FindStartPosition(input, matchEnd, matchStartLowBoundary, perThreadData); | |||
SpecializedFindStartPosition(input, matchEnd, matchStartLowBoundary, perThreadData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "Specialized" mean? Specialized for or to do what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's essentially "curry with preferred generic type parameter and invoke", but I saw your other suggestion on getting rid of these functions and that does look cleaner.
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
int SpecializedFindEndPosition(ReadOnlySpan<char> input, int pos, long timeoutOccursAt, RegexRunnerMode mode, out int initialStatePos, out int matchLength, PerThreadData perThreadData) => | ||
_findOpts is null ? | ||
SpecializedFindEndPosition2<NoOptimizationsInitialStateHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData) : | ||
SpecializedFindEndPosition2<InitialStateFindOptimizationsHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData); | ||
|
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
int SpecializedFindEndPosition2<TFindOptimizationsHandler>(ReadOnlySpan<char> input, int pos, long timeoutOccursAt, RegexRunnerMode mode, out int initialStatePos, out int matchLength, PerThreadData perThreadData) | ||
where TFindOptimizationsHandler : struct, IInitialStateHandler => | ||
_pattern._info.ContainsSomeAnchor ? | ||
FindEndPosition<TFindOptimizationsHandler, FullNullabilityHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData) : | ||
FindEndPosition<TFindOptimizationsHandler, NoAnchorsNullabilityHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's only a single call site for this, these are both aggressively inlined, and the naming with both the "Specialized" prefix and "2" suffix is... suspect :) Can the single call site just be changed to:
int matchEnd = (_findOpts is null, _pattern._info.ContainsSomeAnchor) switch
{
(false, false) => FindEndPosition<NoOptimizationsInitialStateHandler, NoAnchorsNullabilityHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData),
(true, false) => FindEndPosition<InitialStateFindOptimizationsHandler, NoAnchorsNullabilityHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData),
(false, true) => FindEndPosition<NoOptimizationsInitialStateHandler, FullNullabilityHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData),
(true, true) => FindEndPosition<InitialStateFindOptimizationsHandler, FullNullabilityHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData),
};
FindEndPosition<TFindOptimizationsHandler, NoAnchorsNullabilityHandler>(input, pos, timeoutOccursAt, mode, out initialStatePos, out matchLength, perThreadData); | ||
|
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
int SpecializedFindStartPosition(ReadOnlySpan<char> input, int i, int matchStartBoundary, PerThreadData perThreadData) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here... can this just be manually inlined into the single call site?
matchStart =
matchEnd < startat ? startat :
_pattern._info.ContainsSomeAnchor ? FindStartPosition<FullNullabilityHandler>(input, i, matchStartBoundary, perThreadData) :
FindStartPosition<NoAnchorsNullabilityHandler>(input, i, matchStartBoundary, perThreadData);
// state processing logic and optimizations), or failed to transition (which should only happen if we were in DFA mode and | ||
// need to switch over to NFA mode). If we exited because we hit an initial state, find result will be 0, otherwise -1. | ||
if (findResult < 0) | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add the else block? Seems cleaner to just remove the else and unindent all of its contents.
try | ||
{ | ||
// Loop through each character in the input, transitioning from state to state for each. | ||
while (true) | ||
{ | ||
var (isInitial, isDeadend, isNullable, canBeNullable) = TStateHandler.GetStateInfo(builder, ref state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var (isInitial, isDeadend, isNullable, canBeNullable) = TStateHandler.GetStateInfo(builder, ref state); | |
(bool isInitial, bool isDeadend, bool isNullable, bool canBeNullable) = TStateHandler.GetStateInfo(builder, ref state); |
@@ -638,23 +649,29 @@ private bool FindStartPositionDeltas<TStateHandler>(SymbolicRegexBuilder<TSet> b | |||
// Loop backwards through each character in the input, transitioning from state to state for each. | |||
while (true) | |||
{ | |||
var (isInitial, isDeadend, isNullable, canBeNullable) = TStateHandler.GetStateInfo(builder, ref state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var (isInitial, isDeadend, isNullable, canBeNullable) = TStateHandler.GetStateInfo(builder, ref state); | |
(bool isInitial, bool isDeadend, bool isNullable, bool canBeNullable) = TStateHandler.GetStateInfo(builder, ref state); |
I don't think the regressions I cited actually do regress, but rather my laptop isn't stable enough as a benchmarking machine. I'll benchmark again before merging to make sure. |
a23b597
to
3920d3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @olsaarik. Looks good. Left a few style comments, but I'm going to merge this to unblock subsequent changes and to get the regressions addressed.
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexBuilder.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexBuilder.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
Show resolved
Hide resolved
{ | ||
return coreState.FixedLength(nextCharKind); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | |
} | |
Improvements on Arm64 Windows - dotnet/perf-autofiling-issues#6176 |
Improvements on x64: dotnet/perf-autofiling-issues#6037 and dotnet/perf-autofiling-issues#6130 |
This PR addresses the regressions related to the inner matching loop performance in #70022 introduced by #69839 and includes further optimizations. The changes in the PR are roughly:
int
assignment. I've also implemented support for fixed length markers in NFA mode.RegexFindOptimizations
out of the inner matching loop by using aIInitialStateHandler
interface and the "templating" technique with structs implementing the interface as generic type arguments that we already use for DFA and NFA mode.INullabilityHandler
interface.(bool IsInitial, bool IsDeadend, bool IsNullable, bool CanBeNullable)
in theSymbolicRegexBuilder
that can be accessed by state IDs. This reduces pointer chasing involved in looking these up from theDfaMatchingState
and its relatedSymbolicRegexNode
. The existingIsInitial
field fromDfaMatchingState
is removed for some memory saving.CurrentState
hold the ID (anint
) of the current DFA state instead of theDfaMatchingState
instance. This allows cached transitions to be taken in DFA mode without ever dereferencing the currentDfaMatchingState
. Coupled with the above optimization theDfaMatchingState
is rarely needed in typical patterns.One optimization that I evaluated, but that didn't decisively help was pulling out the decision to handle the "last end of line" special minterm outside the inner matching loop. I'm guessing the
char == '\n' && pos == input.Length - 1
check is already cheap enough that it isn't really visible.I compared performance after implementing each optimization to check that each was helping. Here's a total before and after comparison for NonBacktracking on all the relevant benchmarks in dotnet/performance. The baseline here is the situation after the regressions in #70022.
summary:
better: 58, geomean: 1.360
worse: 5, geomean: 1.057
total diff: 63