-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex evaluation bug - discrepancy between compiled and non-compiled regex #97455
Comments
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsDescriptionRunning same code as compiled and non-compiled version is producing different outcome. Reproduction Stepsusing System.Text.RegularExpressions;
namespace BugReport;
public class RegexTests
{
[Fact]
public void CompiledRegexShouldProduceSameResultAsNonCompiled()
{
const string Pattern = @"(.*?)a(?!(a+)b\2c)\2(.*)";
var nonCompiled = new Regex(Pattern, RegexOptions.None);
var compiled = new Regex(Pattern, RegexOptions.Compiled);
const string Input = "baaabaac";
const int GroupNumber = 2;
Assert.Equal(nonCompiled.Match(Input).Groups[GroupNumber].Value, compiled.Match(Input).Groups[GroupNumber].Value);
}
} Expected behaviorGroup capture is equal. Actual behavior
Regression?No response Known WorkaroundsNo response ConfigurationRunnin latest .NET, Windows 11.
Other informationNo response
|
Thanks for the helpful repro. Here's a slightly simpler one based on yours: using System.Text.RegularExpressions;
const string Pattern = @"(?!(b)b)\1";
const string Input = "ba";
var nonCompiled = new Regex(Pattern, RegexOptions.None);
var compiled = new Regex(Pattern, RegexOptions.Compiled);
Console.WriteLine(nonCompiled.Match(Input).Success);
Console.WriteLine(compiled.Match(Input).Success); The issue appears to be in how compiled (and source generated) regexes are handling capture groups inside of negative lookarounds. They're not uncapturing when exiting the construct, so whereas the backreference doesn't end up matching in the interpreter (because there's no capture to match), it does end up matching in the compiled regex because the capture is still there and matches. |
@lahma, given that a backreference to a capture inside a negative lookahead from outside that lookahead will never match, can you speak to the pattern you were using that encountered this? Was it just a test, or was the pattern actually trying to do something useful with that backreference? |
Well hard to say if we can call this a real world scenario, as we are talking about my arch enemy after all. I encountered this when running ECMAScript test suite and testing generated Regex instances in compiled mode when trying to optimize Jint. Here's the actual test case. |
Thanks. That's what I figured it was. |
Adding a note that when run under |
Yes, we rewrote the compiler in 7. |
Description
Running same Regex as compiled and non-compiled version is producing different outcome.
Reproduction Steps
Expected behavior
Group capture is equal, in this case value should be empty.
Actual behavior
Regression?
No response
Known Workarounds
No response
Configuration
Runnin latest .NET, Windows 11.
Other information
No response
The text was updated successfully, but these errors were encountered: