Skip to content

Newline causes empty regex capture groups when NonBacktracking in dotnet 9 and above #120202

@pjc50

Description

@pjc50

Description

Found while upgrading 8 -> 10, but appears in 9 as well. I'm fairly sure I can't find an explanation of it in the documentation.

Both Nonbacktracking and having the final character be a \n rather than any other kind of whitespace appear to be critical to triggering the bug. Nonbacktracking is in there because this case was cut down from a much larger regular expression while trying to diagnose this. It is important to the larger expression for performance.

Reproduction Steps

`using System.Text.RegularExpressions;

string line = "A\n";
string expression = "^(A)(\s)";
Match match = Regex.Match(line, expression, RegexOptions.IgnoreCase | RegexOptions.NonBacktracking);
if (!match.Success)
{
System.Environment.Exit(1);
}

Console.WriteLine($"reg: '{match.Groups[1]}' '{match.Groups[2]}'");`

Expected behavior

reg: 'A' ' '

The two characters of input are each matched and end up captured in a group.

Actual behavior

'reg: '' ''

Both capture groups end up empty, even though the regex has matched.

Regression?

Works as expected in dotnet 8.

Known Workarounds

Applying Trim() to the string rather than relying on regex matching of the start and end, or applying RegexOptions.Multiline

Configuration

Dotnet 10.0.0-rc.1.25451.107 on Windows x64 in VS Insiders.

As above - this code has been in place for many years and works on dotnet 8 and many previous versions.

Other information

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions