-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Fix escaped pattern display in GeneratedRegex XML documentation #120098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix escaped pattern display in GeneratedRegex XML documentation #120098
Conversation
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
@copilot, builds are failing with errors like: |
… pattern documentation Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Fixed the XML parsing errors in commit eb489a5. The issue was that my initial fix removed the
The builds should now pass without XML parsing errors while maintaining the correct single-escaped pattern display in documentation. |
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
@copilot there are multiple failures in the System.Text.RegularExpressions.Tests tests. Please run ALL those tests and get them all passing before returning. |
Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com>
All System.Text.RegularExpressions.Tests are now passing! Fixed in commit c4141a2. The issue was that my XML escaping needed to handle C1 control characters (0x80-0x9F) and whitespace characters that were breaking XML comment structure. The enhanced Final results: 29,287 tests passed, 0 failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue where GeneratedRegex XML documentation was showing double-escaped backslashes in regex patterns (e.g., \\w instead of \w). The root cause was using the Literal() function which performs C# string escaping when generating XML documentation comments.
- Replaced
Literal(regexMethod.Pattern, quote: false)with direct pattern usage in XML comments - Enhanced the existing
EscapeXmlComment()method to handle XML entities, control characters, and whitespace characters - Added comprehensive test coverage for the escaping behavior
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs |
Enhanced EscapeXmlComment() method with comprehensive character escaping and fixed XML documentation generation to use raw pattern instead of C# string literal |
src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/RegexGeneratorOutputTests.cs |
Updated existing test expectations and added three new tests to verify proper escaping behavior |
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Outdated
Show resolved
Hide resolved
|
/ba-g unrelated timeout in System.Runtime tests |
| case '>': sb.Append(">"); break; | ||
|
|
||
| // Propagate all other valid XML characters as-is. Control chars are considered invalid. | ||
| case (>= 0x20 and <= 0x7F) or (>= 0xA0 and <= 0xD7FF) or (>= 0xE000 and <= 0xFFFD): sb.Append(c); break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use char.IsControl() ?
Not sure what you would want here with surrogate chars... Really depends on what you do with the regex. Let's hope that you escape these yourself with \ patterns of interesting.
\\winstead of\w)RegexGenerator.Emitter.csline 62:Literal(regexMethod.Pattern, quote: false)LiteralusesSymbolDisplay.FormatLiteral(s, quote)which escapes backslashesLiteral(regexMethod.Pattern, quote: false)with justregexMethod.PatternEscapeForXmlCommentfunction to handle invalid XML charactersEscapeXmlCommentmethod instead of creating duplicate functionalitySummary: The issue was that
Literal()function was double-escaping backslashes in regex patterns when generating XML documentation comments. The fix enhances the existingEscapeXmlComment()method to properly handle XML entities, invalid XML characters, whitespace control characters, and C1 control characters while preserving backslashes. This provides comprehensive XML-safe escaping for all generated comments and resolves the original double-escaping issue, XML parsing errors, and all test failures.Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.