Skip to content

Conversation

@MihaZupan
Copy link
Member

@MihaZupan MihaZupan commented Nov 1, 2025

Closes #31173

We have an UnescapeString helper that escapes/unescapes input based on various flags (copy input as-is, unescape only, escape reserved chars, or both).

This PR removes the use of unsafe code from this helper.

I also removed the "copy input as-is" and "unescape only" modes from this helper. For copy-only we just forward to CopyTo/Append directly, whereas "unescape only" gets a dedicated helper since it's a reasonably hot path (UnescapeDataString calls into this).

Initial benchmarks look good (no meaningful difference): MihuBot/runtime-utils#1612

Alternating escaped/unescaped

Method Toolchain Mean Error Ratio
Unescape main 214.6 ns 4.02 ns 1.00
Unescape pr 197.5 ns 2.73 ns 0.92

Only unescaped

Method Toolchain Mean Error Ratio
Unescape main 270.0 ns 0.79 ns 1.00
Unescape pr 262.9 ns 2.34 ns 0.97

Escaped at start followed by unescaped

Method Toolchain Mean Error Ratio
Unescape main 88.41 ns 0.356 ns 1.00
Unescape pr 31.44 ns 0.228 ns 0.36

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the URI unescaping logic by:

  • Introducing a simpler Unescape method that unconditionally unescapes percent-encoded sequences
  • Removing the UnescapeAll flag from UnescapeMode enum
  • Renaming CopyOnly to None in UnescapeMode enum
  • Removing unused unsafe code and the System.Runtime.InteropServices import
  • Consolidating various unescape code paths to use the new Unescape method where full unescaping is desired

Key changes:

  • New UriHelper.Unescape() method provides simple unconditional unescaping
  • Callsites that previously used UnescapeMode.Unescape | UnescapeMode.UnescapeAll now use the simpler Unescape() method
  • The more complex UnescapeString() method is retained for cases requiring conditional escaping/unescaping with reserved character handling

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
UriHelper.cs Adds new Unescape() method, refactors UnescapeString() to use span-based operations instead of pointers, removes unsafe code
UriExt.cs Updates UnescapeDataString and TryUnescapeDataString to use new Unescape() method
UriEnumTypes.cs Renames CopyOnly to None, removes UnescapeAll flag
Uri.cs Updates multiple callsites to use new Unescape() method or simplified logic, removes pointer-based operations
PercentEncodingHelper.cs Adds scoped modifier to parameter, minor cleanup
Comments suppressed due to low confidence (1)

src/libraries/System.Private.Uri/src/System/Uri.cs:1081

  • Potential index out of bounds exception for UNC paths. This code assumes index 1 contains a drive letter separator, but for UNC paths, result[1] would be the second backslash character. This check should be conditional on IsDosPath or include bounds checking.
                if (result[1] == '|')
                    result[1] = ':';

@MihaZupan MihaZupan requested a review from a team November 1, 2025 18:35
Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@MihaZupan
Copy link
Member Author

/ba-g android timing out

@MihaZupan MihaZupan merged commit 0e0759a into dotnet:main Nov 3, 2025
81 of 86 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Uri] Rewrite UriHelper.UnescapeString to remove unsafe code and improve perf

4 participants