Skip to content

Commit

Permalink
Use IndexOf in WebUtility (#70700)
Browse files Browse the repository at this point in the history
The IndexOfHtmlDecodingChars method was iterating character by character looking for either a `&` or a surrogate, but then the slow path if one of those is found doesn't special-case surrogates.  So, we can just collapse this to a vectorized `IndexOf('&')`, which makes the fast path of detecting whether there's anything to decode much faster if there's any meaningful amount of input prior to a `&`. (I experimented with also using `IndexOf('&')` in the main routine, but it made cases with lots of entities slower, and so I'm not including that here.)
  • Loading branch information
stephentoub authored Jun 14, 2022
1 parent 84f7cad commit a361f7f
Showing 1 changed file with 2 additions and 17 deletions.
19 changes: 2 additions & 17 deletions src/libraries/System.Private.CoreLib/src/System/Net/WebUtility.cs
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ private static void HtmlEncode(ReadOnlySpan<char> input, ref ValueStringBuilder

ReadOnlySpan<char> valueSpan = value.AsSpan();

int index = IndexOfHtmlDecodingChars(valueSpan);
int index = valueSpan.IndexOf('&');
if (index < 0)
{
return value;
Expand Down Expand Up @@ -215,7 +215,7 @@ public static void HtmlDecode(string? value, TextWriter output)

ReadOnlySpan<char> valueSpan = value.AsSpan();

int index = IndexOfHtmlDecodingChars(valueSpan);
int index = valueSpan.IndexOf('&');
if (index == -1)
{
output.Write(value);
Expand Down Expand Up @@ -701,21 +701,6 @@ private static bool ValidateUrlEncodingParameters(byte[]? bytes, int offset, int
return true;
}

private static int IndexOfHtmlDecodingChars(ReadOnlySpan<char> input)
{
// this string requires html decoding if it contains '&' or a surrogate character
for (int i = 0; i < input.Length; i++)
{
char c = input[i];
if (c == '&' || char.IsSurrogate(c))
{
return i;
}
}

return -1;
}

#endregion

// Internal struct to facilitate URL decoding -- keeps char buffer and byte buffer, allows appending of either chars or bytes
Expand Down

0 comments on commit a361f7f

Please sign in to comment.