Skip to content

Explore switching MultipartReaderStream to use IndexOf #49223

@stephentoub

Description

@stephentoub

MultipartReaderStream currently uses a Boyer-Moore substring search implementation as part of finding the next boundary:

var matchBytesLengthMinusOne = matchBytes.Length - 1;
var matchBytesLastByte = matchBytes[matchBytesLengthMinusOne];
var segmentEndMinusMatchBytesLength = segment1.Offset + segment1.Count - matchBytes.Length;
matchOffset = segment1.Offset;
while (matchOffset < segmentEndMinusMatchBytesLength)
{
var lookaheadTailChar = segment1.Array![matchOffset + matchBytesLengthMinusOne];
if (lookaheadTailChar == matchBytesLastByte &&
CompareBuffers(segment1.Array, matchOffset, matchBytes, 0, matchBytesLengthMinusOne) == 0)
{
matchCount = matchBytes.Length;
return true;
}
matchOffset += _boundary.GetSkipValue(lookaheadTailChar);
}

We should check to see whether this is still more beneficial than just using the vectorized:
https://github.com/dotnet/runtime/blob/7b91fd42a64732681472afc8d5a52c5bc5eb0c8a/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.cs#L1694

Regex used to use Boyer-Moore as well, and in .NET 7 deleted its use of Boyer-Moore entirely, instead just using IndexOf, for significant gains in most cases. See https://devblogs.microsoft.com/dotnet/regular-expression-improvements-in-dotnet-7/#leading-vectorization for details.

Metadata

Metadata

Labels

Perfarea-networkingIncludes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions