-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Description
MultipartReaderStream currently uses a Boyer-Moore substring search implementation as part of finding the next boundary:
aspnetcore/src/Http/WebUtilities/src/MultipartReaderStream.cs
Lines 279 to 294 in 53aad98
var matchBytesLengthMinusOne = matchBytes.Length - 1; | |
var matchBytesLastByte = matchBytes[matchBytesLengthMinusOne]; | |
var segmentEndMinusMatchBytesLength = segment1.Offset + segment1.Count - matchBytes.Length; | |
matchOffset = segment1.Offset; | |
while (matchOffset < segmentEndMinusMatchBytesLength) | |
{ | |
var lookaheadTailChar = segment1.Array![matchOffset + matchBytesLengthMinusOne]; | |
if (lookaheadTailChar == matchBytesLastByte && | |
CompareBuffers(segment1.Array, matchOffset, matchBytes, 0, matchBytesLengthMinusOne) == 0) | |
{ | |
matchCount = matchBytes.Length; | |
return true; | |
} | |
matchOffset += _boundary.GetSkipValue(lookaheadTailChar); | |
} |
We should check to see whether this is still more beneficial than just using the vectorized:
https://github.com/dotnet/runtime/blob/7b91fd42a64732681472afc8d5a52c5bc5eb0c8a/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.cs#L1694
Regex used to use Boyer-Moore as well, and in .NET 7 deleted its use of Boyer-Moore entirely, instead just using IndexOf, for significant gains in most cases. See https://devblogs.microsoft.com/dotnet/regular-expression-improvements-in-dotnet-7/#leading-vectorization for details.