Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Span.Slice by using IsKnownConstant for start = 0 #69193

Closed

Conversation

GrabYourPitchforks
Copy link
Member

There are numerous places in the runtime where we call Span<T>.Slice(0, ...). This currently incurs needless expense due to the length argument and _length field both being zero-extended to 64 bits before comparison.

We can use the IsKnownConstant helper and check for the condition start = 0 to avoid the extension to 64 bits. There's not a significant throughput improvement since zero-extensions are already fast, but it does provide the opportunity to be more conservative with register usage, and it eliminates wasteful codegen. These could give second-order performance and space saving benefits.

Crossgen doesn't seem to be inlining this method across assemblies, but here's the diff just within corelib.

Found 274 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 37308702
Total bytes of diff: 37307222
Total bytes of delta: -1480 (-0.00 % of base)
Total relative delta: NaN
    diff is an improvement.
    relative diff is a regression.


Top file improvements (bytes):
       -1480 : System.Private.CoreLib.dasm (-0.05% of base)

1 total files with Code Size differences (1 improved, 0 regressed), 271 unchanged.

Top method improvements (bytes):
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Byte],System.Span`1[System.Byte],int,System.Collections.Generic.IComparer`1[System.Byte])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Double],System.Span`1[System.Double],int,System.Collections.Generic.IComparer`1[System.Double])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int16],System.Span`1[System.Int16],int,System.Collections.Generic.IComparer`1[System.Int16])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int32],System.Span`1[System.Int32],int,System.Collections.Generic.IComparer`1[System.Int32])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int64],System.Span`1[System.Int64],int,System.Collections.Generic.IComparer`1[System.Int64])
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.SByte],System.Span`1[System.SByte],int,System.Collections.Generic.IComparer`1[System.SByte])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Single],System.Span`1[System.Single],int,System.Collections.Generic.IComparer`1[System.Single])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt16],System.Span`1[System.UInt16],int,System.Collections.Generic.IComparer`1[System.UInt16])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt32],System.Span`1[System.UInt32],int,System.Collections.Generic.IComparer`1[System.UInt32])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt64],System.Span`1[System.UInt64],int,System.Collections.Generic.IComparer`1[System.UInt64])
         -24 (-1.69% of base) : System.Private.CoreLib.dasm - System.IO.PathHelper:TryExpandShortFileName(byref,System.String):System.String
         -23 (-5.36% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Byte],int,System.Comparison`1[System.Byte])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Diagnostics.Tracing.EventProvider+SessionInfo],int,System.Comparison`1[System.Diagnostics.Tracing.EventProvider+SessionInfo])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Double],int,System.Comparison`1[System.Double])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Int16],int,System.Comparison`1[System.Int16])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Int32],int,System.Comparison`1[System.Int32])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Int64],int,System.Comparison`1[System.Int64])
         -23 (-5.36% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.SByte],int,System.Comparison`1[System.SByte])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Single],int,System.Comparison`1[System.Single])
         -23 (-5.36% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Threading.Tasks.ConcurrentExclusiveSchedulerPair+ProcessingMode],int,System.Comparison`1[System.Threading.Tasks.ConcurrentExclusiveSchedulerPair+ProcessingMode])

Top method improvements (percentages):
          -7 (-14.89% of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:get_Text():System.ReadOnlySpan`1[System.Char]:this
          -7 (-14.89% of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AsSpan():System.ReadOnlySpan`1[System.Char]:this
          -6 (-9.23% of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:ToString():System.String:this
          -6 (-6.74% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:TrimEnd(System.ReadOnlySpan`1[System.Char],ushort):System.ReadOnlySpan`1[System.Char]
          -5 (-6.41% of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:Slice(int,int):System.ReadOnlyMemory`1[System.__Canon]:this
          -5 (-6.41% of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:Slice(int,int):System.ReadOnlyMemory`1[System.Byte]:this
          -5 (-6.41% of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:Slice(int,int):System.ReadOnlyMemory`1[System.Char]:this
          -7 (-6.14% of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactLiteral(byref,byref):bool
          -6 (-6.12% of base) : System.Private.CoreLib.dasm - Kernel32:GetAndTrimString(System.Span`1[System.Char]):System.String
         -17 (-5.94% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:TrimEnd(System.ReadOnlyMemory`1[System.Char]):System.ReadOnlyMemory`1[System.Char]
          -6 (-5.71% of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AsSpan(bool):System.ReadOnlySpan`1[System.Char]:this
         -10 (-5.49% of base) : System.Private.CoreLib.dasm - <>c:<ToLower>b__2_0(System.Span`1[System.Char],System.ValueTuple`2[System.String, System.Int32]):this
         -10 (-5.49% of base) : System.Private.CoreLib.dasm - <>c:<ToUpper>b__3_0(System.Span`1[System.Char],System.ValueTuple`2[System.String, System.Int32]):this
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Byte],System.Span`1[System.Byte],int,System.Collections.Generic.IComparer`1[System.Byte])
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.SByte],System.Span`1[System.SByte],int,System.Collections.Generic.IComparer`1[System.SByte])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int16],System.Span`1[System.Int16],int,System.Collections.Generic.IComparer`1[System.Int16])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt16],System.Span`1[System.UInt16],int,System.Collections.Generic.IComparer`1[System.UInt16])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Double],System.Span`1[System.Double],int,System.Collections.Generic.IComparer`1[System.Double])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int32],System.Span`1[System.Int32],int,System.Collections.Generic.IComparer`1[System.Int32])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int64],System.Span`1[System.Int64],int,System.Collections.Generic.IComparer`1[System.Int64])

142 total methods with Code Size differences (142 improved, 0 regressed), 254461 unchanged.

--------------------------------------------------------------------------------

@ghost
Copy link

ghost commented May 11, 2022

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

There are numerous places in the runtime where we call Span<T>.Slice(0, ...). This currently incurs needless expense due to the length argument and _length field both being zero-extended to 64 bits before comparison.

We can use the IsKnownConstant helper and check for the condition start = 0 to avoid the extension to 64 bits. There's not a significant throughput improvement since zero-extensions are already fast, but it does provide the opportunity to be more conservative with register usage, and it eliminates wasteful codegen. These could give second-order performance and space saving benefits.

Crossgen doesn't seem to be inlining this method across assemblies, but here's the diff just within corelib.

Found 274 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 37308702
Total bytes of diff: 37307222
Total bytes of delta: -1480 (-0.00 % of base)
Total relative delta: NaN
    diff is an improvement.
    relative diff is a regression.


Top file improvements (bytes):
       -1480 : System.Private.CoreLib.dasm (-0.05% of base)

1 total files with Code Size differences (1 improved, 0 regressed), 271 unchanged.

Top method improvements (bytes):
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Byte],System.Span`1[System.Byte],int,System.Collections.Generic.IComparer`1[System.Byte])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Double],System.Span`1[System.Double],int,System.Collections.Generic.IComparer`1[System.Double])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int16],System.Span`1[System.Int16],int,System.Collections.Generic.IComparer`1[System.Int16])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int32],System.Span`1[System.Int32],int,System.Collections.Generic.IComparer`1[System.Int32])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int64],System.Span`1[System.Int64],int,System.Collections.Generic.IComparer`1[System.Int64])
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.SByte],System.Span`1[System.SByte],int,System.Collections.Generic.IComparer`1[System.SByte])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Single],System.Span`1[System.Single],int,System.Collections.Generic.IComparer`1[System.Single])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt16],System.Span`1[System.UInt16],int,System.Collections.Generic.IComparer`1[System.UInt16])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt32],System.Span`1[System.UInt32],int,System.Collections.Generic.IComparer`1[System.UInt32])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt64],System.Span`1[System.UInt64],int,System.Collections.Generic.IComparer`1[System.UInt64])
         -24 (-1.69% of base) : System.Private.CoreLib.dasm - System.IO.PathHelper:TryExpandShortFileName(byref,System.String):System.String
         -23 (-5.36% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Byte],int,System.Comparison`1[System.Byte])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Diagnostics.Tracing.EventProvider+SessionInfo],int,System.Comparison`1[System.Diagnostics.Tracing.EventProvider+SessionInfo])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Double],int,System.Comparison`1[System.Double])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Int16],int,System.Comparison`1[System.Int16])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Int32],int,System.Comparison`1[System.Int32])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Int64],int,System.Comparison`1[System.Int64])
         -23 (-5.36% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.SByte],int,System.Comparison`1[System.SByte])
         -23 (-5.35% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Single],int,System.Comparison`1[System.Single])
         -23 (-5.36% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`1:IntroSort(System.Span`1[System.Threading.Tasks.ConcurrentExclusiveSchedulerPair+ProcessingMode],int,System.Comparison`1[System.Threading.Tasks.ConcurrentExclusiveSchedulerPair+ProcessingMode])

Top method improvements (percentages):
          -7 (-14.89% of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:get_Text():System.ReadOnlySpan`1[System.Char]:this
          -7 (-14.89% of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AsSpan():System.ReadOnlySpan`1[System.Char]:this
          -6 (-9.23% of base) : System.Private.CoreLib.dasm - System.Runtime.CompilerServices.DefaultInterpolatedStringHandler:ToString():System.String:this
          -6 (-6.74% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:TrimEnd(System.ReadOnlySpan`1[System.Char],ushort):System.ReadOnlySpan`1[System.Char]
          -5 (-6.41% of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:Slice(int,int):System.ReadOnlyMemory`1[System.__Canon]:this
          -5 (-6.41% of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:Slice(int,int):System.ReadOnlyMemory`1[System.Byte]:this
          -5 (-6.41% of base) : System.Private.CoreLib.dasm - System.ReadOnlyMemory`1:Slice(int,int):System.ReadOnlyMemory`1[System.Char]:this
          -7 (-6.14% of base) : System.Private.CoreLib.dasm - System.Globalization.TimeSpanParse:ParseExactLiteral(byref,byref):bool
          -6 (-6.12% of base) : System.Private.CoreLib.dasm - Kernel32:GetAndTrimString(System.Span`1[System.Char]):System.String
         -17 (-5.94% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:TrimEnd(System.ReadOnlyMemory`1[System.Char]):System.ReadOnlyMemory`1[System.Char]
          -6 (-5.71% of base) : System.Private.CoreLib.dasm - System.Text.ValueStringBuilder:AsSpan(bool):System.ReadOnlySpan`1[System.Char]:this
         -10 (-5.49% of base) : System.Private.CoreLib.dasm - <>c:<ToLower>b__2_0(System.Span`1[System.Char],System.ValueTuple`2[System.String, System.Int32]):this
         -10 (-5.49% of base) : System.Private.CoreLib.dasm - <>c:<ToUpper>b__3_0(System.Span`1[System.Char],System.ValueTuple`2[System.String, System.Int32]):this
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Byte],System.Span`1[System.Byte],int,System.Collections.Generic.IComparer`1[System.Byte])
         -35 (-5.45% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.SByte],System.Span`1[System.SByte],int,System.Collections.Generic.IComparer`1[System.SByte])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int16],System.Span`1[System.Int16],int,System.Collections.Generic.IComparer`1[System.Int16])
         -35 (-5.41% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.UInt16],System.Span`1[System.UInt16],int,System.Collections.Generic.IComparer`1[System.UInt16])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Double],System.Span`1[System.Double],int,System.Collections.Generic.IComparer`1[System.Double])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int32],System.Span`1[System.Int32],int,System.Collections.Generic.IComparer`1[System.Int32])
         -35 (-5.40% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.ArraySortHelper`2:IntroSort(System.Span`1[System.Int64],System.Span`1[System.Int64],int,System.Collections.Generic.IComparer`1[System.Int64])

142 total methods with Code Size differences (142 improved, 0 regressed), 254461 unchanged.

--------------------------------------------------------------------------------
Author: GrabYourPitchforks
Assignees: -
Labels:

area-System.Runtime, optimization

Milestone: -

@tannergooding
Copy link
Member

tannergooding commented May 11, 2022

@GrabYourPitchforks given this method has AggressiveInlining already, I think it may be more beneficial to determine why the JIT is emitting the movzx despite start being a known constant 0.

This is likely a broader issue and one that ideally we would optimize when we have a constant or known positive value (such as Array.Length)

@jakobbotsch
Copy link
Member

What do diffs look like for this when start is a non-zero constant?

@jakobbotsch
Copy link
Member

FWIW, there were some concerns with this pattern the last time we tried to use it, see #67285 (comment) and #64821.

It might not matter for methods that already have aggressive inlining, however.

@GrabYourPitchforks
Copy link
Member Author

GrabYourPitchforks commented May 11, 2022

I think it may be more beneficial to determine why the JIT is emitting the movzx despite start being a known constant 0.

When start is the constant 0, it turns into the comparison if ((ulong)(uint)length > (ulong)(uint)_length). Both length and _length are extended to 64-bits before the comparison. JIT seemingly does not know that zero-extending two values and performing a comparison would yield the same result as leaving the two values in their original widths and performing the comparison.

@jakobbotsch
Copy link
Member

JIT seemingly does not know that zero-extending two values and performing a comparison would yield the same result as leaving the two values in their original widths and performing the comparison.

I don't think this should be difficult to fix in the JIT (famous last words).

@GrabYourPitchforks
Copy link
Member Author

@jakobbotsch Opened #69200 to track. I'll move this PR to draft for now while that issue is being discussed.

@GrabYourPitchforks GrabYourPitchforks marked this pull request as draft May 11, 2022 18:12
@GrabYourPitchforks
Copy link
Member Author

What do diffs look like for this when start is a non-zero constant?

I didn't observe any crossgen diff changes for start != 0. I didn't crack open the tests in a debugger to observe runtime JIT behaviors.

@GrabYourPitchforks
Copy link
Member Author

Superseded by #69247.

@ghost ghost locked as resolved and limited conversation to collaborators Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants