-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT perf: Span.Slice(0, ...) emits unnecessary 64-bit extensions #69200
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsWhen calling public class C {
public static void M(ReadOnlySpan<byte> span, int offset, int length) {
DoIt(span.Slice(offset, length));
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void DoIt(ReadOnlySpan<byte> span)
{
}
} C.M(System.ReadOnlySpan`1<Byte>, Int32, Int32)
L0000: sub rsp, 0x38
L0004: xor eax, eax
L0006: mov [rsp+0x28], rax
L000b: mov eax, edx ; rax := 64-bit zero-extension of offset
L000d: mov edx, r8d ; rdx := 64-bit zero-extension of length
L0010: add rdx, rax ; rdx := sum of offset and length
L0013: mov r9d, [rcx+8] ; r9 := 64-bit zero-extension of span.Length
L0017: cmp rdx, r9 ; bounds check
L001a: ja short L003c
L001c: mov rcx, [rcx]
L001f: add rcx, rax
L0022: mov [rsp+0x28], rcx
L0027: mov [rsp+0x30], r8d
L002c: lea rcx, [rsp+0x28]
L0031: call 0x00007ffbf1df0020
L0036: nop
L0037: add rsp, 0x38
L003b: ret
L003c: call 0x00007ffbe86e5778
L0041: int3 However, when start = 0 is a known constant, JIT still emits the 64-bit extension before performing the comparison. This is unnecessary because a comparison of two zero-extended values will yield the same result as a comparison of the original values. It also burns a register. public static void M(ReadOnlySpan<byte> span, int length) {
DoIt(span.Slice(0, length));
} C.M(System.ReadOnlySpan`1<Byte>, Int32)
L0000: sub rsp, 0x38
L0004: xor eax, eax
L0006: mov [rsp+0x28], rax
L000b: mov eax, edx ; rax := unnecessary 64-bit zero-extension of length
L000d: mov r8d, [rcx+8] ; r8 := unnecessary 64-bit zero-extension of span.Length
L0011: cmp rax, r8 ; bounds check
L0014: ja short L0032
L0016: mov rcx, [rcx]
L0019: mov [rsp+0x28], rcx
L001e: mov [rsp+0x30], edx
L0022: lea rcx, [rsp+0x28]
L0027: call 0x00007ffbf1e10020
L002c: nop
L002d: add rsp, 0x38
L0031: ret
L0032: call 0x00007ffbe86e5778
L0037: int3 In the above example, I would have expected the three annotated instructions to be collapsed into the single instruction
|
Yes, this looks quite easy. We end up with [000021] N---G+-N-U- └──▌ LE int
[000015] -----+---U- ├──▌ CAST long <- ulong <- uint
[000003] -----+----- │ └──▌ LCL_VAR int V01 arg1
[000020] ----G+---U- └──▌ CAST long <- ulong <- uint
[000092] n---G+----- └──▌ IND int
[000094] -----+----- └──▌ ADD byref
[000019] -----+----- ├──▌ LCL_VAR byref V00 arg0
[000093] -----+----- └──▌ CNS_INT long 8 field offset Fseq[_length] in morph. |
When calling
Span<T>.Slice(offset, length)
, the internal implementation 64-bit extends offset and length, then compares their sum against span.Length. This is an optimization that allows us to get away with a single comparison.However, when start = 0 is a known constant, JIT still emits the 64-bit extension before performing the comparison. This is unnecessary because a comparison of two zero-extended values will yield the same result as a comparison of the original values. It also burns a register.
In the above example, I would have expected the three annotated instructions to be collapsed into the single instruction
cmp edx, dword ptr [rcx + 8]
.The text was updated successfully, but these errors were encountered: