-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange Span costs for as Memory.Span -> parameter #32396
Comments
Updated with |
|
Main difference between Slow G_M41396_IG01:
push r14
push rdi
push rsi
push rbp
push rbx
sub rsp, 96
mov rsi, rcx
lea rdi, [rsp+20H]
mov ecx, 16
xor rax, rax
rep stosd
mov rcx, rsi
mov rsi, rcx Fast G_M24512_IG01:
push r15
push r14
push rdi
push rsi
push rbp
push rbx
sub rsp, 72
xor rax, rax
mov qword ptr [rsp+38H], rax
mov qword ptr [rsp+40H], rax
mov qword ptr [rsp+28H], rax
mov qword ptr [rsp+30H], rax
mov rsi, rcx |
We have an issue #8890 for improving heuristics for prolog zeroing. Ideally the jit would reverse copy-prop and construct the There are also some promoted spans in the mix, so the code is also moving span fields from stack to register pairs in places. The jit would be better off not promoting as the code never computes with the span fields, just passes them around, and they ultimately have to end up on the stack. But those are tricky heuristics to get right.
The ABI costs for spans in simple examples like these should be somewhat lower on SysV, however the jit does not yet take full advantage of this. Since there are GC refs in spans, using |
Ah, it hits the 16 byte limit so moves to |
Can costs for passing Spans as parameters be reduced? (e.g. by passing in
xmm
registers).The current costs may make passing pointers or
ref
s more attractive; which is undesirable as it discards the bounding safety provided by the Spans.Noticed in #32371 (comment) where the cost of using Span parameters for the method is higher than the method's time taken to test whether two sets of 4096 bytes are equal (test on Windows)
Created gist benchmark https://gist.github.com/benaadams/56af11cf7f8e0e1da3fed47464414f8a to demonstrate:
All three methods create Spans from the same
Memory<byte>
PassSpansByParam
- passes the createdSpan<byte>
as parametersPassDeconstructedSpans
- turns theSpan<byte>
intoref byte
andint
and passes thoseDeferSpansCreation
- passes no parameters and creates theSpan<byte>
in the calleePassSpansByParamTwice
- passes the createdSpan<byte>
as parameters; then passes them through to second method in different param positionsPassReconstructedSpansParam
- turns theSpan<byte>
intoref byte
andint
and passes those; then recreates the Spans from the params and passes those created Spans on to second method.As
PassSpansByParamTwice
to see if its purely span passing (i.e. is the cost directly additive); it doesn't add the same cost on again; it seems to be more than purely parameter passing.As
PassReconstructedSpansParam
does pass the Spans as parameters, just not from the original method that gets them from theMemory<byte>
and has a much lower cost; even though it now involves an extra non-inlined method call, its even stranger?category:cq
theme:optimization
skill-level:expert
cost:medium
The text was updated successfully, but these errors were encountered: