Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT should suppress zero-extending same-register moves in more scenarios #12402

Open
GrabYourPitchforks opened this issue Apr 2, 2019 · 3 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI JitUntriaged CLR JIT issues needing additional triage optimization
Milestone

Comments

@GrabYourPitchforks
Copy link
Member

Related to dotnet/coreclr#22454. An optimization was previously introduced in coreclr which eliminates unnecessary mov instructions when zero-extending registers. However, that optimization only looks back one instruction to determine if the elimination is worthwhile.

Per dotnet/coreclr#23665, we have evidence that there's benefit to be realized from looking back more than one instruction when performing this optimization. We should be more aggressive about eliminating these mov instructions.

category:cq
theme:basic-cq
skill-level:intermediate
cost:medium

@AndyAyersMS
Copy link
Member

A general fix here would do the proper up-front analysis instead of generating unnecessary instructions and then removing them.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@benaadams
Copy link
Member

If looking back more than 1 instruction for movs would memory loads count too? 2 and 3 instructions are given by example seen in #32442 (comment) for Kestrel's Http1Connection.TakeMessageHeaders

Though this is from the non-dependency version for analysis gist which starts:

       rep stosd 
       mov      rcx, rsi
       mov      qword ptr [rbp-E8H], rsp
       mov      gword ptr [rbp+10H], rcx
       mov      bword ptr [rbp+18H], rdx
       mov      bword ptr [rbp+28H], r9
       mov      esi, r8d

G_M25757_IG02:
       mov      rcx, bword ptr [rbp+18H]  ; load
       mov      rdx, gword ptr [rcx]
       mov      rcx, bword ptr [rbp+18H]  ; load same
       mov      rdi, gword ptr [rcx+8]
       mov      rcx, bword ptr [rbp+18H]  ; load same
       mov      ebx, dword ptr [rcx+16]
       and      ebx, 0xD1FFAB1E
       mov      rcx, bword ptr [rbp+18H]  ; load same
       mov      r14d, dword ptr [rcx+20]
       and      r14d, 0xD1FFAB1E
       cmp      rdx, rdi
       je       SHORT G_M25757_IG10

So could be?

G_M25757_IG02:
       mov      rcx, bword ptr [rbp+18H]  ; load
       mov      rdx, gword ptr [rcx]
       mov      rdi, gword ptr [rcx+8]
       mov      ebx, dword ptr [rcx+16]
       and      ebx, 0xD1FFAB1E
       mov      r14d, dword ptr [rcx+20]
       and      r14d, 0xD1FFAB1E
       cmp      rdx, rdi
       je       SHORT G_M25757_IG10

@AndyAyersMS
Copy link
Member

Would be interesting to see what the above looks like with EH Write-Thru (#543 has the changes, but they are off by default; COMPlus_EnableEHWriteThru=1 will enable).

cc @CarolEidt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI JitUntriaged CLR JIT issues needing additional triage optimization
Projects
None yet
Development

No branches or pull requests

5 participants