Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Don't give up on runtime lookup in inliner too early #81432

Closed
EgorBo opened this issue Jan 31, 2023 · 9 comments · Fixed by #99265
Closed

JIT: Don't give up on runtime lookup in inliner too early #81432

EgorBo opened this issue Jan 31, 2023 · 9 comments · Fixed by #99265
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Jan 31, 2023

static bool Test<T>() => Callee<T>();

static bool Callee<T>() => typeof(T) == typeof(int);

It is expected that Test<string> will be compiled to just return false. Instead, it's compiled down to:

; Assembly listing for method Program:<<Main>$>g__Test|0_0[System.__Canon]():bool
G_M24604_IG01:
       sub      rsp, 40
       mov      qword ptr [rsp+20H], rcx
G_M24604_IG02:
       mov      rdx, qword ptr [rcx+38H]
       mov      rdx, qword ptr [rdx+10H]
       test     rdx, rdx
       je       SHORT G_M24604_IG04
G_M24604_IG03:
       jmp      SHORT G_M24604_IG05
G_M24604_IG04:
       mov      rdx, 0xD1FFAB1E      ; global ptr
       call     CORINFO_HELP_RUNTIMEHANDLE_METHOD
       mov      rdx, rax
G_M24604_IG05:
       mov      rcx, rdx
       call     [Program:<<Main>$>g__Callee|0_1[System.__Canon]():bool]
       nop      
G_M24604_IG06:
       add      rsp, 40
       ret      
; Total bytes of code 57

It seems that JIT's inliner could be a bit smarter here when currently it gives up when it sees ldtoken T! as a potential runtime lookup. We either can teach it to recognize typeof() == typeof() idiom upfront (and fold) or we can do a sort of "I've just hit a runtime lookup, let me import up to 3 more opcodes to see if I will be able to fold it"
cc @AndyAyersMS @jkotas

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 31, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 31, 2023
@ghost
Copy link

ghost commented Jan 31, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details
using System.Runtime.CompilerServices;

Test<string>();

[MethodImpl(MethodImplOptions.NoInlining)]
static bool Test<T>() => Callee<T>();

static bool Callee<T>() => typeof(T) == typeof(int);

It is expected that Test<string> will be compiled to just return false. Instead it's compiled down to:

; Assembly listing for method Program:<<Main>$>g__Test|0_0[System.__Canon]():bool
G_M24604_IG01:
       sub      rsp, 40
       mov      qword ptr [rsp+20H], rcx
G_M24604_IG02:
       mov      rdx, qword ptr [rcx+38H]
       mov      rdx, qword ptr [rdx+10H]
       test     rdx, rdx
       je       SHORT G_M24604_IG04
G_M24604_IG03:
       jmp      SHORT G_M24604_IG05
G_M24604_IG04:
       mov      rdx, 0xD1FFAB1E      ; global ptr
       call     CORINFO_HELP_RUNTIMEHANDLE_METHOD
       mov      rdx, rax
G_M24604_IG05:
       mov      rcx, rdx
       call     [Program:<<Main>$>g__Callee|0_1[System.__Canon]():bool]
       nop      
G_M24604_IG06:
       add      rsp, 40
       ret      
; Total bytes of code 57

It seems that JIT's inliner could be a bit smarter here when currently it gives up when it sees ldtoken T! as a potential runtime lookup. We either can teach it to recognize typeof() == typeof() idiom earlier (and fold) or we can do sort of "I've just hit a runtime lookup, let me import up to 3 more opcodes to see if I will be able to fold it"
cc @AndyAyersMS @jkotas

Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@EgorBo EgorBo self-assigned this Jan 31, 2023
@EgorBo EgorBo added this to the Future milestone Jan 31, 2023
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Jan 31, 2023
@AndyAyersMS
Copy link
Member

Seems like it could be a bit clunky to engineer this?

A more natural thing might be to keep going until the next stack empty point or block boundary.

@jkotas
Copy link
Member

jkotas commented Jan 31, 2023

A more natural thing might be to keep going until the next stack empty point

Even more natural thing to do would be to fix the problem with inlining dictionary lookups. Even imperfect (not 100% efficient) solution may be better than what we have today.

@EgorBo
Copy link
Member Author

EgorBo commented Jan 31, 2023

A more natural thing might be to keep going until the next stack empty point

Even more natural thing to do would be to fix the problem with inlining dictionary lookups. Even imperfect (not 100% efficient) solution may be better than what we have today.

Is it doable for e.g cross-assembly inlining?

@jkotas
Copy link
Member

jkotas commented Feb 1, 2023

cross-assembly inlining is not a problem

When one shared generic method calls another shared generic method, the flow is roughly:

  1. Generic dictionary lookup of generic context in the caller
  2. The generic context is passed as hidden argument
  3. Generic dictionary lookup of the actual thing (e.g. type) in the callee

It should be doable to mimic this exact flow with inlining as a first iteration:

  • The inlined code would use chain of generic dictionary lookups to lookup the actual thing. It is not as efficient as it can be, but it is better than no inlining we get today. The VM should be able to fold the two generic lookups into a single generic lookup, but it requires inventing representation of new kind of types on the VM side that we were stuck on.
  • The JIT should be able to dead-code eliminate the generic lookups if they end up unused (solves the problem described by this issue).

@EgorBo
Copy link
Member Author

EgorBo commented Feb 1, 2023

It should be doable to mimic this exact flow with inlining as a first iteration:

Thanks, that sounds like a challenge/interesting task to do, will try this weekend, always wanted to know how this stuff works under the hood. My understanding that currently we give up somewhere here in VM

@EgorBo
Copy link
Member Author

EgorBo commented Feb 1, 2023

@jkotas @MichalStrehovsky is it easier to do for NativeAOT? It seems that the runtime lookup is much simpler there? e.g. for

[MethodImpl(MethodImplOptions.NoInlining)]
static Type Test<T>() => Callee<T>();

static Type Callee<T>() => typeof(T);

Here is the codegen for both methods (Test<string> and Callee<string>):

; Method Prog:Test[System.__Canon]():System.Type
G_M18196_IG01:
       sub      rsp, 40
       mov      qword ptr [rsp+20H], rcx
G_M18196_IG02:
       mov      rcx, qword ptr [rcx]
       call     Prog:Callee[System.__Canon]():System.Type
       nop      
G_M18196_IG03:
       add      rsp, 40
       ret      
; Total bytes of code: 23


; Method Prog:Callee[System.__Canon]():System.Type
G_M53536_IG01:
       sub      rsp, 40
       mov      qword ptr [rsp+20H], rcx
G_M53536_IG02:
       mov      rcx, qword ptr [rcx]
       call     CORINFO_HELP_TYPEHANDLE_TO_RUNTIMETYPE
       nop      
G_M53536_IG03:
       add      rsp, 40
       ret      
; Total bytes of code: 23

no nullchecks for generic context, helper calls, etc

@jkotas
Copy link
Member

jkotas commented Feb 1, 2023

The dictionaries are eagerly initialized in NativeAOT and that's why we can skip the null check and delay initialization there. We cannot eagerly initialize dictionaries for CoreCLR. It would be large startup hit and it would break for generic types with infinite instantiation cycles that some popular packages use.

I would focus on CoreCLR with this. I guess that the question-mark commas created by dictionary lookups are hard to deal with. Should we import the dictionary lookups as a new GT note and expand this node to the full dictionary lookup later once the high-level optimizations are done?

@jakobbotsch
Copy link
Member

Should we import the dictionary lookups as a new GT note and expand this node to the full dictionary lookup later once the high-level optimizations are done?

Related: #35551

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 4, 2024
@EgorBo EgorBo modified the milestones: Future, 9.0.0 Mar 9, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Apr 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants