Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate calls to interface methods through resolve helper #112406

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

MichalStrehovsky
Copy link
Member

@MichalStrehovsky MichalStrehovsky commented Feb 11, 2025

This changes interface calls when CFG is enabled to use a resolve helper instead of the stub dispatcher. I've also extended the CFG testing a bit to cover more of interface calls in the smoke test.

static int Call(IFoo f, int x, int y) => f.Call(x, y);

Before this change:

00007FF605971D50  push        rbx  
00007FF605971D51  sub         rsp,20h  
00007FF605971D55  mov         qword ptr [rsp+30h],rcx  
00007FF605971D5A  mov         r10d,edx  
00007FF605971D5D  mov         ebx,r8d  
00007FF605971D60  mov         rcx,qword ptr [__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF6059BEBD0h)]  
00007FF605971D67  call        qword ptr [__guard_check_icall_fptr (07FF6059CD558h)]  
00007FF605971D6D  mov         rax,rcx  
00007FF605971D70  mov         rcx,qword ptr [rsp+30h]  
00007FF605971D75  mov         r8d,ebx  
00007FF605971D78  mov         edx,r10d  
00007FF605971D7B  lea         r10,[__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF6059BEBD0h)]  
00007FF605971D82  call        rax  
00007FF605971D84  nop  
00007FF605971D85  add         rsp,20h  
00007FF605971D89  pop         rbx  
00007FF605971D8A  ret  

After this change:

00007FF704181CF0  sub         rsp,28h  
00007FF704181CF4  lea         r10,[__InterfaceDispatchCell_repro_Program_IFoo__Call_repro_Program__Call (07FF7041CEBD0h)]  
00007FF704181CFB  call        RhpResolveInterfaceMethodFast (07FF703FFF5A0h)  
00007FF704181D00  call        qword ptr [__guard_dispatch_icall_fptr (07FF7041DD560h)]  
00007FF704181D06  nop  
00007FF704181D07  add         rsp,28h  
00007FF704181D0B  ret  

Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

For this not to be a throughput regression, we need to figure out a clean way to implement the hack in last two commits.

I ran JSON serialization with/without this PR. Having to preserve argument registers before calling the resolve helper costs us about 1% in throughput.

PR-hack is the hack where we allow RyuJIT to assume argument registers are not clobbered. We better not take a GC in the slow path.
PR is this PR without that hack.
Baseline is the vanilla CFG build.

Style Attemp1 ms Attemp2 ms Attemp3 ms Attemp4 ms Attemp5 ms
PR-hack 1103 1105 1096 1102 1103
PR 1125 1126 1139 1127 1134
Baseline 1112 1105 1111 1104 1106

Do we have a way to do this hack cleanly that doesn't involve building a new universal transition?

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky MichalStrehovsky marked this pull request as ready for review February 18, 2025 07:17
@Copilot Copilot bot review requested due to automatic review settings February 18, 2025 07:17

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 22 changed files in this pull request and generated no comments.

Files not reviewed (19)
  • src/coreclr/inc/corinfo.h: Language not supported
  • src/coreclr/inc/jiteeversionguid.h: Language not supported
  • src/coreclr/inc/jithelpers.h: Language not supported
  • src/coreclr/jit/codegencommon.cpp: Language not supported
  • src/coreclr/jit/gentree.cpp: Language not supported
  • src/coreclr/jit/gentree.h: Language not supported
  • src/coreclr/jit/lower.cpp: Language not supported
  • src/coreclr/jit/morph.cpp: Language not supported
  • src/coreclr/jit/targetamd64.h: Language not supported
  • src/coreclr/jit/targetarm64.h: Language not supported
  • src/coreclr/nativeaot/Runtime/AsmOffsets.h: Language not supported
  • src/coreclr/nativeaot/Runtime/CMakeLists.txt: Language not supported
  • src/coreclr/nativeaot/Runtime/EHHelpers.cpp: Language not supported
  • src/coreclr/nativeaot/Runtime/StackFrameIterator.cpp: Language not supported
  • src/coreclr/nativeaot/Runtime/amd64/DispatchResolve.asm: Language not supported
  • src/coreclr/nativeaot/Runtime/amd64/UniversalTransition.asm: Language not supported
  • src/coreclr/nativeaot/Runtime/arm64/DispatchResolve.asm: Language not supported
  • src/coreclr/nativeaot/Runtime/arm64/UniversalTransition.asm: Language not supported
  • src/coreclr/nativeaot/Runtime/inc/rhbinder.h: Language not supported
Comments suppressed due to low confidence (1)

src/coreclr/tools/Common/JitInterface/CorInfoHelpFunc.cs:267

  • Ensure that the new helper CORINFO_HELP_INTERFACELOOKUP_FOR_SLOT is covered by tests to verify its behavior.
CORINFO_HELP_INTERFACELOOKUP_FOR_SLOT,  // Resolve a non-generic interface method from this pointer and dispatch cell
@MichalStrehovsky
Copy link
Member Author

@dotnet/ilc-contrib this is ready for review

@jakobbotsch who would be the best to review the JIT side?

@jakobbotsch
Copy link
Member

I pushed a commit to fix the JIT assert and also a probable fix for the GC stress issue (only based on review, haven't tested that one). Will kick off the CFG job again.

@jakobbotsch
Copy link
Member

/azp run jit-cfg

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member

There's still gcstress failures. I'm guessing there are some more fundamental issues around trying to give this helper a non standard calling convention where GC refs could be in the argument registers. I suspect they are not properly being reported and reflected back to the JIT on return.

@jkotas
Copy link
Member

jkotas commented Mar 3, 2025

I suspect they are not properly being reported

The GC info produced by the JIT does not look right.

GC info without CFG - notice that tracking of rcx/rdx/r8/r9 is stopped after the interface call:

IN0003: 00000D mov      rcx, 0x1F972400BE8      ; data for Program:f
IN0004: 000017 mov      rcx, gword ptr [rcx]
                            ; gcrRegs +[rcx]
IN0005: 00001A mov      r9, 0x1F972400BF0      ; data for Program:o
IN0006: 000024 mov      r9, gword ptr [r9]
                            ; gcrRegs +[r9]
IN0007: 000027 mov      rdx, r9
                            ; gcrRegs +[rdx]
IN0008: 00002A mov      r8, r9
                            ; gcrRegs +[r8]
IN0009: 00002D mov      r11, 0x7FF91B770070      ; code for IFace:M(System.Object,System.Object,System.Object):this
IN000a: 000037 call     [r11]IFace:M(System.Object,System.Object,System.Object):this
                            ; gcrRegs -[rcx rdx r8-r9]
                            ; gcr arg pop 0

GC info with CFG - notice that tracking of rcx/rdx/r8/r9 is stopped before the interface call:

IN0003: 00000D mov      r11, 0x1F65D000BE8      ; data for Program:f
IN0004: 000017 mov      rcx, gword ptr [r11]
                            ; gcrRegs +[rcx]
IN0005: 00001A mov      r11, 0x1F65D000BF0      ; data for Program:o
IN0006: 000024 mov      r9, gword ptr [r11]
                            ; gcrRegs +[r9]
IN0007: 000027 mov      rdx, r9
                            ; gcrRegs +[rdx]
IN0008: 00002A mov      r8, r9
                            ; gcrRegs +[r8]
IN0009: 00002D mov      r11, 0x7FF8FB8B0070      ; code for IFace:M(System.Object,System.Object,System.Object):this
recordRelocation: 00007FF8FBBB3CE8 (rw: 000001F600575220) => 00007FF95B3DAB40, type 16 (IMAGE_REL_BASED_REL32), delta 0
IN000a: 000037 call     CORINFO_HELP_INTERFACELOOKUP_FOR_SLOT
                            ; gcrRegs -[rcx rdx r8-r9] <--- this looks wrong, it should be after call [CORINFO_HELP_DISPATCH_INDIRECT_CALL]
                            ; gcr arg pop 0
recordRelocation: 00007FF8FBBB3CEE (rw: 000001F600575226) => 00007FF95B98DCF0, type 16 (IMAGE_REL_BASED_DISP32), delta 0
IN000b: 00003C call     [CORINFO_HELP_DISPATCH_INDIRECT_CALL]
                            ; gcr arg pop 0

Test program source


class Program
{
    static IFace f = new C();
    static object o = new object();

    static void Main() 
    {
        f.M(o,o,o);
    }
}

interface IFace
{
   void M(object a, object b, object c);
}

class C : IFace
{
    public void M(object a, object b, object c) { }
}

@jakobbotsch
Copy link
Member

I can look into that, but do you expect that this is going to make things work? My general wondering is if the runtime would know to report and relocate the GC refs in the argument registers of the transition frame if GC happens while inside the new helper.

@jakobbotsch
Copy link
Member

/azp run jit-cfg

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jakobbotsch
Copy link
Member

There's some more JIT work needed to properly report non callee saves after a call. I opened #113071 for it.
Still unsure if it will "just work" once the JIT reporting is right.

@jkotas
Copy link
Member

jkotas commented Mar 3, 2025

My general wondering is if the runtime would know to report and relocate the GC refs in the argument registers of the transition frame if GC happens while inside the new helper.

The runtime knows how to do it for the existing VSD helper that tailcalls, so it should be able to do it for helper that returns as well. The new helper uses the same GC reporting setup as the existing VSD helper that tailcalls.

In order for this to work, the JIT has to setup all arguments before calling the new helper. It does not seem to be happening in all cases. For example, I see the following codegen for a call from Dictionary.FindValue to System.Collections.Generic.IEqualityComparer<string>.Equals(string,string):

...
00007ff8`f8b2d475 4c8bd8          mov     r11,rax
00007ff8`f8b2d478 498b1424        mov     rdx,qword ptr [r12]
00007ff8`f8b2d47c 488bcf          mov     rcx,rdi
00007ff8`f8b2d47f e8bcd6825f      call    coreclr!JIT_InterfaceLookupForSlot (00007ff9`5835ab40)
00007ff8`f8b2d484 488bcf          mov     rcx,rdi <--- unnecessary duplication, but it does not hurt
00007ff8`f8b2d487 4c8bc6          mov     r8,rsi <--- this needs to be before JIT_InterfaceLookupForSlot 
00007ff8`f8b2d48a ff156008de5f    call    qword ptr [00007ff9`5890dcf0] // JIT_DispatchIndirectCall
00007ff8`f8b2d490 85c0            test    eax,eax
...

@jakobbotsch
Copy link
Member

In order for this to work, the JIT has to setup all arguments before calling the new helper. It does not seem to be happening in all cases. For example, I see the following codegen for a call from Dictionary.FindValue to System.Collections.Generic.IEqualityComparer<string>.Equals(string,string):

That's expected with how this is implemented in the JIT. The JIT just considers this helper to not trash the argument registers. It is free to leave any GC ref in those registers across the call. It is not considering the "resolve + dispatch" sequence as one unit that takes the arguments of the final call. (I am not sure if this would be feasible on arm64 that uses "resolve + validate + call" sequence).
What the JIT needs from the VM side here is that the VM side uses its reported GC information to update the argument registers of the transition frame (like it would already be doing for callee saves). Is it possible to make it do that?

@jkotas
Copy link
Member

jkotas commented Mar 3, 2025

What the JIT needs from the VM side here is that the VM side uses its reported GC information to update the argument registers of the transition frame (like it would already be doing for callee saves). Is it possible to make it do that?

Yes, I have pushed a commit to do that (some of the changes in that commit are quick hack that needs cleaning up if it works).

Track all integer registers for calls in `regPtrDsc`. This does not cost
any extra memory and it saves us from going back and forth between an
intermediate format. It also unblocks proper GC reporting for helper
calls that are GC reported with non-standard calling convention.
@dotnet dotnet deleted a comment Mar 3, 2025
@jakobbotsch jakobbotsch closed this Mar 3, 2025
@jakobbotsch jakobbotsch reopened this Mar 3, 2025
@jakobbotsch
Copy link
Member

/azp run jit-cfg

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants