Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore StubSecretArg from stack #100428

Merged
merged 3 commits into from
Apr 10, 2024

Conversation

t-mustafin
Copy link
Contributor

Issue #100301

@jakobbotsch @dotnet/samsung
Part of #84834

Draft to check CI test results.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 28, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 28, 2024
Comment on lines 5613 to 5614
if (comp->lvaStubArgumentVar == BAD_VAR_NUM)
comp->lvaStubArgumentVar = comp->lvaInlinedPInvokeFrameVar;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this (comp->lvaStubArgumentVar == BAD_VAR_NUM) cases lead to assert:

Assert failure(PID 11606 [0x00002d56], Thread: 11606 [0x2d56]): Assertion failed 'lclNum < lvaCount' in 'System.Buffer:_Memmove(byref,byref,ulong)' during 'Lowering nodeinfo' (IL size 25; hash 0xd7d18401; FullOpts)

    File: /home/runtime/src/coreclr/jit/compiler.h:4008
    Image: /home/tmustafin/main_26mar_checkedO0_7f555e0_StubSecretArgRestored68d8aa/coreroot/corerun


Thread 1 "corerun" received signal SIGTRAP, Trace/breakpoint trap.
DBG_DebugBreak () at /home/runtime/src/coreclr/pal/src/arch/riscv64/debugbreak.S:6
6       /home/runtime/src/coreclr/pal/src/arch/riscv64/debugbreak.S: No such file or directory.
(gdb) bt
#0  DBG_DebugBreak () at /home/runtime/src/coreclr/pal/src/arch/riscv64/debugbreak.S:6
#1  0x0000003f74b129ae in DebugBreak () at /home/runtime/src/coreclr/pal/src/debug/debug.cpp:406
#2  0x0000003f7481c69a in assertAbort (why=0x3f74bc0f17 "lclNum < lvaCount", file=0x3f74bc17b0 "/home/runtime/src/coreclr/jit/compiler.h", line=4008) at /home/runtime/src/coreclr/jit/error.cpp:286
#3  0x0000003f7479cf10 in Compiler::lvaGetDesc (this=0x2aaac75278, lclNum=4294967295) at /home/runtime/src/coreclr/jit/compiler.h:4008
#4  0x0000003f7479cff2 in Compiler::lvaGetDesc (this=0x2aaac75278, lclVar=0x2aaac832d8) at /home/runtime/src/coreclr/jit/compiler.h:4020
#5  0x0000003f749e314c in Compiler::fgMorphLeafLocal (this=0x2aaac75278, lclNode=0x2aaac832d8) at /home/runtime/src/coreclr/jit/morph.cpp:4577
#6  0x0000003f749ef45a in Compiler::fgMorphLeaf (this=0x2aaac75278, tree=0x2aaac832d8) at /home/runtime/src/coreclr/jit/morph.cpp:8098
#7  0x0000003f749d9134 in Compiler::fgMorphTree (this=0x2aaac75278, tree=0x2aaac832d8, mac=0x0) at /home/runtime/src/coreclr/jit/morph.cpp:12721
#8  0x0000003f749da318 in Compiler::fgMorphArgs (this=0x2aaac75278, call=0x2aaac831b0) at /home/runtime/src/coreclr/jit/morph.cpp:3171
#9  0x0000003f749ebee0 in Compiler::fgMorphCall (this=0x2aaac75278, call=0x2aaac831b0) at /home/runtime/src/coreclr/jit/morph.cpp:7760
#10 0x0000003f749d9248 in Compiler::fgMorphTree (this=0x2aaac75278, tree=0x2aaac831b0, mac=0x0) at /home/runtime/src/coreclr/jit/morph.cpp:12746
#11 0x0000003f749e6052 in Compiler::fgMorphSmpOp (this=0x2aaac75278, tree=0x2aaac83398, mac=0x0, optAssertionPropDone=0x3fffff9f47) at /home/runtime/src/coreclr/jit/morph.cpp:8950
#12 0x0000003f749d9164 in Compiler::fgMorphTree (this=0x2aaac75278, tree=0x2aaac83398, mac=0x0) at /home/runtime/src/coreclr/jit/morph.cpp:12729
#13 0x0000003f749a51d8 in Lowering::InsertPInvokeMethodProlog (this=0x2aaac830c8) at /home/runtime/src/coreclr/jit/lower.cpp:5630

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you double check why lvaStubArgumentVar is BAD_VAR_NUM even if we are making use of this register? Does it mean we are adding garbage to this method call? It would be better to pass 0 as the argument in those cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC there are several different flavors of how the secret stub argument gets consumed:

  • Through System.StubHelpers.GetStubContext() on all architectures. The runtime passes CORJIT_FLAG_PUBLISH_SECRET_PARAM for functions that may use this intrinsic.
  • As an argument to CORINFO_HELP_INIT_PINVOKE_FRAME outside x86/arm32. The runtime seemingly doesn't always pass CORJIT_FLAG_PUBLISH_SECRET_PARAM in the cases where we need to call this JIT helper, which is confusing to me. Does it mean we can pass nullptr in these cases? Is the arg only needed for the pinvoke IL stubs, and not for normal functions with pinvokes in them?
  • Through known stack frame layout on x86/arm32:
    // Given a methodDesc representing an ILStub for a pinvoke call,
    // this method will return the MethodDesc for the actual interop
    // method if the current InlinedCallFrame is inactive.
    PTR_MethodDesc GetActualInteropMethodDesc()
    {
    #if defined(TARGET_X86) || defined(TARGET_ARM)
    // Important: This code relies on the way JIT lays out frames. Keep it in sync
    // with code:Compiler.lvaAssignFrameOffsets.
    //
    // | ... |
    // +--------------------+
    // | lvaStubArgumentVar | <= filled with EAX in prolog |
    // +--------------------+ |
    // | | |
    // | InlinedCallFrame | |
    // | | <= m_pCrawl.pFrame | to lower addresses
    // +--------------------+ V
    // | ... |
    //
    // Extract the actual MethodDesc to report from the InlinedCallFrame.
    TADDR addr = dac_cast<TADDR>(this) + sizeof(InlinedCallFrame);
    return PTR_MethodDesc(*PTR_TADDR(addr));
    #elif defined(HOST_64BIT)
    // On 64bit, the actual interop MethodDesc is saved off in a field off the InlinedCrawlFrame
    // which is populated by the JIT. Refer to JIT_InitPInvokeFrame for details.
    return PTR_MethodDesc(m_StubSecretArg);
    #else
    _ASSERTE(!"NYI - Interop method reporting for this architecture!");
    return NULL;
    #endif // defined(TARGET_X86) || defined(TARGET_ARM)
    }
    . Does anything stop us from unifying x86/arm32 with the rest of our targets and always passing it as an argument to the helper?

cc @jkotas, do you know the answer to some of these questions?

Copy link
Member

@jkotas jkotas Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the arg only needed for the pinvoke IL stubs, and not for normal functions with pinvokes in them?

Yes, the arg is only used for IL stubs.

Does it mean we can pass nullptr in these cases?

The value is unused. It does not matter what gets passed in.

Does anything stop us from unifying x86/arm32 with the rest of our targets and always passing it as an argument to the helper?

The JIT is aware of InlinedFrame layout and semantics. It may be better to switch 64-bit to the same scheme as x86/arm32: Get rid of the argument and store it in the InlinedFrame instead - on all platforms and only when it is needed. It will save a memory store for inlined PInvokes on 64-bit platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I opened #100662 about unifying this.

@t-mustafin can you please update the fix to be something like

    GenTree* argNode;
    if (comp->info.compPublishStubParam)
    {
        argNode = comp->gtNewLclvNode(comp->lvaStubArgumentVar, TYP_I_IMPL);
    }
    else
    {
        argNode = comp->gtNewIconNode(0, TYP_I_IMPL);
    }
    NewCallArg stubParamArg =
        NewCallArg::Primitive(argNode).WellKnown(WellKnownArg::SecretStubParam);
    call->gtArgs.PushBack(comp, stubParamArg);

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #100662 about unifying this.

@t-mustafin In case you would like to fix #100662 and clean this up in this PR, here is the set of places:

  • Delete code under the ifdef that you are fixing
  • Delete StubSecretArg here:
    Thread * __stdcall JIT_InitPInvokeFrame(InlinedCallFrame *pFrame, PTR_VOID StubSecretArg)
  • Delete
    #ifdef HOST_64BIT
    // IL stubs fill this field with the incoming secret argument when they erect
    // InlinedCallFrame so we know which interop method was invoked even if the frame
    // is not active at the moment.
    PTR_VOID m_StubSecretArg;
    #endif // HOST_64BIT
  • Enable this for all architectures
    // Important: This code relies on the way JIT lays out frames. Keep it in sync
    // with code:Compiler.lvaAssignFrameOffsets.
    //
    // | ... |
    // +--------------------+
    // | lvaStubArgumentVar | <= filled with EAX in prolog |
    // +--------------------+ |
    // | | |
    // | InlinedCallFrame | |
    // | | <= m_pCrawl.pFrame | to lower addresses
    // +--------------------+ V
    // | ... |
    //
    // Extract the actual MethodDesc to report from the InlinedCallFrame.
    TADDR addr = dac_cast<TADDR>(this) + sizeof(InlinedCallFrame);
    return PTR_MethodDesc(*PTR_TADDR(addr));
  • Rev the JIT-EE interface GUID: https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/jiteeversionguid.h

@jakobbotsch Does this sound about right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping we would be able to get rid of the "known stack layout" contract between the VM and JIT as part of this unification. It results in a number of special casing around this particular value in the JIT and has a side effect that we cannot keep the value enregistered.
My thought was:

  • Make InlinedCallFrame::m_StubSecretArg available on all platforms. Make it the JIT's responsibility to store this field in the frame on x64 (instead of doing it from JIT_InitPInvokeFrame). Store the value on x86 as well (it will change a spill to stack at the known stack frame location to a store to the InlinedCallFrame)
  • Remove all special handling in frame layout around lvaStubArgumentVar
  • Eventually enable full enregistration of lvaStubArgumentVar and remove all remaining special handling around it in the JIT. Coincidentally I am doing work in JIT: Rewrite register parameter homing #100572 that should make this simple.

Do you see any potential issues around doing it this way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make InlinedCallFrame::m_StubSecretArg available on all platforms

It makes InlinedCallFrame one extra word larger than necessary in most cases. It should not be a big deal, there are other similar inefficiencies in InlinedCallFrame. We should be able to do that to make things cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please update the fix to be something like

@jakobbotsch thanks, it works. Updated PR.

rv64 asm for IL_STUB_PInvoke(byref) [MinOpts, IL size=98, code size=920, hash=0x5d63dafb]
G_M9476_IG01:        ; offs=0x000000, size=0x0068, bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, nogc <-- Prolog IG
IN0086: 000000      addi           sp, sp, -576
IN0087: 000004      sd             s1, 0(sp)
IN0088: 000008      sd             s2, 8(sp)
IN0089: 00000C      sd             s3, 16(sp)
IN008a: 000010      sd             s4, 24(sp)
IN008b: 000014      sd             s5, 32(sp)
IN008c: 000018      sd             s6, 40(sp)
IN008d: 00001C      sd             s7, 48(sp)
IN008e: 000020      sd             s8, 56(sp)
IN008f: 000024      sd             s9, 64(sp)       
IN0090: 000028      sd             s10, 72(sp)
IN0091: 00002C      sd             s11, 80(sp)
IN0092: 000030      sd             ra, 88(sp)
IN0093: 000034      sd             fp, 96(sp)
IN0094: 000038      addi           fp, sp, 96
IN0095: 00003C      sd             t2, 80(fp)
IN0096: 000040      addi           s1, fp, 104
IN0097: 000044      addi           t0, zero, 22
IN0098: 000048      sd             zero, 8(s1)      
IN0099: 00004C      sd             zero, 0(s1)      
IN009a: 000050      addi           t0, t0, -1
IN009b: 000054      addi           s1, s1, 16
IN009c: 000058      bne            t0, zero, pc-16 (-4 instructions)
IN009d: 00005C      addi           s1, sp, 576
IN009e: 000060      sd             s1, 472(fp)
IN009f: 000064      sd             a0, 464(fp)
                        ;; size=104 bbWeight=1 PerfScore 0.00
G_M9476_IG02:        ; offs=0x000068, size=0x006C, bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, BB09 [0008], BB01 [0000], byref
IN0001: 000068      addi           a0, fp, 112
IN0002: 00006C      addi           a1, zero, -51
IN0003: 000070      addi           a2, zero, 296    
IN0004: 000074      lui            t2, 0
        000078      addi           t2, t2, 63
        00007C      slli           t2, t2, 11
        000080      addi           t2, t2, 1401
        000084      slli           t2, t2, 11
        000088      addi           t2, t2, 1785
        00008C      slli           t2, t2, 10
        000090      jalr           ra, 748(t2)      // CORINFO_HELP_NATIVE_MEMSET
IN0005: 000094      addi           a0, fp, 16
IN0006: 000098      ld             a1, 80(fp)
IN0007: 00009C      lui            a2, 521704
IN0008: 0000A0      addiw          a2, a2, -1570
IN0009: 0000A4      slli           a2, a2, 7 
IN000a: 0000A8      addi           a2, a2, 58
IN000b: 0000AC      jalr           ra, 0(a2)        // CORINFO_HELP_INIT_PINVOKE_FRAME

In case you would like to fix #100662 and clean this up in this PR, here is the set of places:

@jkotas Thanks for detailed instruction, unfortunately I don't have plan to fix #100662 for near future.

@t-mustafin t-mustafin force-pushed the StubSecretArg_restore_from_stack branch from ae76e4a to f37d25e Compare March 28, 2024 23:34
@t-mustafin t-mustafin force-pushed the StubSecretArg_restore_from_stack branch 2 times, most recently from 1c7c97f to 232ee6f Compare March 30, 2024 01:29
@t-mustafin t-mustafin marked this pull request as ready for review March 30, 2024 01:45
Issue dotnet#100301

Co-authored-by: Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com>
@t-mustafin t-mustafin force-pushed the StubSecretArg_restore_from_stack branch from 232ee6f to 321559a Compare April 9, 2024 18:17
Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@jakobbotsch jakobbotsch merged commit dc2fa66 into dotnet:main Apr 10, 2024
110 checks passed
@gbalykov
Copy link
Member

@jakobbotsch thanks! By the way, we first observed this on .net 8, does this need to be backported there?

@jakobbotsch
Copy link
Member

@jakobbotsch thanks! By the way, we first observed this on .net 8, does this need to be backported there?

I don't think so. The CORINFO_HELP_MEMSET call that trashes the register in #100301 is coming from poisoning that is inserted by the JIT for address taken locals in unoptimized codegen. This means that the demonstrated case requires an IL stub compiled without optimizations. However, I believe that IL stubs under normal operations will always be compiled with optimizations enabled, so the call will not be there.

In theory nothing stops the register from being trashed through other means during prolog codegen, but this theoretical issue exists in RyuJIT since forever and generally we wouldn't try to backport a fix for a theoretical issue like that.

@jkotas
Copy link
Member

jkotas commented Apr 15, 2024

I believe that IL stubs under normal operations will always be compiled with optimizations enabled

I do not think that it is the case. The IL stubs have the same optimization settings as the module that they are attached to.

I agree that this is a theoretical issue in the shipping runtime version and it does not meet the bar for backport without a repro.

matouskozak pushed a commit to matouskozak/runtime that referenced this pull request Apr 30, 2024
@github-actions github-actions bot locked and limited conversation to collaborators May 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants