optimization of indirect calls #44610

AndyAyersMS · 2020-11-12T20:55:46Z

With the advent of function pointers we will now see an increased number of indirect call cases that the jit can optimize to direct calls. This requires a transformation similar to the one we do for devirtualization. There may be additional wrinkles, say if someone takes the address of an intrinsic method we would want intrinsic recognition to kick in too. So in the importer perhaps this needs to sit just upstream from impImportCall.

Overall this should be structured in a similar way to devirtualization -- opportunistically transforming calls during importation to enable subsequent inlining, and then perhaps retry later after inlining to at least remove overhead. Would be nice to be able to change over in the optimizer too, but that may prove challenging as we start to bake in many details in morph.

In principle we could do something similar with locally created delegates but there are a number of missing pieces that prevent us from seeing through from delegate creation to invocation.

using System;

class X
{
    static int F() => 33;
   
    public unsafe static int Main()
    {
        delegate*<int> f = &X.F;

        return f() + 67;
    }
}

produces

 Assembly listing for method X:Main():int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 OutArgs      [V00    ] (  1,  1   )  lclBlk (32) [rsp+0x00]   "OutgoingArgSpace"
;* V01 tmp1         [V01    ] (  0,  0   )    long  ->  zero-ref    "impImportIndirectCall"
;
; Lcl frame size = 40

G_M46779_IG01:              ;; offset=0000H
       4883EC28             sub      rsp, 40
                                                ;; bbWeight=1    PerfScore 0.25
G_M46779_IG02:              ;; offset=0004H
       48B830F3FDD6FC7F0000 mov      rax, 0x7FFCD6FDF330
       FFD0                 call     rax
       83C043               add      eax, 67
                                                ;; bbWeight=1    PerfScore 3.50
G_M46779_IG03:              ;; offset=0013H
       4883C428             add      rsp, 40
       C3                   ret
                                                ;; bbWeight=1    PerfScore 1.25

Not sure of the priority of this just yet, I am looking at some apps that use calli fairly heavily and need to better understand how many of these can be optimized. So marking as future for now.

This optimization will also intersect with guarded devirtualization / PGO, provided we can profile indirect targets and see biased distributions.

category:cq
theme:devirtualization
skill-level:expert
cost:medium

The text was updated successfully, but these errors were encountered:

hez2010 · 2021-12-29T05:04:09Z

Since ASP.NET Core 6, delegates and lambdas are being heavily used in the new minimal-api design which uses delegates as endpoint handlers here and there. I believe if JIT can optimize delegates to direct calls, a huge performance improvement will be introduced.

Allow method handle histograms in .mibc files and in the PGO text format. Contributes to dotnet#44610.

Allow method handle histograms in .mibc files and in the PGO text format. Contributes to #44610.

MichalPetryka · 2023-04-08T11:34:13Z

I'm working on this currently but I have a bit of an issue regarding invalid code handling with static readonly pointers, I'm using NonVirtualEntry2MethodDesc to resolve those to the method, this works fine with valid pointers and returns null for null, random pointer and pointer to unmanaged method correctly, but when I pass in validPtr + 1/validPtr - 1, it returns (from here) an invalid MethodDesc* which then AVs on read. On main this does not crash when JITting the code, but it crashes with my change. What's the best way to deal with this?
@jkotas @davidwrighton could you help here?

jkotas · 2023-04-08T13:13:49Z

In general, runtime methods like NonVirtualEntry2MethodDesc does not expect bogus pointers.

You would need to create your own implementation of NonVirtualEntry2MethodDesc that does a precise validation of the function pointer. Also, once you get the pointer resolved, you would need to validate that the resolved method is a plausible target for the callsite. Consider cases like this:

static readonly void* pfn = ....;
static readonly PfnKind pfnKind = ....;

static void DoStuff()
{
    switch (pfnKind)
    {
    // pfn can be inlined only for the case that matches the actual pfn signature
    case MethodWithOneArgument: ((delegate* <int, void>)pfn)(arg1); break;
    case MethodWithTwoArguments: ((delegate* <int, void>)pfn)(arg1, arg2); break;
    case BogusMethod: return;
    }
}

MichalPetryka · 2023-04-08T13:28:20Z

You would need to create your own implementation of NonVirtualEntry2MethodDesc that does a precise validation of the function pointer.

The issue is that it's not fully clear to me what's the proper way to validate those, do you have any hints with that?

Also, once you get the pointer resolved, you would need to validate that the resolved method is a plausible target for the callsite.

Yeah I do realise that, it's much less of an issue though since I have the signature of the calli and target method already, just have to compare those.

jkotas · 2023-04-08T15:19:20Z

The issue is that it's not fully clear to me what's the proper way to validate those, do you have any hints with that?

Some examples:

(MethodDesc*)((FixupPrecode*)PCODEToPINSTR(entryPoint))->GetMethodDesc();: You would want to introduce new methods on class Precode that check whether entryPoint points to a valid spot withing the STUB_CODE_BLOCK_FIXUPPRECODE block.
if (pRS->_pjit->JitCodeToMethodInfo(pRS, entryPoint, &pMD, NULL)): Ask JitCodeToMethodInfo to compute EECodeInfo and use it to validate that the entryPoint actually points to start of the method and not into the middle of it.

jkotas · 2023-04-10T02:33:17Z

Also, you will need to check whether the loader allocator of the resolved function pointer is in the dependency closure of the method being compiled. It is necessary to avoid introducing a dependency of non-collectible code on collectible code.

MichalPetryka · 2023-04-10T22:25:22Z

(MethodDesc*)((FixupPrecode*)PCODEToPINSTR(entryPoint))->GetMethodDesc();: You would want to introduce new methods on class Precode that check whether entryPoint points to a valid spot withing the STUB_CODE_BLOCK_FIXUPPRECODE block.

I've tried to do so by checking whether the pRS->_pRangeList->_starts contains the address but the array is apparently empty (edit, looking at the code, it seems like addresses are only added then in collectible assemblies; edit 2: even if I always add them, they're addresses to ranges, not starts of methods in them)? How am I supposed to check if the spot is valid then?

jkotas · 2023-04-11T11:26:49Z

These data structures were not designed to allow validation of entry points. I think they would require redesign to allow it. It does not look easy.

MichalPetryka · 2023-04-11T12:43:13Z

These data structures were not designed to allow validation of entry points. I think they would require redesign to allow it. It does not look easy.

Is creating a global ftn ptr -> handle map for ldftns in cctors maybe a reasonable choice?

Or maybe we could just decide that storing invalid pointers to managed methods in static readonly fields is UB and can crash? Especially since the ECMA already says:

The ftn argument must be a method pointer to a method that can be legitimately called with the
arguments described by callsitedescr (a metadata token for a stand-alone signature). Such a
pointer can be created using the ldftn or ldvirtftn instructions, or could have been passed in from
native code.

and since having a pointer that points to JIT code range but isn't valid already requires some weird invalid code, since null or pointers to unmanaged that are casted to managed will be fine here.

jkotas · 2023-04-12T04:10:41Z

Is creating a global ftn ptr -> handle map for ldftns in cctors maybe a reasonable choice?

It does not sound very pay for play. This is niche optimization, vast majority of existing function pointer uses are not going to benefit from it, but they would still need to pay for it.

Or maybe we could just decide that storing invalid pointers to managed methods in static readonly fields is UB and can crash?

It would be a breaking change. Function pointers were invented first and foremost for managed C++. Does C++ define any of this as UB?

having a pointer that points to JIT code range but isn't valid already requires some weird invalid code

It is not that unusual as you may think. There is a defense-in-depth school of thought that encourages storing sensitive pointers (that includes code pointers) in statics in encrypted form. Here is a random code example that found after searching for 10 seconds on github: https://github.com/rituraj0/Old-Codes/blob/c2bc55bd94da84d404e6ed3be92b81ffcfbaa74f/Trace.cpp#L45

MichalPetryka · 2023-04-12T19:18:58Z

Does C++ define any of this as UB?

It's implementation defined:

6.7.5
Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

There's also:

Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.

There is a defense-in-depth school of thought that encourages storing sensitive pointers (that includes code pointers) in statics in encrypted form.

This wouldn't be an issue here since such obfuscation decodes such pointers on use, this issue only manifests when directly calling an invalid pointer, it just moves the AV from call time to JIT time. (This is an issue ONLY with code that'd AV if the call actually happend, now the call doesn't need to happen, just to be possible).

MichalPetryka · 2023-04-12T19:27:48Z

And I guess being able to check whether a MethodDesc* the function returns points to a valid MethodDesc would be a solution, but I don't see how that could be checked?

MichalPetryka · 2023-04-12T19:41:10Z

It's implementation defined

I've checked GCC, Clang and MSVC, they all seem to detect this, inline the valid case and leave the invalid one as is so that it AVs on call (same as dotnet right now).

MichalPetryka · 2023-04-12T20:26:05Z

It does not sound very pay for play. This is niche optimization, vast majority of existing function pointer uses are not going to benefit from it, but they would still need to pay for it.

Hmm, the only solution I could think of that'd have small runtime overhead would be introducing a hidden byte/ushort/uint field with a bitmask for all static readonly managed ftn ptrs in the type and store a handle instead of the actual value in the field then instead, set a bit in the mask and then check the mask when in the helper for static readonly fields and spawn FTN_ADDR instead of CNS_INT there. But I feel like this would be too much to handle for me (and reflection would need to be handled too).

jkotas · 2023-04-13T03:13:30Z

introducing a hidden byte/ushort/uint field with a bitmask for all static readonly managed ftn ptrs in the type

This would be between impossible and very complicated. #44610 (comment) should be a lot easier than this.

Imports indirect calls as direct ones when the target method is known. Only handles addresses from ldftn as the VM has no way to verify pointers from static readonly fields and crashes on invalid ones. The helpers currently contain a small dead path that I'll soon use when extending the code to also handle delegates. Opening as a draft so that the code can be reviewed while I finish the tests. Fixes dotnet#44610

AndyAyersMS added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 12, 2020

AndyAyersMS added this to the Future milestone Nov 12, 2020

Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Nov 12, 2020

AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Nov 12, 2020

EgorBo mentioned this issue Nov 20, 2020

Proposal: Unsafe APIs to support ref void* and ref T* #44968

Closed

AndyAyersMS mentioned this issue Apr 27, 2021

JIT: chained guarded devirtualization #51890

Merged

davidfowl mentioned this issue Aug 2, 2021

Delegates are not being inlined #56708

Closed

AndyAyersMS assigned EgorBo Dec 17, 2021

AndyAyersMS modified the milestones: Future, 7.0.0 Dec 17, 2021

AndyAyersMS mentioned this issue Feb 2, 2022

Dynamic PGO work planned for .NET 7 #64659

Closed

10 tasks

jakobbotsch mentioned this issue Apr 12, 2022

Add support for storing method handle histograms in profiles #67919

Merged

jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Apr 12, 2022

Add support for storing method handle histograms in profiles

26ad0a8

Allow method handle histograms in .mibc files and in the PGO text format. Contributes to dotnet#44610.

jakobbotsch mentioned this issue Apr 29, 2022

Add support for delegate GDV and method-based vtable GDV #68703

Merged

14 tasks

jakobbotsch added a commit that referenced this issue May 3, 2022

Add support for storing method handle histograms in profiles (#67919)

6f563b3

Allow method handle histograms in .mibc files and in the PGO text format. Contributes to #44610.

jakobbotsch mentioned this issue May 19, 2022

libraries-jitstress failure in System.Diagnostics.Tests.StackFrameTests.Ctor_FNeedFileInfo #69154

Closed

EgorBo modified the milestones: 7.0.0, 8.0.0 Jul 6, 2022

jakobbotsch mentioned this issue Aug 31, 2022

PGO work planned for .NET 8 #74873

Closed

44 tasks

am11 mentioned this issue Nov 10, 2022

Inline for lambda #13679

Open

EgorBo modified the milestones: 8.0.0, Future Mar 26, 2023

MichalPetryka mentioned this issue Apr 22, 2023

Transform indirect calls to direct ones in the importer #85197

Closed

ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 22, 2023

ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels Jun 4, 2023

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 5, 2023

JulieLeeMSFT added this to .NET Core CodeGen Jun 5, 2024

JulieLeeMSFT moved this to PGO in .NET Core CodeGen Jun 5, 2024

EgorBo mentioned this issue Jun 11, 2024

An edge case when devirtualization doesn't help at all #103280

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimization of indirect calls #44610

optimization of indirect calls #44610

AndyAyersMS commented Nov 12, 2020

hez2010 commented Dec 29, 2021 •

edited

Loading

MichalPetryka commented Apr 8, 2023 •

edited

Loading

jkotas commented Apr 8, 2023 •

edited

Loading

MichalPetryka commented Apr 8, 2023

jkotas commented Apr 8, 2023

jkotas commented Apr 10, 2023

MichalPetryka commented Apr 10, 2023 •

edited

Loading

jkotas commented Apr 11, 2023

MichalPetryka commented Apr 11, 2023 •

edited

Loading

jkotas commented Apr 12, 2023

MichalPetryka commented Apr 12, 2023 •

edited

Loading

MichalPetryka commented Apr 12, 2023

MichalPetryka commented Apr 12, 2023

MichalPetryka commented Apr 12, 2023

jkotas commented Apr 13, 2023

optimization of indirect calls #44610

optimization of indirect calls #44610

Comments

AndyAyersMS commented Nov 12, 2020

hez2010 commented Dec 29, 2021 • edited Loading

MichalPetryka commented Apr 8, 2023 • edited Loading

jkotas commented Apr 8, 2023 • edited Loading

MichalPetryka commented Apr 8, 2023

jkotas commented Apr 8, 2023

jkotas commented Apr 10, 2023

MichalPetryka commented Apr 10, 2023 • edited Loading

jkotas commented Apr 11, 2023

MichalPetryka commented Apr 11, 2023 • edited Loading

jkotas commented Apr 12, 2023

MichalPetryka commented Apr 12, 2023 • edited Loading

MichalPetryka commented Apr 12, 2023

MichalPetryka commented Apr 12, 2023

MichalPetryka commented Apr 12, 2023

jkotas commented Apr 13, 2023

hez2010 commented Dec 29, 2021 •

edited

Loading

MichalPetryka commented Apr 8, 2023 •

edited

Loading

jkotas commented Apr 8, 2023 •

edited

Loading

MichalPetryka commented Apr 10, 2023 •

edited

Loading

MichalPetryka commented Apr 11, 2023 •

edited

Loading

MichalPetryka commented Apr 12, 2023 •

edited

Loading