Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A simple method that performs unboxing is considered unprofitable inlinee, preventing box stack allocation #104479

Open
neon-sunset opened this issue Jul 5, 2024 · 4 comments · May be fixed by #110596
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Milestone

Comments

@neon-sunset
Copy link
Contributor

neon-sunset commented Jul 5, 2024

Description

Given the following code:

static int BoxTest()
{
    var boxed = (object)42;
    Unbox(boxed);
    return 42;
}

static void Unbox(object value)
{
    var number = (int)value;
    Consume(number.ToString());
}

[MethodImpl(MethodImplOptions.NoInlining)]
static void Consume<T>(T value) { }

It appears that Unbox(object) method is considered an unprofitable inlinee, leading to the following codegen:

G_M15828_IG01:  ;; offset=0x0000
       sub      rsp, 40
						;; size=4 bbWeight=1 PerfScore 0.25
G_M15828_IG02:  ;; offset=0x0004
       mov      rcx, 0x7FF9616B5A58      ; System.Int32
       call     CORINFO_HELP_NEWSFAST
       mov      dword ptr [rax+0x08], 42
       mov      rcx, rax
       call     [Program:<<Main>$>g__Unbox|0_1(System.Object)]
       mov      eax, 42
						;; size=36 bbWeight=1 PerfScore 5.75
G_M15828_IG03:  ;; offset=0x0028
       add      rsp, 40
       ret

If we annotate Unbox with AggressiveInlining, it gets inlined with allocation being elided as expected:

G_M15828_IG01:  ;; offset=0x0000
       sub      rsp, 40
						;; size=4 bbWeight=1 PerfScore 0.25
G_M15828_IG02:  ;; offset=0x0004
       mov      ecx, 42
       call     [System.Number:Int32ToDecStr(int):System.String]
       mov      rdx, rax
       mov      rcx, 0x7FF9DCB36110      ; Program:<<Main>$>g__Consume|0_2[System.String](System.String)
       call     [Program:<<Main>$>g__Consume|0_2[System.__Canon](System.__Canon)]
       mov      eax, 42
						;; size=35 bbWeight=1 PerfScore 7.00
G_M15828_IG03:  ;; offset=0x0027
       add      rsp, 40
       ret

Data

PHASE Morph - Inlining

*************** Starting PHASE Morph - Inlining
Expanding INLINE_CANDIDATE in statement STMT00002 in BB01:
STMT00002 ( ??? ... ??? )
               [000010] I-C-G------                         *  CALL      void   Program:<<Main>$>g__Unbox|0_1(System.Object) (exactContextHnd=0x00007FF9DA234E59)
               [000009] ----------- arg0                    \--*  BOX       ref   
               [000008] -----------                            \--*  LCL_VAR   ref    V01 tmp1         

Argument #0:
               [000009] -----------                         *  BOX       ref   
               [000008] -----------                         \--*  LCL_VAR   ref    V01 tmp1         

INLINER: inlineInfo.tokenLookupContextHandle for Program:<<Main>$>g__Unbox|0_1(System.Object) set to 0x00007FF9DA234E59:

Invoking compiler for the inlinee method Program:<<Main>$>g__Unbox|0_1(System.Object) :
IL to import:
IL_0000  02                ldarg.0     
IL_0001  a5 52 00 00 01    unbox.any    0x1000052
IL_0006  0a                stloc.0     
IL_0007  12 00             ldloca.s     0x0
IL_0009  28 48 00 00 0a    call         0xA000048
IL_000e  28 02 00 00 2b    call         0x2B000002
IL_0013  2a                ret         

INLINER impTokenLookupContextHandle for Program:<<Main>$>g__Unbox|0_1(System.Object) is 0x00007FF9DA234E59.
*************** In compInitDebuggingInfo() for Program:<<Main>$>g__Unbox|0_1(System.Object)
info.compStmtOffsetsCount    = 0
info.compStmtOffsetsImplicit = 0005h ( STACK_EMPTY CALL_SITE )
*************** In fgFindBasicBlocks() for Program:<<Main>$>g__Unbox|0_1(System.Object)
weight= 10 : state   3 [ ldarg.0 ]
weight=274 : state 142 [ unbox.any ]
weight=  6 : state  11 [ stloc.0 ]
weight= 35 : state 189 [ ldloca.s.normed ]
weight= 79 : state  40 [ call ]
weight= 79 : state  40 [ call ]
weight= 19 : state  42 [ ret ]

Inline candidate looks like a wrapper method.  Multiplier increased to 1.
Callsite is going to box 1 arguments.  Multiplier increased to 1.5.
Inline candidate callsite is boring.  Multiplier increased to 2.8.
calleeNativeSizeEstimate=502
callsiteNativeSizeEstimate=85
benefit multiplier=2.8
threshold=237
Native estimate for function size exceeds threshold for inlining 50.2 > 23.7 (multiplier = 2.8)


Inline expansion aborted, inline not profitable
INLINER: during 'fgInline' result 'failed this call site' reason 'unprofitable inline' for 'Program:<<Main>$>g__BoxTest|0_0():int' calling 'Program:<<Main>$>g__Unbox|0_1(System.Object)'
INLINER: during 'fgInline' result 'failed this call site' reason 'unprofitable inline'
**************** Inline Tree

Inlines into 06000003 [via ExtendedDefaultPolicy] Program:<<Main>$>g__BoxTest|0_0():int:
  [INL00 IL=0007 TR=000010 06000004] [FAILED: call site: unprofitable inline] Program:<<Main>$>g__Unbox|0_1(System.Object)
Budget: initialTime=105, finalTime=105, initialBudget=1050, currentBudget=1050
Budget: initialSize=473, finalSize=473

*************** Before renumbering the basic blocks

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    [000..00F)                           (return)                     i newobj
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

***************  Exception Handling table is empty
=============== No blocks renumbered!

*************** Finishing PHASE Morph - Inlining

Configuration

Windows 11, x64, runtime commit ebf21a4

Regression?

No

@neon-sunset neon-sunset added the tenet-performance Performance related issue label Jul 5, 2024
@neon-sunset neon-sunset changed the title A simple method that performs unboxing is considered unprofitable, preventing box stack allocation A simple method that performs unboxing is considered unprofitable inlinee, preventing box stack allocation Jul 5, 2024
@vcsjones vcsjones added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 5, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jul 5, 2024
@JulieLeeMSFT
Copy link
Member

cc @AndyAyersMS.

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jul 5, 2024
@JulieLeeMSFT JulieLeeMSFT added this to the 9.0.0 milestone Jul 5, 2024
@AndyAyersMS
Copy link
Member

AndyAyersMS commented Jul 8, 2024

#103361 had a more aggressive inlining boost for "box at callsite" at one point but it looked like it caused a lot of code bloat for little gain, so I backed it down.

Based on your log it looks like a substantial boost would be needed if we're going to do this purely as a caller-side observation. Seems like we'd need to get the callee side involved as well—something like: "if an argument is unboxed and not much else is done with it..."

@neon-sunset
Copy link
Contributor Author

neon-sunset commented Jul 8, 2024

Yeah, I opened this issue to ask if, besides setting multiplier to some absurdly high number, there was something else going on with profitability calculation as there are other factors that set a threshold for getting inlined high. I was hoping there is another way to lower the threshold instead.

@AndyAyersMS
Copy link
Member

I don't think we'll get to this in 9.0, so moving to future.

@AndyAyersMS AndyAyersMS modified the milestones: 9.0.0, Future Jul 15, 2024
AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this issue Dec 10, 2024
Especially so if the caller is passing an exact type. This may lead
to the JIT being able to stack allocate the box and promote the underlying
payload.

Fixes dotnet#104479
@AndyAyersMS AndyAyersMS linked a pull request Dec 10, 2024 that will close this issue
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants