Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: better optimization for CORINFO_HELP_RUNTIMEHANDLE_METHOD #7590

Closed
JosephTremoulet opened this issue Mar 9, 2017 · 14 comments
Closed
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Milestone

Comments

@JosephTremoulet
Copy link
Contributor

This helper depends on the (exact) type of the object passed to it, not the contents/fields of that objects. Since an object's type is initialized anywhere the object pointer is available and is immutable for the lifetime of the object, we should therefore be able to give this helper call a value number that is a function of the pointer argument, rather than the memory the argument points to, and eliminate redundant calls such as we see in shared generic code.
category:cq
theme:helpers
skill-level:expert
cost:medium

@russellhadley
Copy link
Contributor

Keep in 2.0 for Span on Kestrel.

@JosephTremoulet
Copy link
Contributor Author

We already recognize this helper of pure, and our definition of "pure" already requires that the helper's behavior not depend on mutable heap state, so we can already CSE such calls.

The cases showing up in the SpanBench benchmark aren't getting CSE'd because each is conditionally executed (so the 1st occurrence isn't available at the 2nd one on the path that branched around it).

That said, perhaps we should be able to recognize the conditions guarding them as equivalent? The test does a two-level indirection off TypeCtx and then checks if the result is nonzero. Here's where we value-number the first two-level indirection:

***** BB03, stmt 5 (before)
N007 (  7,  7) [000046] ------------             /--*  indir     long  
N005 (  1,  1) [000044] ------------             |  |  /--*  const     long   16
N006 (  5,  5) [000045] -------N----             |  \--*  +         long  
N004 (  4,  4) [000043] #-----------             |     \--*  indir     long  
N002 (  1,  1) [000041] ------------             |        |  /--*  const     long   16
N003 (  2,  2) [000042] -------N----             |        \--*  +         long  
N001 (  1,  1) [000039] ------------             |           \--*  lclVar    long   V00 TypeCtx      u:2
N009 (  7,  7) [000048] -A------R---             *  =         long  
N008 (  1,  1) [000047] D------N----             \--*  lclVar    long   V09 tmp1         d:3

N001 [000039]   lclVar    V00 TypeCtx      u:2 => $80 {InitVal($40)}
N002 [000041]   const     16 => $141 {LngCns:  16}
N003 [000042]   +         => $200 {+($80, $141)}
    VNForMapSelect($2, $200):ref returns $240 {$VN.ReadOnlyHeap[$200]}
    VNForMapSelect($2, $200):ref returns $240 {$VN.ReadOnlyHeap[$200]}
N004 [000043]   indir     => $240 {$VN.ReadOnlyHeap[$200]}
N005 [000044]   const     16 => $141 {LngCns:  16}
N006 [000045]   +         => $204 {+($141, $240)}
N007 [000046]   indir     => <l:$282 {ByrefExposedLoad($45, $204, $480)}, c:$2c6 {2c6}>
N008 [000047]   lclVar    V09 tmp1         d:3 => <l:$282 {ByrefExposedLoad($45, $204, $480)}, c:$2c6 {2c6}>
N009 [000048]   =         => <l:$282 {ByrefExposedLoad($45, $204, $480)}, c:$2c6 {2c6}>

***** BB03, stmt 5 (after)
N007 (  7,  7) [000046] ------------             /--*  indir     long   <l:$282, c:$2c6>
N005 (  1,  1) [000044] ------------             |  |  /--*  const     long   16 $141
N006 (  5,  5) [000045] -------N----             |  \--*  +         long   $204
N004 (  4,  4) [000043] #-----------             |     \--*  indir     long   $240
N002 (  1,  1) [000041] ------------             |        |  /--*  const     long   16 $141
N003 (  2,  2) [000042] -------N----             |        \--*  +         long   $200
N001 (  1,  1) [000039] ------------             |           \--*  lclVar    long   V00 TypeCtx      u:2 $80
N009 (  7,  7) [000048] -A------R---             *  =         long   <l:$282, c:$2c6>
N008 (  1,  1) [000047] D------N----             \--*  lclVar    long   V09 tmp1         d:3 <l:$282, c:$2c6>

and here's where we value-number the second:

***** BB11, stmt 20 (before)
N007 (  7,  7) [000109] ------------             /--*  indir     long  
N005 (  1,  1) [000107] ------------             |  |  /--*  const     long   16
N006 (  5,  5) [000108] -------N----             |  \--*  +         long  
N004 (  4,  4) [000106] #-----------             |     \--*  indir     long  
N002 (  1,  1) [000104] ------------             |        |  /--*  const     long   16
N003 (  2,  2) [000105] -------N----             |        \--*  +         long  
N001 (  1,  1) [000102] ------------             |           \--*  lclVar    long   V00 TypeCtx      u:2
N009 (  7,  7) [000111] -A------R---             *  =         long  
N008 (  1,  1) [000110] D------N----             \--*  lclVar    long   V11 tmp3         d:3

N001 [000102]   lclVar    V00 TypeCtx      u:2 => $80 {InitVal($40)}
N002 [000104]   const     16 => $141 {LngCns:  16}
N003 [000105]   +         => $200 {+($80, $141)}
    VNForMapSelect($2, $200):ref returns $240 {$VN.ReadOnlyHeap[$200]}
    VNForMapSelect($2, $200):ref returns $240 {$VN.ReadOnlyHeap[$200]}
N004 [000106]   indir     => $240 {$VN.ReadOnlyHeap[$200]}
N005 [000107]   const     16 => $141 {LngCns:  16}
N006 [000108]   +         => $204 {+($141, $240)}
N007 [000109]   indir     => <l:$284 {ByrefExposedLoad($45, $204, $5c0)}, c:$600 {600}>
N007 [000046]   indir     => <l:$282 {ByrefExposedLoad($45, $204, $480)}, c:$2c6 {2c6}>
N008 [000110]   lclVar    V11 tmp3         d:3 => <l:$284 {ByrefExposedLoad($45, $204, $5c0)}, c:$600 {600}>
N009 [000111]   =         => <l:$284 {ByrefExposedLoad($45, $204, $5c0)}, c:$600 {600}>

***** BB11, stmt 20 (after)
N007 (  7,  7) [000109] ------------             /--*  indir     long   <l:$284, c:$600>
N005 (  1,  1) [000107] ------------             |  |  /--*  const     long   16 $141
N006 (  5,  5) [000108] -------N----             |  \--*  +         long   $204
N004 (  4,  4) [000106] #-----------             |     \--*  indir     long   $240
N002 (  1,  1) [000104] ------------             |        |  /--*  const     long   16 $141
N003 (  2,  2) [000105] -------N----             |        \--*  +         long   $200
N001 (  1,  1) [000102] ------------             |           \--*  lclVar    long   V00 TypeCtx      u:2 $80
N009 (  7,  7) [000111] -A------R---             *  =         long   <l:$284, c:$600>
N008 (  1,  1) [000110] D------N----             \--*  lclVar    long   V11 tmp3         d:3 <l:$284, c:$600>

We recognize that TypeCtx itself points to read-only memory, but assume the memory loaded by the 2nd-level dereference may be changing. @AndyAyersMS, should we know more about the aliasing properties of that 2nd-level dereference?

(even if we did recognize the tests as redundant, though, we currently don't track gating predicates when value-numbering phis, and so wouldn't recognize the second sequence as redundant without addressing that)

@ahsonkhan
Copy link
Member

@JosephTremoulet, any updates on this issue?

@JosephTremoulet
Copy link
Contributor Author

@JosephTremoulet, any updates on this issue?

Not yet; still need to track down what that context/dictionary lookup sequence is loading and whether we can assume it is immutable to know if this is possible. Assuming it is, doing a full-blown control dependence analysis for tracking phi gating predicates is likely out-of-scope for a while, but we might be able to get some mileage out of recognizing cases where the flow graph has a simple single-conditional-block "if" or two-conditional-block "if/else" pattern and number phis like questions in those join blocks.

@russellhadley
Copy link
Contributor

After discussion with @JosephTremoulet swapping this for the span bounds check work in #7589.

@JosephTremoulet
Copy link
Contributor Author

JosephTremoulet commented Apr 7, 2017

It turns out the indirection in question is one that we generate when the JIT calls embedGenericHandle, and that returns a CORINFO_GENERICHANDLE_RESULT whose CORINFO_LOOKUP is a runtime lookup whose CORINFO_RUNTIME_LOOKUP has its testForNull flag set.

The comment for testForNull at its definition is If set, test for null and branch to helper if null, and at its use in the jit is

      2b. pLookup->testForNull == true : Dereference the instantiation-specific handle.
          If it is non-NULL, it is the handle required. Else, call a helper
          to lookup the handle.

It's not clear to me from either of those if we can/should assume that the dereference we generate to implement the testForNull in a CORINFO_RUNTIME_LOOKUP is invariant (specifically, if we find two such loads that are loading from the same address, can we assume that they load the same value? Or, alternatively, can we assume that if they load different values then the first one loaded null but the thing got pack-patched with the result of the run-time lookup and so the second load would get the value that was returned by the helper that was called when the first one was null?)

I'm not familiar enough with the runtime side to know the answer. @jkotas?

@jkotas
Copy link
Member

jkotas commented Apr 7, 2017

can we assume that if they load different values then the first one loaded null but the thing got pack-patched with the result of the run-time lookup and so the second load would get the value that was returned by the helper that was called when the first one was null?

This is the correct assumption to make. The indirection is lazily initialized. If it is non-null, you have the good value. If it is null, the helper will initialize so that it is non-null next time.

@JosephTremoulet
Copy link
Contributor Author

So then the importer is correct to not flag the load as invariant. The whole load-test-init sequence is CSEable, but trying to mark all the individual pieces as such would lead to redundant calls to the initialization helper. It may be worth considering delaying the expansion of the lookup sequence so that we could value-number it as a whole before exposing the parts...

@JosephTremoulet JosephTremoulet removed their assignment Sep 14, 2017
@JosephTremoulet
Copy link
Contributor Author

JosephTremoulet commented Oct 2, 2017

Another approach would be to track assertions of the form "INDIR (INDIR $Ctx) is not null". Then the call to CORINFO_HELP_RUNTIMEHANDLE_METHOD could generate such an assertion. I believe we already use assertion prop to recognize redundant "if (p == null)" tests; presumably we could adjust that to recognize when "p" has the form "**Ctx" for one of these type init lookups, and then phrase the assertions in terms of Ctx, or even in terms of the value number of *Ctx (since we recognize that first load as invariant), and let our normal assertion merging handle the join.

@ahsonkhan
Copy link
Member

ahsonkhan commented Oct 18, 2017

@JosephTremoulet, any updates on this issue?
This looks like the last issue under "JIT - Optimizations" from https://github.com/dotnet/coreclr/issues/5851#issue-160849016

@JosephTremoulet
Copy link
Contributor Author

any updates on this issue?

I suppose @AndyAyersMS or @RussKeldorph would be in the best position to answer that.

@AndyAyersMS
Copy link
Member

We are not going to get to this in 2.1 so we should update #6168 to reflect this.

@ahsonkhan
Copy link
Member

We are not going to get to this in 2.1 so we should update #6168 to reflect this.

I updated the original post of https://github.com/dotnet/coreclr/issues/5851. Thanks

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@AndyAyersMS
Copy link
Member

@EgorBo I think #81635 may have covered this?

@EgorBo EgorBo closed this as completed May 6, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

8 participants