Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: some ideas on high-level representation of runtime operations in IR #9056

Open
AndyAyersMS opened this issue Oct 3, 2017 · 0 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Milestone

Comments

@AndyAyersMS
Copy link
Member

To better support high-level optimizations, it makes sense to try and defer or encapsulate some of the more complex runtime lowerings in the JIT IR. Here are some thoughts on the matter.

Motivations:

  • High-level optimizations would prefer to see logical operators (even if complex) rather than a complex tree or tree sequence
  • Many times these operators become dead and cleaning up after them can be complex if they’ve been expanded
  • Sometimes these operators can be abstractly simplified if they are partially dead. For instance a box used only to feed a type test can become a type lookup.
  • Properties of these operators are not always evident from their expansions, and the expansions can vary considerably, making “reparsing” within the jit to recover information lost during expansion problematic
  • Often of these operators have nice properties (invariant, nonfaulting) and would be good candidates for hoisting, but their complex shape makes this difficult/costly.
  • Often the equivalence of two such operators can be stated rather simply as equivalence of some abstract inputs, making CSE/value numbering simple.

Possible candidates for this kind of encapsulation include

  • Runtime lookup
  • Static field access
  • Box (already semi-encapsulated)
  • Unbox
  • Cast/Isint
  • Allocation (already encapsulated)
  • Class initialization

The downside to encapsulation is that the subsequent expansion is context dependent. The jit would have to ensure that it could retain all the necessary bits of context so it could query the runtime when it is time to actually expand the operation. This becomes complicated when these runtime operators are created during inlining, as sometimes inlining must be abandoned when the runtime operator expansions become complex. So it could be this approach becomes somewhat costly in space (given the amount of retained context per operator) or in time (since we likely must simulate enough of the expansion during inlining to see if problematic cases arise).

We’d also have more kinds of operations flowing around in the IR and would need to decide when to remove/expand them. This can be done organically, removing the operations just after the last point at which some optimization is able to reason about them. Initially perhaps they’d all vanish after inlining or we could repurpose the object allocation lowering to become a more general runtime lowering.

Instead of full encapsulation, we might consider relying initially on partial encapsulation like we do now for box: introduce a “thin” unary encapsulation wrapper over a fully expanded tree that identifies the tree as an instance of some particular runtime operation (and possibly, as in box, keeping tabs on related upstream statements) with enough information to identify the key properties. Expansion would be simple: the wrapper would disappear at a suitable downstream phase, simply replaced by its content. These thin wrappers would not need to capture all the context, but just add a small amount of additional state. Current logic for abandoning inlines in the face of complex expansion would apply, so no new logic would be needed.

As opportunities arise we can then gradually convert the thin wrappers to full encapsulations; most “upstream” logic should not care that much since presumably the expanded subtrees, once built, do not play any significant role in high-level optimization, so their creation could be deferred.

So I’m tempted to say that thin encapsulation gives us the right set of tradeoffs, and start building upon that.

The likely first target is the runtime lookups feeding type equality and eventually type cast operations. Then probably static field accesses feeding devirtualization opportunities.

If you’re curious what this would look like, here’s a prototype: master..AndyAyersMS:WrapRuntimeLookup

And here’s an example using the prototype. In this case the lookup tree is split off into an earlier statement, but at the point of use we can still see some information about what type the tree intends to look up. A new jit interface call (not present in the fork above) can use this to determine if the types are possibly equal or not equal, even with runtime lookups for one or both inputs.

fgMorphTree BB01, stmt 3 (before)
               [000037] --C-G-------             *  JTRUE     void  
               [000035] ------------             |  /--*  CNS_INT   int    0
               [000036] --C-G-------             \--*  EQ        int   
               [000032] --C-G-------                \--*  CALL      int    System.Type.op_Equality
               [000028] --C-G------- arg0              +--*  CALL help ref    HELPER.CORINFO_HELP_TYPEHANDLE_TO_RUNTIMETYPE
               [000026] ------------ arg0              |  \--*  RUNTIMELOOKUP long   0x7ffcc5769428 class
               [000025] ------------                   |     \--*  LCL_VAR   long   V03 loc1         
               [000031] --C-G------- arg1              \--*  CALL help ref    HELPER.CORINFO_HELP_TYPEHANDLE_TO_RUNTIMETYPE
               [000029] ------------ arg0                 \--*  CNS_INT(h) long   0x7ffcc5769530 class

By default the wrapper just evaporates in morph:

Optimizing call to Type:op_Equality to simple compare via EQ
Optimizing compare of types-from-handles to instead compare handles

fgMorphTree BB01, stmt 3 (after)
               [000037] ----G+------             *  JTRUE     void  
               [000029] -----+------             |  /--*  CNS_INT(h) long   0x7ffcc5769530 class
               [000170] J----+-N----             \--*  NE        int   
               [000025] -----+------                \--*  LCL_VAR   long   V03 loc1         

But in morph and upstream it can be used to trigger new optimizations.

category:implementation
theme:ir
skill-level:expert
cost:medium

AndyAyersMS referenced this issue in AndyAyersMS/coreclr Oct 5, 2017
Work based on the plan outlined in #14305.

Introduce a new unary node type GT_RUNTIMELOOKUP that wraps existing
runtime lookup trees created in `impLookupToTree` and holds onto the
handle that inspired the lookup.

Note there are other importer paths that create lookups directly that
we might also want to wrap, though wrapping is not a requirement for
correctness.

Keep this node type around through morph, then unwrap and just use
the wrapped node after that.
AndyAyersMS referenced this issue in dotnet/coreclr Oct 5, 2017
* JIT: Wrap some runtime lookups in new node type

Work based on the plan outlined in #14305.

Introduce a new unary node type GT_RUNTIMELOOKUP that wraps existing
runtime lookup trees created in `impLookupToTree` and holds onto the
handle that inspired the lookup.

Note there are other importer paths that create lookups directly that
we might also want to wrap, though wrapping is not a requirement for
correctness.

Keep this node type around through morph, then unwrap and just use
the wrapped node after that.

* JIT: More enhancements to type equality testing

The jit is now able to optimize some type equality checks in shared
method instances where one or both of the types require runtime lookup,
for instance comparing `T` to a value type, or comparing two different
value types `V1<T>` and `V2<T>`.

Add two new jit interface methods, one for testing for type equality and
the other for type casting.

These return Must/MustNot/Maybe results depending on whether or not
the equality or cast can be resolved at jit time.

Implement the equality check. Use this to enhance the type equality opts
in the jit for both direct comparison and the checking done by unbox.any.
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

3 participants