JIT: some ideas on high-level representation of runtime operations in IR #9056
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
enhancement
Product code improvement that does NOT require public API changes/additions
JitUntriaged
CLR JIT issues needing additional triage
optimization
tenet-performance
Performance related issue
Milestone
To better support high-level optimizations, it makes sense to try and defer or encapsulate some of the more complex runtime lowerings in the JIT IR. Here are some thoughts on the matter.
Motivations:
Possible candidates for this kind of encapsulation include
The downside to encapsulation is that the subsequent expansion is context dependent. The jit would have to ensure that it could retain all the necessary bits of context so it could query the runtime when it is time to actually expand the operation. This becomes complicated when these runtime operators are created during inlining, as sometimes inlining must be abandoned when the runtime operator expansions become complex. So it could be this approach becomes somewhat costly in space (given the amount of retained context per operator) or in time (since we likely must simulate enough of the expansion during inlining to see if problematic cases arise).
We’d also have more kinds of operations flowing around in the IR and would need to decide when to remove/expand them. This can be done organically, removing the operations just after the last point at which some optimization is able to reason about them. Initially perhaps they’d all vanish after inlining or we could repurpose the object allocation lowering to become a more general runtime lowering.
Instead of full encapsulation, we might consider relying initially on partial encapsulation like we do now for box: introduce a “thin” unary encapsulation wrapper over a fully expanded tree that identifies the tree as an instance of some particular runtime operation (and possibly, as in box, keeping tabs on related upstream statements) with enough information to identify the key properties. Expansion would be simple: the wrapper would disappear at a suitable downstream phase, simply replaced by its content. These thin wrappers would not need to capture all the context, but just add a small amount of additional state. Current logic for abandoning inlines in the face of complex expansion would apply, so no new logic would be needed.
As opportunities arise we can then gradually convert the thin wrappers to full encapsulations; most “upstream” logic should not care that much since presumably the expanded subtrees, once built, do not play any significant role in high-level optimization, so their creation could be deferred.
So I’m tempted to say that thin encapsulation gives us the right set of tradeoffs, and start building upon that.
The likely first target is the runtime lookups feeding type equality and eventually type cast operations. Then probably static field accesses feeding devirtualization opportunities.
If you’re curious what this would look like, here’s a prototype: master..AndyAyersMS:WrapRuntimeLookup
And here’s an example using the prototype. In this case the lookup tree is split off into an earlier statement, but at the point of use we can still see some information about what type the tree intends to look up. A new jit interface call (not present in the fork above) can use this to determine if the types are possibly equal or not equal, even with runtime lookups for one or both inputs.
By default the wrapper just evaporates in morph:
But in morph and upstream it can be used to trigger new optimizations.
category:implementation
theme:ir
skill-level:expert
cost:medium
The text was updated successfully, but these errors were encountered: