Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Propagate LCL_ADDRs into natural loops #102965

Closed
wants to merge 1 commit into from

Conversation

jakobbotsch
Copy link
Member

Just an experiment to see TP cost vs asm diffs... probably unlikely to be worth it.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 2, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jakobbotsch
Copy link
Member Author

Diffs were tiny, doesn't seem to be worth the complexity.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 27, 2024
@AndyAyersMS
Copy link
Member

@jakobbotsch I am likely going to want to revive this or something like it, so I can get stack allocated enumerator addresses propagated into loops.

But I will also need something more aggressive for handlers, since in my cases there are updates needed in finally/fault regions.

I have a prototype that builds a handler entry assertion set as follows:

  • start with the try region entry assertions set (currently recomputed, but can be simply saved since the try entry should appear before the handler entry in the RPO)
  • remove any assertions for locals defined in the try or any enclosed EH region
  • remove any assertions for locals defined in any enclosing try's filter
  • remove any assertions for locals defined in the handler, if the handler entry is a loop header (or clear all assertions, if handler entry has a non-loop backedge)

Diffs from this are still small, but I expect will be bigger once I figure out how to handle the case where the enumerator is conditionally defined to be a local (as in GDV).

I'm less sure how to compute this at acceptable cost, especially the loop and EH regions nest and/or intersect, and we need the computation for loops as well. One thought is to revise the work in this PR to populate suitable per-block bit vectors, so that the per-block analysis (which is the costly part) doesn't need to be done more than once, and then just compute the per-block parts on demand if missing.

@jakobbotsch
Copy link
Member Author

@AndyAyersMS I don't think anything special needs to happen for handlers since they are modelled faithfully by the DFS traversal we are using. That is, I think we can just switch the PredBlocks use in StartBlock to BlockPredsWithEH and then remove the bbIsHandlerBeg restriction. I think the existing logic should be fine with those changes, unless you can see any problems with that?

As for LoopDefinitions, there is LoopLocalOccurrences in inductionvariableopts.cpp that I think we should factor and reuse for the purpose here as well. I don't expect the TP cost to be super high for it.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Oct 23, 2024

Hmm, that won't be sufficient since we logically need to differentiate between "assertions true at the predecessor's out edges" and "assertions true over the entire predecessor". Perhaps we could just track both...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants