-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Adjust physical promotion heuristics #86660
JIT: Adjust physical promotion heuristics #86660
Conversation
Adjust the heuristics to take into account recent work on liveness and assignment decomposition. Stop phrasing things in terms of code size (multiplied by basic block weights, which does not make sense).
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsAdjust the heuristics to take into account recent work on liveness and assignment decomposition. Stop phrasing things in terms of code size (multiplied by basic block weights, which does not make sense).
|
cc @dotnet/jit-contrib PTAL @AndyAyersMS. First round of adjustments, might dial some of this in more later (specifically related to no costing for assignments with overlapped structs, I need to come up with a good way of costing). Diffs without old promotion. -6.6 MB on win-x64 over the old heuristics |
// TODO-CQ: We can make much better guesses on what will and won't be contained. | ||
costWithout += access.CountWtd * 6.5; | ||
// We cost any normal access (which is a struct load or store) without promotion at 3 cycles. | ||
costWithout += access.CountWtd * 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making all these weight factors symbolic so later on we can vary them more readily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I was initially using the perfscore constants here but noticed some oddities (e.g. stack writes are cheaper on arm64 than stack reads but they are the same cost for xarch). I can add some names the next time around.
// fields we are promoting together, evaluating all of them at once in | ||
// comparison with the covering struct uses. This will also allow us to | ||
// give a bonus to promoting remainders that may not have scalar uses | ||
// but will allow fully decomposing assignments away. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds ambitious, but I like the idea of costing things as sets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, hopefully I can approximate this in a simple/cheap way. But I think I will approach this by writing down an integer linear programming form of the problem and see if it gives me any insights.
Adjust the heuristics to take into account recent work on liveness and assignment decomposition. Stop phrasing things in terms of code size (multiplied by basic block weights, which does not make sense).