-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Ensure edge likelihoods 1/N #97740
Conversation
…incorporation, stop checking very early
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue Detailsnull
|
@amanasifkhalid FYI -- starting to chip away here. |
So far I am following the rule that the caller of |
Very odd that I didn't see that build error locally. |
Am going to make |
Thanks for getting started on this! By the way, #97664 will probably conflict with your changes to |
No worries, this is still work in progress. I am going to keep pushing the checkpoint back and see how far I can get easily. |
Not surprisingly, x86 is special. Let me clean that up next. |
Diff results for #97740Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 2,507,291 contexts (1,007,092 MinOpts, 1,500,199 FullOpts). MISSED contexts: base: 1 (0.00%), diff: 27 (0.00%) Overall (-888 bytes)
FullOpts (-888 bytes)
Assembly diffs for linux/x64 ran on windows/x64Diffs are based on 2,517,881 contexts (991,070 MinOpts, 1,526,811 FullOpts). MISSED contexts: base: 1 (0.00%), diff: 28 (0.00%) Overall (+584 bytes)
FullOpts (+584 bytes)
Assembly diffs for osx/arm64 ran on windows/x64Diffs are based on 2,270,844 contexts (932,669 MinOpts, 1,338,175 FullOpts). MISSED contexts: base: 2 (0.00%), diff: 26 (0.00%) Overall (-1,088 bytes)
FullOpts (-1,088 bytes)
Assembly diffs for windows/arm64 ran on windows/x64Diffs are based on 2,341,083 contexts (938,449 MinOpts, 1,402,634 FullOpts). MISSED contexts: base: 9 (0.00%), diff: 34 (0.00%) Overall (-928 bytes)
FullOpts (-928 bytes)
Assembly diffs for windows/x64 ran on windows/x64Diffs are based on 2,512,183 contexts (997,391 MinOpts, 1,514,792 FullOpts). MISSED contexts: base: 3 (0.00%), diff: 29 (0.00%) Overall (+412 bytes)
FullOpts (+412 bytes)
Details here Assembly diffs for linux/arm ran on windows/x86Diffs are based on 2,239,389 contexts (829,328 MinOpts, 1,410,061 FullOpts). MISSED contexts: base: 71,274 (3.08%), diff: 71,275 (3.08%) Overall (-254 bytes)
FullOpts (-254 bytes)
Assembly diffs for windows/x86 ran on windows/x86Diffs are based on 2,293,420 contexts (839,658 MinOpts, 1,453,762 FullOpts). MISSED contexts: base: 45 (0.00%), diff: 47 (0.00%) Overall (+147 bytes)
FullOpts (+147 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Warning: Different compilers used for base and diff JITs. Results may be misleading. Overall (-0.01% to -0.00%)
FullOpts (-0.02% to -0.00%)
Throughput diffs for linux/x64 ran on windows/x64Warning: Different compilers used for base and diff JITs. Results may be misleading. Overall (-0.00% to +0.01%)
FullOpts (-0.00% to +0.01%)
Throughput diffs for osx/arm64 ran on windows/x64Warning: Different compilers used for base and diff JITs. Results may be misleading. Overall (-0.01% to -0.00%)
MinOpts (+0.00% to +0.01%)
FullOpts (-0.02% to -0.00%)
Throughput diffs for windows/arm64 ran on windows/x64Warning: Different compilers used for base and diff JITs. Results may be misleading. Overall (-0.01% to -0.00%)
MinOpts (+0.00% to +0.01%)
FullOpts (-0.02% to -0.00%)
Throughput diffs for windows/x64 ran on windows/x64Warning: Different compilers used for base and diff JITs. Results may be misleading. MinOpts (-0.00% to +0.01%)
FullOpts (-0.00% to +0.01%)
Details here Throughput diffs for linux/arm ran on windows/x86Warning: Different compilers used for base and diff JITs. Results may be misleading. Overall (-0.06% to +0.01%)
MinOpts (-0.16% to +0.00%)
FullOpts (+0.00% to +0.01%)
Throughput diffs for windows/x86 ran on windows/x86Warning: Different compilers used for base and diff JITs. Results may be misleading. Overall (+0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.00% to +0.01%)
MinOpts (-0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.00% to +0.01%)
MinOpts (-0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Details here |
about to push a bunch of fixes ... will also merge up. |
Moved the likelihood checking boundary farther back in the phase list. Next up are the EH opts. So far most of the fixes are stylized. A few will require revisiting. Still sticking with the plan of not adding any automagic fixups just yet. |
Diff results for #97740Assembly diffsAssembly diffs for osx/arm64 ran on linux/x64Diffs are based on 2,270,844 contexts (932,669 MinOpts, 1,338,175 FullOpts). MISSED contexts: base: 2 (0.00%), diff: 26 (0.00%) Overall (-1,196 bytes)
FullOpts (-1,196 bytes)
Assembly diffs for windows/arm64 ran on linux/x64Diffs are based on 2,341,083 contexts (938,449 MinOpts, 1,402,634 FullOpts). MISSED contexts: base: 9 (0.00%), diff: 34 (0.00%) Overall (-1,000 bytes)
FullOpts (-1,000 bytes)
Assembly diffs for windows/x64 ran on linux/x64Diffs are based on 2,512,183 contexts (997,391 MinOpts, 1,514,792 FullOpts). MISSED contexts: base: 3 (0.00%), diff: 29 (0.00%) Overall (+680 bytes)
FullOpts (+680 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.01% to +0.02%)
MinOpts (+0.00% to +0.01%)
FullOpts (+0.01% to +0.02%)
Throughput diffs for linux/x64 ran on windows/x64Overall (+0.01% to +0.02%)
MinOpts (+0.00% to +0.02%)
FullOpts (+0.01% to +0.02%)
Throughput diffs for osx/arm64 ran on windows/x64Overall (+0.01%)
MinOpts (-0.00% to +0.02%)
FullOpts (+0.01%)
Throughput diffs for windows/arm64 ran on windows/x64Overall (+0.01%)
MinOpts (+0.00% to +0.01%)
FullOpts (+0.01% to +0.02%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.01% to +0.02%)
MinOpts (+0.00% to +0.02%)
FullOpts (+0.01% to +0.02%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (-0.06% to +0.01%)
MinOpts (-0.16% to +0.01%)
FullOpts (+0.00% to +0.01%)
Throughput diffs for windows/x86 ran on windows/x86Overall (+0.00% to +0.01%)
MinOpts (-0.00% to +0.02%)
FullOpts (+0.00% to +0.01%)
Details here Throughput diffs for linux/x64 ran on linux/x64Overall (+0.00% to +0.01%)
MinOpts (-0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Details here |
Diff results for #97740Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 2,507,291 contexts (1,007,092 MinOpts, 1,500,199 FullOpts). MISSED contexts: base: 1 (0.00%), diff: 27 (0.00%) Overall (-988 bytes)
FullOpts (-988 bytes)
Assembly diffs for linux/x64 ran on windows/x64Diffs are based on 2,517,881 contexts (991,070 MinOpts, 1,526,811 FullOpts). MISSED contexts: base: 1 (0.00%), diff: 28 (0.00%) Overall (+817 bytes)
FullOpts (+817 bytes)
Assembly diffs for osx/arm64 ran on windows/x64Diffs are based on 2,270,844 contexts (932,669 MinOpts, 1,338,175 FullOpts). MISSED contexts: base: 2 (0.00%), diff: 26 (0.00%) Overall (-1,196 bytes)
FullOpts (-1,196 bytes)
Assembly diffs for windows/arm64 ran on windows/x64Diffs are based on 2,341,083 contexts (938,449 MinOpts, 1,402,634 FullOpts). MISSED contexts: base: 9 (0.00%), diff: 34 (0.00%) Overall (-1,000 bytes)
FullOpts (-1,000 bytes)
Assembly diffs for windows/x64 ran on windows/x64Diffs are based on 2,512,183 contexts (997,391 MinOpts, 1,514,792 FullOpts). MISSED contexts: base: 3 (0.00%), diff: 29 (0.00%) Overall (+680 bytes)
FullOpts (+680 bytes)
Details here Assembly diffs for linux/arm ran on windows/x86Diffs are based on 2,239,389 contexts (829,328 MinOpts, 1,410,061 FullOpts). MISSED contexts: base: 71,274 (3.08%), diff: 71,275 (3.08%) Overall (-312 bytes)
FullOpts (-312 bytes)
Assembly diffs for windows/x86 ran on windows/x86Diffs are based on 2,293,449 contexts (839,658 MinOpts, 1,453,791 FullOpts). MISSED contexts: base: 45 (0.00%), diff: 47 (0.00%) Overall (+171 bytes)
FullOpts (+171 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.01% to +0.02%)
MinOpts (+0.00% to +0.01%)
FullOpts (+0.01% to +0.02%)
Throughput diffs for linux/x64 ran on windows/x64Overall (+0.01% to +0.02%)
MinOpts (+0.00% to +0.02%)
FullOpts (+0.01% to +0.02%)
Throughput diffs for osx/arm64 ran on windows/x64Overall (+0.01%)
MinOpts (-0.00% to +0.02%)
FullOpts (+0.01%)
Throughput diffs for windows/arm64 ran on windows/x64Overall (+0.01%)
MinOpts (+0.00% to +0.01%)
FullOpts (+0.01% to +0.02%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.01% to +0.02%)
MinOpts (+0.00% to +0.02%)
FullOpts (+0.01% to +0.02%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (-0.06% to +0.01%)
MinOpts (-0.16% to +0.01%)
FullOpts (+0.00% to +0.01%)
Throughput diffs for windows/x86 ran on windows/x86Overall (+0.00% to +0.01%)
MinOpts (-0.00% to +0.02%)
FullOpts (+0.00% to +0.01%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.00% to +0.01%)
MinOpts (-0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.00% to +0.01%)
MinOpts (-0.00% to +0.01%)
FullOpts (+0.00% to +0.01%)
Details here |
Think this is far enough for one PR. Will convert to non-draft. |
FYI @dotnet/jit-contrib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for getting this started!
Let me know if you'd like me to work on the next few phases. I imagine the bulk of the work will be a repeat of what you did here.
fgRemoveRefPred(oldTarget, block); | ||
fgAddRefPred(newTarget, block); | ||
|
||
if (block->TargetIs(oldTarget)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, did you run into any situations where we call fgReplaceJumpTarget
on a block that doesn't have oldTarget
as its current target? Or can we get rid of this branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may have been a merge error -- I was not trying to change behavior.
Will clean it up in the next phase of this work.
} | ||
|
||
// checkBlock | ||
// Todo: get likelihoods right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not in scope for this PR, but I'm curious what your ideas are for coming up with better guesses for the likelihoods in cases like this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For that specific case the logic is in a base class so we will need the various derived classes to make the right info available.
Generally speaking if you are introducing new conditional flow you should have some idea of the relative frequencies of the different possibilities. If not you can always go off and try and do some custom measurements to see what happens.
One example where I may do that is in the "OSR step block" case -- here I do not know the right answer, my guess would be that it depends on whether the try that encompasses the OSR entry point is itself nested in a loop. But to be sure I'd have to go measure.
Start enforcing that all
FlowEdge
s have likelihoods set (ideally, to the right value).Update the profile checker to allow checking just for likelihoods. Enable this for all the initial phases up through and including
fgAddInternal
.Fix missing flow edge issues by generally having whatever bit of code calls
fgAddRef
also ensure likelihood is set. For many cases these are unconditional flows so the value is just 1.0. Sometimes we can pass the old edge in and havefgAddRef
itself do the work. And in a few cases (switches) things are a bit more subtle.The general plan is to leave things like this for now until we've fixed likelihoods everywhere and are checking all phases. Then we'll go back and try and refactor
fgAddRef
to do more of this automatically.Contributes to #93020