-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Factor loop duplication code #97506
Conversation
Factor the loop duplication code out of loop cloning and loop unrolling in anticipation of also using it in loop peeling.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsFactor the loop duplication code out of loop cloning and loop unrolling in anticipation of also using it in loop peeling. No diffs expected.
|
void FlowGraphNaturalLoop::Duplicate(BasicBlock** insertAfter, | ||
BlockToBlockMap* map, | ||
weight_t weightScale, | ||
bool bottomNeedsRedirection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, we will be able to get rid of this bottomNeedsRedirection
parameter once we don't have fallthrough anymore. We could've probably gotten rid of it here with some complications, but I think it is going to be much easier once we don't have fallthrough, so I didn't want to bother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. This PR will probably get merged before #97488, so I'll plan on removing this over there.
Diff results for #97506Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 2,498,771 contexts (1,011,240 MinOpts, 1,487,531 FullOpts). MISSED contexts: 6,580 (0.26%) Overall (+0 bytes)
FullOpts (+0 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.00% to +0.01%)
FullOpts (-0.00% to +0.01%)
Throughput diffs for windows/x86 ran on windows/x86FullOpts (-0.00% to +0.01%)
Details here |
cc @dotnet/jit-contrib PTAL @BruceForstall No diffs. |
Diff results for #97506Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 2,498,771 contexts (1,011,240 MinOpts, 1,487,531 FullOpts). MISSED contexts: 6,580 (0.26%) Overall (+0 bytes)
FullOpts (+0 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.00% to +0.01%)
FullOpts (-0.00% to +0.01%)
Throughput diffs for windows/x86 ran on windows/x86FullOpts (-0.00% to +0.01%)
Details here |
Diff results for #97506Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 2,498,771 contexts (1,011,240 MinOpts, 1,487,531 FullOpts). MISSED contexts: 6,580 (0.26%) Overall (+0 bytes)
FullOpts (+0 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.00% to +0.01%)
FullOpts (-0.00% to +0.01%)
Throughput diffs for windows/x86 ran on windows/x86FullOpts (-0.00% to +0.01%)
Details here |
Hmm, there is actually a single diff in linux-arm64. Going to check what that is, but I expect it to just be something around block weighting that is subtly different... |
Ah, the diff is because the PR removes this quirk in unrolling: runtime/src/coreclr/jit/optimizer.cpp Lines 1671 to 1674 in c105e7d
Didn't realize it was still there, but given the single diff that doesn't seem necessary to do separately. |
Factor the loop duplication code out of loop cloning and loop unrolling in anticipation of also using it in loop peeling.
No diffs expected.