Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Factor loop duplication code #97506

Merged
merged 1 commit into from
Jan 26, 2024

Conversation

jakobbotsch
Copy link
Member

Factor the loop duplication code out of loop cloning and loop unrolling in anticipation of also using it in loop peeling.

No diffs expected.

Factor the loop duplication code out of loop cloning and loop unrolling
in anticipation of also using it in loop peeling.
@ghost ghost assigned jakobbotsch Jan 25, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 25, 2024
@ghost
Copy link

ghost commented Jan 25, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Factor the loop duplication code out of loop cloning and loop unrolling in anticipation of also using it in loop peeling.

No diffs expected.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@ryujit-bot
Copy link

Diff results for #97506

Throughput diffs

Throughput diffs for windows/x86 ran on linux/x86

FullOpts (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.windows.x86.checked.mch +0.01%

Details here


void FlowGraphNaturalLoop::Duplicate(BasicBlock** insertAfter,
BlockToBlockMap* map,
weight_t weightScale,
bool bottomNeedsRedirection)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, we will be able to get rid of this bottomNeedsRedirection parameter once we don't have fallthrough anymore. We could've probably gotten rid of it here with some complications, but I think it is going to be much easier once we don't have fallthrough, so I didn't want to bother.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. This PR will probably get merged before #97488, so I'll plan on removing this over there.

@ryujit-bot
Copy link

Diff results for #97506

Assembly diffs

Assembly diffs for linux/arm64 ran on windows/x64

Diffs are based on 2,498,771 contexts (1,011,240 MinOpts, 1,487,531 FullOpts).

MISSED contexts: 6,580 (0.26%)

Overall (+0 bytes)
Collection Base size (bytes) Diff size (bytes)
libraries_tests.run.linux.arm64.Release.mch 383,838,152 +0
FullOpts (+0 bytes)
Collection Base size (bytes) Diff size (bytes)
libraries_tests.run.linux.arm64.Release.mch 168,416,476 +0

Details here


Throughput diffs

Throughput diffs for linux/arm ran on windows/x86

Overall (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.linux.arm.checked.mch +0.01%
realworld.run.linux.arm.checked.mch +0.01%
FullOpts (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.linux.arm.checked.mch +0.01%
realworld.run.linux.arm.checked.mch +0.01%

Throughput diffs for windows/x86 ran on windows/x86

FullOpts (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.windows.x86.checked.mch +0.01%

Details here


@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @BruceForstall

No diffs.

@jakobbotsch jakobbotsch marked this pull request as ready for review January 25, 2024 15:40
@ryujit-bot
Copy link

Diff results for #97506

Assembly diffs

Assembly diffs for linux/arm64 ran on windows/x64

Diffs are based on 2,498,771 contexts (1,011,240 MinOpts, 1,487,531 FullOpts).

MISSED contexts: 6,580 (0.26%)

Overall (+0 bytes)
Collection Base size (bytes) Diff size (bytes)
libraries_tests.run.linux.arm64.Release.mch 383,838,152 +0
FullOpts (+0 bytes)
Collection Base size (bytes) Diff size (bytes)
libraries_tests.run.linux.arm64.Release.mch 168,416,476 +0

Details here


Throughput diffs

Throughput diffs for linux/arm ran on windows/x86

Overall (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.linux.arm.checked.mch +0.01%
realworld.run.linux.arm.checked.mch +0.01%
FullOpts (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.linux.arm.checked.mch +0.01%
realworld.run.linux.arm.checked.mch +0.01%

Throughput diffs for windows/x86 ran on windows/x86

FullOpts (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.windows.x86.checked.mch +0.01%

Details here


@ryujit-bot
Copy link

Diff results for #97506

Assembly diffs

Assembly diffs for linux/arm64 ran on windows/x64

Diffs are based on 2,498,771 contexts (1,011,240 MinOpts, 1,487,531 FullOpts).

MISSED contexts: 6,580 (0.26%)

Overall (+0 bytes)
Collection Base size (bytes) Diff size (bytes)
libraries_tests.run.linux.arm64.Release.mch 383,838,152 +0
FullOpts (+0 bytes)
Collection Base size (bytes) Diff size (bytes)
libraries_tests.run.linux.arm64.Release.mch 168,416,476 +0

Details here


Throughput diffs

Throughput diffs for linux/arm ran on windows/x86

Overall (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.linux.arm.checked.mch +0.01%
realworld.run.linux.arm.checked.mch +0.01%
FullOpts (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.linux.arm.checked.mch +0.01%
realworld.run.linux.arm.checked.mch +0.01%

Throughput diffs for windows/x86 ran on windows/x86

FullOpts (-0.00% to +0.01%)
Collection PDIFF
coreclr_tests.run.windows.x86.checked.mch +0.01%

Details here


@jakobbotsch
Copy link
Member Author

Hmm, there is actually a single diff in linux-arm64. Going to check what that is, but I expect it to just be something around block weighting that is subtly different...

@jakobbotsch
Copy link
Member Author

Ah, the diff is because the PR removes this quirk in unrolling:

// TODO-Quirk: Skip empty blocks and go directly to their destination.
BasicBlock* targetBlk = block->Next();
if (targetBlk->KindIs(BBJ_ALWAYS) && targetBlk->isEmpty())
targetBlk = targetBlk->GetTarget();

Didn't realize it was still there, but given the single diff that doesn't seem necessary to do separately.

@jakobbotsch jakobbotsch merged commit 1e8b750 into dotnet:main Jan 26, 2024
@jakobbotsch jakobbotsch deleted the factor-loop-duplication branch January 26, 2024 09:03
@github-actions github-actions bot locked and limited conversation to collaborators Feb 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants