Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SimplifyCFG] Always allow hoisting if all instructions match. #97158

Merged
merged 2 commits into from
Dec 13, 2024

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Jun 29, 2024

Generalize hoistCommonCodeFromSuccessors's EqTermsOnly to
AllInstsEqOnly and always allow hoisting if all instructions match.

In that case, all instructions can be hoisted and the
original branch will be replaced and selects for PHIs are added. This
allows preserving metadata in more cases, using the existing hoisting
logic, whereas previously FoldTwoEntryPHINode would drop the metadata.

https://llvm-compile-time-tracker.com/compare.php?from=716360367fbdabac2c374c19b8746f4de49a5599&to=986b2c47df516b31d998c055400e4f62aa76edc6&stat=instructions:u

@llvmbot
Copy link
Member

llvmbot commented Jun 29, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Update FoldTwoEntryPHINode to collect common TBAA metadata for instructions that match in all if-blocks and have the same TBAA metadata. If that is the case, they access the same type on all paths and the TBAA info can be preserved after hoisting.

I think we should be able to preserve most metadata, if it is available on matching instructions in all blocks, i.e. preserve the intersection of metadata on all matching instructions. I couldn't find any utility that already computes that intersection. At the moment, the order of of matching instructions must be the same.


Full diff: https://github.com/llvm/llvm-project/pull/97158.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+27)
  • (modified) llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll (+2-4)
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 6847bb7502429..c2774b8b74b4c 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -3624,6 +3624,29 @@ static bool FoldTwoEntryPHINode(PHINode *PN, const TargetTransformInfo &TTI,
                     << "  T: " << IfTrue->getName()
                     << "  F: " << IfFalse->getName() << "\n");
 
+  // Collect common TBAA metadata, for instructions that match in all if-blocks
+  // and have the same TBAA metadata. If that is the case, they access the same
+  // type on all paths and the TBAA info can be preserved after hoisting.
+  // TODO: preserve other common metadata.
+  LockstepReverseIterator LRI(IfBlocks);
+  DenseMap<Instruction *, MDNode *> CommonTBAA;
+  while (LRI.isValid()) {
+    auto Insts = *LRI;
+    Instruction *I0 = Insts.front();
+    MDNode *MD = I0->getMetadata(LLVMContext::MD_tbaa);
+    if (!MD || any_of(Insts, [I0, MD](Instruction *I) {
+          return !I->isSameOperationAs(I0) ||
+                 !equal(I->operands(), I0->operands()) ||
+                 I->getMetadata(LLVMContext::MD_tbaa) != MD;
+        })) {
+      --LRI;
+      continue;
+    }
+    for (Instruction *I : Insts)
+      CommonTBAA[I] = MD;
+    --LRI;
+  }
+
   // If we can still promote the PHI nodes after this gauntlet of tests,
   // do all of the PHI's now.
 
@@ -3632,6 +3655,10 @@ static bool FoldTwoEntryPHINode(PHINode *PN, const TargetTransformInfo &TTI,
   for (BasicBlock *IfBlock : IfBlocks)
       hoistAllInstructionsInto(DomBlock, DomBI, IfBlock);
 
+  for (Instruction &I : *DomBlock)
+    if (auto *MD = CommonTBAA.lookup(&I))
+      I.setMetadata(LLVMContext::MD_tbaa, MD);
+
   IRBuilder<NoFolder> Builder(DomBI);
   // Propagate fast-math-flags from phi nodes to replacement selects.
   IRBuilder<>::FastMathFlagGuard FMFGuard(Builder);
diff --git a/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll b/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll
index 026002a4942af..4aea8634bafcb 100644
--- a/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll
+++ b/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll
@@ -8,10 +8,8 @@ define i64 @hoist_load_with_matching_pointers_and_tbaa(i1 %c) {
 ; CHECK-NEXT:  [[ENTRY:.*:]]
 ; CHECK-NEXT:    [[TMP:%.*]] = alloca i64, align 8
 ; CHECK-NEXT:    call void @init(ptr [[TMP]])
-; CHECK-NEXT:    [[TMP0:%.*]] = load i64, ptr [[TMP]], align 8
-; CHECK-NOT:       !tbaa
-; CHECK-NEXT:    [[TMP1:%.*]] = load i64, ptr [[TMP]], align 8
-; CHECK-NOT:       !tbaa
+; CHECK-NEXT:    [[TMP0:%.*]] = load i64, ptr [[TMP]], align 8, !tbaa [[M:!.+]]
+; CHECK-NEXT:    [[TMP1:%.*]] = load i64, ptr [[TMP]], align 8, !tbaa [[M]]
 ; CHECK-NEXT:    [[P:%.*]] = select i1 [[C]], i64 [[TMP0]], i64 [[TMP1]]
 ; CHECK-NEXT:    ret i64 [[P]]
 ;

@nikic
Copy link
Contributor

nikic commented Jun 29, 2024

This would get automatically handled if it used the actual hoisting logic -- why doesn't it?

@fhahn
Copy link
Contributor Author

fhahn commented Jun 29, 2024

This would get automatically handled if it used the actual hoisting logic -- why doesn't it?

Do you mean hoistCommonCodeFromSuccessors? The code in FoldTwoEntryPHINode hoist any code, not just common code. The current logic however only preserves TBAA metadata across common instructions for now

@efriedma-quic
Copy link
Collaborator

Does llvm::combineMetadataForCSE work here?

@nikic
Copy link
Contributor

nikic commented Jun 30, 2024

This would get automatically handled if it used the actual hoisting logic -- why doesn't it?

Do you mean hoistCommonCodeFromSuccessors? The code in FoldTwoEntryPHINode hoist any code, not just common code. The current logic however only preserves TBAA metadata across common instructions for now

Right. My point here is that if hoisting is possible, we should hoist instead of performing this transform. What you are proposing here is an odd middle ground where we keep the metadata as if we were hoisting, but still have two separate instructions for the two branches.

I assume the motivation here is that FoldTwoEntryPHINode is performed in early SimplifyCFG runs that do not enable hoisting -- I think the correct way to address this issue is to perform hoisting in the cases where it is possible and where FoldTwoEntryPHINode thinks its profitable, as what hoisting does is strictly better.

The lazy way to do that would be to try calling hoistCommonCodeFromSuccessors() after FoldTwoEntryPHINode's profitability checks. The proper way to do it would be something like allowing early hoisting if it will hoist out all instructions including terminator (we already have an exception to allow early hoisting of just terminators).

@fhahn fhahn force-pushed the simplifycfg-preserve-tbaa-hoisting branch from 910a857 to febeaa9 Compare September 9, 2024 20:14
Copy link

github-actions bot commented Sep 9, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@fhahn
Copy link
Contributor Author

fhahn commented Sep 9, 2024

This would get automatically handled if it used the actual hoisting logic -- why doesn't it?

Do you mean hoistCommonCodeFromSuccessors? The code in FoldTwoEntryPHINode hoist any code, not just common code. The current logic however only preserves TBAA metadata across common instructions for now

Right. My point here is that if hoisting is possible, we should hoist instead of performing this transform. What you are proposing here is an odd middle ground where we keep the metadata as if we were hoisting, but still have two separate instructions for the two branches.

I assume the motivation here is that FoldTwoEntryPHINode is performed in early SimplifyCFG runs that do not enable hoisting -- I think the correct way to address this issue is to perform hoisting in the cases where it is possible and where FoldTwoEntryPHINode thinks its profitable, as what hoisting does is strictly better.

The lazy way to do that would be to try calling hoistCommonCodeFromSuccessors() after FoldTwoEntryPHINode's profitability checks. The proper way to do it would be something like allowing early hoisting if it will hoist out all instructions including terminator (we already have an exception to allow early hoisting of just terminators).

Thanks, updated this PR to relax EqTermsOnly to allow hoisting for equivalent instructions. Could also limit to cases with 2 successors, similarly to FoldTwoEntryPHINode. Is that along the lines you had in mind?

Impact on simplifycfg.NumHoistCommonInstrs

Tests: 2074
Same hash: 2023 (filtered out)
Remaining: 51
Metric: simplifycfg.NumHoistCommonInstrs

Program                                       simplifycfg.NumHoistCommonInstrs
                                              lhs                              rhs      diff
SingleSource/Benchmarks/McGill/misr               0.00                             2.00  inf%
MultiSourc...Fhourstones-3.1/fhourstones3.1       0.00                             2.00  inf%
External/S...C/CINT2006/456.hmmer/456.hmmer     409.00                           465.00 13.7%
MultiSourc...ch/office-ispell/office-ispell      94.00                           106.00 12.8%
MultiSourc...rks/BitBench/uudecode/uudecode      18.00                            20.00 11.1%
MultiSource/Benchmarks/nbench/nbench            107.00                           118.00 10.3%
External/S...FP2017rate/544.nab_r/544.nab_r     299.00                           329.00 10.0%
MultiSource/Applications/siod/siod               98.00                           106.00  8.2%
MultiSourc...olangs-C/unix-smail/unix-smail      26.00                            28.00  7.7%
MultiSource/Applications/oggenc/oggenc          562.00                           596.00  6.0%
External/S...te/520.omnetpp_r/520.omnetpp_r    3354.00                          3547.00  5.8%
MultiSourc...e/Applications/minisat/minisat      36.00                            38.00  5.6%
External/S...FP2006/482.sphinx3/482.sphinx3     223.00                           235.00  5.4%
External/S.../CFP2006/447.dealII/447.dealII    4022.00                          4220.00  4.9%
MultiSource/Benchmarks/Ptrdist/bc/bc            124.00                           130.00  4.8%
MultiSourc...e/Applications/obsequi/Obsequi      84.00                            88.00  4.8%
External/S...te/538.imagick_r/538.imagick_r    4649.00                          4858.00  4.5%
MultiSourc...sumer-typeset/consumer-typeset     572.00                           594.00  3.8%
MultiSource/Applications/treecc/treecc          422.00                           438.00  3.8%
External/S...NT2017rate/502.gcc_r/502.gcc_r   13107.00                         13594.00  3.7%
External/SPEC/CFP2006/444.namd/444.namd         110.00                           114.00  3.6%
MultiSourc...e/Applications/sqlite3/sqlite3     581.00                           601.00  3.4%
External/S...17rate/541.leela_r/541.leela_r     299.00                           309.00  3.3%
External/SPEC/CFP2006/433.milc/433.milc         184.00                           190.00  3.3%
MultiSourc...gs-C/TimberWolfMC/timberwolfmc     564.00                           582.00  3.2%
MultiSourc...ity-blowfish/security-blowfish      74.00                            76.00  2.7%
MultiSourc.../mediabench/jpeg/jpeg-6a/cjpeg     451.00                           463.00  2.7%
MultiSourc...Benchmarks/Ptrdist/yacr2/yacr2     152.00                           156.00  2.6%
External/S...NT2006/464.h264ref/464.h264ref    5319.00                          5457.00  2.6%
MultiSourc...e/Applications/ClamAV/clamscan    1235.00                          1267.00  2.6%
External/S...rate/511.povray_r/511.povray_r    1727.00                          1765.00  2.2%
External/S.../CFP2006/453.povray/453.povray    1757.00                          1795.00  2.2%
MultiSourc...ch/consumer-jpeg/consumer-jpeg     477.00                           487.00  2.1%
MultiSourc.../Applications/JM/lencod/lencod    5711.00                          5823.00  2.0%
External/S...06/400.perlbench/400.perlbench    2354.00                          2398.00  1.9%
External/S.../CFP2006/450.soplex/450.soplex    1301.00                          1325.00  1.8%
External/SPEC/CINT2006/403.gcc/403.gcc         6182.00                          6281.00  1.6%
External/S...06/483.xalancbmk/483.xalancbmk    4147.00                          4203.00  1.4%
External/S...23.xalancbmk_r/523.xalancbmk_r    5648.00                          5712.00  1.1%
External/S...00.perlbench_r/500.perlbench_r    5274.00                          5324.00  0.9%
MultiSourc.../MallocBench/espresso/espresso     244.00                           246.00  0.8%
MultiSource/Applications/d/make_dparser         528.00                           532.00  0.8%
MultiSourc.../DOE-ProxyApps-C++/CLAMR/CLAMR     811.00                           817.00  0.7%
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000    1504.00                          1510.00  0.4%
MultiSourc...Benchmarks/7zip/7zip-benchmark    2623.00                          2633.00  0.4%
External/S...te/526.blender_r/526.blender_r   27768.00                         27835.00  0.2%
MultiSource/Applications/kimwitu++/kc          1146.00                          1148.00  0.2%
MultiSourc...enchmarks/VersaBench/dbms/dbms      28.00                            28.00  0.0%
MultiSource/Applications/SPASS/SPASS           1213.00                          1209.00 -0.3%
External/S...C/CINT2006/445.gobmk/445.gobmk    1396.00                          1384.00 -0.9%
SingleSour.../execute/GCC-C-execute-pr90949       0.00                             0.00

@fhahn
Copy link
Contributor Author

fhahn commented Sep 24, 2024

ping :)

@fhahn fhahn force-pushed the simplifycfg-preserve-tbaa-hoisting branch from febeaa9 to e4ad278 Compare October 2, 2024 12:00
@fhahn
Copy link
Contributor Author

fhahn commented Oct 2, 2024

ping :)

@fhahn fhahn force-pushed the simplifycfg-preserve-tbaa-hoisting branch from e4ad278 to 62beb89 Compare October 17, 2024 16:35
@fhahn
Copy link
Contributor Author

fhahn commented Oct 17, 2024

ping :)

1 similar comment
@fhahn
Copy link
Contributor Author

fhahn commented Nov 13, 2024

ping :)

@antoniofrighetto
Copy link
Contributor

Just an orthogonal note a latere, I think this LockstepReverseIterator and its companion in GVNSink could be unified (logic seems basically the same, modulo the other uses an extra ActiveBlocks?), or at the least be polished up a bit?

@nikic
Copy link
Contributor

nikic commented Nov 14, 2024

Thanks, updated this PR to relax EqTermsOnly to allow hoisting for equivalent instructions. Could also limit to cases with 2 successors, similarly to FoldTwoEntryPHINode. Is that along the lines you had in mind?

Not really. What I had in mind is that we only allow early hoisting if it actually ends up deduplicating the blocks entirely, i.e. we either hoist everything including the terminators, or nothing. That seems in the spirit of what the EqTermsOnly condition was doing previously.

If you only allow hoisting identical instructions, then I think that's not really materially different from just allowing all hoisting. I think the only thing that would disable is the more sophisticated heuristics for finding those identical instructions?

@fhahn fhahn force-pushed the simplifycfg-preserve-tbaa-hoisting branch from 62beb89 to fe2dd4b Compare November 25, 2024 11:40
@fhahn
Copy link
Contributor Author

fhahn commented Nov 25, 2024

Thanks, updated this PR to relax EqTermsOnly to allow hoisting for equivalent instructions. Could also limit to cases with 2 successors, similarly to FoldTwoEntryPHINode. Is that along the lines you had in mind?

Not really. What I had in mind is that we only allow early hoisting if it actually ends up deduplicating the blocks entirely, i.e. we either hoist everything including the terminators, or nothing. That seems in the spirit of what the EqTermsOnly condition was doing previously.

If you only allow hoisting identical instructions, then I think that's not really materially different from just allowing all hoisting. I think the only thing that would disable is the more sophisticated heuristics for finding those identical instructions?

Right, I update the code to check that the terminators in the successors match and the # of instructions is the same to ensure to only hoist if all instructions match

@fhahn fhahn force-pushed the simplifycfg-preserve-tbaa-hoisting branch from fe2dd4b to e335c78 Compare November 25, 2024 11:45
@fhahn
Copy link
Contributor Author

fhahn commented Dec 2, 2024

ping :)

@fhahn
Copy link
Contributor Author

fhahn commented Dec 12, 2024

ping :)

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if no compile-time impact.

/// In that case, only the original BI will be replaced and selects for PHIs are
/// added.
/// function guarantees that BB dominates all successors. If AllInstsEqOnly is
/// given, only perform hoisting in case all successors blocks contain matchin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// given, only perform hoisting in case all successors blocks contain matchin
/// given, only perform hoisting in case all successors blocks contain matching

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks

/// added.
/// function guarantees that BB dominates all successors. If AllInstsEqOnly is
/// given, only perform hoisting in case all successors blocks contain matchin
/// instructions only In that case, all instructions can be hoisted and the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// instructions only In that case, all instructions can be hoisted and the
/// instructions only. In that case, all instructions can be hoisted and the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed thanks

if (!Term->isSameOperationAs(Term0) ||
!equal(Term->operands(), Term0->operands()))
return true;
return Succs[0]->sizeWithoutDebug() != Succ->sizeWithoutDebug();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Succs[0]->sizeWithoutDebug() != Succ->sizeWithoutDebug();
return Succs[0]->size() != Succ->size();

Now that we're using debug records, let's avoid the linear scan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks!

// terminator. Let the loop below handle those 2 cases.
if (!AllSame)
return false;
// Now we know that all instructions in all successors can be hoisted Let
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Now we know that all instructions in all successors can be hoisted Let
// Now we know that all instructions in all successors can be hoisted. Let

return true;
return Succs[0]->sizeWithoutDebug() != Succ->sizeWithoutDebug();
})) {
AllSame = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could directly init AllSame to the !any_of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (using none_of), thanks

})) {
AllSame = false;
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of this implementation is that it doesn't cover the case where one instruction uses the result of another (hoistable) instruction. But it's not needed for your use case...

Generalize EqTermsOnly to allow hoisting if all all instructions in the
successor blocks match. This allows hoisting all instructions and removing
the blocks we are hoisting from, so does not add any new instructions.
@fhahn fhahn force-pushed the simplifycfg-preserve-tbaa-hoisting branch from e335c78 to e749e12 Compare December 13, 2024 12:29
@fhahn
Copy link
Contributor Author

fhahn commented Dec 13, 2024

LGTM if no compile-time impact.

Change looks mostly within the noise

stage1-O3 (+0.03%)
stage1-ReleaseThinLTO (+0.03%)
stage1-ReleaseLTO-g (+0.04%)
stage1-O0-g (-0.02%)
stage2-O3 (+0.01%)
stage2-O0-g (-0.02%)
stage2-clang (+0.01%)

@fhahn fhahn changed the title [SimplifyCFG] Preserve common TBAA metadata when hoisting instructions. [SimplifyCFG] Allows allow hoisting if all instructions match. Dec 13, 2024
@fhahn fhahn changed the title [SimplifyCFG] Allows allow hoisting if all instructions match. [SimplifyCFG] Allow hoisting if all instructions match. Dec 13, 2024
@fhahn fhahn changed the title [SimplifyCFG] Allow hoisting if all instructions match. [SimplifyCFG] Always allow hoisting if all instructions match. Dec 13, 2024
@fhahn fhahn merged commit c4a78b6 into llvm:main Dec 13, 2024
8 checks passed
@fhahn fhahn deleted the simplifycfg-preserve-tbaa-hoisting branch December 13, 2024 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants