-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SimplifyCFG] Always allow hoisting if all instructions match. #97158
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesUpdate FoldTwoEntryPHINode to collect common TBAA metadata for instructions that match in all if-blocks and have the same TBAA metadata. If that is the case, they access the same type on all paths and the TBAA info can be preserved after hoisting. I think we should be able to preserve most metadata, if it is available on matching instructions in all blocks, i.e. preserve the intersection of metadata on all matching instructions. I couldn't find any utility that already computes that intersection. At the moment, the order of of matching instructions must be the same. Full diff: https://github.com/llvm/llvm-project/pull/97158.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 6847bb7502429..c2774b8b74b4c 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -3624,6 +3624,29 @@ static bool FoldTwoEntryPHINode(PHINode *PN, const TargetTransformInfo &TTI,
<< " T: " << IfTrue->getName()
<< " F: " << IfFalse->getName() << "\n");
+ // Collect common TBAA metadata, for instructions that match in all if-blocks
+ // and have the same TBAA metadata. If that is the case, they access the same
+ // type on all paths and the TBAA info can be preserved after hoisting.
+ // TODO: preserve other common metadata.
+ LockstepReverseIterator LRI(IfBlocks);
+ DenseMap<Instruction *, MDNode *> CommonTBAA;
+ while (LRI.isValid()) {
+ auto Insts = *LRI;
+ Instruction *I0 = Insts.front();
+ MDNode *MD = I0->getMetadata(LLVMContext::MD_tbaa);
+ if (!MD || any_of(Insts, [I0, MD](Instruction *I) {
+ return !I->isSameOperationAs(I0) ||
+ !equal(I->operands(), I0->operands()) ||
+ I->getMetadata(LLVMContext::MD_tbaa) != MD;
+ })) {
+ --LRI;
+ continue;
+ }
+ for (Instruction *I : Insts)
+ CommonTBAA[I] = MD;
+ --LRI;
+ }
+
// If we can still promote the PHI nodes after this gauntlet of tests,
// do all of the PHI's now.
@@ -3632,6 +3655,10 @@ static bool FoldTwoEntryPHINode(PHINode *PN, const TargetTransformInfo &TTI,
for (BasicBlock *IfBlock : IfBlocks)
hoistAllInstructionsInto(DomBlock, DomBI, IfBlock);
+ for (Instruction &I : *DomBlock)
+ if (auto *MD = CommonTBAA.lookup(&I))
+ I.setMetadata(LLVMContext::MD_tbaa, MD);
+
IRBuilder<NoFolder> Builder(DomBI);
// Propagate fast-math-flags from phi nodes to replacement selects.
IRBuilder<>::FastMathFlagGuard FMFGuard(Builder);
diff --git a/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll b/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll
index 026002a4942af..4aea8634bafcb 100644
--- a/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll
+++ b/llvm/test/Transforms/SimplifyCFG/hoisting-metadata.ll
@@ -8,10 +8,8 @@ define i64 @hoist_load_with_matching_pointers_and_tbaa(i1 %c) {
; CHECK-NEXT: [[ENTRY:.*:]]
; CHECK-NEXT: [[TMP:%.*]] = alloca i64, align 8
; CHECK-NEXT: call void @init(ptr [[TMP]])
-; CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr [[TMP]], align 8
-; CHECK-NOT: !tbaa
-; CHECK-NEXT: [[TMP1:%.*]] = load i64, ptr [[TMP]], align 8
-; CHECK-NOT: !tbaa
+; CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr [[TMP]], align 8, !tbaa [[M:!.+]]
+; CHECK-NEXT: [[TMP1:%.*]] = load i64, ptr [[TMP]], align 8, !tbaa [[M]]
; CHECK-NEXT: [[P:%.*]] = select i1 [[C]], i64 [[TMP0]], i64 [[TMP1]]
; CHECK-NEXT: ret i64 [[P]]
;
|
This would get automatically handled if it used the actual hoisting logic -- why doesn't it? |
Do you mean |
Does llvm::combineMetadataForCSE work here? |
Right. My point here is that if hoisting is possible, we should hoist instead of performing this transform. What you are proposing here is an odd middle ground where we keep the metadata as if we were hoisting, but still have two separate instructions for the two branches. I assume the motivation here is that FoldTwoEntryPHINode is performed in early SimplifyCFG runs that do not enable hoisting -- I think the correct way to address this issue is to perform hoisting in the cases where it is possible and where FoldTwoEntryPHINode thinks its profitable, as what hoisting does is strictly better. The lazy way to do that would be to try calling hoistCommonCodeFromSuccessors() after FoldTwoEntryPHINode's profitability checks. The proper way to do it would be something like allowing early hoisting if it will hoist out all instructions including terminator (we already have an exception to allow early hoisting of just terminators). |
910a857
to
febeaa9
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
Thanks, updated this PR to relax Impact on
|
ping :) |
febeaa9
to
e4ad278
Compare
ping :) |
e4ad278
to
62beb89
Compare
ping :) |
1 similar comment
ping :) |
Just an orthogonal note a latere, I think this LockstepReverseIterator and its companion in GVNSink could be unified (logic seems basically the same, modulo the other uses an extra ActiveBlocks?), or at the least be polished up a bit? |
Not really. What I had in mind is that we only allow early hoisting if it actually ends up deduplicating the blocks entirely, i.e. we either hoist everything including the terminators, or nothing. That seems in the spirit of what the EqTermsOnly condition was doing previously. If you only allow hoisting identical instructions, then I think that's not really materially different from just allowing all hoisting. I think the only thing that would disable is the more sophisticated heuristics for finding those identical instructions? |
62beb89
to
fe2dd4b
Compare
Right, I update the code to check that the terminators in the successors match and the # of instructions is the same to ensure to only hoist if all instructions match |
fe2dd4b
to
e335c78
Compare
ping :) |
ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if no compile-time impact.
/// In that case, only the original BI will be replaced and selects for PHIs are | ||
/// added. | ||
/// function guarantees that BB dominates all successors. If AllInstsEqOnly is | ||
/// given, only perform hoisting in case all successors blocks contain matchin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// given, only perform hoisting in case all successors blocks contain matchin | |
/// given, only perform hoisting in case all successors blocks contain matching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks
/// added. | ||
/// function guarantees that BB dominates all successors. If AllInstsEqOnly is | ||
/// given, only perform hoisting in case all successors blocks contain matchin | ||
/// instructions only In that case, all instructions can be hoisted and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// instructions only In that case, all instructions can be hoisted and the | |
/// instructions only. In that case, all instructions can be hoisted and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed thanks
if (!Term->isSameOperationAs(Term0) || | ||
!equal(Term->operands(), Term0->operands())) | ||
return true; | ||
return Succs[0]->sizeWithoutDebug() != Succ->sizeWithoutDebug(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Succs[0]->sizeWithoutDebug() != Succ->sizeWithoutDebug(); | |
return Succs[0]->size() != Succ->size(); |
Now that we're using debug records, let's avoid the linear scan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
// terminator. Let the loop below handle those 2 cases. | ||
if (!AllSame) | ||
return false; | ||
// Now we know that all instructions in all successors can be hoisted Let |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Now we know that all instructions in all successors can be hoisted Let | |
// Now we know that all instructions in all successors can be hoisted. Let |
return true; | ||
return Succs[0]->sizeWithoutDebug() != Succ->sizeWithoutDebug(); | ||
})) { | ||
AllSame = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could directly init AllSame to the !any_of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (using none_of
), thanks
})) { | ||
AllSame = false; | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of this implementation is that it doesn't cover the case where one instruction uses the result of another (hoistable) instruction. But it's not needed for your use case...
Generalize EqTermsOnly to allow hoisting if all all instructions in the successor blocks match. This allows hoisting all instructions and removing the blocks we are hoisting from, so does not add any new instructions.
e335c78
to
e749e12
Compare
Change looks mostly within the noise stage1-O3 (+0.03%) |
Generalize hoistCommonCodeFromSuccessors's
EqTermsOnly
toAllInstsEqOnly
and always allow hoisting if all instructions match.In that case, all instructions can be hoisted and the
original branch will be replaced and selects for PHIs are added. This
allows preserving metadata in more cases, using the existing hoisting
logic, whereas previously FoldTwoEntryPHINode would drop the metadata.
https://llvm-compile-time-tracker.com/compare.php?from=716360367fbdabac2c374c19b8746f4de49a5599&to=986b2c47df516b31d998c055400e4f62aa76edc6&stat=instructions:u