-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Add late optimization pass for riscv #133256
Conversation
optimizeCondBranch isn't allowed to modify the CFG, but it can rewrite the branch condition freely. However, If we could fold a conditional branch to an unconditional one (aside from that restriction), we can also rewrite it into some canonical conditional branch instead. Looking at the diffs, the only cases this catches in tree tests are cases where we could have constant folded during lowering from IR, but didn't. This is inspired by trying to salvage code from llvm#131684 which might be useful. Given the test impact, it's of questionable merits. The main advantage over only the late cleanup pass is that it kills off the LIs for the constants early - which can help e.g. register allocation.
This patch is an alternative to PRs llvm#117060, llvm#131684, llvm#131728. The patch adds a late optimization pass that replaces conditional branches that can be statically evaluated with an unconditinal branch. Adding michael as a co-author as most of code that evaluates the condition comes from llvm#131684.
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-risc-v Author: Mikhail R. Gadelha (mikhailramalho) ChangesThis patch is an alternative to PRs #117060, #131684, #131728, and builds on top of #132988. The patch adds a late optimization pass that replaces conditional branches that can be statically evaluated with an unconditinal branch. Once this PR and #132988 land, I plan to send a follow-up patch that reduces code size by adding a branch folding pass after the newly added late optimization pass. Adding Michael as a co-author as most of the code that evaluates the condition comes from #131684. Co-authored-by: Michael Maitland michaeltmaitland@gmail.com Patch is 38.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133256.diff 15 Files Affected:
diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index e8d00f4df7c86..c9609d224414d 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -35,6 +35,7 @@ add_llvm_target(RISCVCodeGen
RISCVConstantPoolValue.cpp
RISCVDeadRegisterDefinitions.cpp
RISCVMakeCompressible.cpp
+ RISCVLateOpt.cpp
RISCVExpandAtomicPseudoInsts.cpp
RISCVExpandPseudoInsts.cpp
RISCVFoldMemOffset.cpp
diff --git a/llvm/lib/Target/RISCV/RISCV.h b/llvm/lib/Target/RISCV/RISCV.h
index 641e2eb4094f9..1f1d7e1fa21df 100644
--- a/llvm/lib/Target/RISCV/RISCV.h
+++ b/llvm/lib/Target/RISCV/RISCV.h
@@ -40,6 +40,9 @@ void initializeRISCVLandingPadSetupPass(PassRegistry &);
FunctionPass *createRISCVISelDag(RISCVTargetMachine &TM,
CodeGenOptLevel OptLevel);
+FunctionPass *createRISCVLateOptPass();
+void initializeRISCVLateOptPass(PassRegistry &);
+
FunctionPass *createRISCVMakeCompressibleOptPass();
void initializeRISCVMakeCompressibleOptPass(PassRegistry &);
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 62f978d64fbb9..af9975aca206e 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -998,6 +998,25 @@ static RISCVCC::CondCode getCondFromBranchOpc(unsigned Opc) {
}
}
+static bool evaluateCondBranch(unsigned CC, int64_t C0, int64_t C1) {
+ switch (CC) {
+ default:
+ llvm_unreachable("Unexpected CC");
+ case RISCVCC::COND_EQ:
+ return C0 == C1;
+ case RISCVCC::COND_NE:
+ return C0 != C1;
+ case RISCVCC::COND_LT:
+ return C0 < C1;
+ case RISCVCC::COND_GE:
+ return C0 >= C1;
+ case RISCVCC::COND_LTU:
+ return (uint64_t)C0 < (uint64_t)C1;
+ case RISCVCC::COND_GEU:
+ return (uint64_t)C0 >= (uint64_t)C1;
+ }
+}
+
// The contents of values added to Cond are not examined outside of
// RISCVInstrInfo, giving us flexibility in what to push to it. For RISCV, we
// push BranchOpcode, Reg1, Reg2.
@@ -1295,6 +1314,49 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
RISCVCC::CondCode CC = static_cast<RISCVCC::CondCode>(Cond[0].getImm());
assert(CC != RISCVCC::COND_INVALID);
+ auto modifyBranch = [&]() {
+ // Build the new branch and remove the old one.
+ BuildMI(*MBB, MI, MI.getDebugLoc(),
+ getBrCond(static_cast<RISCVCC::CondCode>(Cond[0].getImm())))
+ .add(Cond[1])
+ .add(Cond[2])
+ .addMBB(TBB);
+ MI.eraseFromParent();
+ };
+
+ // Right now we only care about LI (i.e. ADDI x0, imm)
+ auto isLoadImm = [](const MachineInstr *MI, int64_t &Imm) -> bool {
+ if (MI->getOpcode() == RISCV::ADDI && MI->getOperand(1).isReg() &&
+ MI->getOperand(1).getReg() == RISCV::X0) {
+ Imm = MI->getOperand(2).getImm();
+ return true;
+ }
+ return false;
+ };
+ // Either a load from immediate instruction or X0.
+ auto isFromLoadImm = [&](const MachineOperand &Op, int64_t &Imm) -> bool {
+ if (!Op.isReg())
+ return false;
+ Register Reg = Op.getReg();
+ if (Reg == RISCV::X0) {
+ Imm = 0;
+ return true;
+ }
+ return Reg.isVirtual() && isLoadImm(MRI.getVRegDef(Reg), Imm);
+ };
+
+ // Canonicalize conditional branches which can be constant folded into
+ // beqz or bnez. We can't modify the CFG here.
+ int64_t C0, C1;
+ if (isFromLoadImm(Cond[1], C0) && isFromLoadImm(Cond[2], C1)) {
+ unsigned NewCC =
+ evaluateCondBranch(CC, C0, C1) ? RISCVCC::COND_EQ : RISCVCC::COND_NE;
+ Cond[0] = MachineOperand::CreateImm(NewCC);
+ Cond[1] = Cond[2] = MachineOperand::CreateReg(RISCV::X0, /*isDef=*/false);
+ modifyBranch();
+ return true;
+ }
+
if (CC == RISCVCC::COND_EQ || CC == RISCVCC::COND_NE)
return false;
@@ -1315,24 +1377,6 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
//
// To make sure this optimization is really beneficial, we only
// optimize for cases where Y had only one use (i.e. only used by the branch).
-
- // Right now we only care about LI (i.e. ADDI x0, imm)
- auto isLoadImm = [](const MachineInstr *MI, int64_t &Imm) -> bool {
- if (MI->getOpcode() == RISCV::ADDI && MI->getOperand(1).isReg() &&
- MI->getOperand(1).getReg() == RISCV::X0) {
- Imm = MI->getOperand(2).getImm();
- return true;
- }
- return false;
- };
- // Either a load from immediate instruction or X0.
- auto isFromLoadImm = [&](const MachineOperand &Op, int64_t &Imm) -> bool {
- if (!Op.isReg())
- return false;
- Register Reg = Op.getReg();
- return Reg.isVirtual() && isLoadImm(MRI.getVRegDef(Reg), Imm);
- };
-
MachineOperand &LHS = MI.getOperand(0);
MachineOperand &RHS = MI.getOperand(1);
// Try to find the register for constant Z; return
@@ -1350,8 +1394,6 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
return Register();
};
- bool Modify = false;
- int64_t C0;
if (isFromLoadImm(LHS, C0) && MRI.hasOneUse(LHS.getReg())) {
// Might be case 1.
// Signed integer overflow is UB. (UINT64_MAX is bigger so we don't need
@@ -1364,7 +1406,8 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
// We might extend the live range of Z, clear its kill flag to
// account for this.
MRI.clearKillFlags(RegZ);
- Modify = true;
+ modifyBranch();
+ return true;
}
} else if (isFromLoadImm(RHS, C0) && MRI.hasOneUse(RHS.getReg())) {
// Might be case 2.
@@ -1378,22 +1421,12 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
// We might extend the live range of Z, clear its kill flag to
// account for this.
MRI.clearKillFlags(RegZ);
- Modify = true;
+ modifyBranch();
+ return true;
}
}
- if (!Modify)
- return false;
-
- // Build the new branch and remove the old one.
- BuildMI(*MBB, MI, MI.getDebugLoc(),
- getBrCond(static_cast<RISCVCC::CondCode>(Cond[0].getImm())))
- .add(Cond[1])
- .add(Cond[2])
- .addMBB(TBB);
- MI.eraseFromParent();
-
- return true;
+ return false;
}
MachineBasicBlock *
diff --git a/llvm/lib/Target/RISCV/RISCVLateOpt.cpp b/llvm/lib/Target/RISCV/RISCVLateOpt.cpp
new file mode 100644
index 0000000000000..7fa04f0cbba9b
--- /dev/null
+++ b/llvm/lib/Target/RISCV/RISCVLateOpt.cpp
@@ -0,0 +1,190 @@
+//===-- RISCVLateOpt.cpp - Late stage optimization ------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// This file provides RISC-V specific target descriptions.
+///
+//===----------------------------------------------------------------------===//
+
+#include "MCTargetDesc/RISCVMCTargetDesc.h"
+#include "RISCV.h"
+#include "RISCVInstrInfo.h"
+#include "RISCVSubtarget.h"
+#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
+#include "llvm/CodeGen/MachineDominators.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/Passes.h"
+#include "llvm/CodeGen/RegisterScavenging.h"
+#include "llvm/MC/TargetRegistry.h"
+#include "llvm/Support/Debug.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "riscv-late-opt"
+#define RISCV_LATE_OPT_NAME "RISC-V Late Stage Optimizations"
+
+namespace {
+
+struct RISCVLateOpt : public MachineFunctionPass {
+ static char ID;
+
+ RISCVLateOpt() : MachineFunctionPass(ID) {}
+
+ StringRef getPassName() const override { return RISCV_LATE_OPT_NAME; }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override {
+ MachineFunctionPass::getAnalysisUsage(AU);
+ }
+
+ bool runOnMachineFunction(MachineFunction &Fn) override;
+
+private:
+ bool trySimplifyCondBr(MachineInstr *MI, MachineBasicBlock *TBB,
+ MachineBasicBlock *FBB,
+ SmallVectorImpl<MachineOperand> &Cond) const;
+
+ const RISCVInstrInfo *RII = nullptr;
+};
+} // namespace
+
+char RISCVLateOpt::ID = 0;
+INITIALIZE_PASS(RISCVLateOpt, "riscv-late-opt", RISCV_LATE_OPT_NAME, false,
+ false)
+
+bool RISCVLateOpt::trySimplifyCondBr(
+ MachineInstr *MI, MachineBasicBlock *TBB, MachineBasicBlock *FBB,
+ SmallVectorImpl<MachineOperand> &Cond) const {
+
+ RISCVCC::CondCode CC = static_cast<RISCVCC::CondCode>(Cond[0].getImm());
+ assert(CC != RISCVCC::COND_INVALID);
+
+ // Right now we only care about LI (i.e. ADDI x0, imm)
+ auto isLoadImm = [](const MachineInstr *MI, int64_t &Imm) -> bool {
+ if (MI->getOpcode() == RISCV::ADDI && MI->getOperand(1).isReg() &&
+ MI->getOperand(1).getReg() == RISCV::X0) {
+ Imm = MI->getOperand(2).getImm();
+ return true;
+ }
+ return false;
+ };
+
+ MachineBasicBlock *MBB = MI->getParent();
+ MachineRegisterInfo &MRI = MBB->getParent()->getRegInfo();
+ // Either a load from immediate instruction or X0.
+ auto isFromLoadImm = [&](const MachineOperand &Op, int64_t &Imm) -> bool {
+ if (!Op.isReg())
+ return false;
+ Register Reg = Op.getReg();
+ if (Reg == RISCV::X0) {
+ Imm = 0;
+ return true;
+ }
+ return Reg.isVirtual() && isLoadImm(MRI.getVRegDef(Reg), Imm);
+ };
+
+ // Try and convert a conditional branch that can be evaluated statically
+ // into an unconditional branch.
+ MachineBasicBlock *Folded = nullptr;
+ int64_t C0, C1;
+ if (isFromLoadImm(Cond[1], C0) && isFromLoadImm(Cond[2], C1)) {
+ switch (CC) {
+ case RISCVCC::COND_INVALID:
+ llvm_unreachable("Unexpected CC");
+ case RISCVCC::COND_EQ: {
+ Folded = (C0 == C1) ? TBB : FBB;
+ break;
+ }
+ case RISCVCC::COND_NE: {
+ Folded = (C0 != C1) ? TBB : FBB;
+ break;
+ }
+ case RISCVCC::COND_LT: {
+ Folded = (C0 < C1) ? TBB : FBB;
+ break;
+ }
+ case RISCVCC::COND_GE: {
+ Folded = (C0 >= C1) ? TBB : FBB;
+ break;
+ }
+ case RISCVCC::COND_LTU: {
+ Folded = ((uint64_t)C0 < (uint64_t)C1) ? TBB : FBB;
+ break;
+ }
+ case RISCVCC::COND_GEU: {
+ Folded = ((uint64_t)C0 >= (uint64_t)C1) ? TBB : FBB;
+ break;
+ }
+ }
+
+ // Do the conversion
+ // Build the new unconditional branch
+ DebugLoc DL = MBB->findBranchDebugLoc();
+ if (Folded) {
+ BuildMI(*MBB, MI, DL, RII->get(RISCV::PseudoBR)).addMBB(Folded);
+ } else {
+ MachineFunction::iterator Fallthrough = ++MBB->getIterator();
+ if (Fallthrough == MBB->getParent()->end())
+ return false;
+ BuildMI(*MBB, MI, DL, RII->get(RISCV::PseudoBR)).addMBB(&*Fallthrough);
+ }
+
+ // Update successors of MBB->
+ if (Folded == TBB) {
+ // If we're taking TBB, then the succ to delete is the fallthrough (if
+ // it was a succ in the first place), or its the MBB from the
+ // unconditional branch.
+ if (!FBB) {
+ MachineFunction::iterator Fallthrough = ++MBB->getIterator();
+ if (Fallthrough != MBB->getParent()->end() &&
+ MBB->isSuccessor(&*Fallthrough))
+ MBB->removeSuccessor(&*Fallthrough, true);
+ } else {
+ MBB->removeSuccessor(FBB, true);
+ }
+ } else if (Folded == FBB) {
+ // If we're taking the fallthrough or unconditional branch, then the
+ // succ to remove is the one from the conditional branch.
+ MBB->removeSuccessor(TBB, true);
+ }
+
+ MI->eraseFromParent();
+ return true;
+ }
+ return false;
+}
+
+bool RISCVLateOpt::runOnMachineFunction(MachineFunction &Fn) {
+ if (skipFunction(Fn.getFunction()))
+ return false;
+
+ auto &ST = Fn.getSubtarget<RISCVSubtarget>();
+ RII = ST.getInstrInfo();
+
+ bool Changed = false;
+
+ for (MachineBasicBlock &MBB : Fn) {
+ for (MachineBasicBlock::iterator MII = MBB.begin(), MIE = MBB.end();
+ MII != MIE;) {
+ MachineInstr *MI = &*MII;
+ // We may be erasing MI below, increment MII now.
+ ++MII;
+ if (!MI->isConditionalBranch())
+ continue;
+
+ MachineBasicBlock *TBB, *FBB;
+ SmallVector<MachineOperand, 4> Cond;
+ if (!RII->analyzeBranch(MBB, TBB, FBB, Cond, /*AllowModify=*/false))
+ Changed |= trySimplifyCondBr(MI, TBB, FBB, Cond);
+ }
+ }
+
+ return Changed;
+}
+
+/// Returns an instance of the Make Compressible Optimization pass.
+FunctionPass *llvm::createRISCVLateOptPass() { return new RISCVLateOpt(); }
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index f78e5f8147d98..40c1aead7991b 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -127,6 +127,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeRISCVTarget() {
initializeRISCVPostLegalizerCombinerPass(*PR);
initializeKCFIPass(*PR);
initializeRISCVDeadRegisterDefinitionsPass(*PR);
+ initializeRISCVLateOptPass(*PR);
initializeRISCVMakeCompressibleOptPass(*PR);
initializeRISCVGatherScatterLoweringPass(*PR);
initializeRISCVCodeGenPreparePass(*PR);
@@ -565,6 +566,7 @@ void RISCVPassConfig::addPreEmitPass() {
if (TM->getOptLevel() >= CodeGenOptLevel::Default &&
EnableRISCVCopyPropagation)
addPass(createMachineCopyPropagationPass(true));
+ addPass(createRISCVLateOptPass());
addPass(&BranchRelaxationPassID);
addPass(createRISCVMakeCompressibleOptPass());
}
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
index 338925059862c..95af7861d4798 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
@@ -357,7 +357,7 @@ define i64 @ctpop_i64(i64 %a) nounwind {
define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
; RV32I-LABEL: ctpop_i64_ugt_two:
; RV32I: # %bb.0:
-; RV32I-NEXT: beqz zero, .LBB6_2
+; RV32I-NEXT: j .LBB6_2
; RV32I-NEXT: # %bb.1:
; RV32I-NEXT: sltiu a0, zero, 0
; RV32I-NEXT: ret
@@ -404,7 +404,7 @@ define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
;
; RV32ZBB-LABEL: ctpop_i64_ugt_two:
; RV32ZBB: # %bb.0:
-; RV32ZBB-NEXT: beqz zero, .LBB6_2
+; RV32ZBB-NEXT: j .LBB6_2
; RV32ZBB-NEXT: # %bb.1:
; RV32ZBB-NEXT: sltiu a0, zero, 0
; RV32ZBB-NEXT: ret
@@ -422,7 +422,7 @@ define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
define i1 @ctpop_i64_ugt_one(i64 %a) nounwind {
; RV32I-LABEL: ctpop_i64_ugt_one:
; RV32I: # %bb.0:
-; RV32I-NEXT: beqz zero, .LBB7_2
+; RV32I-NEXT: j .LBB7_2
; RV32I-NEXT: # %bb.1:
; RV32I-NEXT: snez a0, zero
; RV32I-NEXT: ret
@@ -470,7 +470,7 @@ define i1 @ctpop_i64_ugt_one(i64 %a) nounwind {
;
; RV32ZBB-LABEL: ctpop_i64_ugt_one:
; RV32ZBB: # %bb.0:
-; RV32ZBB-NEXT: beqz zero, .LBB7_2
+; RV32ZBB-NEXT: j .LBB7_2
; RV32ZBB-NEXT: # %bb.1:
; RV32ZBB-NEXT: snez a0, zero
; RV32ZBB-NEXT: ret
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 694662eab1681..0b02c8a5e66cb 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -62,6 +62,7 @@
; CHECK-NEXT: Insert fentry calls
; CHECK-NEXT: Insert XRay ops
; CHECK-NEXT: Implement the 'patchable-function' attribute
+; CHECK-NEXT: RISC-V Late Stage Optimizations
; CHECK-NEXT: Branch relaxation pass
; CHECK-NEXT: RISC-V Make Compressible
; CHECK-NEXT: Contiguously Lay Out Funclets
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index beef7a574dc4f..1ca9ecaac6342 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -194,6 +194,7 @@
; CHECK-NEXT: Insert XRay ops
; CHECK-NEXT: Implement the 'patchable-function' attribute
; CHECK-NEXT: Machine Copy Propagation Pass
+; CHECK-NEXT: RISC-V Late Stage Optimizations
; CHECK-NEXT: Branch relaxation pass
; CHECK-NEXT: RISC-V Make Compressible
; CHECK-NEXT: Contiguously Lay Out Funclets
diff --git a/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll b/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll
index 51ea8873d8c03..b2558cde29832 100644
--- a/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll
+++ b/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll
@@ -11,8 +11,7 @@ declare bfloat @dummy(bfloat)
define void @br_fcmp_false(bfloat %a, bfloat %b) nounwind {
; RV32IZFBFMIN-LABEL: br_fcmp_false:
; RV32IZFBFMIN: # %bb.0:
-; RV32IZFBFMIN-NEXT: li a0, 1
-; RV32IZFBFMIN-NEXT: bnez a0, .LBB0_2
+; RV32IZFBFMIN-NEXT: j .LBB0_2
; RV32IZFBFMIN-NEXT: # %bb.1: # %if.then
; RV32IZFBFMIN-NEXT: ret
; RV32IZFBFMIN-NEXT: .LBB0_2: # %if.else
@@ -22,8 +21,7 @@ define void @br_fcmp_false(bfloat %a, bfloat %b) nounwind {
;
; RV64IZFBFMIN-LABEL: br_fcmp_false:
; RV64IZFBFMIN: # %bb.0:
-; RV64IZFBFMIN-NEXT: li a0, 1
-; RV64IZFBFMIN-NEXT: bnez a0, .LBB0_2
+; RV64IZFBFMIN-NEXT: j .LBB0_2
; RV64IZFBFMIN-NEXT: # %bb.1: # %if.then
; RV64IZFBFMIN-NEXT: ret
; RV64IZFBFMIN-NEXT: .LBB0_2: # %if.else
@@ -583,8 +581,7 @@ if.then:
define void @br_fcmp_true(bfloat %a, bfloat %b) nounwind {
; RV32IZFBFMIN-LABEL: br_fcmp_true:
; RV32IZFBFMIN: # %bb.0:
-; RV32IZFBFMIN-NEXT: li a0, 1
-; RV32IZFBFMIN-NEXT: bnez a0, .LBB16_2
+; RV32IZFBFMIN-NEXT: j .LBB16_2
; RV32IZFBFMIN-NEXT: # %bb.1: # %if.else
; RV32IZFBFMIN-NEXT: ret
; RV32IZFBFMIN-NEXT: .LBB16_2: # %if.then
@@ -594,8 +591,7 @@ define void @br_fcmp_true(bfloat %a, bfloat %b) nounwind {
;
; RV64IZFBFMIN-LABEL: br_fcmp_true:
; RV64IZFBFMIN: # %bb.0:
-; RV64IZFBFMIN-NEXT: li a0, 1
-; RV64IZFBFMIN-NEXT: bnez a0, .LBB16_2
+; RV64IZFBFMIN-NEXT: j .LBB16_2
; RV64IZFBFMIN-NEXT: # %bb.1: # %if.else
; RV64IZFBFMIN-NEXT: ret
; RV64IZFBFMIN-NEXT: .LBB16_2: # %if.then
diff --git a/llvm/test/CodeGen/RISCV/branch_zero.ll b/llvm/test/CodeGen/RISCV/branch_zero.ll
index fd0979977ba3b..0554f8c168c80 100644
--- a/llvm/test/CodeGen/RISCV/branch_zero.ll
+++ b/llvm/test/CodeGen/RISCV/branch_zero.ll
@@ -5,15 +5,13 @@
define void @foo(i16 %finder_idx) {
; CHECK-LABEL: foo:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: .LBB0_1: # %for.body
-; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: # %bb.1: # %for.body
; CHECK-NEXT: slli a0, a0, 48
; CHECK-NEXT: bltz a0, .LBB0_4
; CHECK-NEXT: # %bb.2: # %while.cond.preheader.i
-; CHECK-NEXT: # in Loop: Header=BB0_1 Depth=1
; CHECK-NEXT: li a0, 0
-; CHECK-NEXT: bnez zero, .LBB0_1
-; CHECK-NEXT: # %bb.3: # %while.body
+; CHECK-NEXT: j .LBB0_3
+; CHECK-NEXT: .LBB0_3: # %while.body
; CHECK-NEXT: .LBB0_4: # %while.cond1.preheader.i
entry:
br label %for.body
@@ -46,15 +44,13 @@ if.then:
define void @bar(i16 %finder_idx) {
; CHECK-LABEL: bar:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: .LBB1_1: # %for.body
-; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: # %bb.1: # %for.body
; CHECK-NEXT: slli a0, a0, 48
; CHECK-NEXT: bgez a0, .LBB1_4
; CHECK-NEXT: # %bb.2: # %while.cond.preheader.i
-; CHECK-NEXT: # in Loop: Header=BB1_1 Depth=1
; CHECK-NEXT: li a0, 0
-; CHECK-NEXT: bnez zero, .LBB1_1
-; CHECK-NEXT: # %bb.3: # %while.body
+; CHECK-NEXT: j .LBB1_3
+; CHECK-NEXT: .LBB1_3: # %while.body
; CHECK-NEXT: .LBB1_4: # %while.cond1.preheader.i
entry:
br label %for.body
diff --git a/llvm/test/CodeGen/RISCV/double-br-fcmp.ll b/llvm/test/CodeGen/RISCV/double-br-fcmp.ll
index 035228e73c707..b2c882878f8bc 100644
--- a/llvm/test/CodeGen/RISCV/double-br-fcmp.ll
+++ b/llvm/test/CodeGen/RISCV/double-br-fcmp.ll
@@ -14,8 +14,7 @@ declare void @exit(i32)
define void @br_fcmp_false(double %a, double %b) nounwind {
; RV32IFD-LABEL: br_fcmp_false:
; RV32IFD: # %bb.0:
-; RV32IFD-NEXT: li a0, 1
-; RV32IFD-NEXT: bnez a0, .LBB0_2
+; RV32IFD-NEXT: j .LBB0_2
; RV32IFD-NEXT: # %bb.1: # %if.then
; RV32IFD-NEXT: ret
; RV32IFD-NEXT: .LBB0_2: # %if.else
@@ -25,8 +24,7 @@ define void @br_fcmp_false(double %a, double %b) nounwind {
;
; RV64IFD-LABEL: br_fcmp_false:
; RV64IFD: # %bb.0:
-; RV64IFD-NEXT: li a0, 1
-; RV64IFD-NEXT: bnez a0, .LBB0_2
+; RV64IFD-NEXT: j .LBB0_2
; RV64IFD-NEXT: # %bb.1: # %if.then
; RV64IFD-NEX...
[truncated]
|
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
…unction of RISCVInstrInfo Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Can we add a test that shows the issue from real code? |
@mikhailramalho feel free to use this test reduced from perlbench |
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
I see that #131728 has a different version with an extra function; should I use that instead? |
I forgot about that one. Thats a good idea. The reason we got an extra test there is because the first one got cleaned up in BranchFolding with a call to the modified analyzeBranch. The second one didn't though, and needed the late optimizer. Without changes to analyzeBranch (the current approach we're pursuing), we will optimize both here I think. |
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but obviously we need to wait for @preames change to go in first.
He landed it this morning. I'll update the patch description. |
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Can we call this something like "RISC-V Late Peephole Optimisation Pass" or "RISC-V Late Branch Optimisation Pass", rather than just "Late Optimisation". I don't want that to prevent us adding more things to it in the future, but I do want to have a slightly more descriptive name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can rename it to "RISC-V Late Branch Optimisation Pass". I removed the Peephole from the previous version because we are changing the CFG |
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
I'm happy with the new name, thanks! |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/7720 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/17/builds/6824 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/143/builds/6502 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2595 Here is the relevant piece of the build log for the reference
|
This is a follow-up patch to PR llvm#133256. This patch adds the branch folding pass after the newly added late optimization pass for riscv, which reduces code size in all SPEC benchmarks (except libm). The improvements are: 500.perlbench_r (-3.37%), 544.nab_r (-3.06%), 557.xz_r (-2.82%), 523.xalancbmk_r (-2.64%), 520.omnetpp_r (-2.34%), 531.deepsjeng_r (-2.27%), 502.gcc_r (-2.19%), 526.blender_r (-2.11%), 538.imagick_r (-2.03%), 505.mcf_r (-1.82%), 541.leela_r (-1.74%), 511.povray_r (-1.62%), 510.parest_r (-1.62%), 508.namd_r (-1.57%), 525.x264_r (-1.47%). Geo mean is -2.07%. Some caveats: * On llvm#131728 I mentioned a 7% improvement on execution time of xz, but that's no longer the case. I went back and also tried to reproduce the result with the code from llvm#131728 and couldn't. Now the results from that PR and this one are the same: an overall code size reduction but no exec time improvements. * The root cause of the large number is not yet clear for me. I'm still investigating it.
This patch is an alternative to PRs #117060, #131684, #131728.
The patch adds a late optimization pass that replaces conditional branches that can be statically evaluated with an unconditinal branch.
Once this PR lands, I plan to send a follow-up patch that reduces code size by adding a branch folding pass after the newly added late optimization pass.
Adding Michael as a co-author as most of the code that evaluates the condition comes from #131684.
Co-authored-by: Michael Maitland michaeltmaitland@gmail.com