Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Add late optimization pass for riscv #133256

Merged
merged 28 commits into from
Mar 27, 2025

Conversation

mikhailramalho
Copy link
Member

@mikhailramalho mikhailramalho commented Mar 27, 2025

This patch is an alternative to PRs #117060, #131684, #131728.

The patch adds a late optimization pass that replaces conditional branches that can be statically evaluated with an unconditinal branch.

Once this PR lands, I plan to send a follow-up patch that reduces code size by adding a branch folding pass after the newly added late optimization pass.

Adding Michael as a co-author as most of the code that evaluates the condition comes from #131684.

Co-authored-by: Michael Maitland michaeltmaitland@gmail.com

preames and others added 4 commits March 25, 2025 12:52
optimizeCondBranch isn't allowed to modify the CFG, but it can rewrite
the branch condition freely.  However, If we could fold a conditional
branch to an unconditional one (aside from that restriction), we can
also rewrite it into some canonical conditional branch instead.

Looking at the diffs, the only cases this catches in tree tests are
cases where we could have constant folded during lowering from IR,
but didn't.  This is inspired by trying to salvage code from llvm#131684
which might be useful.  Given the test impact, it's of questionable merits.
The main advantage over only the late cleanup pass is that it kills off the
LIs for the constants early - which can help e.g. register allocation.
This patch is an alternative to PRs llvm#117060, llvm#131684, llvm#131728.

The patch adds a late optimization pass that replaces conditional
branches that can be statically evaluated with an unconditinal branch.

Adding michael as a co-author as most of code that evaluates the
condition comes from llvm#131684.
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
@llvmbot
Copy link
Member

llvmbot commented Mar 27, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-risc-v

Author: Mikhail R. Gadelha (mikhailramalho)

Changes

This patch is an alternative to PRs #117060, #131684, #131728, and builds on top of #132988.

The patch adds a late optimization pass that replaces conditional branches that can be statically evaluated with an unconditinal branch.

Once this PR and #132988 land, I plan to send a follow-up patch that reduces code size by adding a branch folding pass after the newly added late optimization pass.

Adding Michael as a co-author as most of the code that evaluates the condition comes from #131684.

Co-authored-by: Michael Maitland michaeltmaitland@gmail.com


Patch is 38.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133256.diff

15 Files Affected:

  • (modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1)
  • (modified) llvm/lib/Target/RISCV/RISCV.h (+3)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+67-34)
  • (added) llvm/lib/Target/RISCV/RISCVLateOpt.cpp (+190)
  • (modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+2)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/O0-pipeline.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/O3-pipeline.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll (+4-8)
  • (modified) llvm/test/CodeGen/RISCV/branch_zero.ll (+6-10)
  • (modified) llvm/test/CodeGen/RISCV/double-br-fcmp.ll (+8-16)
  • (modified) llvm/test/CodeGen/RISCV/float-br-fcmp.ll (+8-16)
  • (modified) llvm/test/CodeGen/RISCV/half-br-fcmp.ll (+16-32)
  • (modified) llvm/test/CodeGen/RISCV/machine-sink-load-immediate.ll (+55-65)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll (+6-4)
diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index e8d00f4df7c86..c9609d224414d 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -35,6 +35,7 @@ add_llvm_target(RISCVCodeGen
   RISCVConstantPoolValue.cpp
   RISCVDeadRegisterDefinitions.cpp
   RISCVMakeCompressible.cpp
+  RISCVLateOpt.cpp
   RISCVExpandAtomicPseudoInsts.cpp
   RISCVExpandPseudoInsts.cpp
   RISCVFoldMemOffset.cpp
diff --git a/llvm/lib/Target/RISCV/RISCV.h b/llvm/lib/Target/RISCV/RISCV.h
index 641e2eb4094f9..1f1d7e1fa21df 100644
--- a/llvm/lib/Target/RISCV/RISCV.h
+++ b/llvm/lib/Target/RISCV/RISCV.h
@@ -40,6 +40,9 @@ void initializeRISCVLandingPadSetupPass(PassRegistry &);
 FunctionPass *createRISCVISelDag(RISCVTargetMachine &TM,
                                  CodeGenOptLevel OptLevel);
 
+FunctionPass *createRISCVLateOptPass();
+void initializeRISCVLateOptPass(PassRegistry &);
+
 FunctionPass *createRISCVMakeCompressibleOptPass();
 void initializeRISCVMakeCompressibleOptPass(PassRegistry &);
 
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 62f978d64fbb9..af9975aca206e 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -998,6 +998,25 @@ static RISCVCC::CondCode getCondFromBranchOpc(unsigned Opc) {
   }
 }
 
+static bool evaluateCondBranch(unsigned CC, int64_t C0, int64_t C1) {
+  switch (CC) {
+  default:
+    llvm_unreachable("Unexpected CC");
+  case RISCVCC::COND_EQ:
+    return C0 == C1;
+  case RISCVCC::COND_NE:
+    return C0 != C1;
+  case RISCVCC::COND_LT:
+    return C0 < C1;
+  case RISCVCC::COND_GE:
+    return C0 >= C1;
+  case RISCVCC::COND_LTU:
+    return (uint64_t)C0 < (uint64_t)C1;
+  case RISCVCC::COND_GEU:
+    return (uint64_t)C0 >= (uint64_t)C1;
+  }
+}
+
 // The contents of values added to Cond are not examined outside of
 // RISCVInstrInfo, giving us flexibility in what to push to it. For RISCV, we
 // push BranchOpcode, Reg1, Reg2.
@@ -1295,6 +1314,49 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
   RISCVCC::CondCode CC = static_cast<RISCVCC::CondCode>(Cond[0].getImm());
   assert(CC != RISCVCC::COND_INVALID);
 
+  auto modifyBranch = [&]() {
+    // Build the new branch and remove the old one.
+    BuildMI(*MBB, MI, MI.getDebugLoc(),
+            getBrCond(static_cast<RISCVCC::CondCode>(Cond[0].getImm())))
+        .add(Cond[1])
+        .add(Cond[2])
+        .addMBB(TBB);
+    MI.eraseFromParent();
+  };
+
+  // Right now we only care about LI (i.e. ADDI x0, imm)
+  auto isLoadImm = [](const MachineInstr *MI, int64_t &Imm) -> bool {
+    if (MI->getOpcode() == RISCV::ADDI && MI->getOperand(1).isReg() &&
+        MI->getOperand(1).getReg() == RISCV::X0) {
+      Imm = MI->getOperand(2).getImm();
+      return true;
+    }
+    return false;
+  };
+  // Either a load from immediate instruction or X0.
+  auto isFromLoadImm = [&](const MachineOperand &Op, int64_t &Imm) -> bool {
+    if (!Op.isReg())
+      return false;
+    Register Reg = Op.getReg();
+    if (Reg == RISCV::X0) {
+      Imm = 0;
+      return true;
+    }
+    return Reg.isVirtual() && isLoadImm(MRI.getVRegDef(Reg), Imm);
+  };
+
+  // Canonicalize conditional branches which can be constant folded into
+  // beqz or bnez.  We can't modify the CFG here.
+  int64_t C0, C1;
+  if (isFromLoadImm(Cond[1], C0) && isFromLoadImm(Cond[2], C1)) {
+    unsigned NewCC =
+        evaluateCondBranch(CC, C0, C1) ? RISCVCC::COND_EQ : RISCVCC::COND_NE;
+    Cond[0] = MachineOperand::CreateImm(NewCC);
+    Cond[1] = Cond[2] = MachineOperand::CreateReg(RISCV::X0, /*isDef=*/false);
+    modifyBranch();
+    return true;
+  }
+
   if (CC == RISCVCC::COND_EQ || CC == RISCVCC::COND_NE)
     return false;
 
@@ -1315,24 +1377,6 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
   //
   // To make sure this optimization is really beneficial, we only
   // optimize for cases where Y had only one use (i.e. only used by the branch).
-
-  // Right now we only care about LI (i.e. ADDI x0, imm)
-  auto isLoadImm = [](const MachineInstr *MI, int64_t &Imm) -> bool {
-    if (MI->getOpcode() == RISCV::ADDI && MI->getOperand(1).isReg() &&
-        MI->getOperand(1).getReg() == RISCV::X0) {
-      Imm = MI->getOperand(2).getImm();
-      return true;
-    }
-    return false;
-  };
-  // Either a load from immediate instruction or X0.
-  auto isFromLoadImm = [&](const MachineOperand &Op, int64_t &Imm) -> bool {
-    if (!Op.isReg())
-      return false;
-    Register Reg = Op.getReg();
-    return Reg.isVirtual() && isLoadImm(MRI.getVRegDef(Reg), Imm);
-  };
-
   MachineOperand &LHS = MI.getOperand(0);
   MachineOperand &RHS = MI.getOperand(1);
   // Try to find the register for constant Z; return
@@ -1350,8 +1394,6 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
     return Register();
   };
 
-  bool Modify = false;
-  int64_t C0;
   if (isFromLoadImm(LHS, C0) && MRI.hasOneUse(LHS.getReg())) {
     // Might be case 1.
     // Signed integer overflow is UB. (UINT64_MAX is bigger so we don't need
@@ -1364,7 +1406,8 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
         // We might extend the live range of Z, clear its kill flag to
         // account for this.
         MRI.clearKillFlags(RegZ);
-        Modify = true;
+        modifyBranch();
+        return true;
       }
   } else if (isFromLoadImm(RHS, C0) && MRI.hasOneUse(RHS.getReg())) {
     // Might be case 2.
@@ -1378,22 +1421,12 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
         // We might extend the live range of Z, clear its kill flag to
         // account for this.
         MRI.clearKillFlags(RegZ);
-        Modify = true;
+        modifyBranch();
+        return true;
       }
   }
 
-  if (!Modify)
-    return false;
-
-  // Build the new branch and remove the old one.
-  BuildMI(*MBB, MI, MI.getDebugLoc(),
-          getBrCond(static_cast<RISCVCC::CondCode>(Cond[0].getImm())))
-      .add(Cond[1])
-      .add(Cond[2])
-      .addMBB(TBB);
-  MI.eraseFromParent();
-
-  return true;
+  return false;
 }
 
 MachineBasicBlock *
diff --git a/llvm/lib/Target/RISCV/RISCVLateOpt.cpp b/llvm/lib/Target/RISCV/RISCVLateOpt.cpp
new file mode 100644
index 0000000000000..7fa04f0cbba9b
--- /dev/null
+++ b/llvm/lib/Target/RISCV/RISCVLateOpt.cpp
@@ -0,0 +1,190 @@
+//===-- RISCVLateOpt.cpp - Late stage optimization ------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// This file provides RISC-V specific target descriptions.
+///
+//===----------------------------------------------------------------------===//
+
+#include "MCTargetDesc/RISCVMCTargetDesc.h"
+#include "RISCV.h"
+#include "RISCVInstrInfo.h"
+#include "RISCVSubtarget.h"
+#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
+#include "llvm/CodeGen/MachineDominators.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/Passes.h"
+#include "llvm/CodeGen/RegisterScavenging.h"
+#include "llvm/MC/TargetRegistry.h"
+#include "llvm/Support/Debug.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "riscv-late-opt"
+#define RISCV_LATE_OPT_NAME "RISC-V Late Stage Optimizations"
+
+namespace {
+
+struct RISCVLateOpt : public MachineFunctionPass {
+  static char ID;
+
+  RISCVLateOpt() : MachineFunctionPass(ID) {}
+
+  StringRef getPassName() const override { return RISCV_LATE_OPT_NAME; }
+
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+    MachineFunctionPass::getAnalysisUsage(AU);
+  }
+
+  bool runOnMachineFunction(MachineFunction &Fn) override;
+
+private:
+  bool trySimplifyCondBr(MachineInstr *MI, MachineBasicBlock *TBB,
+                         MachineBasicBlock *FBB,
+                         SmallVectorImpl<MachineOperand> &Cond) const;
+
+  const RISCVInstrInfo *RII = nullptr;
+};
+} // namespace
+
+char RISCVLateOpt::ID = 0;
+INITIALIZE_PASS(RISCVLateOpt, "riscv-late-opt", RISCV_LATE_OPT_NAME, false,
+                false)
+
+bool RISCVLateOpt::trySimplifyCondBr(
+    MachineInstr *MI, MachineBasicBlock *TBB, MachineBasicBlock *FBB,
+    SmallVectorImpl<MachineOperand> &Cond) const {
+
+  RISCVCC::CondCode CC = static_cast<RISCVCC::CondCode>(Cond[0].getImm());
+  assert(CC != RISCVCC::COND_INVALID);
+
+  // Right now we only care about LI (i.e. ADDI x0, imm)
+  auto isLoadImm = [](const MachineInstr *MI, int64_t &Imm) -> bool {
+    if (MI->getOpcode() == RISCV::ADDI && MI->getOperand(1).isReg() &&
+        MI->getOperand(1).getReg() == RISCV::X0) {
+      Imm = MI->getOperand(2).getImm();
+      return true;
+    }
+    return false;
+  };
+
+  MachineBasicBlock *MBB = MI->getParent();
+  MachineRegisterInfo &MRI = MBB->getParent()->getRegInfo();
+  // Either a load from immediate instruction or X0.
+  auto isFromLoadImm = [&](const MachineOperand &Op, int64_t &Imm) -> bool {
+    if (!Op.isReg())
+      return false;
+    Register Reg = Op.getReg();
+    if (Reg == RISCV::X0) {
+      Imm = 0;
+      return true;
+    }
+    return Reg.isVirtual() && isLoadImm(MRI.getVRegDef(Reg), Imm);
+  };
+
+  // Try and convert a conditional branch that can be evaluated statically
+  // into an unconditional branch.
+  MachineBasicBlock *Folded = nullptr;
+  int64_t C0, C1;
+  if (isFromLoadImm(Cond[1], C0) && isFromLoadImm(Cond[2], C1)) {
+    switch (CC) {
+    case RISCVCC::COND_INVALID:
+      llvm_unreachable("Unexpected CC");
+    case RISCVCC::COND_EQ: {
+      Folded = (C0 == C1) ? TBB : FBB;
+      break;
+    }
+    case RISCVCC::COND_NE: {
+      Folded = (C0 != C1) ? TBB : FBB;
+      break;
+    }
+    case RISCVCC::COND_LT: {
+      Folded = (C0 < C1) ? TBB : FBB;
+      break;
+    }
+    case RISCVCC::COND_GE: {
+      Folded = (C0 >= C1) ? TBB : FBB;
+      break;
+    }
+    case RISCVCC::COND_LTU: {
+      Folded = ((uint64_t)C0 < (uint64_t)C1) ? TBB : FBB;
+      break;
+    }
+    case RISCVCC::COND_GEU: {
+      Folded = ((uint64_t)C0 >= (uint64_t)C1) ? TBB : FBB;
+      break;
+    }
+    }
+
+    // Do the conversion
+    // Build the new unconditional branch
+    DebugLoc DL = MBB->findBranchDebugLoc();
+    if (Folded) {
+      BuildMI(*MBB, MI, DL, RII->get(RISCV::PseudoBR)).addMBB(Folded);
+    } else {
+      MachineFunction::iterator Fallthrough = ++MBB->getIterator();
+      if (Fallthrough == MBB->getParent()->end())
+        return false;
+      BuildMI(*MBB, MI, DL, RII->get(RISCV::PseudoBR)).addMBB(&*Fallthrough);
+    }
+
+    // Update successors of MBB->
+    if (Folded == TBB) {
+      // If we're taking TBB, then the succ to delete is the fallthrough (if
+      // it was a succ in the first place), or its the MBB from the
+      // unconditional branch.
+      if (!FBB) {
+        MachineFunction::iterator Fallthrough = ++MBB->getIterator();
+        if (Fallthrough != MBB->getParent()->end() &&
+            MBB->isSuccessor(&*Fallthrough))
+          MBB->removeSuccessor(&*Fallthrough, true);
+      } else {
+        MBB->removeSuccessor(FBB, true);
+      }
+    } else if (Folded == FBB) {
+      // If we're taking the fallthrough or unconditional branch, then the
+      // succ to remove is the one from the conditional branch.
+      MBB->removeSuccessor(TBB, true);
+    }
+
+    MI->eraseFromParent();
+    return true;
+  }
+  return false;
+}
+
+bool RISCVLateOpt::runOnMachineFunction(MachineFunction &Fn) {
+  if (skipFunction(Fn.getFunction()))
+    return false;
+
+  auto &ST = Fn.getSubtarget<RISCVSubtarget>();
+  RII = ST.getInstrInfo();
+
+  bool Changed = false;
+
+  for (MachineBasicBlock &MBB : Fn) {
+    for (MachineBasicBlock::iterator MII = MBB.begin(), MIE = MBB.end();
+         MII != MIE;) {
+      MachineInstr *MI = &*MII;
+      // We may be erasing MI below, increment MII now.
+      ++MII;
+      if (!MI->isConditionalBranch())
+        continue;
+
+      MachineBasicBlock *TBB, *FBB;
+      SmallVector<MachineOperand, 4> Cond;
+      if (!RII->analyzeBranch(MBB, TBB, FBB, Cond, /*AllowModify=*/false))
+        Changed |= trySimplifyCondBr(MI, TBB, FBB, Cond);
+    }
+  }
+
+  return Changed;
+}
+
+/// Returns an instance of the Make Compressible Optimization pass.
+FunctionPass *llvm::createRISCVLateOptPass() { return new RISCVLateOpt(); }
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index f78e5f8147d98..40c1aead7991b 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -127,6 +127,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeRISCVTarget() {
   initializeRISCVPostLegalizerCombinerPass(*PR);
   initializeKCFIPass(*PR);
   initializeRISCVDeadRegisterDefinitionsPass(*PR);
+  initializeRISCVLateOptPass(*PR);
   initializeRISCVMakeCompressibleOptPass(*PR);
   initializeRISCVGatherScatterLoweringPass(*PR);
   initializeRISCVCodeGenPreparePass(*PR);
@@ -565,6 +566,7 @@ void RISCVPassConfig::addPreEmitPass() {
   if (TM->getOptLevel() >= CodeGenOptLevel::Default &&
       EnableRISCVCopyPropagation)
     addPass(createMachineCopyPropagationPass(true));
+  addPass(createRISCVLateOptPass());
   addPass(&BranchRelaxationPassID);
   addPass(createRISCVMakeCompressibleOptPass());
 }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
index 338925059862c..95af7861d4798 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv32zbb.ll
@@ -357,7 +357,7 @@ define i64 @ctpop_i64(i64 %a) nounwind {
 define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
 ; RV32I-LABEL: ctpop_i64_ugt_two:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    beqz zero, .LBB6_2
+; RV32I-NEXT:    j .LBB6_2
 ; RV32I-NEXT:  # %bb.1:
 ; RV32I-NEXT:    sltiu a0, zero, 0
 ; RV32I-NEXT:    ret
@@ -404,7 +404,7 @@ define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
 ;
 ; RV32ZBB-LABEL: ctpop_i64_ugt_two:
 ; RV32ZBB:       # %bb.0:
-; RV32ZBB-NEXT:    beqz zero, .LBB6_2
+; RV32ZBB-NEXT:    j .LBB6_2
 ; RV32ZBB-NEXT:  # %bb.1:
 ; RV32ZBB-NEXT:    sltiu a0, zero, 0
 ; RV32ZBB-NEXT:    ret
@@ -422,7 +422,7 @@ define i1 @ctpop_i64_ugt_two(i64 %a) nounwind {
 define i1 @ctpop_i64_ugt_one(i64 %a) nounwind {
 ; RV32I-LABEL: ctpop_i64_ugt_one:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    beqz zero, .LBB7_2
+; RV32I-NEXT:    j .LBB7_2
 ; RV32I-NEXT:  # %bb.1:
 ; RV32I-NEXT:    snez a0, zero
 ; RV32I-NEXT:    ret
@@ -470,7 +470,7 @@ define i1 @ctpop_i64_ugt_one(i64 %a) nounwind {
 ;
 ; RV32ZBB-LABEL: ctpop_i64_ugt_one:
 ; RV32ZBB:       # %bb.0:
-; RV32ZBB-NEXT:    beqz zero, .LBB7_2
+; RV32ZBB-NEXT:    j .LBB7_2
 ; RV32ZBB-NEXT:  # %bb.1:
 ; RV32ZBB-NEXT:    snez a0, zero
 ; RV32ZBB-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 694662eab1681..0b02c8a5e66cb 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -62,6 +62,7 @@
 ; CHECK-NEXT:       Insert fentry calls
 ; CHECK-NEXT:       Insert XRay ops
 ; CHECK-NEXT:       Implement the 'patchable-function' attribute
+; CHECK-NEXT:       RISC-V Late Stage Optimizations
 ; CHECK-NEXT:       Branch relaxation pass
 ; CHECK-NEXT:       RISC-V Make Compressible
 ; CHECK-NEXT:       Contiguously Lay Out Funclets
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index beef7a574dc4f..1ca9ecaac6342 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -194,6 +194,7 @@
 ; CHECK-NEXT:       Insert XRay ops
 ; CHECK-NEXT:       Implement the 'patchable-function' attribute
 ; CHECK-NEXT:       Machine Copy Propagation Pass
+; CHECK-NEXT:       RISC-V Late Stage Optimizations
 ; CHECK-NEXT:       Branch relaxation pass
 ; CHECK-NEXT:       RISC-V Make Compressible
 ; CHECK-NEXT:       Contiguously Lay Out Funclets
diff --git a/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll b/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll
index 51ea8873d8c03..b2558cde29832 100644
--- a/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll
+++ b/llvm/test/CodeGen/RISCV/bfloat-br-fcmp.ll
@@ -11,8 +11,7 @@ declare bfloat @dummy(bfloat)
 define void @br_fcmp_false(bfloat %a, bfloat %b) nounwind {
 ; RV32IZFBFMIN-LABEL: br_fcmp_false:
 ; RV32IZFBFMIN:       # %bb.0:
-; RV32IZFBFMIN-NEXT:    li a0, 1
-; RV32IZFBFMIN-NEXT:    bnez a0, .LBB0_2
+; RV32IZFBFMIN-NEXT:    j .LBB0_2
 ; RV32IZFBFMIN-NEXT:  # %bb.1: # %if.then
 ; RV32IZFBFMIN-NEXT:    ret
 ; RV32IZFBFMIN-NEXT:  .LBB0_2: # %if.else
@@ -22,8 +21,7 @@ define void @br_fcmp_false(bfloat %a, bfloat %b) nounwind {
 ;
 ; RV64IZFBFMIN-LABEL: br_fcmp_false:
 ; RV64IZFBFMIN:       # %bb.0:
-; RV64IZFBFMIN-NEXT:    li a0, 1
-; RV64IZFBFMIN-NEXT:    bnez a0, .LBB0_2
+; RV64IZFBFMIN-NEXT:    j .LBB0_2
 ; RV64IZFBFMIN-NEXT:  # %bb.1: # %if.then
 ; RV64IZFBFMIN-NEXT:    ret
 ; RV64IZFBFMIN-NEXT:  .LBB0_2: # %if.else
@@ -583,8 +581,7 @@ if.then:
 define void @br_fcmp_true(bfloat %a, bfloat %b) nounwind {
 ; RV32IZFBFMIN-LABEL: br_fcmp_true:
 ; RV32IZFBFMIN:       # %bb.0:
-; RV32IZFBFMIN-NEXT:    li a0, 1
-; RV32IZFBFMIN-NEXT:    bnez a0, .LBB16_2
+; RV32IZFBFMIN-NEXT:    j .LBB16_2
 ; RV32IZFBFMIN-NEXT:  # %bb.1: # %if.else
 ; RV32IZFBFMIN-NEXT:    ret
 ; RV32IZFBFMIN-NEXT:  .LBB16_2: # %if.then
@@ -594,8 +591,7 @@ define void @br_fcmp_true(bfloat %a, bfloat %b) nounwind {
 ;
 ; RV64IZFBFMIN-LABEL: br_fcmp_true:
 ; RV64IZFBFMIN:       # %bb.0:
-; RV64IZFBFMIN-NEXT:    li a0, 1
-; RV64IZFBFMIN-NEXT:    bnez a0, .LBB16_2
+; RV64IZFBFMIN-NEXT:    j .LBB16_2
 ; RV64IZFBFMIN-NEXT:  # %bb.1: # %if.else
 ; RV64IZFBFMIN-NEXT:    ret
 ; RV64IZFBFMIN-NEXT:  .LBB16_2: # %if.then
diff --git a/llvm/test/CodeGen/RISCV/branch_zero.ll b/llvm/test/CodeGen/RISCV/branch_zero.ll
index fd0979977ba3b..0554f8c168c80 100644
--- a/llvm/test/CodeGen/RISCV/branch_zero.ll
+++ b/llvm/test/CodeGen/RISCV/branch_zero.ll
@@ -5,15 +5,13 @@
 define void @foo(i16 %finder_idx) {
 ; CHECK-LABEL: foo:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:  .LBB0_1: # %for.body
-; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:  # %bb.1: # %for.body
 ; CHECK-NEXT:    slli a0, a0, 48
 ; CHECK-NEXT:    bltz a0, .LBB0_4
 ; CHECK-NEXT:  # %bb.2: # %while.cond.preheader.i
-; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
 ; CHECK-NEXT:    li a0, 0
-; CHECK-NEXT:    bnez zero, .LBB0_1
-; CHECK-NEXT:  # %bb.3: # %while.body
+; CHECK-NEXT:    j .LBB0_3
+; CHECK-NEXT:  .LBB0_3: # %while.body
 ; CHECK-NEXT:  .LBB0_4: # %while.cond1.preheader.i
 entry:
   br label %for.body
@@ -46,15 +44,13 @@ if.then:
 define void @bar(i16 %finder_idx) {
 ; CHECK-LABEL: bar:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:  .LBB1_1: # %for.body
-; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:  # %bb.1: # %for.body
 ; CHECK-NEXT:    slli a0, a0, 48
 ; CHECK-NEXT:    bgez a0, .LBB1_4
 ; CHECK-NEXT:  # %bb.2: # %while.cond.preheader.i
-; CHECK-NEXT:    # in Loop: Header=BB1_1 Depth=1
 ; CHECK-NEXT:    li a0, 0
-; CHECK-NEXT:    bnez zero, .LBB1_1
-; CHECK-NEXT:  # %bb.3: # %while.body
+; CHECK-NEXT:    j .LBB1_3
+; CHECK-NEXT:  .LBB1_3: # %while.body
 ; CHECK-NEXT:  .LBB1_4: # %while.cond1.preheader.i
 entry:
   br label %for.body
diff --git a/llvm/test/CodeGen/RISCV/double-br-fcmp.ll b/llvm/test/CodeGen/RISCV/double-br-fcmp.ll
index 035228e73c707..b2c882878f8bc 100644
--- a/llvm/test/CodeGen/RISCV/double-br-fcmp.ll
+++ b/llvm/test/CodeGen/RISCV/double-br-fcmp.ll
@@ -14,8 +14,7 @@ declare void @exit(i32)
 define void @br_fcmp_false(double %a, double %b) nounwind {
 ; RV32IFD-LABEL: br_fcmp_false:
 ; RV32IFD:       # %bb.0:
-; RV32IFD-NEXT:    li a0, 1
-; RV32IFD-NEXT:    bnez a0, .LBB0_2
+; RV32IFD-NEXT:    j .LBB0_2
 ; RV32IFD-NEXT:  # %bb.1: # %if.then
 ; RV32IFD-NEXT:    ret
 ; RV32IFD-NEXT:  .LBB0_2: # %if.else
@@ -25,8 +24,7 @@ define void @br_fcmp_false(double %a, double %b) nounwind {
 ;
 ; RV64IFD-LABEL: br_fcmp_false:
 ; RV64IFD:       # %bb.0:
-; RV64IFD-NEXT:    li a0, 1
-; RV64IFD-NEXT:    bnez a0, .LBB0_2
+; RV64IFD-NEXT:    j .LBB0_2
 ; RV64IFD-NEXT:  # %bb.1: # %if.then
 ; RV64IFD-NEX...
[truncated]

Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
…unction of RISCVInstrInfo

Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
@topperc
Copy link
Collaborator

topperc commented Mar 27, 2025

Can we add a test that shows the issue from real code?

@michaelmaitland
Copy link
Contributor

Can we add a test that shows the issue from real code?

@mikhailramalho feel free to use this test reduced from perlbench

@mikhailramalho
Copy link
Member Author

Can we add a test that shows the issue from real code?

@mikhailramalho feel free to use this test reduced from perlbench

I see that #131728 has a different version with an extra function; should I use that instead?

@michaelmaitland
Copy link
Contributor

Can we add a test that shows the issue from real code?

@mikhailramalho feel free to use this test reduced from perlbench

I see that #131728 has a different version with an extra function; should I use that instead?

I forgot about that one. Thats a good idea. The reason we got an extra test there is because the first one got cleaned up in BranchFolding with a call to the modified analyzeBranch. The second one didn't though, and needed the late optimizer. Without changes to analyzeBranch (the current approach we're pursuing), we will optimize both here I think.

Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but obviously we need to wait for @preames change to go in first.

@mikhailramalho
Copy link
Member Author

LGTM, but obviously we need to wait for @preames change to go in first.

He landed it this morning. I'll update the patch description.

Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
@lenary
Copy link
Member

lenary commented Mar 27, 2025

Can we call this something like "RISC-V Late Peephole Optimisation Pass" or "RISC-V Late Branch Optimisation Pass", rather than just "Late Optimisation". I don't want that to prevent us adding more things to it in the future, but I do want to have a slightly more descriptive name

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/prior style and naming comments from @topperc and @lenary addressed.

@mikhailramalho
Copy link
Member Author

Can we call this something like "RISC-V Late Peephole Optimisation Pass" or "RISC-V Late Branch Optimisation Pass", rather than just "Late Optimisation". I don't want that to prevent us adding more things to it in the future, but I do want to have a slightly more descriptive name

I can rename it to "RISC-V Late Branch Optimisation Pass". I removed the Peephole from the previous version because we are changing the CFG

Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
@lenary
Copy link
Member

lenary commented Mar 27, 2025

I'm happy with the new name, thanks!

@mikhailramalho mikhailramalho merged commit d8e44a9 into llvm:main Mar 27, 2025
11 checks passed
@mikhailramalho mikhailramalho deleted the riscv-late-opt1 branch March 27, 2025 22:31
@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 27, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-android running on sanitizer-buildbot-android while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/7720

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
[       OK ] AddressSanitizer.AtoiAndFriendsOOBTest (2151 ms)
[ RUN      ] AddressSanitizer.HasFeatureAddressSanitizerTest
[       OK ] AddressSanitizer.HasFeatureAddressSanitizerTest (0 ms)
[ RUN      ] AddressSanitizer.CallocReturnsZeroMem
[       OK ] AddressSanitizer.CallocReturnsZeroMem (11 ms)
[ DISABLED ] AddressSanitizer.DISABLED_TSDTest
[ RUN      ] AddressSanitizer.IgnoreTest
[       OK ] AddressSanitizer.IgnoreTest (0 ms)
[ RUN      ] AddressSanitizer.SignalTest
[       OK ] AddressSanitizer.SignalTest (186 ms)
[ RUN      ] AddressSanitizer.ReallocTest
[       OK ] AddressSanitizer.ReallocTest (32 ms)
[ RUN      ] AddressSanitizer.WrongFreeTest
[       OK ] AddressSanitizer.WrongFreeTest (121 ms)
[ RUN      ] AddressSanitizer.LongJmpTest
[       OK ] AddressSanitizer.LongJmpTest (0 ms)
[ RUN      ] AddressSanitizer.ThreadStackReuseTest
[       OK ] AddressSanitizer.ThreadStackReuseTest (1 ms)
[ DISABLED ] AddressSanitizer.DISABLED_MemIntrinsicUnalignedAccessTest
[ DISABLED ] AddressSanitizer.DISABLED_LargeFunctionSymbolizeTest
[ DISABLED ] AddressSanitizer.DISABLED_MallocFreeUnwindAndSymbolizeTest
[ RUN      ] AddressSanitizer.UseThenFreeThenUseTest
[       OK ] AddressSanitizer.UseThenFreeThenUseTest (120 ms)
[ RUN      ] AddressSanitizer.FileNameInGlobalReportTest
[       OK ] AddressSanitizer.FileNameInGlobalReportTest (145 ms)
[ DISABLED ] AddressSanitizer.DISABLED_StressStackReuseAndExceptionsTest
[ RUN      ] AddressSanitizer.MlockTest
[       OK ] AddressSanitizer.MlockTest (0 ms)
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadedTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowIn
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowLeft
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowRight
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFHigh
[ DISABLED ] AddressSanitizer.DISABLED_DemoOOM
[ DISABLED ] AddressSanitizer.DISABLED_DemoDoubleFreeTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoNullDerefTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoFunctionStaticTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoTooMuchMemoryTest
[ RUN      ] AddressSanitizer.LongDoubleNegativeTest
[       OK ] AddressSanitizer.LongDoubleNegativeTest (0 ms)
[----------] 19 tests from AddressSanitizer (27795 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 2 test suites ran. (27798 ms total)
[  PASSED  ] 22 tests.

  YOU HAVE 1 DISABLED TEST

Step 34 (run instrumented asan tests [aarch64/bluejay-userdebug/TQ3A.230805.001]) failure: run instrumented asan tests [aarch64/bluejay-userdebug/TQ3A.230805.001] (failure)
...
[ RUN      ] AddressSanitizer.HasFeatureAddressSanitizerTest
[       OK ] AddressSanitizer.HasFeatureAddressSanitizerTest (0 ms)
[ RUN      ] AddressSanitizer.CallocReturnsZeroMem
[       OK ] AddressSanitizer.CallocReturnsZeroMem (11 ms)
[ DISABLED ] AddressSanitizer.DISABLED_TSDTest
[ RUN      ] AddressSanitizer.IgnoreTest
[       OK ] AddressSanitizer.IgnoreTest (0 ms)
[ RUN      ] AddressSanitizer.SignalTest
[       OK ] AddressSanitizer.SignalTest (186 ms)
[ RUN      ] AddressSanitizer.ReallocTest
[       OK ] AddressSanitizer.ReallocTest (32 ms)
[ RUN      ] AddressSanitizer.WrongFreeTest
[       OK ] AddressSanitizer.WrongFreeTest (121 ms)
[ RUN      ] AddressSanitizer.LongJmpTest
[       OK ] AddressSanitizer.LongJmpTest (0 ms)
[ RUN      ] AddressSanitizer.ThreadStackReuseTest
[       OK ] AddressSanitizer.ThreadStackReuseTest (1 ms)
[ DISABLED ] AddressSanitizer.DISABLED_MemIntrinsicUnalignedAccessTest
[ DISABLED ] AddressSanitizer.DISABLED_LargeFunctionSymbolizeTest
[ DISABLED ] AddressSanitizer.DISABLED_MallocFreeUnwindAndSymbolizeTest
[ RUN      ] AddressSanitizer.UseThenFreeThenUseTest
[       OK ] AddressSanitizer.UseThenFreeThenUseTest (120 ms)
[ RUN      ] AddressSanitizer.FileNameInGlobalReportTest
[       OK ] AddressSanitizer.FileNameInGlobalReportTest (145 ms)
[ DISABLED ] AddressSanitizer.DISABLED_StressStackReuseAndExceptionsTest
[ RUN      ] AddressSanitizer.MlockTest
[       OK ] AddressSanitizer.MlockTest (0 ms)
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadedTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoThreadStackTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowIn
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowLeft
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFLowRight
[ DISABLED ] AddressSanitizer.DISABLED_DemoUAFHigh
[ DISABLED ] AddressSanitizer.DISABLED_DemoOOM
[ DISABLED ] AddressSanitizer.DISABLED_DemoDoubleFreeTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoNullDerefTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoFunctionStaticTest
[ DISABLED ] AddressSanitizer.DISABLED_DemoTooMuchMemoryTest
[ RUN      ] AddressSanitizer.LongDoubleNegativeTest
[       OK ] AddressSanitizer.LongDoubleNegativeTest (0 ms)
[----------] 19 tests from AddressSanitizer (27795 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 2 test suites ran. (27798 ms total)
[  PASSED  ] 22 tests.

  YOU HAVE 1 DISABLED TEST
program finished with exit code 0
elapsedTime=2382.610335

@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 28, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve-vla running on linaro-g3-04 while building llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/17/builds/6824

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
...
PASS: Flang :: Driver/include-header.f90 (25155 of 97423)
PASS: Flang :: Driver/print-effective-triple.f90 (25156 of 97423)
PASS: DataFlowSanitizer-aarch64 :: origin_ldst.c (25157 of 97423)
PASS: Flang :: Driver/predefined-macros-compiler-version.F90 (25158 of 97423)
PASS: Flang :: Driver/print-resource-dir.F90 (25159 of 97423)
PASS: Flang :: Driver/mlink-builtin-bc.f90 (25160 of 97423)
PASS: Flang :: Driver/fd-lines-as.f90 (25161 of 97423)
PASS: Flang :: Driver/phases.f90 (25162 of 97423)
PASS: Flang :: Driver/parse-error.ll (25163 of 97423)
UNRESOLVED: Flang :: Driver/slp-vectorize.ll (25164 of 97423)
******************** TEST 'Flang :: Driver/slp-vectorize.ll' FAILED ********************
Test has no 'RUN:' line
********************
PASS: Flang :: Driver/macro-def-undef.F90 (25165 of 97423)
PASS: Flang :: Driver/linker-flags.f90 (25166 of 97423)
PASS: Flang :: Driver/mlir-pass-pipeline.f90 (25167 of 97423)
PASS: Flang :: Driver/parse-fir-error.ll (25168 of 97423)
PASS: Flang :: Driver/print-pipeline-passes.f90 (25169 of 97423)
PASS: Flang :: Driver/print-target-triple.f90 (25170 of 97423)
PASS: Clangd Unit Tests :: ./ClangdTests/80/81 (25171 of 97423)
PASS: Flang :: Driver/pthread.f90 (25172 of 97423)
PASS: Flang :: Driver/parse-ir-error.f95 (25173 of 97423)
PASS: Flang :: Driver/scanning-error.f95 (25174 of 97423)
PASS: Flang :: Driver/bbc-openmp-version-macro.f90 (25175 of 97423)
PASS: Flang :: Driver/pass-plugin-not-found.f90 (25176 of 97423)
PASS: Flang :: Driver/std2018-wrong.f90 (25177 of 97423)
PASS: Flang :: Driver/config-file.f90 (25178 of 97423)
PASS: Flang :: Driver/supported-suffices/f08-suffix.f08 (25179 of 97423)
PASS: Flang :: Driver/tco-code-gen-llvm.fir (25180 of 97423)
PASS: Flang :: Driver/target-gpu-features.f90 (25181 of 97423)
PASS: Flang :: Driver/missing-arg.f90 (25182 of 97423)
PASS: Flang :: Driver/supported-suffices/f03-suffix.f03 (25183 of 97423)
PASS: Flang :: Driver/target.f90 (25184 of 97423)
PASS: Flang :: Driver/pp-fixed-form.f90 (25185 of 97423)
PASS: Clangd Unit Tests :: ./ClangdTests/71/81 (25186 of 97423)
PASS: Flang :: Driver/lto-bc.f90 (25187 of 97423)
PASS: Flang :: Driver/unsupported-vscale-max-min.f90 (25188 of 97423)
PASS: Flang :: Driver/q-unused-arguments.f90 (25189 of 97423)
PASS: Flang :: Driver/mllvm.f90 (25190 of 97423)
PASS: Flang :: Driver/multiple-input-files.f90 (25191 of 97423)
PASS: Flang :: Driver/unparse-with-modules.f90 (25192 of 97423)
PASS: Flang :: Driver/input-from-stdin/input-from-stdin.f90 (25193 of 97423)
PASS: Flang :: Driver/target-machine-error.f90 (25194 of 97423)
PASS: Flang :: Driver/fixed-line-length.f90 (25195 of 97423)
PASS: Flang :: Driver/no-duplicate-main.f90 (25196 of 97423)
PASS: Flang :: Driver/unparse-use-analyzed.f95 (25197 of 97423)
PASS: Flang :: Driver/lto-flags.f90 (25198 of 97423)
PASS: DataFlowSanitizer-aarch64 :: pair.cpp (25199 of 97423)
PASS: Flang :: Driver/std2018.f90 (25200 of 97423)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 28, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve-vls running on linaro-g3-03 while building llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/143/builds/6502

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
...
PASS: Flang :: Driver/print-effective-triple.f90 (25153 of 97423)
PASS: DataFlowSanitizer-aarch64 :: origin_ldst.c (25154 of 97423)
PASS: Flang :: Driver/fdefault.f90 (25155 of 97423)
PASS: Flang :: Driver/bbc-openmp-version-macro.f90 (25156 of 97423)
PASS: Flang :: Driver/print-resource-dir.F90 (25157 of 97423)
PASS: Clangd Unit Tests :: ./ClangdTests/80/81 (25158 of 97423)
PASS: Flang :: Driver/predefined-macros-compiler-version.F90 (25159 of 97423)
PASS: Flang :: Driver/override-triple.ll (25160 of 97423)
PASS: Flang :: Driver/phases.f90 (25161 of 97423)
UNRESOLVED: Flang :: Driver/slp-vectorize.ll (25162 of 97423)
******************** TEST 'Flang :: Driver/slp-vectorize.ll' FAILED ********************
Test has no 'RUN:' line
********************
PASS: Flang :: Driver/parse-error.ll (25163 of 97423)
PASS: Flang :: Driver/macro-def-undef.F90 (25164 of 97423)
PASS: Flang :: Driver/print-pipeline-passes.f90 (25165 of 97423)
PASS: Flang :: Driver/parse-fir-error.ll (25166 of 97423)
PASS: Flang :: Driver/include-header.f90 (25167 of 97423)
PASS: Flang :: Driver/print-target-triple.f90 (25168 of 97423)
PASS: Flang :: Driver/missing-arg.f90 (25169 of 97423)
PASS: Flang :: Driver/fd-lines-as.f90 (25170 of 97423)
PASS: Flang :: Driver/scanning-error.f95 (25171 of 97423)
PASS: Flang :: Driver/mlir-pass-pipeline.f90 (25172 of 97423)
PASS: Flang :: Driver/parse-ir-error.f95 (25173 of 97423)
PASS: Flang :: Driver/mlink-builtin-bc.f90 (25174 of 97423)
PASS: Flang :: Driver/linker-flags.f90 (25175 of 97423)
PASS: Flang :: Driver/std2018-wrong.f90 (25176 of 97423)
PASS: Flang :: Driver/pthread.f90 (25177 of 97423)
PASS: Flang :: Driver/supported-suffices/f08-suffix.f08 (25178 of 97423)
PASS: Flang :: Driver/tco-code-gen-llvm.fir (25179 of 97423)
PASS: Flang :: Driver/pass-plugin-not-found.f90 (25180 of 97423)
PASS: Flang :: Driver/target-gpu-features.f90 (25181 of 97423)
PASS: Flang :: Driver/target.f90 (25182 of 97423)
PASS: Flang :: Driver/supported-suffices/f03-suffix.f03 (25183 of 97423)
PASS: Flang :: Driver/config-file.f90 (25184 of 97423)
PASS: Flang :: Driver/lto-bc.f90 (25185 of 97423)
PASS: Flang :: Driver/pp-fixed-form.f90 (25186 of 97423)
PASS: Flang :: Driver/unsupported-vscale-max-min.f90 (25187 of 97423)
PASS: Flang :: Driver/q-unused-arguments.f90 (25188 of 97423)
PASS: Flang :: Driver/multiple-input-files.f90 (25189 of 97423)
PASS: Flang :: Driver/unparse-with-modules.f90 (25190 of 97423)
PASS: Flang :: Driver/mllvm.f90 (25191 of 97423)
PASS: Flang :: Driver/target-machine-error.f90 (25192 of 97423)
PASS: Clangd Unit Tests :: ./ClangdTests/71/81 (25193 of 97423)
PASS: Flang :: Driver/no-duplicate-main.f90 (25194 of 97423)
PASS: Flang :: Driver/prescanner-diag.f90 (25195 of 97423)
PASS: Flang :: Driver/fixed-line-length.f90 (25196 of 97423)
PASS: Flang :: Driver/input-from-stdin/input-from-stdin.f90 (25197 of 97423)
PASS: Flang :: Driver/std2018.f90 (25198 of 97423)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 28, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2595

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/82/95' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-14508-82-95.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=95 GTEST_SHARD_INDEX=82 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


mikhailramalho added a commit to mikhailramalho/llvm-project that referenced this pull request Apr 7, 2025
This is a follow-up patch to PR llvm#133256.

This patch adds the branch folding pass after the newly added late
optimization pass for riscv, which reduces code size in all SPEC
benchmarks (except libm).

The improvements are: 500.perlbench_r (-3.37%), 544.nab_r (-3.06%),
557.xz_r (-2.82%), 523.xalancbmk_r (-2.64%), 520.omnetpp_r (-2.34%),
531.deepsjeng_r (-2.27%), 502.gcc_r (-2.19%), 526.blender_r (-2.11%),
538.imagick_r (-2.03%), 505.mcf_r (-1.82%), 541.leela_r (-1.74%),
511.povray_r (-1.62%), 510.parest_r (-1.62%), 508.namd_r (-1.57%),
525.x264_r (-1.47%).

Geo mean is -2.07%.

Some caveats:
* On llvm#131728 I mentioned a 7% improvement on execution time of xz, but
  that's no longer the case. I went back and also tried to reproduce the
  result with the code from llvm#131728 and couldn't. Now the results from
  that PR and this one are the same: an overall code size reduction but
  no exec time improvements.
* The root cause of the large number is not yet clear for me. I'm still
  investigating it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants