Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm][ConstraintElimination]Insert ConditionFact into loop header in case of monotonic induction variables #112080

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

grigorypas
Copy link

@grigorypas grigorypas commented Oct 12, 2024

The goal of this change is to be able to remove dead loops through range analysis in the code like below:

long foo() {
    long x = 0;
    while (x < 10) {
        // Dead while loop
        while ((x > 20) && (x % 5 == 0)) {
            x -= 5;
        }
        if (x % 2 == 0) {
            x += 2;
        } else {
            x += 1;
        }
    }
    return x;
}

See related discussion in #111716

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Oct 12, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Grigory Pastukhov (grigorypas)

Changes

The goal of this change is to be able to remove dead loops through range analysis in the code like below:

long foo() {
    long x = 0;
    while (x &lt; 10) {
        // Dead while loop
        while ((x &gt; 20) &amp;&amp; (x % 5 == 0)) {
            x -= 5;
        }
        if (x % 2 == 0) {
            x += 2;
        } else {
            x += 1;
        }
    }
    return x;
}

See related discussion in #111716

cc: @dcci, @dtcxzyw, @nikic, @fhahn


Patch is 27.56 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112080.diff

13 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/ConstraintElimination.cpp (+55)
  • (modified) llvm/test/Transforms/ConstraintElimination/and-implied-by-operands.ll (+1-2)
  • (added) llvm/test/Transforms/ConstraintElimination/loop-removal.ll (+47)
  • (modified) llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-base.ll (+4-8)
  • (modified) llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-pointer-cmps.ll (-3)
  • (modified) llvm/test/Transforms/ConstraintElimination/loops-header-tested-base.ll (+9-18)
  • (modified) llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-cmps.ll (-2)
  • (modified) llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-iv.ll (+1-2)
  • (modified) llvm/test/Transforms/ConstraintElimination/monotonic-int-phis-wrapping.ll (+3-6)
  • (modified) llvm/test/Transforms/ConstraintElimination/monotonic-pointer-phis-chain-of-exits.ll (+4-8)
  • (modified) llvm/test/Transforms/ConstraintElimination/monotonic-pointer-phis-early-exits.ll (+7-14)
  • (modified) llvm/test/Transforms/ConstraintElimination/monotonic-pointer-phis.ll (+1-3)
  • (modified) llvm/test/Transforms/ConstraintElimination/transfer-signed-facts-to-unsigned-is-known-non-negative.ll (+1-2)
diff --git a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
index 7c06e0c757e1cc..56cc26093cee87 100644
--- a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
@@ -197,6 +197,8 @@ struct State {
   /// controlling the loop header.
   void addInfoForInductions(BasicBlock &BB);
 
+  void addConditionFactsIntoLoopHeader(BasicBlock &BB);
+  
   /// Returns true if we can add a known condition from BB to its successor
   /// block Succ.
   bool canAddSuccessor(BasicBlock &BB, BasicBlock *Succ) const {
@@ -900,7 +902,60 @@ static void dumpConstraint(ArrayRef<int64_t> C,
 }
 #endif
 
+// For monotonically decreasing/increasing variables in the loop,
+// this will insert ConditionFact PN >= StartingValue or PN <= StartingValue
+// associated with the loop header, where PN is the corresponding PHi node.
+void State::addConditionFactsIntoLoopHeader(BasicBlock &BB) {
+  auto *L = LI.getLoopFor(&BB);
+  if (!L || L->getHeader() != &BB)
+    return;
+  DomTreeNode *DTN = DT.getNode(&BB);
+  for(PHINode &PN :L->getHeader()->phis()){
+    if(PN.getNumIncomingValues() != 2 || PN.getParent() != &BB || !SE.isSCEVable(PN.getType()))
+      continue;
+    auto *AR = dyn_cast_or_null<SCEVAddRecExpr>(SE.getSCEV(&PN));
+    BasicBlock *LoopPred = L->getLoopPredecessor();
+    if (!AR || AR->getLoop() != L || !LoopPred)
+      return;
+    const SCEV *StartSCEV = AR->getStart();
+    Value *StartValue = nullptr;
+    if (auto *C = dyn_cast<SCEVConstant>(StartSCEV)) {
+      StartValue = C->getValue();
+    } else {
+      StartValue = PN.getIncomingValueForBlock(LoopPred);
+      assert(SE.getSCEV(StartValue) == StartSCEV && "inconsistent start value");
+    }
+    auto IncUnsigned = SE.getMonotonicPredicateType(AR, CmpInst::ICMP_UGT);
+    auto IncSigned = SE.getMonotonicPredicateType(AR, CmpInst::ICMP_SGT);
+    
+    // Monotonically Increasing 
+    bool MonotonicallyIncreasingUnsigned =
+      IncUnsigned && *IncUnsigned == ScalarEvolution::MonotonicallyIncreasing;
+    bool MonotonicallyIncreasingSigned =
+        IncSigned && *IncSigned == ScalarEvolution::MonotonicallyIncreasing;
+    if (MonotonicallyIncreasingUnsigned)
+      WorkList.push_back(
+          FactOrCheck::getConditionFact(DTN, CmpInst::ICMP_UGE, &PN, StartValue));
+    if (MonotonicallyIncreasingSigned)
+      WorkList.push_back(
+          FactOrCheck::getConditionFact(DTN, CmpInst::ICMP_SGE, &PN, StartValue));
+
+    // Monotonically Decreasing
+    bool MonotonicallyDecreasingUnsigned =
+      IncUnsigned && *IncUnsigned == ScalarEvolution::MonotonicallyDecreasing;
+    bool MonotonicallyDecreasingSigned =
+        IncSigned && *IncSigned == ScalarEvolution::MonotonicallyDecreasing;
+    if(MonotonicallyDecreasingUnsigned)
+      WorkList.push_back(
+          FactOrCheck::getConditionFact(DTN, CmpInst::ICMP_ULE, &PN, StartValue));
+    if(MonotonicallyDecreasingSigned)
+      WorkList.push_back(
+          FactOrCheck::getConditionFact(DTN, CmpInst::ICMP_SLE, &PN, StartValue));
+  }        
+}
+
 void State::addInfoForInductions(BasicBlock &BB) {
+  addConditionFactsIntoLoopHeader(BB);
   auto *L = LI.getLoopFor(&BB);
   if (!L || L->getHeader() != &BB)
     return;
diff --git a/llvm/test/Transforms/ConstraintElimination/and-implied-by-operands.ll b/llvm/test/Transforms/ConstraintElimination/and-implied-by-operands.ll
index 6bbc73c9c996c8..ab1733caf10d14 100644
--- a/llvm/test/Transforms/ConstraintElimination/and-implied-by-operands.ll
+++ b/llvm/test/Transforms/ConstraintElimination/and-implied-by-operands.ll
@@ -326,9 +326,8 @@ define void @test_monotonic_ptr_iv_inc_1_eq_to_uge(ptr %start, i16 %len) {
 ; CHECK-NEXT:    [[AND_0:%.*]] = and i1 [[LEN_NEG]], [[C]]
 ; CHECK-NEXT:    br i1 [[AND_0]], label [[FOR_BODY:%.*]], label [[EXIT:%.*]]
 ; CHECK:       for.body:
-; CHECK-NEXT:    [[T_1:%.*]] = icmp uge ptr [[PTR_IV]], [[START]]
 ; CHECK-NEXT:    [[T_2:%.*]] = icmp ult ptr [[PTR_IV]], [[UPPER]]
-; CHECK-NEXT:    [[AND:%.*]] = and i1 [[T_1]], [[T_2]]
+; CHECK-NEXT:    [[AND:%.*]] = and i1 true, [[T_2]]
 ; CHECK-NEXT:    br i1 [[AND]], label [[LOOP_LATCH]], label [[EXIT]]
 ; CHECK:       loop.latch:
 ; CHECK-NEXT:    call void @use(ptr [[PTR_IV]])
diff --git a/llvm/test/Transforms/ConstraintElimination/loop-removal.ll b/llvm/test/Transforms/ConstraintElimination/loop-removal.ll
new file mode 100644
index 00000000000000..9f963b1ff7aaf0
--- /dev/null
+++ b/llvm/test/Transforms/ConstraintElimination/loop-removal.ll
@@ -0,0 +1,47 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=constraint-elimination -S | FileCheck %s
+
+define i32 @foo() {
+; CHECK-LABEL: define i32 @foo() {
+; CHECK-NEXT:  init:
+; CHECK-NEXT:    br label %[[OUTER_LOOP_CONTROL:.*]]
+; CHECK:       outer.loop.control:
+; CHECK-NEXT:    [[X_0:%.*]] = phi i32 [ 0, [[INIT:%.*]] ], [ [[X_OUTER:%.*]], %[[OUTER_LOOP_INC:.*]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = icmp slt i32 [[X_0]], 10
+; CHECK-NEXT:    br i1 [[TMP0]], label %[[INNER_LOOP_CONTROL:.*]], label %[[EXIT:.*]]
+; CHECK:       inner.loop.control:
+; CHECK-NEXT:    [[X_1:%.*]] = phi i32 [ [[X_0]], %[[OUTER_LOOP_CONTROL]] ], [ [[X_INNER:%.*]], %[[INNER_LOOP_BODY:.*]] ]
+; CHECK-NEXT:    br i1 false, label %[[INNER_LOOP_BODY]], label %[[OUTER_LOOP_INC]]
+; CHECK:       inner.loop.body:
+; CHECK-NEXT:    [[X_INNER]] = add nsw i32 [[X_1]], -1
+; CHECK-NEXT:    br label %[[INNER_LOOP_CONTROL]]
+; CHECK:       outer.loop.inc:
+; CHECK-NEXT:    [[X_OUTER]] = add nsw i32 [[X_1]], 2
+; CHECK-NEXT:    br label %[[OUTER_LOOP_CONTROL]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i32 [[X_0]]
+;
+init:
+  br label %outer.loop.control
+
+outer.loop.control:                                                ; preds = %init, %outer.loop.inc
+  %x.0 = phi i32 [ 0, %init ], [ %x.outer, %outer.loop.inc ]
+  %0 = icmp slt i32 %x.0, 10
+  br i1 %0, label %inner.loop.control, label %exit
+
+inner.loop.control:                                                ; preds = %outer.loop.control, %inner.loop.body
+  %x.1 = phi i32 [ %x.0, %outer.loop.control ], [ %x.inner, %inner.loop.body ]
+  %1 = icmp sgt i32 %x.1, 20
+  br i1 %1, label %inner.loop.body, label %outer.loop.inc
+
+inner.loop.body:                                                ; preds = %inner.loop.control
+  %x.inner = add nsw i32 %x.1, -1
+  br label %inner.loop.control
+
+outer.loop.inc:                                                ; preds = %inner.loop.control
+  %x.outer = add nsw i32 %x.1, 2
+  br label %outer.loop.control
+
+exit:                                               ; preds = %1
+  ret i32 %x.0
+}
diff --git a/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-base.ll b/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-base.ll
index 3dbea9496da8d7..6510fbb9d260f5 100644
--- a/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-base.ll
+++ b/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-base.ll
@@ -11,10 +11,8 @@ define void @loop_iv_cond_variable_bound(i32 %n) {
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
 ; CHECK-NEXT:    [[T_1:%.*]] = icmp ule i32 [[IV]], [[N:%.*]]
 ; CHECK-NEXT:    call void @use(i1 [[T_1]])
-; CHECK-NEXT:    [[T_2:%.*]] = icmp sge i32 [[IV]], 0
-; CHECK-NEXT:    call void @use(i1 [[T_2]])
-; CHECK-NEXT:    [[T_3:%.*]] = icmp sge i32 [[IV]], -1
-; CHECK-NEXT:    call void @use(i1 [[T_3]])
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    [[C_1:%.*]] = icmp ult i32 [[IV]], [[N]]
 ; CHECK-NEXT:    call void @use(i1 [[C_1]])
 ; CHECK-NEXT:    [[C_2:%.*]] = icmp ugt i32 [[IV]], 1
@@ -58,10 +56,8 @@ define void @loop_iv_cond_constant_bound() {
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
 ; CHECK-NEXT:    [[T_1:%.*]] = icmp ule i32 [[IV]], 2
 ; CHECK-NEXT:    call void @use(i1 [[T_1]])
-; CHECK-NEXT:    [[T_2:%.*]] = icmp sge i32 [[IV]], 0
-; CHECK-NEXT:    call void @use(i1 [[T_2]])
-; CHECK-NEXT:    [[T_3:%.*]] = icmp sge i32 [[IV]], -1
-; CHECK-NEXT:    call void @use(i1 [[T_3]])
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    [[C_1:%.*]] = icmp ult i32 [[IV]], 2
 ; CHECK-NEXT:    call void @use(i1 [[C_1]])
 ; CHECK-NEXT:    [[C_2:%.*]] = icmp ugt i32 [[IV]], 1
diff --git a/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-pointer-cmps.ll b/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-pointer-cmps.ll
index 279238bea1842e..b003553ff51b23 100644
--- a/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-pointer-cmps.ll
+++ b/llvm/test/Transforms/ConstraintElimination/loops-bottom-tested-pointer-cmps.ll
@@ -21,7 +21,6 @@ define void @checks_in_loops_removable(ptr %ptr, ptr %lower, ptr %upper, i8 %n)
 ; CHECK:       loop.header:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i16 [ 0, [[PRE_2]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
 ; CHECK-NEXT:    [[PTR_IV:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i16 [[IV]]
-; CHECK-NEXT:    [[CMP_PTR_IV_LOWER:%.*]] = icmp ugt ptr [[LOWER]], [[PTR_IV]]
 ; CHECK-NEXT:    [[CMP_PTR_IV_UPPER:%.*]] = icmp ule ptr [[UPPER]], [[PTR_IV]]
 ; CHECK-NEXT:    [[OR:%.*]] = or i1 false, [[CMP_PTR_IV_UPPER]]
 ; CHECK-NEXT:    br i1 [[OR]], label [[TRAP]], label [[LOOP_LATCH]]
@@ -86,7 +85,6 @@ define void @some_checks_in_loops_removable(ptr %ptr, ptr %lower, ptr %upper, i8
 ; CHECK:       loop.header:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i16 [ 0, [[PRE_2]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
 ; CHECK-NEXT:    [[PTR_IV:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i16 [[IV]]
-; CHECK-NEXT:    [[CMP_PTR_IV_LOWER:%.*]] = icmp ugt ptr [[LOWER]], [[PTR_IV]]
 ; CHECK-NEXT:    [[CMP_PTR_IV_UPPER:%.*]] = icmp ule ptr [[UPPER]], [[PTR_IV]]
 ; CHECK-NEXT:    [[OR:%.*]] = or i1 false, [[CMP_PTR_IV_UPPER]]
 ; CHECK-NEXT:    br i1 [[OR]], label [[TRAP]], label [[LOOP_BODY:%.*]]
@@ -163,7 +161,6 @@ define void @no_checks_in_loops_removable(ptr %ptr, ptr %lower, ptr %upper, i8 %
 ; CHECK:       loop.header:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i16 [ 0, [[PRE_1]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
 ; CHECK-NEXT:    [[PTR_IV:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i16 [[IV]]
-; CHECK-NEXT:    [[CMP_PTR_IV_LOWER:%.*]] = icmp ugt ptr [[LOWER]], [[PTR_IV]]
 ; CHECK-NEXT:    [[CMP_PTR_IV_UPPER:%.*]] = icmp ule ptr [[UPPER]], [[PTR_IV]]
 ; CHECK-NEXT:    [[OR:%.*]] = or i1 false, [[CMP_PTR_IV_UPPER]]
 ; CHECK-NEXT:    br i1 [[OR]], label [[TRAP]], label [[LOOP_BODY:%.*]]
diff --git a/llvm/test/Transforms/ConstraintElimination/loops-header-tested-base.ll b/llvm/test/Transforms/ConstraintElimination/loops-header-tested-base.ll
index 7b928a030614b5..5ff87987300773 100644
--- a/llvm/test/Transforms/ConstraintElimination/loops-header-tested-base.ll
+++ b/llvm/test/Transforms/ConstraintElimination/loops-header-tested-base.ll
@@ -14,15 +14,11 @@ define void @loop_phi_pos_start_value(i32 %y, i1 %c, i32 %n) {
 ; CHECK:       loop.latch:
 ; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    call void @use(i1 false)
-; CHECK-NEXT:    [[T_2:%.*]] = icmp sge i32 [[X]], 10
-; CHECK-NEXT:    call void @use(i1 [[T_2]])
-; CHECK-NEXT:    [[C_2:%.*]] = icmp sle i32 [[X]], 9
-; CHECK-NEXT:    call void @use(i1 [[C_2]])
-; CHECK-NEXT:    [[C_3:%.*]] = icmp sgt i32 [[X]], 9
-; CHECK-NEXT:    call void @use(i1 [[C_3]])
 ; CHECK-NEXT:    call void @use(i1 true)
-; CHECK-NEXT:    [[C_5:%.*]] = icmp sge i32 [[X]], 9
-; CHECK-NEXT:    call void @use(i1 [[C_5]])
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    [[X_NEXT]] = add nsw i32 [[X]], 1
 ; CHECK-NEXT:    br label [[LOOP_HEADER]]
 ; CHECK:       exit:
@@ -75,8 +71,7 @@ define void @loop_phi_neg_start_value(i32 %y, i1 %c, i32 %n) {
 ; CHECK:       loop.latch:
 ; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    call void @use(i1 false)
-; CHECK-NEXT:    [[T_2:%.*]] = icmp sge i32 [[X]], -10
-; CHECK-NEXT:    call void @use(i1 [[T_2]])
+; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    [[C_2:%.*]] = icmp sle i32 [[X]], 9
 ; CHECK-NEXT:    call void @use(i1 [[C_2]])
 ; CHECK-NEXT:    [[C_3:%.*]] = icmp sgt i32 [[X]], 9
@@ -299,10 +294,8 @@ define void @loop_iv_cond_variable_bound(i32 %n) {
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
 ; CHECK-NEXT:    [[T_1:%.*]] = icmp ule i32 [[IV]], [[N:%.*]]
 ; CHECK-NEXT:    call void @use(i1 [[T_1]])
-; CHECK-NEXT:    [[T_2:%.*]] = icmp sge i32 [[IV]], 0
-; CHECK-NEXT:    call void @use(i1 [[T_2]])
-; CHECK-NEXT:    [[T_3:%.*]] = icmp sge i32 [[IV]], -1
-; CHECK-NEXT:    call void @use(i1 [[T_3]])
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    [[C_1:%.*]] = icmp ult i32 [[IV]], [[N]]
 ; CHECK-NEXT:    call void @use(i1 [[C_1]])
 ; CHECK-NEXT:    [[C_2:%.*]] = icmp ugt i32 [[IV]], 1
@@ -364,10 +357,8 @@ define void @loop_iv_cond_constant_bound() {
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
 ; CHECK-NEXT:    [[T_1:%.*]] = icmp ule i32 [[IV]], 2
 ; CHECK-NEXT:    call void @use(i1 [[T_1]])
-; CHECK-NEXT:    [[T_2:%.*]] = icmp sge i32 [[IV]], 0
-; CHECK-NEXT:    call void @use(i1 [[T_2]])
-; CHECK-NEXT:    [[T_3:%.*]] = icmp sge i32 [[IV]], -1
-; CHECK-NEXT:    call void @use(i1 [[T_3]])
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    [[C_1:%.*]] = icmp ult i32 [[IV]], 2
 ; CHECK-NEXT:    call void @use(i1 [[C_1]])
 ; CHECK-NEXT:    [[C_2:%.*]] = icmp ugt i32 [[IV]], 1
diff --git a/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-cmps.ll b/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-cmps.ll
index 1842ca2d82545c..57191a8ebc0870 100644
--- a/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-cmps.ll
+++ b/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-cmps.ll
@@ -210,7 +210,6 @@ define void @test2_with_ne(ptr %src, ptr %lower, ptr %upper, i8 %N) {
 ; CHECK-NEXT:    br i1 [[EC]], label [[EXIT:%.*]], label [[LOOP_BODY:%.*]]
 ; CHECK:       loop.body:
 ; CHECK-NEXT:    [[SRC_IV:%.*]] = getelementptr inbounds i8, ptr [[SRC]], i8 [[IV]]
-; CHECK-NEXT:    [[CMP_IV_START:%.*]] = icmp ult ptr [[SRC_IV]], [[LOWER]]
 ; CHECK-NEXT:    [[CMP_IV_END:%.*]] = icmp uge ptr [[SRC_IV]], [[UPPER]]
 ; CHECK-NEXT:    [[OR_1:%.*]] = or i1 false, [[CMP_IV_END]]
 ; CHECK-NEXT:    br i1 [[OR_1]], label [[TRAP_BB]], label [[LOOP_BODY_1:%.*]]
@@ -304,7 +303,6 @@ define void @test3(ptr %src, ptr %lower, ptr %upper, i8 %N) {
 ; CHECK-NEXT:    br i1 [[EC]], label [[LOOP_BODY:%.*]], label [[EXIT:%.*]]
 ; CHECK:       loop.body:
 ; CHECK-NEXT:    [[SRC_IV:%.*]] = getelementptr inbounds i8, ptr [[SRC]], i8 [[IV]]
-; CHECK-NEXT:    [[CMP_IV_START:%.*]] = icmp ult ptr [[SRC_IV]], [[LOWER]]
 ; CHECK-NEXT:    [[CMP_IV_END:%.*]] = icmp uge ptr [[SRC_IV]], [[UPPER]]
 ; CHECK-NEXT:    [[OR_1:%.*]] = or i1 false, [[CMP_IV_END]]
 ; CHECK-NEXT:    br i1 [[OR_1]], label [[TRAP_BB]], label [[LOOP_BODY_1:%.*]]
diff --git a/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-iv.ll b/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-iv.ll
index eb252d1a5444ea..8863907ecfd50a 100644
--- a/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-iv.ll
+++ b/llvm/test/Transforms/ConstraintElimination/loops-header-tested-pointer-iv.ll
@@ -12,8 +12,7 @@ define void @loop_pointer_iv(ptr %start, ptr %end, ptr %upper) {
 ; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[START:%.*]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
 ; CHECK-NEXT:    [[C_2:%.*]] = icmp ule ptr [[IV]], [[END]]
 ; CHECK-NEXT:    call void @use(i1 [[C_2]])
-; CHECK-NEXT:    [[T_2:%.*]] = icmp uge ptr [[IV]], [[START]]
-; CHECK-NEXT:    call void @use(i1 [[T_2]])
+; CHECK-NEXT:    call void @use(i1 true)
 ; CHECK-NEXT:    [[C_1:%.*]] = icmp ule ptr [[IV]], [[END]]
 ; CHECK-NEXT:    br i1 [[C_1]], label [[LOOP_LATCH]], label [[EXIT]]
 ; CHECK:       loop.latch:
diff --git a/llvm/test/Transforms/ConstraintElimination/monotonic-int-phis-wrapping.ll b/llvm/test/Transforms/ConstraintElimination/monotonic-int-phis-wrapping.ll
index 17aa4d54abfbab..2a58220b78f000 100644
--- a/llvm/test/Transforms/ConstraintElimination/monotonic-int-phis-wrapping.ll
+++ b/llvm/test/Transforms/ConstraintElimination/monotonic-int-phis-wrapping.ll
@@ -49,8 +49,7 @@ define void @test_iv_nuw_nsw_1_uge_start(i8 %len.n, i8 %a) {
 ; CHECK-NEXT:    br label [[LOOP_HEADER:%.*]]
 ; CHECK:       loop.header:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i8 [ -1, [[LOOP_PH:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
-; CHECK-NEXT:    [[C:%.*]] = icmp eq i8 [[IV]], 1
-; CHECK-NEXT:    br i1 [[C]], label [[EXIT:%.*]], label [[FOR_BODY:%.*]]
+; CHECK-NEXT:    br i1 false, label [[EXIT:%.*]], label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[C_2:%.*]] = call i1 @cond()
 ; CHECK-NEXT:    br i1 [[C_2]], label [[LOOP_LATCH]], label [[EXIT]]
@@ -90,8 +89,7 @@ define void @test_iv_nuw_nsw_2_uge_start(i8 %len.n, i8 %a) {
 ; CHECK-NEXT:    br label [[LOOP_HEADER:%.*]]
 ; CHECK:       loop.header:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i8 [ [[START]], [[LOOP_PH:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
-; CHECK-NEXT:    [[C:%.*]] = icmp eq i8 [[IV]], 1
-; CHECK-NEXT:    br i1 [[C]], label [[EXIT:%.*]], label [[FOR_BODY:%.*]]
+; CHECK-NEXT:    br i1 false, label [[EXIT:%.*]], label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[C_2:%.*]] = call i1 @cond()
 ; CHECK-NEXT:    br i1 [[C_2]], label [[LOOP_LATCH]], label [[EXIT]]
@@ -131,8 +129,7 @@ define void @test_iv_nsw_nuw_1_ult_end(i8 %len.n, i8 %a) {
 ; CHECK-NEXT:    br label [[LOOP_HEADER:%.*]]
 ; CHECK:       loop.header:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i8 [ -2, [[LOOP_PH:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP_LATCH:%.*]] ]
-; CHECK-NEXT:    [[C:%.*]] = icmp eq i8 [[IV]], 1
-; CHECK-NEXT:    br i1 [[C]], label [[EXIT:%.*]], label [[FOR_BODY:%.*]]
+; CHECK-NEXT:    br i1 false, label [[EXIT:%.*]], label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[C_2:%.*]] = call i1 @cond()
 ; CHECK-NEXT:    br i1 [[C_2]], label [[LOOP_LATCH]], label [[EXIT]]
diff --git a/llvm/test/Transforms/ConstraintElimination/monotonic-pointer-phis-chain-of-exits.ll b/llvm/test/Transforms/ConstraintElimination/monotonic-pointer-phis-chain-of-exits.ll
index 6b128d9e525ca3..16c8beff2780c3 100644
--- a/llvm/test/Transforms/ConstraintElimination/monotonic-pointer-phis-chain-of-exits.ll
+++ b/llvm/test/Transforms/ConstraintElimination/monotonic-pointer-phis-chain-of-exits.ll
@@ -25,9 +25,8 @@ define void @test_monotonic_ptr_iv_inc_1_different_element_types_1_chain_of_2_ex
 ; CHECK-NEXT:    [[C_2:%.*]] = call i1 @cond()
 ; CHECK-NEXT:    br i1 [[C_2]], label [[LOOP_NEXT:%.*]], label [[EXIT]]
 ; CHECK:       loop.next:
-; CHECK-NEXT:    [[T_1:%.*]] = icmp uge ptr [[PTR_IV]], [[START]]
 ; CHECK-NEXT:    [[T_2:%.*]] = icmp ult ptr [[PTR_IV]], [[UPPER]]
-; CHECK-NEXT:    [[AND:%.*]] = and i1 [[T_1]], [[T_2]]
+; CHECK-NEXT:    [[AND:%.*]] = and i1 true, [[T_2]]
 ; CHECK-NEXT:    br i1 [[AND]], label [[LOOP_LATCH]], label [[EXIT]]
 ; CHECK:       loop.latch:
 ; CHECK-NEXT:    call void @use(ptr [[PTR_IV]])
@@ -94,9 +93,8 @@ define void @test_monotonic_ptr_iv_inc_1_different_element_types_1_chain_of_3_ex
 ; CHECK-NEXT:    [[C_2:%.*]] = call i1 @cond()
 ; CHECK-NEXT:    br i1 [[C_2]], label [[LOOP_NEXT:%.*]], label [[EXIT]]
 ; CHECK:       loop.next:
-; CHECK-NEXT:    [[T_1:%.*]] = icmp uge ptr [[PTR_IV]], [[START]]
 ; CHECK-NEXT:    [[T_2:%.*]] = icmp ult ptr [[PTR_IV]], [[UPPER]]
-; CHECK-NEXT:    [[AND:%.*]] = and i1 [[T_1]], [[T_2]]
+; CHECK-NEXT:    [[AND:%.*]] = and i1 true, [[T_2]]
 ; CHECK-NEXT:    br i1 [[AND]], label [[LOOP_LATCH]], label [[EXIT]]
 ; CHECK:       loop.latch:
 ; CHECK-NEXT:    call void @use(ptr [[PTR_IV]])
@@ -163,9 +161,8 @@ define void @test_monotonic_ptr_iv_inc_1_different_element_types_1_chain_of_2_ex
 ; CHECK-NEXT:    [[C_2:%.*]] = call i1 @cond()
 ; CHECK-NEXT:    br i1 [[C_2]], label [[LOOP_NEXT:%.*]], l...
[truncated]

Copy link

github-actions bot commented Oct 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@grigorypas
Copy link
Author

@dtcxzyw, can you please re-run the performance analysis? I restricted the logic to be applied to the functions with smaller number of conditional branches. Hopefully, it will avoid hitting MaxRaws threshold.

@grigorypas
Copy link
Author

@dtcxzyw, can you please re-run the performance analysis? I restricted the logic to be applied to the functions with smaller number of conditional branches. Hopefully, it will avoid hitting MaxRaws threshold.

@dtcxzyw, can you please help me interpret the results? I see that the compilation time "by files" is improved, but "by projects" is worsened. Is it acceptable? If not, what should be the threshold?

@dtcxzyw
Copy link
Member

dtcxzyw commented Oct 22, 2024

I see that the compilation time "by files" is improved, but "by projects" is worsened.

It is possible since I calculate them with geomean.

Is it acceptable? If not, what should be the threshold?

It is acceptable for me. Overall compile time regression < 0.3% should be OK since this patch eliminates many branches.

The IR diff looks OK. Wait for @fhahn and @nikic to see if there are correctness issues :)
I am sorry I have no bandwidth to take a deep look.

@grigorypas
Copy link
Author

The IR diff looks OK. Wait for @fhahn and @nikic to see if there are correctness issues :) I am sorry I have no bandwidth to take a deep look.

Thanks for your feedback. I appreciate your insights!

Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • While this already appears to seems to improve existing tests (which is a good sign), a change like this should probably add new tests specifically geared towards this change as well.
  • I think the requires some sort check to rule out overflows to be correct? (add a test showing that no transformation happens in cases with potential overflow).

@MatzeB
Copy link
Contributor

MatzeB commented Oct 25, 2024

I am somewhat worried that this sort of constraint is not already added to the system because of compile time considerations. It would be good if you could outline the expected complexity effects here as program inputs grow. And not being too familiar with this part of the compiler myself I have to defer to @nikic and @fhahn for whether we need to be careful with the complexity here or if this is fine?

@MatzeB
Copy link
Contributor

MatzeB commented Oct 25, 2024

  • While this already appears to seems to improve existing tests (which is a good sign), a change like this should probably add new tests specifically geared towards this change as well.

Just notice that there is already a test added, so please ignore this.

@grigorypas
Copy link
Author

grigorypas commented Oct 26, 2024

  • I think the requires some sort check to rule out overflows to be correct? (add a test showing that no transformation happens in cases with potential overflow).

SE.getMonotonicPredicateType should return empty in the case of overflow. I added a test to cover that.

@grigorypas
Copy link
Author

I am somewhat worried that this sort of constraint is not already added to the system because of compile time considerations. It would be good if you could outline the expected complexity effects here as program inputs grow. And not being too familiar with this part of the compiler myself I have to defer to @nikic and @fhahn for whether we need to be careful with the complexity here or if this is fine?

There is a limit on the number of constraints (MaxRaws parameter) that can be added to the same system of constraints (all additional onces are dropped). The default for this limit is 500. I also restricted these new constraints to be applied only for functions with less than MaxRaws / 5 number of conditional branches to avoid potential regression due to hitting MaxRaws limit.

@grigorypas grigorypas requested a review from MatzeB October 27, 2024 00:03
@grigorypas
Copy link
Author

Ping

Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me and I'm ready to accept this!

Leaving this open to give other folks more time to react (as I am not the most familiar in this part of the code), but will approve in a week or so then...

@@ -900,10 +905,88 @@ static void dumpConstraint(ArrayRef<int64_t> C,
}
#endif

static Value *getStartValueFromAddRec(PHINode &PN, Loop &L,
ScalarEvolution &SE) {
auto *AR = dyn_cast_or_null<SCEVAddRecExpr>(SE.getSCEV(&PN));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is only called in contexts where this cast is guaranteed to succeed. So I'd probably just pass the existing AR as a parameter (or at least use an unchecked cast<> instead of a dyn_cast_or_null<>).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

State S(DT, LI, SE);
unsigned int NumCondBranches = getNumConditionalBranches(F);
State S(DT, LI, SE,
/* AddInductionInfoIntoHeader= */ NumCondBranches < MaxRows / 5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware of the existing MaxRows mechanism during my earlier review, so that seems to be already enough to avoid compile-time explosion.

Doing a whole walk over the function to count conditional branches feels somewhat heavy and we potentially get an amount of facts in the number of Phi instructions (which isn't necessarily correlated to number of condjumps). Anyway just trying to make a case that maybe we should just do the simpler thing and rely on MaxRows limiting things without adding extra conditions here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added this condition is to avoid regression in some edge cases (see analysis by @dtcxzyw). If we add extra constrains in loop headers and reach MaxRows limit, some constraints are dropped that are not otherwise. I am happy to remove it if we are OK with those regressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just feels like a lot of trouble (or rather a lot of memory read) to go through the whole function just to count the number of conditional branches... So I am wondering if there is ways to limit this without having an extra pass over the whole function...

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be careful with use of SCEV in CE because it is expensive. I just checked, and at least as-is, this causes a major compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=95a2eb70cf850597a5e871380807911e55f341a7&to=5bb2e9f2f1462c637ce2eec82b8cd2626fcb21d2&stat=instructions:u

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants