Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LVI] Infer non-zero from equality icmp #112838

Merged
merged 3 commits into from
Oct 18, 2024
Merged

[LVI] Infer non-zero from equality icmp #112838

merged 3 commits into from
Oct 18, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Oct 18, 2024

This following pattern is common in loop headers:

  %101 = sub nuw i64 %78, %98
  %103 = icmp eq i64 %78, %98
  br i1 %103, label %.thread.i.i, label %.preheader.preheader.i.i

.preheader.preheader.i.i:
  %invariant.umin.i.i = call i64 @llvm.umin.i64(i64 %101, i64 9)
  %umax.i = call i64 @llvm.umax.i64(i64 %invariant.umin.i.i, i64 1)
  br label %.preheader.i.i

.preheader.i.i:
  ...
  %116 = add nuw nsw i64 %.011.i.i, 1
  %exitcond.not.i = icmp eq i64 %116, %umax.i
  br i1 %exitcond.not.i, label %.critedge.i.i, label %.preheader.i.i

As %78 is not equal to %98 in BB .preheader.preheader.i.i, we can prove %101 is non-zero. Then we can simplify the loop exit condition.

Addresses regression introduced by #112742.

@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This following pattern is common in loop headers:

  %101 = sub nuw i64 %78, %98
  %103 = icmp eq i64 %78, %98
  br i1 %103, label %.thread.i.i, label %.preheader.preheader.i.i

.preheader.preheader.i.i:
  %invariant.umin.i.i = call i64 @<!-- -->llvm.umin.i64(i64 %101, i64 9)
  %umax.i = call i64 @<!-- -->llvm.umax.i64(i64 %invariant.umin.i.i, i64 1)
  br label %.preheader.i.i

.preheader.i.i:
  ...
  %116 = add nuw nsw i64 %.011.i.i, 1
  %exitcond.not.i = icmp eq i64 %116, %umax.i
  br i1 %exitcond.not.i, label %.critedge.i.i, label %.preheader.i.i

As %78 is not equal to %98 in BB .preheader.preheader.i.i, we can prove %101 is non-zero. Then we can simplify the loop exit condition.

Addresses regression introduced by #112742.


Full diff: https://github.com/llvm/llvm-project/pull/112838.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/LazyValueInfo.cpp (+14)
  • (added) llvm/test/Transforms/CorrelatedValuePropagation/umax.ll (+58)
diff --git a/llvm/lib/Analysis/LazyValueInfo.cpp b/llvm/lib/Analysis/LazyValueInfo.cpp
index 10ad4708596cb3..f29777a584772d 100644
--- a/llvm/lib/Analysis/LazyValueInfo.cpp
+++ b/llvm/lib/Analysis/LazyValueInfo.cpp
@@ -1127,6 +1127,20 @@ std::optional<ValueLatticeElement> LazyValueInfoImpl::getValueFromICmpCondition(
   if (!Ty->isIntegerTy())
     return ValueLatticeElement::getOverdefined();
 
+  // a - b or ptrtoint(a) - ptrtoint(b) ==/!= 0 if a ==/!= b
+  Value *X, *Y;
+  if (ICI->isEquality() && match(Val, m_Sub(m_Value(X), m_Value(Y)))) {
+    // Peek through ptrtoints
+    match(X, m_PtrToIntSameSize(DL, m_Value(X)));
+    match(Y, m_PtrToIntSameSize(DL, m_Value(Y)));
+    if ((X == LHS && Y == RHS) || (X == RHS && Y == LHS)) {
+      Constant *NullVal = Constant::getNullValue(Val->getType());
+      if (EdgePred == ICmpInst::ICMP_EQ)
+        return ValueLatticeElement::get(NullVal);
+      return ValueLatticeElement::getNot(NullVal);
+    }
+  }
+
   unsigned BitWidth = Ty->getScalarSizeInBits();
   APInt Offset(BitWidth, 0);
   if (matchICmpOperand(Offset, LHS, Val, EdgePred))
diff --git a/llvm/test/Transforms/CorrelatedValuePropagation/umax.ll b/llvm/test/Transforms/CorrelatedValuePropagation/umax.ll
new file mode 100644
index 00000000000000..5cd615e948adbe
--- /dev/null
+++ b/llvm/test/Transforms/CorrelatedValuePropagation/umax.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=correlated-propagation -S | FileCheck %s
+
+target datalayout = "p:32:32"
+
+define i32 @infer_range_from_dom_equality(i32 %x, i32 %y) {
+; CHECK-LABEL: define range(i32 1, 0) i32 @infer_range_from_dom_equality(
+; CHECK-SAME: i32 [[X:%.*]], i32 [[Y:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i32 [[X]], [[Y]]
+; CHECK-NEXT:    [[SUB:%.*]] = sub i32 [[X]], [[Y]]
+; CHECK-NEXT:    br i1 [[COND]], label %[[IF_THEN:.*]], label %[[IF_ELSE:.*]]
+; CHECK:       [[IF_THEN]]:
+; CHECK-NEXT:    ret i32 1
+; CHECK:       [[IF_ELSE]]:
+; CHECK-NEXT:    ret i32 [[SUB]]
+;
+entry:
+  %cond = icmp eq i32 %x, %y
+  %sub = sub i32 %x, %y
+  br i1 %cond, label %if.then, label %if.else
+
+if.then:
+  %max1 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max1
+
+if.else:
+  %max2 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max2
+}
+
+define i32 @infer_range_from_dom_equality_ptrdiff(ptr %x, ptr %y) {
+; CHECK-LABEL: define range(i32 1, 0) i32 @infer_range_from_dom_equality_ptrdiff(
+; CHECK-SAME: ptr [[X:%.*]], ptr [[Y:%.*]]) {
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq ptr [[X]], [[Y]]
+; CHECK-NEXT:    [[XI:%.*]] = ptrtoint ptr [[X]] to i32
+; CHECK-NEXT:    [[YI:%.*]] = ptrtoint ptr [[Y]] to i32
+; CHECK-NEXT:    [[SUB:%.*]] = sub i32 [[XI]], [[YI]]
+; CHECK-NEXT:    br i1 [[COND]], label %[[IF_THEN:.*]], label %[[IF_ELSE:.*]]
+; CHECK:       [[IF_THEN]]:
+; CHECK-NEXT:    ret i32 1
+; CHECK:       [[IF_ELSE]]:
+; CHECK-NEXT:    ret i32 [[SUB]]
+;
+  %cond = icmp eq ptr %x, %y
+  %xi = ptrtoint ptr %x to i32
+  %yi = ptrtoint ptr %y to i32
+  %sub = sub i32 %xi, %yi
+  br i1 %cond, label %if.then, label %if.else
+
+if.then:
+  %max1 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max1
+
+if.else:
+  %max2 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max2
+}

@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Yingwei Zheng (dtcxzyw)

Changes

This following pattern is common in loop headers:

  %101 = sub nuw i64 %78, %98
  %103 = icmp eq i64 %78, %98
  br i1 %103, label %.thread.i.i, label %.preheader.preheader.i.i

.preheader.preheader.i.i:
  %invariant.umin.i.i = call i64 @<!-- -->llvm.umin.i64(i64 %101, i64 9)
  %umax.i = call i64 @<!-- -->llvm.umax.i64(i64 %invariant.umin.i.i, i64 1)
  br label %.preheader.i.i

.preheader.i.i:
  ...
  %116 = add nuw nsw i64 %.011.i.i, 1
  %exitcond.not.i = icmp eq i64 %116, %umax.i
  br i1 %exitcond.not.i, label %.critedge.i.i, label %.preheader.i.i

As %78 is not equal to %98 in BB .preheader.preheader.i.i, we can prove %101 is non-zero. Then we can simplify the loop exit condition.

Addresses regression introduced by #112742.


Full diff: https://github.com/llvm/llvm-project/pull/112838.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/LazyValueInfo.cpp (+14)
  • (added) llvm/test/Transforms/CorrelatedValuePropagation/umax.ll (+58)
diff --git a/llvm/lib/Analysis/LazyValueInfo.cpp b/llvm/lib/Analysis/LazyValueInfo.cpp
index 10ad4708596cb3..f29777a584772d 100644
--- a/llvm/lib/Analysis/LazyValueInfo.cpp
+++ b/llvm/lib/Analysis/LazyValueInfo.cpp
@@ -1127,6 +1127,20 @@ std::optional<ValueLatticeElement> LazyValueInfoImpl::getValueFromICmpCondition(
   if (!Ty->isIntegerTy())
     return ValueLatticeElement::getOverdefined();
 
+  // a - b or ptrtoint(a) - ptrtoint(b) ==/!= 0 if a ==/!= b
+  Value *X, *Y;
+  if (ICI->isEquality() && match(Val, m_Sub(m_Value(X), m_Value(Y)))) {
+    // Peek through ptrtoints
+    match(X, m_PtrToIntSameSize(DL, m_Value(X)));
+    match(Y, m_PtrToIntSameSize(DL, m_Value(Y)));
+    if ((X == LHS && Y == RHS) || (X == RHS && Y == LHS)) {
+      Constant *NullVal = Constant::getNullValue(Val->getType());
+      if (EdgePred == ICmpInst::ICMP_EQ)
+        return ValueLatticeElement::get(NullVal);
+      return ValueLatticeElement::getNot(NullVal);
+    }
+  }
+
   unsigned BitWidth = Ty->getScalarSizeInBits();
   APInt Offset(BitWidth, 0);
   if (matchICmpOperand(Offset, LHS, Val, EdgePred))
diff --git a/llvm/test/Transforms/CorrelatedValuePropagation/umax.ll b/llvm/test/Transforms/CorrelatedValuePropagation/umax.ll
new file mode 100644
index 00000000000000..5cd615e948adbe
--- /dev/null
+++ b/llvm/test/Transforms/CorrelatedValuePropagation/umax.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=correlated-propagation -S | FileCheck %s
+
+target datalayout = "p:32:32"
+
+define i32 @infer_range_from_dom_equality(i32 %x, i32 %y) {
+; CHECK-LABEL: define range(i32 1, 0) i32 @infer_range_from_dom_equality(
+; CHECK-SAME: i32 [[X:%.*]], i32 [[Y:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i32 [[X]], [[Y]]
+; CHECK-NEXT:    [[SUB:%.*]] = sub i32 [[X]], [[Y]]
+; CHECK-NEXT:    br i1 [[COND]], label %[[IF_THEN:.*]], label %[[IF_ELSE:.*]]
+; CHECK:       [[IF_THEN]]:
+; CHECK-NEXT:    ret i32 1
+; CHECK:       [[IF_ELSE]]:
+; CHECK-NEXT:    ret i32 [[SUB]]
+;
+entry:
+  %cond = icmp eq i32 %x, %y
+  %sub = sub i32 %x, %y
+  br i1 %cond, label %if.then, label %if.else
+
+if.then:
+  %max1 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max1
+
+if.else:
+  %max2 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max2
+}
+
+define i32 @infer_range_from_dom_equality_ptrdiff(ptr %x, ptr %y) {
+; CHECK-LABEL: define range(i32 1, 0) i32 @infer_range_from_dom_equality_ptrdiff(
+; CHECK-SAME: ptr [[X:%.*]], ptr [[Y:%.*]]) {
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq ptr [[X]], [[Y]]
+; CHECK-NEXT:    [[XI:%.*]] = ptrtoint ptr [[X]] to i32
+; CHECK-NEXT:    [[YI:%.*]] = ptrtoint ptr [[Y]] to i32
+; CHECK-NEXT:    [[SUB:%.*]] = sub i32 [[XI]], [[YI]]
+; CHECK-NEXT:    br i1 [[COND]], label %[[IF_THEN:.*]], label %[[IF_ELSE:.*]]
+; CHECK:       [[IF_THEN]]:
+; CHECK-NEXT:    ret i32 1
+; CHECK:       [[IF_ELSE]]:
+; CHECK-NEXT:    ret i32 [[SUB]]
+;
+  %cond = icmp eq ptr %x, %y
+  %xi = ptrtoint ptr %x to i32
+  %yi = ptrtoint ptr %y to i32
+  %sub = sub i32 %xi, %yi
+  br i1 %cond, label %if.then, label %if.else
+
+if.then:
+  %max1 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max1
+
+if.else:
+  %max2 = call i32 @llvm.umax.i32(i32 %sub, i32 1)
+  ret i32 %max2
+}

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
@dtcxzyw
Copy link
Member Author

dtcxzyw commented Oct 18, 2024

Failed Tests (3):
LLVM :: MC/ELF/warn-newline-in-escaped-string.s
LLVM :: TableGen/x86-fold-tables.td
LLVM :: tools/llvm-rc/tag-html.test

Looks unrelated.

@dtcxzyw dtcxzyw merged commit c89d731 into llvm:main Oct 18, 2024
6 of 8 checks passed
@dtcxzyw dtcxzyw deleted the perf/lvi-eq branch October 18, 2024 13:19
EricWF pushed a commit to efcs/llvm-project that referenced this pull request Oct 22, 2024
This following pattern is common in loop headers:
```
  %101 = sub nuw i64 %78, %98
  %103 = icmp eq i64 %78, %98
  br i1 %103, label %.thread.i.i, label %.preheader.preheader.i.i

.preheader.preheader.i.i:
  %invariant.umin.i.i = call i64 @llvm.umin.i64(i64 %101, i64 9)
  %umax.i = call i64 @llvm.umax.i64(i64 %invariant.umin.i.i, i64 1)
  br label %.preheader.i.i

.preheader.i.i:
  ...
  %116 = add nuw nsw i64 %.011.i.i, 1
  %exitcond.not.i = icmp eq i64 %116, %umax.i
  br i1 %exitcond.not.i, label %.critedge.i.i, label %.preheader.i.i
```
As `%78` is not equal to `%98` in BB `.preheader.preheader.i.i`, we can
prove `%101` is non-zero. Then we can simplify the loop exit condition.

Addresses regression introduced by
llvm#112742.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants