Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LVI] Learn value ranges from ctpop results #121945

Merged
merged 1 commit into from
Jan 15, 2025
Merged

[LVI] Learn value ranges from ctpop results #121945

merged 1 commit into from
Jan 15, 2025

Conversation

zsrkmyn
Copy link
Member

@zsrkmyn zsrkmyn commented Jan 7, 2025

Fixes #115751

@llvmbot
Copy link
Member

llvmbot commented Jan 7, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Stephen Senran Zhang (zsrkmyn)

Changes

Fixes #115751


Full diff: https://github.com/llvm/llvm-project/pull/121945.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/LazyValueInfo.cpp (+71)
  • (added) llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll (+309)
diff --git a/llvm/lib/Analysis/LazyValueInfo.cpp b/llvm/lib/Analysis/LazyValueInfo.cpp
index 349a0a1a2d3c42..d3cd5e41bcf1cf 100644
--- a/llvm/lib/Analysis/LazyValueInfo.cpp
+++ b/llvm/lib/Analysis/LazyValueInfo.cpp
@@ -390,6 +390,11 @@ class LazyValueInfoImpl {
   // push additional values to the worklist and return nullopt. If
   // UseBlockValue is false, it will never return nullopt.
 
+  std::optional<ValueLatticeElement>
+  getValueFromICmpBitIntrinsic(ICmpInst::Predicate Pred, unsigned ValBitWidth,
+                               Intrinsic::ID IID, Value *RHS, Instruction *CtxI,
+                               bool UseBlockValue);
+
   std::optional<ValueLatticeElement>
   getValueFromSimpleICmpCondition(CmpInst::Predicate Pred, Value *RHS,
                                   const APInt &Offset, Instruction *CxtI,
@@ -1159,6 +1164,65 @@ getRangeViaSLT(CmpInst::Predicate Pred, APInt RHS,
   return std::nullopt;
 }
 
+static bool matchBitIntrinsic(Value *LHS, Value *Val, Intrinsic::ID &IID) {
+  auto *II = dyn_cast<IntrinsicInst>(LHS);
+  if (!II)
+    return false;
+  auto ID = II->getIntrinsicID();
+  switch (ID) {
+  case Intrinsic::ctpop:
+  case Intrinsic::ctlz:
+  case Intrinsic::cttz:
+    break;
+  default:
+    return false;
+  }
+  if (II->getArgOperand(0) != Val)
+    return false;
+  IID = ID;
+  return true;
+}
+
+/// Get value range for a "intrinsic(Val) Pred RHS" condition, where intrinsic
+/// can be one of ctpop, ctlz, and cttz.
+std::optional<ValueLatticeElement>
+LazyValueInfoImpl::getValueFromICmpBitIntrinsic(ICmpInst::Predicate Pred,
+                                                unsigned ValBitWidth,
+                                                Intrinsic::ID IID, Value *RHS,
+                                                Instruction *CtxI,
+                                                bool UseBlockValue) {
+  unsigned BitWidth = ValBitWidth;
+  auto Offset = APInt::getZero(BitWidth);
+
+  auto ResValLattice =
+      getValueFromSimpleICmpCondition(Pred, RHS, Offset, CtxI, UseBlockValue);
+  if (!ResValLattice)
+    return std::nullopt;
+  auto &ResValRange = ResValLattice->getConstantRange();
+
+  unsigned ResMin = ResValRange.getUnsignedMin().getLimitedValue(BitWidth);
+  unsigned ResMax = ResValRange.getUnsignedMax().getLimitedValue(BitWidth);
+
+  APInt ValMin, ValMax;
+  APInt AllOnes = APInt::getAllOnes(BitWidth);
+  switch (IID) {
+  case Intrinsic::ctpop:
+    ValMin = AllOnes.lshr(BitWidth - ResMin);
+    ValMax = AllOnes.shl(BitWidth - ResMax);
+    break;
+  case Intrinsic::ctlz:
+    ValMin = ResMax == BitWidth ? APInt(BitWidth, 0)
+                                : APInt(BitWidth, 1).shl(BitWidth - ResMax - 1);
+    ValMax = AllOnes.lshr(ResMin);
+    break;
+  case Intrinsic::cttz:
+    ValMin = APInt(BitWidth, 1).shl(ResMin);
+    ValMax = AllOnes.shl(ResMin);
+    break;
+  }
+  return ValueLatticeElement::getRange(ConstantRange{ValMin, ValMax + 1});
+}
+
 std::optional<ValueLatticeElement> LazyValueInfoImpl::getValueFromICmpCondition(
     Value *Val, ICmpInst *ICI, bool isTrueDest, bool UseBlockValue) {
   Value *LHS = ICI->getOperand(0);
@@ -1191,6 +1255,13 @@ std::optional<ValueLatticeElement> LazyValueInfoImpl::getValueFromICmpCondition(
   if (matchICmpOperand(Offset, RHS, Val, SwappedPred))
     return getValueFromSimpleICmpCondition(SwappedPred, LHS, Offset, ICI,
                                            UseBlockValue);
+  Intrinsic::ID IID;
+  if (matchBitIntrinsic(LHS, Val, IID))
+    return getValueFromICmpBitIntrinsic(EdgePred, BitWidth, IID, RHS, ICI,
+                                        UseBlockValue);
+  if (matchBitIntrinsic(RHS, Val, IID))
+    return getValueFromICmpBitIntrinsic(SwappedPred, BitWidth, IID, LHS, ICI,
+                                        UseBlockValue);
 
   const APInt *Mask, *C;
   if (match(LHS, m_And(m_Specific(Val), m_APInt(Mask))) &&
diff --git a/llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll b/llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll
new file mode 100644
index 00000000000000..8fc835e0b056b5
--- /dev/null
+++ b/llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll
@@ -0,0 +1,309 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=correlated-propagation %s | FileCheck %s
+
+declare void @use(i1)
+
+define void @ctpop(i8 %v) {
+; CHECK-LABEL: define void @ctpop(
+; CHECK-SAME: i8 [[V:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[RES:%.*]] = call range(i8 0, 9) i8 @llvm.ctpop.i8(i8 [[V]])
+; CHECK-NEXT:    [[C0_0:%.*]] = icmp samesign uge i8 [[RES]], 3
+; CHECK-NEXT:    [[C0_1:%.*]] = icmp samesign ule i8 [[RES]], 7
+; CHECK-NEXT:    [[C0:%.*]] = and i1 [[C0_0]], [[C0_1]]
+; CHECK-NEXT:    br i1 [[C0]], label %[[RANGE_3_8:.*]], label %[[TEST2:.*]]
+; CHECK:       [[RANGE_3_8]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST2]]:
+; CHECK-NEXT:    [[C1:%.*]] = icmp samesign uge i8 [[RES]], 8
+; CHECK-NEXT:    br i1 [[C1]], label %[[RANGE_8_9:.*]], label %[[TEST3:.*]]
+; CHECK:       [[RANGE_8_9]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST3]]:
+; CHECK-NEXT:    [[C2:%.*]] = icmp eq i8 [[RES]], 0
+; CHECK-NEXT:    br i1 [[C2]], label %[[RANGE_0_1:.*]], label %[[TEST4:.*]]
+; CHECK:       [[RANGE_0_1]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST4]]:
+; CHECK-NEXT:    [[C3_1:%.*]] = icmp samesign ule i8 [[RES]], 4
+; CHECK-NEXT:    [[C3:%.*]] = and i1 true, [[C3_1]]
+; CHECK-NEXT:    br i1 [[C3]], label %[[RANGE_1_5:.*]], label %[[ED:.*]]
+; CHECK:       [[RANGE_1_5]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[ED]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %res = call range(i8 0, 9) i8 @llvm.ctpop.i8(i8 %v)
+  %c0.0 = icmp uge i8 %res, 3
+  %c0.1 = icmp ule i8 %res, 7
+  %c0 = and i1 %c0.0, %c0.1
+  br i1 %c0, label %range.3.8, label %test2
+
+range.3.8:
+  %cmp0 = icmp uge i8 %v, 7
+  call void @use(i1 %cmp0) ; true
+  %cmp1 = icmp ult i8 %v, 7
+  call void @use(i1 %cmp1) ; false
+  %cmp2 = icmp ule i8 %v, 254
+  call void @use(i1 %cmp2) ; true
+  %cmp3 = icmp ugt i8 %v, 254
+  call void @use(i1 %cmp3) ; false
+  ret void
+
+test2:
+  %c1 = icmp uge i8 %res, 8
+  br i1 %c1, label %range.8.9, label %test3
+
+range.8.9:
+  %cmp4 = icmp eq i8 %v, -1
+  call void @use(i1 %cmp4) ; true
+  ret void
+
+test3:
+  %c2 = icmp eq i8 %res, 0
+  br i1 %c2, label %range.0.1, label %test4
+
+range.0.1:
+  %cmp5 = icmp eq i8 %v, 0
+  call void @use(i1 %cmp5) ; true
+  ret void
+
+test4:
+  %c3.0 = icmp uge i8 %res, 1
+  %c3.1 = icmp ule i8 %res, 4
+  %c3 = and i1 %c3.0, %c3.1
+  br i1 %c3, label %range.1.5, label %ed
+
+range.1.5:
+  %cmp8 = icmp uge i8 %v, 1
+  call void @use(i1 %cmp8) ; true
+  %cmp9 = icmp ult i8 %v, 1
+  call void @use(i1 %cmp9) ; false
+  %cmp10 = icmp ule i8 %v, 240
+  call void @use(i1 %cmp10) ; true
+  %cmp11 = icmp ugt i8 %v, 240
+  call void @use(i1 %cmp11) ; false
+  ret void
+
+ed:
+  ret void
+}
+
+define void @ctlz(i8 %v) {
+; CHECK-LABEL: define void @ctlz(
+; CHECK-SAME: i8 [[V:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[RES:%.*]] = call range(i8 0, 9) i8 @llvm.ctlz.i8(i8 [[V]], i1 false)
+; CHECK-NEXT:    [[C0_0:%.*]] = icmp samesign uge i8 [[RES]], 3
+; CHECK-NEXT:    [[C0_1:%.*]] = icmp samesign ule i8 [[RES]], 7
+; CHECK-NEXT:    [[C0:%.*]] = and i1 [[C0_0]], [[C0_1]]
+; CHECK-NEXT:    br i1 [[C0]], label %[[RANGE_3_8:.*]], label %[[TEST2:.*]]
+; CHECK:       [[RANGE_3_8]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST2]]:
+; CHECK-NEXT:    [[C1:%.*]] = icmp samesign uge i8 [[RES]], 8
+; CHECK-NEXT:    br i1 [[C1]], label %[[RANGE_8_9:.*]], label %[[TEST3:.*]]
+; CHECK:       [[RANGE_8_9]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST3]]:
+; CHECK-NEXT:    [[C2:%.*]] = icmp eq i8 [[RES]], 0
+; CHECK-NEXT:    br i1 [[C2]], label %[[RANGE_0_1:.*]], label %[[TEST4:.*]]
+; CHECK:       [[RANGE_0_1]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    [[CMP7:%.*]] = icmp samesign ult i8 [[V]], -1
+; CHECK-NEXT:    call void @use(i1 [[CMP7]])
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST4]]:
+; CHECK-NEXT:    [[C3_1:%.*]] = icmp samesign ule i8 [[RES]], 4
+; CHECK-NEXT:    [[C3:%.*]] = and i1 true, [[C3_1]]
+; CHECK-NEXT:    br i1 [[C3]], label %[[RANGE_1_5:.*]], label %[[ED:.*]]
+; CHECK:       [[RANGE_1_5]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[ED]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %res = call range(i8 0, 9) i8 @llvm.ctlz.i8(i8 %v, i1 false)
+  %c0.0 = icmp uge i8 %res, 3
+  %c0.1 = icmp ule i8 %res, 7
+  %c0 = and i1 %c0.0, %c0.1
+  br i1 %c0, label %range.3.8, label %test2
+
+range.3.8:
+  %cmp0 = icmp uge i8 %v, 1
+  call void @use(i1 %cmp0) ; true
+  %cmp1 = icmp ult i8 %v, 1
+  call void @use(i1 %cmp1) ; false
+  %cmp2 = icmp ule i8 %v, 31
+  call void @use(i1 %cmp2) ; true
+  %cmp3 = icmp ugt i8 %v, 31
+  call void @use(i1 %cmp3) ; false
+  ret void
+
+test2:
+  %c1 = icmp uge i8 %res, 8
+  br i1 %c1, label %range.8.9, label %test3
+
+range.8.9:
+  %cmp4 = icmp eq i8 %v, 0
+  call void @use(i1 %cmp4) ; true
+  ret void
+
+test3:
+  %c2 = icmp eq i8 %res, 0
+  br i1 %c2, label %range.0.1, label %test4
+
+range.0.1:
+  %cmp5 = icmp uge i8 %v, 128
+  call void @use(i1 %cmp5) ; true
+  %cmp6 = icmp ult i8 %v, 128
+  call void @use(i1 %cmp6) ; false
+  %cmp7 = icmp ult i8 %v, 255
+  call void @use(i1 %cmp7) ; unknown
+  ret void
+
+test4:
+  %c3.0 = icmp uge i8 %res, 1
+  %c3.1 = icmp ule i8 %res, 4
+  %c3 = and i1 %c3.0, %c3.1
+  br i1 %c3, label %range.1.5, label %ed
+
+range.1.5:
+  %cmp8 = icmp uge i8 %v, 8
+  call void @use(i1 %cmp8) ; true
+  %cmp9 = icmp ult i8 %v, 8
+  call void @use(i1 %cmp9) ; false
+  %cmp10 = icmp ule i8 %v, 127
+  call void @use(i1 %cmp10) ; true
+  %cmp11 = icmp ugt i8 %v, 127
+  call void @use(i1 %cmp11) ; false
+  ret void
+
+ed:
+  ret void
+}
+
+define void @cttz(i8 %v) {
+; CHECK-LABEL: define void @cttz(
+; CHECK-SAME: i8 [[V:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[RES:%.*]] = call range(i8 0, 9) i8 @llvm.cttz.i8(i8 [[V]], i1 false)
+; CHECK-NEXT:    [[C0_0:%.*]] = icmp samesign uge i8 [[RES]], 3
+; CHECK-NEXT:    [[C0_1:%.*]] = icmp samesign ule i8 [[RES]], 7
+; CHECK-NEXT:    [[C0:%.*]] = and i1 [[C0_0]], [[C0_1]]
+; CHECK-NEXT:    br i1 [[C0]], label %[[RANGE_3_8:.*]], label %[[TEST2:.*]]
+; CHECK:       [[RANGE_3_8]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST2]]:
+; CHECK-NEXT:    [[C1:%.*]] = icmp samesign uge i8 [[RES]], 8
+; CHECK-NEXT:    br i1 [[C1]], label %[[RANGE_8_9:.*]], label %[[TEST3:.*]]
+; CHECK:       [[RANGE_8_9]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST3]]:
+; CHECK-NEXT:    [[C2:%.*]] = icmp eq i8 [[RES]], 0
+; CHECK-NEXT:    br i1 [[C2]], label %[[RANGE_0_1:.*]], label %[[TEST4:.*]]
+; CHECK:       [[RANGE_0_1]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    [[CMP7:%.*]] = icmp ult i8 [[V]], -1
+; CHECK-NEXT:    call void @use(i1 [[CMP7]])
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST4]]:
+; CHECK-NEXT:    [[C3_1:%.*]] = icmp samesign ule i8 [[RES]], 4
+; CHECK-NEXT:    [[C3:%.*]] = and i1 true, [[C3_1]]
+; CHECK-NEXT:    br i1 [[C3]], label %[[RANGE_1_5:.*]], label %[[ED:.*]]
+; CHECK:       [[RANGE_1_5]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[ED]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %res = call range(i8 0, 9) i8 @llvm.cttz.i8(i8 %v, i1 false)
+  %c0.0 = icmp uge i8 %res, 3
+  %c0.1 = icmp ule i8 %res, 7
+  %c0 = and i1 %c0.0, %c0.1
+  br i1 %c0, label %range.3.8, label %test2
+
+range.3.8:
+  %cmp0 = icmp uge i8 %v, 4
+  call void @use(i1 %cmp0) ; true
+  %cmp1 = icmp ult i8 %v, 4
+  call void @use(i1 %cmp1) ; false
+  %cmp2 = icmp ule i8 %v, 248
+  call void @use(i1 %cmp2) ; true
+  %cmp3 = icmp ugt i8 %v, 248
+  call void @use(i1 %cmp3) ; false
+  ret void
+
+test2:
+  %c1 = icmp uge i8 %res, 8
+  br i1 %c1, label %range.8.9, label %test3
+
+range.8.9:
+  %cmp4 = icmp eq i8 %v, 0
+  call void @use(i1 %cmp4) ; true
+  ret void
+
+test3:
+  %c2 = icmp eq i8 %res, 0
+  br i1 %c2, label %range.0.1, label %test4
+
+range.0.1:
+  %cmp5 = icmp uge i8 %v, 1
+  call void @use(i1 %cmp5) ; true
+  %cmp6 = icmp ult i8 %v, 1
+  call void @use(i1 %cmp6) ; false
+  %cmp7 = icmp ult i8 %v, 255
+  call void @use(i1 %cmp7) ; unknown
+  ret void
+
+test4:
+  %c3.0 = icmp uge i8 %res, 1
+  %c3.1 = icmp ule i8 %res, 4
+  %c3 = and i1 %c3.0, %c3.1
+  br i1 %c3, label %range.1.5, label %ed
+
+range.1.5:
+  %cmp8 = icmp uge i8 %v, 2
+  call void @use(i1 %cmp8) ; true
+  %cmp9 = icmp ult i8 %v, 2
+  call void @use(i1 %cmp9) ; false
+  %cmp10 = icmp ule i8 %v, 254
+  call void @use(i1 %cmp10) ; true
+  %cmp11 = icmp ugt i8 %v, 254
+  call void @use(i1 %cmp11) ; false
+  ret void
+
+ed:
+  ret void
+}

@llvmbot
Copy link
Member

llvmbot commented Jan 7, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Stephen Senran Zhang (zsrkmyn)

Changes

Fixes #115751


Full diff: https://github.com/llvm/llvm-project/pull/121945.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/LazyValueInfo.cpp (+71)
  • (added) llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll (+309)
diff --git a/llvm/lib/Analysis/LazyValueInfo.cpp b/llvm/lib/Analysis/LazyValueInfo.cpp
index 349a0a1a2d3c42..d3cd5e41bcf1cf 100644
--- a/llvm/lib/Analysis/LazyValueInfo.cpp
+++ b/llvm/lib/Analysis/LazyValueInfo.cpp
@@ -390,6 +390,11 @@ class LazyValueInfoImpl {
   // push additional values to the worklist and return nullopt. If
   // UseBlockValue is false, it will never return nullopt.
 
+  std::optional<ValueLatticeElement>
+  getValueFromICmpBitIntrinsic(ICmpInst::Predicate Pred, unsigned ValBitWidth,
+                               Intrinsic::ID IID, Value *RHS, Instruction *CtxI,
+                               bool UseBlockValue);
+
   std::optional<ValueLatticeElement>
   getValueFromSimpleICmpCondition(CmpInst::Predicate Pred, Value *RHS,
                                   const APInt &Offset, Instruction *CxtI,
@@ -1159,6 +1164,65 @@ getRangeViaSLT(CmpInst::Predicate Pred, APInt RHS,
   return std::nullopt;
 }
 
+static bool matchBitIntrinsic(Value *LHS, Value *Val, Intrinsic::ID &IID) {
+  auto *II = dyn_cast<IntrinsicInst>(LHS);
+  if (!II)
+    return false;
+  auto ID = II->getIntrinsicID();
+  switch (ID) {
+  case Intrinsic::ctpop:
+  case Intrinsic::ctlz:
+  case Intrinsic::cttz:
+    break;
+  default:
+    return false;
+  }
+  if (II->getArgOperand(0) != Val)
+    return false;
+  IID = ID;
+  return true;
+}
+
+/// Get value range for a "intrinsic(Val) Pred RHS" condition, where intrinsic
+/// can be one of ctpop, ctlz, and cttz.
+std::optional<ValueLatticeElement>
+LazyValueInfoImpl::getValueFromICmpBitIntrinsic(ICmpInst::Predicate Pred,
+                                                unsigned ValBitWidth,
+                                                Intrinsic::ID IID, Value *RHS,
+                                                Instruction *CtxI,
+                                                bool UseBlockValue) {
+  unsigned BitWidth = ValBitWidth;
+  auto Offset = APInt::getZero(BitWidth);
+
+  auto ResValLattice =
+      getValueFromSimpleICmpCondition(Pred, RHS, Offset, CtxI, UseBlockValue);
+  if (!ResValLattice)
+    return std::nullopt;
+  auto &ResValRange = ResValLattice->getConstantRange();
+
+  unsigned ResMin = ResValRange.getUnsignedMin().getLimitedValue(BitWidth);
+  unsigned ResMax = ResValRange.getUnsignedMax().getLimitedValue(BitWidth);
+
+  APInt ValMin, ValMax;
+  APInt AllOnes = APInt::getAllOnes(BitWidth);
+  switch (IID) {
+  case Intrinsic::ctpop:
+    ValMin = AllOnes.lshr(BitWidth - ResMin);
+    ValMax = AllOnes.shl(BitWidth - ResMax);
+    break;
+  case Intrinsic::ctlz:
+    ValMin = ResMax == BitWidth ? APInt(BitWidth, 0)
+                                : APInt(BitWidth, 1).shl(BitWidth - ResMax - 1);
+    ValMax = AllOnes.lshr(ResMin);
+    break;
+  case Intrinsic::cttz:
+    ValMin = APInt(BitWidth, 1).shl(ResMin);
+    ValMax = AllOnes.shl(ResMin);
+    break;
+  }
+  return ValueLatticeElement::getRange(ConstantRange{ValMin, ValMax + 1});
+}
+
 std::optional<ValueLatticeElement> LazyValueInfoImpl::getValueFromICmpCondition(
     Value *Val, ICmpInst *ICI, bool isTrueDest, bool UseBlockValue) {
   Value *LHS = ICI->getOperand(0);
@@ -1191,6 +1255,13 @@ std::optional<ValueLatticeElement> LazyValueInfoImpl::getValueFromICmpCondition(
   if (matchICmpOperand(Offset, RHS, Val, SwappedPred))
     return getValueFromSimpleICmpCondition(SwappedPred, LHS, Offset, ICI,
                                            UseBlockValue);
+  Intrinsic::ID IID;
+  if (matchBitIntrinsic(LHS, Val, IID))
+    return getValueFromICmpBitIntrinsic(EdgePred, BitWidth, IID, RHS, ICI,
+                                        UseBlockValue);
+  if (matchBitIntrinsic(RHS, Val, IID))
+    return getValueFromICmpBitIntrinsic(SwappedPred, BitWidth, IID, LHS, ICI,
+                                        UseBlockValue);
 
   const APInt *Mask, *C;
   if (match(LHS, m_And(m_Specific(Val), m_APInt(Mask))) &&
diff --git a/llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll b/llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll
new file mode 100644
index 00000000000000..8fc835e0b056b5
--- /dev/null
+++ b/llvm/test/Transforms/CorrelatedValuePropagation/ctpop-cttz-ctlz-range.ll
@@ -0,0 +1,309 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=correlated-propagation %s | FileCheck %s
+
+declare void @use(i1)
+
+define void @ctpop(i8 %v) {
+; CHECK-LABEL: define void @ctpop(
+; CHECK-SAME: i8 [[V:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[RES:%.*]] = call range(i8 0, 9) i8 @llvm.ctpop.i8(i8 [[V]])
+; CHECK-NEXT:    [[C0_0:%.*]] = icmp samesign uge i8 [[RES]], 3
+; CHECK-NEXT:    [[C0_1:%.*]] = icmp samesign ule i8 [[RES]], 7
+; CHECK-NEXT:    [[C0:%.*]] = and i1 [[C0_0]], [[C0_1]]
+; CHECK-NEXT:    br i1 [[C0]], label %[[RANGE_3_8:.*]], label %[[TEST2:.*]]
+; CHECK:       [[RANGE_3_8]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST2]]:
+; CHECK-NEXT:    [[C1:%.*]] = icmp samesign uge i8 [[RES]], 8
+; CHECK-NEXT:    br i1 [[C1]], label %[[RANGE_8_9:.*]], label %[[TEST3:.*]]
+; CHECK:       [[RANGE_8_9]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST3]]:
+; CHECK-NEXT:    [[C2:%.*]] = icmp eq i8 [[RES]], 0
+; CHECK-NEXT:    br i1 [[C2]], label %[[RANGE_0_1:.*]], label %[[TEST4:.*]]
+; CHECK:       [[RANGE_0_1]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST4]]:
+; CHECK-NEXT:    [[C3_1:%.*]] = icmp samesign ule i8 [[RES]], 4
+; CHECK-NEXT:    [[C3:%.*]] = and i1 true, [[C3_1]]
+; CHECK-NEXT:    br i1 [[C3]], label %[[RANGE_1_5:.*]], label %[[ED:.*]]
+; CHECK:       [[RANGE_1_5]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[ED]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %res = call range(i8 0, 9) i8 @llvm.ctpop.i8(i8 %v)
+  %c0.0 = icmp uge i8 %res, 3
+  %c0.1 = icmp ule i8 %res, 7
+  %c0 = and i1 %c0.0, %c0.1
+  br i1 %c0, label %range.3.8, label %test2
+
+range.3.8:
+  %cmp0 = icmp uge i8 %v, 7
+  call void @use(i1 %cmp0) ; true
+  %cmp1 = icmp ult i8 %v, 7
+  call void @use(i1 %cmp1) ; false
+  %cmp2 = icmp ule i8 %v, 254
+  call void @use(i1 %cmp2) ; true
+  %cmp3 = icmp ugt i8 %v, 254
+  call void @use(i1 %cmp3) ; false
+  ret void
+
+test2:
+  %c1 = icmp uge i8 %res, 8
+  br i1 %c1, label %range.8.9, label %test3
+
+range.8.9:
+  %cmp4 = icmp eq i8 %v, -1
+  call void @use(i1 %cmp4) ; true
+  ret void
+
+test3:
+  %c2 = icmp eq i8 %res, 0
+  br i1 %c2, label %range.0.1, label %test4
+
+range.0.1:
+  %cmp5 = icmp eq i8 %v, 0
+  call void @use(i1 %cmp5) ; true
+  ret void
+
+test4:
+  %c3.0 = icmp uge i8 %res, 1
+  %c3.1 = icmp ule i8 %res, 4
+  %c3 = and i1 %c3.0, %c3.1
+  br i1 %c3, label %range.1.5, label %ed
+
+range.1.5:
+  %cmp8 = icmp uge i8 %v, 1
+  call void @use(i1 %cmp8) ; true
+  %cmp9 = icmp ult i8 %v, 1
+  call void @use(i1 %cmp9) ; false
+  %cmp10 = icmp ule i8 %v, 240
+  call void @use(i1 %cmp10) ; true
+  %cmp11 = icmp ugt i8 %v, 240
+  call void @use(i1 %cmp11) ; false
+  ret void
+
+ed:
+  ret void
+}
+
+define void @ctlz(i8 %v) {
+; CHECK-LABEL: define void @ctlz(
+; CHECK-SAME: i8 [[V:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[RES:%.*]] = call range(i8 0, 9) i8 @llvm.ctlz.i8(i8 [[V]], i1 false)
+; CHECK-NEXT:    [[C0_0:%.*]] = icmp samesign uge i8 [[RES]], 3
+; CHECK-NEXT:    [[C0_1:%.*]] = icmp samesign ule i8 [[RES]], 7
+; CHECK-NEXT:    [[C0:%.*]] = and i1 [[C0_0]], [[C0_1]]
+; CHECK-NEXT:    br i1 [[C0]], label %[[RANGE_3_8:.*]], label %[[TEST2:.*]]
+; CHECK:       [[RANGE_3_8]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST2]]:
+; CHECK-NEXT:    [[C1:%.*]] = icmp samesign uge i8 [[RES]], 8
+; CHECK-NEXT:    br i1 [[C1]], label %[[RANGE_8_9:.*]], label %[[TEST3:.*]]
+; CHECK:       [[RANGE_8_9]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST3]]:
+; CHECK-NEXT:    [[C2:%.*]] = icmp eq i8 [[RES]], 0
+; CHECK-NEXT:    br i1 [[C2]], label %[[RANGE_0_1:.*]], label %[[TEST4:.*]]
+; CHECK:       [[RANGE_0_1]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    [[CMP7:%.*]] = icmp samesign ult i8 [[V]], -1
+; CHECK-NEXT:    call void @use(i1 [[CMP7]])
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST4]]:
+; CHECK-NEXT:    [[C3_1:%.*]] = icmp samesign ule i8 [[RES]], 4
+; CHECK-NEXT:    [[C3:%.*]] = and i1 true, [[C3_1]]
+; CHECK-NEXT:    br i1 [[C3]], label %[[RANGE_1_5:.*]], label %[[ED:.*]]
+; CHECK:       [[RANGE_1_5]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[ED]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %res = call range(i8 0, 9) i8 @llvm.ctlz.i8(i8 %v, i1 false)
+  %c0.0 = icmp uge i8 %res, 3
+  %c0.1 = icmp ule i8 %res, 7
+  %c0 = and i1 %c0.0, %c0.1
+  br i1 %c0, label %range.3.8, label %test2
+
+range.3.8:
+  %cmp0 = icmp uge i8 %v, 1
+  call void @use(i1 %cmp0) ; true
+  %cmp1 = icmp ult i8 %v, 1
+  call void @use(i1 %cmp1) ; false
+  %cmp2 = icmp ule i8 %v, 31
+  call void @use(i1 %cmp2) ; true
+  %cmp3 = icmp ugt i8 %v, 31
+  call void @use(i1 %cmp3) ; false
+  ret void
+
+test2:
+  %c1 = icmp uge i8 %res, 8
+  br i1 %c1, label %range.8.9, label %test3
+
+range.8.9:
+  %cmp4 = icmp eq i8 %v, 0
+  call void @use(i1 %cmp4) ; true
+  ret void
+
+test3:
+  %c2 = icmp eq i8 %res, 0
+  br i1 %c2, label %range.0.1, label %test4
+
+range.0.1:
+  %cmp5 = icmp uge i8 %v, 128
+  call void @use(i1 %cmp5) ; true
+  %cmp6 = icmp ult i8 %v, 128
+  call void @use(i1 %cmp6) ; false
+  %cmp7 = icmp ult i8 %v, 255
+  call void @use(i1 %cmp7) ; unknown
+  ret void
+
+test4:
+  %c3.0 = icmp uge i8 %res, 1
+  %c3.1 = icmp ule i8 %res, 4
+  %c3 = and i1 %c3.0, %c3.1
+  br i1 %c3, label %range.1.5, label %ed
+
+range.1.5:
+  %cmp8 = icmp uge i8 %v, 8
+  call void @use(i1 %cmp8) ; true
+  %cmp9 = icmp ult i8 %v, 8
+  call void @use(i1 %cmp9) ; false
+  %cmp10 = icmp ule i8 %v, 127
+  call void @use(i1 %cmp10) ; true
+  %cmp11 = icmp ugt i8 %v, 127
+  call void @use(i1 %cmp11) ; false
+  ret void
+
+ed:
+  ret void
+}
+
+define void @cttz(i8 %v) {
+; CHECK-LABEL: define void @cttz(
+; CHECK-SAME: i8 [[V:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[RES:%.*]] = call range(i8 0, 9) i8 @llvm.cttz.i8(i8 [[V]], i1 false)
+; CHECK-NEXT:    [[C0_0:%.*]] = icmp samesign uge i8 [[RES]], 3
+; CHECK-NEXT:    [[C0_1:%.*]] = icmp samesign ule i8 [[RES]], 7
+; CHECK-NEXT:    [[C0:%.*]] = and i1 [[C0_0]], [[C0_1]]
+; CHECK-NEXT:    br i1 [[C0]], label %[[RANGE_3_8:.*]], label %[[TEST2:.*]]
+; CHECK:       [[RANGE_3_8]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST2]]:
+; CHECK-NEXT:    [[C1:%.*]] = icmp samesign uge i8 [[RES]], 8
+; CHECK-NEXT:    br i1 [[C1]], label %[[RANGE_8_9:.*]], label %[[TEST3:.*]]
+; CHECK:       [[RANGE_8_9]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST3]]:
+; CHECK-NEXT:    [[C2:%.*]] = icmp eq i8 [[RES]], 0
+; CHECK-NEXT:    br i1 [[C2]], label %[[RANGE_0_1:.*]], label %[[TEST4:.*]]
+; CHECK:       [[RANGE_0_1]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    [[CMP7:%.*]] = icmp ult i8 [[V]], -1
+; CHECK-NEXT:    call void @use(i1 [[CMP7]])
+; CHECK-NEXT:    ret void
+; CHECK:       [[TEST4]]:
+; CHECK-NEXT:    [[C3_1:%.*]] = icmp samesign ule i8 [[RES]], 4
+; CHECK-NEXT:    [[C3:%.*]] = and i1 true, [[C3_1]]
+; CHECK-NEXT:    br i1 [[C3]], label %[[RANGE_1_5:.*]], label %[[ED:.*]]
+; CHECK:       [[RANGE_1_5]]:
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    call void @use(i1 true)
+; CHECK-NEXT:    call void @use(i1 false)
+; CHECK-NEXT:    ret void
+; CHECK:       [[ED]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %res = call range(i8 0, 9) i8 @llvm.cttz.i8(i8 %v, i1 false)
+  %c0.0 = icmp uge i8 %res, 3
+  %c0.1 = icmp ule i8 %res, 7
+  %c0 = and i1 %c0.0, %c0.1
+  br i1 %c0, label %range.3.8, label %test2
+
+range.3.8:
+  %cmp0 = icmp uge i8 %v, 4
+  call void @use(i1 %cmp0) ; true
+  %cmp1 = icmp ult i8 %v, 4
+  call void @use(i1 %cmp1) ; false
+  %cmp2 = icmp ule i8 %v, 248
+  call void @use(i1 %cmp2) ; true
+  %cmp3 = icmp ugt i8 %v, 248
+  call void @use(i1 %cmp3) ; false
+  ret void
+
+test2:
+  %c1 = icmp uge i8 %res, 8
+  br i1 %c1, label %range.8.9, label %test3
+
+range.8.9:
+  %cmp4 = icmp eq i8 %v, 0
+  call void @use(i1 %cmp4) ; true
+  ret void
+
+test3:
+  %c2 = icmp eq i8 %res, 0
+  br i1 %c2, label %range.0.1, label %test4
+
+range.0.1:
+  %cmp5 = icmp uge i8 %v, 1
+  call void @use(i1 %cmp5) ; true
+  %cmp6 = icmp ult i8 %v, 1
+  call void @use(i1 %cmp6) ; false
+  %cmp7 = icmp ult i8 %v, 255
+  call void @use(i1 %cmp7) ; unknown
+  ret void
+
+test4:
+  %c3.0 = icmp uge i8 %res, 1
+  %c3.1 = icmp ule i8 %res, 4
+  %c3 = and i1 %c3.0, %c3.1
+  br i1 %c3, label %range.1.5, label %ed
+
+range.1.5:
+  %cmp8 = icmp uge i8 %v, 2
+  call void @use(i1 %cmp8) ; true
+  %cmp9 = icmp ult i8 %v, 2
+  call void @use(i1 %cmp9) ; false
+  %cmp10 = icmp ule i8 %v, 254
+  call void @use(i1 %cmp10) ; true
+  %cmp11 = icmp ugt i8 %v, 254
+  call void @use(i1 %cmp11) ; false
+  ret void
+
+ed:
+  ret void
+}

@zsrkmyn
Copy link
Member Author

zsrkmyn commented Jan 7, 2025

Oops, it seems there are crashes.. I'll take a look tomorrow.

@zsrkmyn
Copy link
Member Author

zsrkmyn commented Jan 7, 2025

Fixed.

@dtcxzyw could you help restart the benchmark? Many thanks!

llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rationale for supporting ctlz and cttz here? Comparisons involving these will already get canonicalized: https://llvm.godbolt.org/z/dfW4vK8jx

@zsrkmyn
Copy link
Member Author

zsrkmyn commented Jan 9, 2025

What is the rationale for supporting ctlz and cttz here? Comparisons involving these will already get canonicalized: https://llvm.godbolt.org/z/dfW4vK8jx

Oh, that's because if the RHS is not a constant, but a constant range, it won't get canonicalized.

But if I remove the use of getValueFromSimpleICmpCondition, ctlz & cttz won't be in such cases and can be removed.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
@dtcxzyw dtcxzyw changed the title [LVI] Learn value ranges from ctpop/ctlz/cttz results [LVI] Learn value ranges from ctpop results Jan 13, 2025
@zsrkmyn zsrkmyn force-pushed the ctpop branch 2 times, most recently from c390bf6 to 472e37a Compare January 13, 2025 06:26
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me.

llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
llvm/lib/Analysis/LazyValueInfo.cpp Outdated Show resolved Hide resolved
@zsrkmyn zsrkmyn force-pushed the ctpop branch 2 times, most recently from 7f886ef to d86c6c9 Compare January 14, 2025 02:15
@zsrkmyn
Copy link
Member Author

zsrkmyn commented Jan 14, 2025

@nikic Thanks! All comments resolved.

Could you also help merge it when it's ready? many thanks!

Comment on lines 34 to 35
%cmp1 = icmp ult i8 %v, 7
call void @use(i1 %cmp1) ; false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%cmp1 = icmp ult i8 %v, 7
call void @use(i1 %cmp1) ; false
%cmp1 = icmp uge i8 %v, 8
call void @use(i1 %cmp1)

For these sorts of tests, it's best to check two adjacent values where one folds and one doesn't. This allows you to test where exactly the boundary is. Otherwise it's not clear that the range is actually correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right. I meant to test its lower bound but made the condition wrong. Fixed. Many thanks!

; CHECK-NEXT: ret void
;
entry:
%res = call range(i8 0, 9) i8 @llvm.ctpop.i8(i8 %v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%res = call range(i8 0, 9) i8 @llvm.ctpop.i8(i8 %v)
%res = call i8 @llvm.ctpop.i8(i8 %v)

I don't think these range annotations are really relevant to the test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic nikic merged commit af656a8 into llvm:main Jan 15, 2025
8 checks passed
paulhuggett pushed a commit to paulhuggett/llvm-project that referenced this pull request Jan 16, 2025
DKLoehr pushed a commit to DKLoehr/llvm-project that referenced this pull request Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure to infer that ctpop(y) == 1 implies y != 0
4 participants