-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[InstCombine] Fold sext(trunc nsw)
and zext(trunc nuw)
#88609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Monad (YanWQ-monad) ChangesFold
Alive2 proofs: Full diff: https://github.com/llvm/llvm-project/pull/88609.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
index 437e9b92c7032f..91c149305bb76c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
@@ -1188,9 +1188,20 @@ Instruction *InstCombinerImpl::visitZExt(ZExtInst &Zext) {
if (auto *CSrc = dyn_cast<TruncInst>(Src)) { // A->B->C cast
// TODO: Subsume this into EvaluateInDifferentType.
+ Value *A = CSrc->getOperand(0);
+ // If TRUNC has nuw flag, then convert directly to final type.
+ if (CSrc->hasNoUnsignedWrap()) {
+ CastInst *I =
+ CastInst::CreateIntegerCast(A, DestTy, /* isSigned */ false);
+ if (auto *ZExt = dyn_cast<ZExtInst>(I))
+ ZExt->setNonNeg();
+ if (auto *Trunc = dyn_cast<TruncInst>(I))
+ Trunc->setHasNoUnsignedWrap(true);
+ return I;
+ }
+
// Get the sizes of the types involved. We know that the intermediate type
// will be smaller than A or C, but don't know the relation between A and C.
- Value *A = CSrc->getOperand(0);
unsigned SrcSize = A->getType()->getScalarSizeInBits();
unsigned MidSize = CSrc->getType()->getScalarSizeInBits();
unsigned DstSize = DestTy->getScalarSizeInBits();
@@ -1467,6 +1478,15 @@ Instruction *InstCombinerImpl::visitSExt(SExtInst &Sext) {
if (ComputeNumSignBits(X, 0, &Sext) > XBitSize - SrcBitSize)
return CastInst::CreateIntegerCast(X, DestTy, /* isSigned */ true);
+ // If trunc has nsw flag, then convert directly to final type.
+ auto *CSrc = static_cast<TruncInst *>(Src);
+ if (CSrc->hasNoSignedWrap()) {
+ CastInst *I = CastInst::CreateIntegerCast(X, DestTy, /* isSigned */ true);
+ if (auto *Trunc = dyn_cast<TruncInst>(I))
+ Trunc->setHasNoSignedWrap(true);
+ return I;
+ }
+
// If input is a trunc from the destination type, then convert into shifts.
if (Src->hasOneUse() && X->getType() == DestTy) {
// sext (trunc X) --> ashr (shl X, C), C
diff --git a/llvm/test/Transforms/InstCombine/sext.ll b/llvm/test/Transforms/InstCombine/sext.ll
index e3b6058ce7f806..9eae03470a4693 100644
--- a/llvm/test/Transforms/InstCombine/sext.ll
+++ b/llvm/test/Transforms/InstCombine/sext.ll
@@ -423,3 +423,44 @@ define i64 @smear_set_bit_different_dest_type_wider_dst(i32 %x) {
%s = sext i8 %a to i64
ret i64 %s
}
+
+define i32 @sext_trunc_nsw(i16 %x) {
+; CHECK-LABEL: @sext_trunc_nsw(
+; CHECK-NEXT: [[E:%.*]] = sext i16 [[X:%.*]] to i32
+; CHECK-NEXT: ret i32 [[E]]
+;
+ %c = trunc nsw i16 %x to i8
+ %e = sext i8 %c to i32
+ ret i32 %e
+}
+
+define i16 @sext_trunc_nsw_2(i32 %x) {
+; CHECK-LABEL: @sext_trunc_nsw_2(
+; CHECK-NEXT: [[E:%.*]] = trunc nsw i32 [[X:%.*]] to i16
+; CHECK-NEXT: ret i16 [[E]]
+;
+ %c = trunc nsw i32 %x to i8
+ %e = sext i8 %c to i16
+ ret i16 %e
+}
+
+define <2 x i32> @sext_trunc_nsw_vec(<2 x i16> %x) {
+; CHECK-LABEL: @sext_trunc_nsw_vec(
+; CHECK-NEXT: [[E:%.*]] = sext <2 x i16> [[X:%.*]] to <2 x i32>
+; CHECK-NEXT: ret <2 x i32> [[E]]
+;
+ %c = trunc nsw <2 x i16> %x to <2 x i8>
+ %e = sext <2 x i8> %c to <2 x i32>
+ ret <2 x i32> %e
+}
+
+define i32 @sext_trunc(i16 %x) {
+; CHECK-LABEL: @sext_trunc(
+; CHECK-NEXT: [[C:%.*]] = trunc i16 [[X:%.*]] to i8
+; CHECK-NEXT: [[E:%.*]] = sext i8 [[C]] to i32
+; CHECK-NEXT: ret i32 [[E]]
+;
+ %c = trunc i16 %x to i8
+ %e = sext i8 %c to i32
+ ret i32 %e
+}
diff --git a/llvm/test/Transforms/InstCombine/zext.ll b/llvm/test/Transforms/InstCombine/zext.ll
index 88cd9c70af40d8..16e7ef143cef9e 100644
--- a/llvm/test/Transforms/InstCombine/zext.ll
+++ b/llvm/test/Transforms/InstCombine/zext.ll
@@ -867,3 +867,44 @@ entry:
%res = zext nneg i2 %x to i32
ret i32 %res
}
+
+define i32 @zext_trunc_nuw(i16 %x) {
+; CHECK-LABEL: @zext_trunc_nuw(
+; CHECK-NEXT: [[E1:%.*]] = zext nneg i16 [[X:%.*]] to i32
+; CHECK-NEXT: ret i32 [[E1]]
+;
+ %c = trunc nuw i16 %x to i8
+ %e = zext i8 %c to i32
+ ret i32 %e
+}
+
+define i16 @zext_trunc_nuw_2(i32 %x) {
+; CHECK-LABEL: @zext_trunc_nuw_2(
+; CHECK-NEXT: [[E:%.*]] = trunc nuw i32 [[X:%.*]] to i16
+; CHECK-NEXT: ret i16 [[E]]
+;
+ %c = trunc nuw i32 %x to i8
+ %e = zext i8 %c to i16
+ ret i16 %e
+}
+
+define <2 x i32> @zext_trunc_nuw_vec(<2 x i16> %x) {
+; CHECK-LABEL: @zext_trunc_nuw_vec(
+; CHECK-NEXT: [[E1:%.*]] = zext nneg <2 x i16> [[X:%.*]] to <2 x i32>
+; CHECK-NEXT: ret <2 x i32> [[E1]]
+;
+ %c = trunc nuw <2 x i16> %x to <2 x i8>
+ %e = zext <2 x i8> %c to <2 x i32>
+ ret <2 x i32> %e
+}
+
+define i32 @zext_trunc(i16 %x) {
+; CHECK-LABEL: @zext_trunc(
+; CHECK-NEXT: [[E:%.*]] = and i16 [[X:%.*]], 255
+; CHECK-NEXT: [[E1:%.*]] = zext nneg i16 [[E]] to i32
+; CHECK-NEXT: ret i32 [[E1]]
+;
+ %c = trunc i16 %x to i8
+ %e = zext i8 %c to i32
+ ret i32 %e
+}
|
This fold unfortunately causes information loss, which may hinder other folds. For example, For the above reason, I am not sure if this fold is worth a try. |
maybe the right place is DAGCombiner then. |
Doing this fold is the whole purpose of the flags, so if we can't do it, we may as well drop them again :) Looking at the diffs, I think it looks ok on average? Note that we already essentially do this fold just via computeKnownBits/ComputeNumSignBits, which is also why there is such a small number of diffs overall. I think the extra changes you see are due to interactions of IPSCCP and InstCombine or something like that. I think we would get some more interesting cases after #88686. A general issue I see in the diffs is that we're not very good at narrowing i8 "booleans" down to i1 when loop phis are involved, as computeKnownBits() can't look through them. I think this issue accounts for most of the regressions, but it's also tricky to solve... |
@dtcxzyw Could you please rerun tests for this patch? |
Done. |
noticed that the small number of changes in llvm-opt-benchmark is due to most cases is already handled in the llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp Lines 1203 to 1236 in 2c3d7d5
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp#L1494-L1513 so this fold is not executed that mush. the only cases that I can see will be handled by this fold when it is placed after the linked code is:
if the "target datalayout" in the test files is updated with "n8:16:32:64" most of the test will no longer pass. I tried and moved the fold before the code linked and got 3k files update in llvm-opt-benchmark. |
your proofs seem overly complex. I think the following suffice: https://alive2.llvm.org/ce/z/WbDanN |
@@ -1188,9 +1188,19 @@ Instruction *InstCombinerImpl::visitZExt(ZExtInst &Zext) { | |||
if (auto *CSrc = dyn_cast<TruncInst>(Src)) { // A->B->C cast | |||
// TODO: Subsume this into EvaluateInDifferentType. | |||
|
|||
Value *A = CSrc->getOperand(0); | |||
// If trunc has nuw flag, then convert directly to final type. | |||
if (CSrc->hasNoUnsignedWrap()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also handle nsw
if the zext
has nneg
.
That's because Alive2 didn't support |
Fold
sext (trunc nsw X to Y) to Z
tocast (nsw) X to Z
, andzext (trunc nuw X to Y) to Z
tocast (nuw) X to Z
Alive2 proofs:
sext
: https://alive2.llvm.org/ce/z/cqsk5tzext
: https://alive2.llvm.org/ce/z/kdtEWbCloses #98017.