-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ValueTracking] Compute KnownFP state from recursive select/phi. #113686
Conversation
@llvm/pr-subscribers-llvm-transforms Author: David Green (davemgreen) ChangesGiven a recursive phi with select: The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from #83200. Full diff: https://github.com/llvm/llvm-project/pull/113686.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index e9ed8b3c862b55..708922e21ee0bb 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5999,6 +5999,13 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
if (IncValue == P)
continue;
+ // If the Use is a select of this phi, use the fp class of the other
+ // operand to break the recursion.
+ Value *V;
+ if (match(IncValue, m_Select(m_Value(), m_Specific(P), m_Value(V))) ||
+ match(IncValue, m_Select(m_Value(), m_Value(V), m_Specific(P))))
+ IncValue = V;
+
KnownFPClass KnownSrc;
// Recurse, but cap the recursion to two levels, because we don't want
// to waste time spinning around in loops. We need at least depth 2 to
diff --git a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
index fc56754166d609..b54d4a97d07be7 100644
--- a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
+++ b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
@@ -45,8 +45,8 @@ define float @test(ptr nocapture readonly %pA, ptr nocapture readonly %pB, i32 %
; CHECK-NEXT: [[TMP7:%.*]] = fsub fast <4 x float> [[WIDE_LOAD]], [[WIDE_LOAD7]]
; CHECK-NEXT: [[TMP8:%.*]] = call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP7]])
; CHECK-NEXT: [[TMP9:%.*]] = fdiv fast <4 x float> [[TMP8]], [[TMP6]]
-; CHECK-NEXT: [[TMP10:%.*]] = fadd fast <4 x float> [[TMP9]], [[VEC_PHI]]
-; CHECK-NEXT: [[PREDPHI]] = select <4 x i1> [[DOTNOT9]], <4 x float> [[VEC_PHI]], <4 x float> [[TMP10]]
+; CHECK-NEXT: [[TMP10:%.*]] = select <4 x i1> [[DOTNOT9]], <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, <4 x float> [[TMP9]]
+; CHECK-NEXT: [[PREDPHI]] = fadd fast <4 x float> [[VEC_PHI]], [[TMP10]]
; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
; CHECK-NEXT: br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
|
@llvm/pr-subscribers-llvm-analysis Author: David Green (davemgreen) ChangesGiven a recursive phi with select: The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from #83200. Full diff: https://github.com/llvm/llvm-project/pull/113686.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index e9ed8b3c862b55..708922e21ee0bb 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5999,6 +5999,13 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
if (IncValue == P)
continue;
+ // If the Use is a select of this phi, use the fp class of the other
+ // operand to break the recursion.
+ Value *V;
+ if (match(IncValue, m_Select(m_Value(), m_Specific(P), m_Value(V))) ||
+ match(IncValue, m_Select(m_Value(), m_Value(V), m_Specific(P))))
+ IncValue = V;
+
KnownFPClass KnownSrc;
// Recurse, but cap the recursion to two levels, because we don't want
// to waste time spinning around in loops. We need at least depth 2 to
diff --git a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
index fc56754166d609..b54d4a97d07be7 100644
--- a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
+++ b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
@@ -45,8 +45,8 @@ define float @test(ptr nocapture readonly %pA, ptr nocapture readonly %pB, i32 %
; CHECK-NEXT: [[TMP7:%.*]] = fsub fast <4 x float> [[WIDE_LOAD]], [[WIDE_LOAD7]]
; CHECK-NEXT: [[TMP8:%.*]] = call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP7]])
; CHECK-NEXT: [[TMP9:%.*]] = fdiv fast <4 x float> [[TMP8]], [[TMP6]]
-; CHECK-NEXT: [[TMP10:%.*]] = fadd fast <4 x float> [[TMP9]], [[VEC_PHI]]
-; CHECK-NEXT: [[PREDPHI]] = select <4 x i1> [[DOTNOT9]], <4 x float> [[VEC_PHI]], <4 x float> [[TMP10]]
+; CHECK-NEXT: [[TMP10:%.*]] = select <4 x i1> [[DOTNOT9]], <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, <4 x float> [[TMP9]]
+; CHECK-NEXT: [[PREDPHI]] = fadd fast <4 x float> [[VEC_PHI]], [[TMP10]]
; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
; CHECK-NEXT: br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally, this indeed restores the lost transformation.
411c89e
to
4d32839
Compare
I think this approach is also applicable to |
Nice, thanks! I had seen known bits had some handling of recurrences already, they didn't necessarily include selects. I rebased over 577c7dd, which added a phase-ordering test. |
As far as the regression on #83200 is concerned, I think it needs a local flag handling fix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a more targeted, dedicated test
4d32839
to
b17b64b
Compare
Yeah there are probably a few ways to fix it, selects are often missing fmfs due to them coming from control flow. This seemed generally useful.
I've tried to add some Attributor tests, which I see have been used to test similar patches in the past. Let me know if there is a better way. |
As a follow-on to llvm#113686, this breaks the recursion between phi nodes that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated from the classes of p1 and p2.
b17b64b
to
7cbb47e
Compare
As a follow-on to llvm#113686, this breaks the recursion between phi nodes that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated from the classes of p1 and p2.
// operand to break the recursion. | ||
Value *V; | ||
if (match(IncValue, m_Select(m_Value(), m_Specific(P), m_Value(V))) || | ||
match(IncValue, m_Select(m_Value(), m_Value(V), m_Specific(P)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, but maybe this means we should add m_c_Select
matcher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be hard to represent that with the negated condition I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, think it's probably not practical
@@ -2,7 +2,7 @@ | |||
; RUN: opt -aa-pipeline=basic-aa -passes=attributor -attributor-manifest-internal -S < %s | FileCheck %s | |||
|
|||
define float @phi_select(i1 %c, float nofpclass(inf) %base, float nofpclass(inf) %arg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs another case in the other select operand?
Given a recursive phi with select: %p = phi [ 0, entry ], [ %sel, loop] %sel = select %c, %other, %p The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from llvm#83200.
7cbb47e
to
54efb00
Compare
As a follow-on to llvm#113686, this breaks the recursion between phi nodes that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated from the classes of p1 and p2.
This patch is inspired by #113686. I found that it removes a lot of unnecessary "and X, 1" in some applications that represent boolean values with int.
…m#113686) Given a recursive phi with select: %p = phi [ 0, entry ], [ %sel, loop] %sel = select %c, %other, %p The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from llvm#83200.
…3707) This patch is inspired by llvm#113686. I found that it removes a lot of unnecessary "and X, 1" in some applications that represent boolean values with int.
…3707) This patch is inspired by llvm#113686. I found that it removes a lot of unnecessary "and X, 1" in some applications that represent boolean values with int.
…m#113686) Given a recursive phi with select: %p = phi [ 0, entry ], [ %sel, loop] %sel = select %c, %other, %p The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from llvm#83200.
…3707) This patch is inspired by llvm#113686. I found that it removes a lot of unnecessary "and X, 1" in some applications that represent boolean values with int.
Given a recursive phi with select:
%p = phi [ 0, entry ], [ %sel, loop]
%sel = select %c, %other, %p
The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand.
This helps to address a regression from #83200.