Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ValueTracking] Compute KnownFP state from recursive select/phi. #113686

Merged
merged 2 commits into from
Oct 31, 2024

Conversation

davemgreen
Copy link
Collaborator

Given a recursive phi with select:
%p = phi [ 0, entry ], [ %sel, loop]
%sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand.

This helps to address a regression from #83200.

@llvmbot
Copy link
Member

llvmbot commented Oct 25, 2024

@llvm/pr-subscribers-llvm-transforms

Author: David Green (davemgreen)

Changes

Given a recursive phi with select:
%p = phi [ 0, entry ], [ %sel, loop]
%sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand.

This helps to address a regression from #83200.


Full diff: https://github.com/llvm/llvm-project/pull/113686.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+7)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll (+2-2)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index e9ed8b3c862b55..708922e21ee0bb 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5999,6 +5999,13 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
         if (IncValue == P)
           continue;
 
+        // If the Use is a select of this phi, use the fp class of the other
+        // operand to break the recursion.
+        Value *V;
+        if (match(IncValue, m_Select(m_Value(), m_Specific(P), m_Value(V))) ||
+            match(IncValue, m_Select(m_Value(), m_Value(V), m_Specific(P))))
+          IncValue = V;
+
         KnownFPClass KnownSrc;
         // Recurse, but cap the recursion to two levels, because we don't want
         // to waste time spinning around in loops. We need at least depth 2 to
diff --git a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
index fc56754166d609..b54d4a97d07be7 100644
--- a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
+++ b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
@@ -45,8 +45,8 @@ define float @test(ptr nocapture readonly %pA, ptr nocapture readonly %pB, i32 %
 ; CHECK-NEXT:    [[TMP7:%.*]] = fsub fast <4 x float> [[WIDE_LOAD]], [[WIDE_LOAD7]]
 ; CHECK-NEXT:    [[TMP8:%.*]] = call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP7]])
 ; CHECK-NEXT:    [[TMP9:%.*]] = fdiv fast <4 x float> [[TMP8]], [[TMP6]]
-; CHECK-NEXT:    [[TMP10:%.*]] = fadd fast <4 x float> [[TMP9]], [[VEC_PHI]]
-; CHECK-NEXT:    [[PREDPHI]] = select <4 x i1> [[DOTNOT9]], <4 x float> [[VEC_PHI]], <4 x float> [[TMP10]]
+; CHECK-NEXT:    [[TMP10:%.*]] = select <4 x i1> [[DOTNOT9]], <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, <4 x float> [[TMP9]]
+; CHECK-NEXT:    [[PREDPHI]] = fadd fast <4 x float> [[VEC_PHI]], [[TMP10]]
 ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
 ; CHECK-NEXT:    [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
 ; CHECK-NEXT:    br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]

@llvmbot
Copy link
Member

llvmbot commented Oct 25, 2024

@llvm/pr-subscribers-llvm-analysis

Author: David Green (davemgreen)

Changes

Given a recursive phi with select:
%p = phi [ 0, entry ], [ %sel, loop]
%sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand.

This helps to address a regression from #83200.


Full diff: https://github.com/llvm/llvm-project/pull/113686.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+7)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll (+2-2)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index e9ed8b3c862b55..708922e21ee0bb 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5999,6 +5999,13 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
         if (IncValue == P)
           continue;
 
+        // If the Use is a select of this phi, use the fp class of the other
+        // operand to break the recursion.
+        Value *V;
+        if (match(IncValue, m_Select(m_Value(), m_Specific(P), m_Value(V))) ||
+            match(IncValue, m_Select(m_Value(), m_Value(V), m_Specific(P))))
+          IncValue = V;
+
         KnownFPClass KnownSrc;
         // Recurse, but cap the recursion to two levels, because we don't want
         // to waste time spinning around in loops. We need at least depth 2 to
diff --git a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
index fc56754166d609..b54d4a97d07be7 100644
--- a/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
+++ b/llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll
@@ -45,8 +45,8 @@ define float @test(ptr nocapture readonly %pA, ptr nocapture readonly %pB, i32 %
 ; CHECK-NEXT:    [[TMP7:%.*]] = fsub fast <4 x float> [[WIDE_LOAD]], [[WIDE_LOAD7]]
 ; CHECK-NEXT:    [[TMP8:%.*]] = call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP7]])
 ; CHECK-NEXT:    [[TMP9:%.*]] = fdiv fast <4 x float> [[TMP8]], [[TMP6]]
-; CHECK-NEXT:    [[TMP10:%.*]] = fadd fast <4 x float> [[TMP9]], [[VEC_PHI]]
-; CHECK-NEXT:    [[PREDPHI]] = select <4 x i1> [[DOTNOT9]], <4 x float> [[VEC_PHI]], <4 x float> [[TMP10]]
+; CHECK-NEXT:    [[TMP10:%.*]] = select <4 x i1> [[DOTNOT9]], <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, <4 x float> [[TMP9]]
+; CHECK-NEXT:    [[PREDPHI]] = fadd fast <4 x float> [[VEC_PHI]], [[TMP10]]
 ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
 ; CHECK-NEXT:    [[TMP11:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
 ; CHECK-NEXT:    br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]

Copy link
Contributor

@pawosm-arm pawosm-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, this indeed restores the lost transformation.

@davemgreen davemgreen force-pushed the gh-vt-knownfpselectphi branch from 411c89e to 4d32839 Compare October 25, 2024 14:21
@dtcxzyw
Copy link
Member

dtcxzyw commented Oct 25, 2024

I think this approach is also applicable to computeKnownBits. I'll post a patch later :)

@davemgreen
Copy link
Collaborator Author

I think this approach is also applicable to computeKnownBits. I'll post a patch later :)

Nice, thanks! I had seen known bits had some handling of recurrences already, they didn't necessarily include selects.
I think we can expand this to phi+phi for the fpclass case too relatively easily, which will help in more complex cases.

I rebased over 577c7dd, which added a phase-ordering test.

@arsenm
Copy link
Contributor

arsenm commented Oct 25, 2024

As far as the regression on #83200 is concerned, I think it needs a local flag handling fix

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a more targeted, dedicated test

@davemgreen davemgreen force-pushed the gh-vt-knownfpselectphi branch from 4d32839 to b17b64b Compare October 27, 2024 19:56
@davemgreen
Copy link
Collaborator Author

As far as the regression on #83200 is concerned, I think it needs a local flag handling fix

Yeah there are probably a few ways to fix it, selects are often missing fmfs due to them coming from control flow. This seemed generally useful.

Needs a more targeted, dedicated test

I've tried to add some Attributor tests, which I see have been used to test similar patches in the past. Let me know if there is a better way.

davemgreen added a commit to davemgreen/llvm-project that referenced this pull request Oct 29, 2024
As a follow-on to llvm#113686, this breaks the recursion between phi nodes that
have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated
from the classes of p1 and p2.
@davemgreen davemgreen force-pushed the gh-vt-knownfpselectphi branch from b17b64b to 7cbb47e Compare October 30, 2024 08:12
davemgreen added a commit to davemgreen/llvm-project that referenced this pull request Oct 30, 2024
As a follow-on to llvm#113686, this breaks the recursion between phi nodes that
have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated
from the classes of p1 and p2.
// operand to break the recursion.
Value *V;
if (match(IncValue, m_Select(m_Value(), m_Specific(P), m_Value(V))) ||
match(IncValue, m_Select(m_Value(), m_Value(V), m_Specific(P))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit, but maybe this means we should add m_c_Select matcher.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be hard to represent that with the negated condition I think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, think it's probably not practical

@@ -2,7 +2,7 @@
; RUN: opt -aa-pipeline=basic-aa -passes=attributor -attributor-manifest-internal -S < %s | FileCheck %s

define float @phi_select(i1 %c, float nofpclass(inf) %base, float nofpclass(inf) %arg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs another case in the other select operand?

Given a recursive phi with select:
 %p = phi [ 0, entry ], [ %sel, loop]
 %sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi pair can
only be the initial state (0 here) or from %other. This adds a short-cut into
computeKnownFPClass for PHI to detect that the select is recursive back to the
phi, and if so use the state from the other operand.

This helps to address a regression from llvm#83200.
@davemgreen davemgreen force-pushed the gh-vt-knownfpselectphi branch from 7cbb47e to 54efb00 Compare October 31, 2024 00:07
@davemgreen davemgreen merged commit 9735c05 into llvm:main Oct 31, 2024
8 checks passed
@davemgreen davemgreen deleted the gh-vt-knownfpselectphi branch October 31, 2024 07:50
davemgreen added a commit to davemgreen/llvm-project that referenced this pull request Nov 1, 2024
As a follow-on to llvm#113686, this breaks the recursion between phi nodes that
have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated
from the classes of p1 and p2.
dtcxzyw added a commit that referenced this pull request Nov 2, 2024
This patch is inspired by
#113686. I found that it
removes a lot of unnecessary "and X, 1" in some applications that
represent boolean values with int.
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
…m#113686)

Given a recursive phi with select:
 %p = phi [ 0, entry ], [ %sel, loop]
 %sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi
pair can only be the initial state (0 here) or from %other. This adds a
short-cut into computeKnownFPClass for PHI to detect that the select is
recursive back to the phi, and if so use the state from the other
operand.

This helps to address a regression from llvm#83200.
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
…3707)

This patch is inspired by
llvm#113686. I found that it
removes a lot of unnecessary "and X, 1" in some applications that
represent boolean values with int.
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
…3707)

This patch is inspired by
llvm#113686. I found that it
removes a lot of unnecessary "and X, 1" in some applications that
represent boolean values with int.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…m#113686)

Given a recursive phi with select:
 %p = phi [ 0, entry ], [ %sel, loop]
 %sel = select %c, %other, %p

The fp state can be calculated using the knowledge that the select/phi
pair can only be the initial state (0 here) or from %other. This adds a
short-cut into computeKnownFPClass for PHI to detect that the select is
recursive back to the phi, and if so use the state from the other
operand.

This helps to address a regression from llvm#83200.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…3707)

This patch is inspired by
llvm#113686. I found that it
removes a lot of unnecessary "and X, 1" in some applications that
represent boolean values with int.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants