Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMDGPU: Use "countMaxActiveBits() <= 5" to define uint5Bits #115543

Merged
merged 1 commit into from
Nov 8, 2024

Conversation

changpeng
Copy link
Contributor

countMaxTrailingOnes() is not correct. This patch follows the suggestion from #115372.

  countMaxTrailingOnes() is not correct. This patch follows the suggestion
from llvm#115372.
@llvmbot
Copy link

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Changpeng Fang (changpeng)

Changes

countMaxTrailingOnes() is not correct. This patch follows the suggestion from #115372.


Full diff: https://github.com/llvm/llvm-project/pull/115543.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/extract-lowbits.ll (+25)
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 0658e030ffa5d6..755cbb7fb65492 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -3554,7 +3554,7 @@ def : AMDGPUPat <
 >;
 
 def uint5Bits : PatLeaf<(i32 VGPR_32:$width), [{
-  return CurDAG->computeKnownBits(SDValue(N, 0)).countMaxTrailingOnes() <= 5;
+  return CurDAG->computeKnownBits(SDValue(N, 0)).countMaxActiveBits() <= 5;
 }]>;
 
 // x << (bitwidth - y) >> (bitwidth - y)
diff --git a/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll b/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll
index 3de8db2c6a448e..0e5a68773a6ba8 100644
--- a/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll
+++ b/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll
@@ -163,6 +163,31 @@ define i32 @bzhi32_d0(i32 %val, i32 %numlowbits) nounwind {
   ret i32 %masked
 }
 
+define i32 @bzhi32_d0_even(i32 %val, i32 %numlowbits) nounwind {
+; SI-LABEL: bzhi32_d0_even:
+; SI:       ; %bb.0:
+; SI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; SI-NEXT:    v_lshlrev_b32_e32 v1, 1, v1
+; SI-NEXT:    v_sub_i32_e32 v1, vcc, 32, v1
+; SI-NEXT:    v_lshlrev_b32_e32 v0, v1, v0
+; SI-NEXT:    v_lshrrev_b32_e32 v0, v1, v0
+; SI-NEXT:    s_setpc_b64 s[30:31]
+;
+; VI-LABEL: bzhi32_d0_even:
+; VI:       ; %bb.0:
+; VI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; VI-NEXT:    v_lshlrev_b32_e32 v1, 1, v1
+; VI-NEXT:    v_sub_u32_e32 v1, vcc, 32, v1
+; VI-NEXT:    v_lshlrev_b32_e32 v0, v1, v0
+; VI-NEXT:    v_lshrrev_b32_e32 v0, v1, v0
+; VI-NEXT:    s_setpc_b64 s[30:31]
+  %times2 = shl i32 %numlowbits, 1
+  %numhighbits = sub i32 32, %times2
+  %highbitscleared = shl i32 %val, %numhighbits
+  %masked = lshr i32 %highbitscleared, %numhighbits
+  ret i32 %masked
+}
+
 define i32 @bzhi32_d1_indexzext(i32 %val, i8 %numlowbits) nounwind {
 ; SI-LABEL: bzhi32_d1_indexzext:
 ; SI:       ; %bb.0:

@changpeng changpeng merged commit db6f476 into llvm:main Nov 8, 2024
7 of 9 checks passed
@changpeng changpeng deleted the bfe branch November 8, 2024 21:03
@arsenm
Copy link
Contributor

arsenm commented Nov 8, 2024

I think the whole original patch should be reverted

@changpeng
Copy link
Contributor Author

I think the whole original patch should be reverted

What is the original patch? WE should not do pattern match for x << (32 -y) >> (32 - y) to bfe x, 0, y?

@arsenm
Copy link
Contributor

arsenm commented Nov 13, 2024

I think the whole original patch should be reverted

What is the original patch? WE should not do pattern match for x << (32 -y) >> (32 - y) to bfe x, 0, y?

Yes

changpeng added a commit that referenced this pull request Nov 13, 2024
…" (#116091)

Based on the suggestion from
#115543, we should not do the
pattern matching from x << (32-y) >> (32-y) to "bfe x, 0, y" at all.
This reverts commits a2bacf8 and
bdf8e30.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants