-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "AMDGPU: Don't avoid clamp of bit shift in BFE pattern (#115372)" #116091
Conversation
This reverts commit a2bacf8.
@llvm/pr-subscribers-backend-amdgpu Author: Changpeng Fang (changpeng) ChangesThis reverts commit a2bacf8. Full diff: https://github.com/llvm/llvm-project/pull/116091.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 25df5dabdc6aa1..2fdf69e068e5c1 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -3557,7 +3557,7 @@ def : AMDGPUPat <
>;
def uint5Bits : PatLeaf<(i32 VGPR_32:$width), [{
- return CurDAG->computeKnownBits(SDValue(N, 0)).countMaxActiveBits() <= 5;
+ return CurDAG->computeKnownBits(SDValue(N, 0)).countMaxTrailingOnes() <= 5;
}]>;
// x << (bitwidth - y) >> (bitwidth - y)
diff --git a/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll b/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll
index 0e5a68773a6ba8..3de8db2c6a448e 100644
--- a/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll
+++ b/llvm/test/CodeGen/AMDGPU/extract-lowbits.ll
@@ -163,31 +163,6 @@ define i32 @bzhi32_d0(i32 %val, i32 %numlowbits) nounwind {
ret i32 %masked
}
-define i32 @bzhi32_d0_even(i32 %val, i32 %numlowbits) nounwind {
-; SI-LABEL: bzhi32_d0_even:
-; SI: ; %bb.0:
-; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; SI-NEXT: v_lshlrev_b32_e32 v1, 1, v1
-; SI-NEXT: v_sub_i32_e32 v1, vcc, 32, v1
-; SI-NEXT: v_lshlrev_b32_e32 v0, v1, v0
-; SI-NEXT: v_lshrrev_b32_e32 v0, v1, v0
-; SI-NEXT: s_setpc_b64 s[30:31]
-;
-; VI-LABEL: bzhi32_d0_even:
-; VI: ; %bb.0:
-; VI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; VI-NEXT: v_lshlrev_b32_e32 v1, 1, v1
-; VI-NEXT: v_sub_u32_e32 v1, vcc, 32, v1
-; VI-NEXT: v_lshlrev_b32_e32 v0, v1, v0
-; VI-NEXT: v_lshrrev_b32_e32 v0, v1, v0
-; VI-NEXT: s_setpc_b64 s[30:31]
- %times2 = shl i32 %numlowbits, 1
- %numhighbits = sub i32 32, %times2
- %highbitscleared = shl i32 %val, %numhighbits
- %masked = lshr i32 %highbitscleared, %numhighbits
- ret i32 %masked
-}
-
define i32 @bzhi32_d1_indexzext(i32 %val, i8 %numlowbits) nounwind {
; SI-LABEL: bzhi32_d1_indexzext:
; SI: ; %bb.0:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please say why in the description.
We are going to revert the original patch for bfe patterns. We believe the optimization is just the corner cases and does not deserve the cost of compilation time. |
This comment is not part of the commit description |
I know. Already updated the commit description (in different words though) |
Based on the suggestion from #115543, we should not do the pattern matching from x << (32-y) >> (32-y) to "bfe x, 0, y" at all.
This reverts commits a2bacf8 and bdf8e30.