Skip to content

Conversation

@RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Dec 10, 2025

Currently we only permit i256 CTTZ/CTLZ AVX512 lowering when the source is loadable as GPR->FPU transition costs would outweigh the vectorization benefit.

This patch checks for other cases where the source can avoid the GPR - a mayFoldToVector helper checks for a bitcast originally from a vector type, as well as constant values and the original mayFoldLoad check.

There will be other cases for the mayFoldToVector helper, but I've just used this for CTTZ/CTLZ initially.

…ed from the vector unit

Currently we only permit i256 CTTZ/CTLZ AVX512 lowering when the source is loadable as GPR->FPU transition costs would outweigh the vectorization benefit.

This patch checks for other cases where the source can avoid the GPR - a mayFoldToVector helper checks for a bitcast originally from a vector type, as well as constant values and the original mayFoldLoad check.

There will be other cases for the mayFoldToVector helper, but I've just used this for CTTZ/CTLZ initially.
@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2025

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

Currently we only permit i256 CTTZ/CTLZ AVX512 lowering when the source is loadable as GPR->FPU transition costs would outweigh the vectorization benefit.

This patch checks for other cases where the source can avoid the GPR - a mayFoldToVector helper checks for a bitcast originally from a vector type, as well as constant values and the original mayFoldLoad check.

There will be other cases for the mayFoldToVector helper, but I've just used this for CTTZ/CTLZ initially.


Full diff: https://github.com/llvm/llvm-project/pull/171589.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+10-1)
  • (modified) llvm/test/CodeGen/X86/bitcnt-big-integer.ll (+104-228)
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index fbd875a93fd4a..b4ad7465d612e 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -2846,6 +2846,15 @@ bool X86::mayFoldIntoZeroExtend(SDValue Op) {
   return false;
 }
 
+// Return true if its cheap to bitcast this to a vector type.
+static bool mayFoldToVector(SDValue Op, const X86Subtarget &Subtarget) {
+  if (peekThroughBitcasts(Op).getValueType().isVector())
+    return true;
+  if (isa<ConstantSDNode>(Op) || isa<ConstantFPSDNode>(Op))
+    return true;
+  return X86::mayFoldLoad(Op, Subtarget);
+}
+
 static bool isLogicOp(unsigned Opcode) {
   // TODO: Add support for X86ISD::FAND/FOR/FXOR/FANDN with test coverage.
   return ISD::isBitwiseLogicOp(Opcode) || X86ISD::ANDNP == Opcode;
@@ -33958,7 +33967,7 @@ void X86TargetLowering::ReplaceNodeResults(SDNode *N,
     EVT VT = N->getValueType(0);
     assert(Subtarget.hasCDI() && "AVX512CD required");
     assert((VT == MVT::i256 || VT == MVT::i512) && "Unexpected VT!");
-    if (VT == MVT::i256 && !X86::mayFoldLoad(N0, Subtarget))
+    if (VT == MVT::i256 && !mayFoldToVector(N0, Subtarget))
       return;
 
     unsigned SizeInBits = VT.getSizeInBits();
diff --git a/llvm/test/CodeGen/X86/bitcnt-big-integer.ll b/llvm/test/CodeGen/X86/bitcnt-big-integer.ll
index 749b3ddc96d0d..06ccbf4daa1e8 100644
--- a/llvm/test/CodeGen/X86/bitcnt-big-integer.ll
+++ b/llvm/test/CodeGen/X86/bitcnt-big-integer.ll
@@ -1567,72 +1567,38 @@ define i32 @vector_ctlz_i256(<8 x i32> %v0) nounwind {
 ;
 ; AVX512F-LABEL: vector_ctlz_i256:
 ; AVX512F:       # %bb.0:
-; AVX512F-NEXT:    vmovq %xmm0, %rax
-; AVX512F-NEXT:    vpextrq $1, %xmm0, %rcx
-; AVX512F-NEXT:    vextracti128 $1, %ymm0, %xmm0
-; AVX512F-NEXT:    vmovq %xmm0, %rdx
-; AVX512F-NEXT:    vpextrq $1, %xmm0, %rsi
-; AVX512F-NEXT:    lzcntq %rsi, %rdi
-; AVX512F-NEXT:    lzcntq %rdx, %r8
-; AVX512F-NEXT:    addl $64, %r8d
-; AVX512F-NEXT:    testq %rsi, %rsi
-; AVX512F-NEXT:    cmovnel %edi, %r8d
-; AVX512F-NEXT:    lzcntq %rcx, %rdi
-; AVX512F-NEXT:    lzcntq %rax, %rax
-; AVX512F-NEXT:    addl $64, %eax
-; AVX512F-NEXT:    testq %rcx, %rcx
-; AVX512F-NEXT:    cmovnel %edi, %eax
-; AVX512F-NEXT:    subl $-128, %eax
-; AVX512F-NEXT:    orq %rsi, %rdx
-; AVX512F-NEXT:    cmovnel %r8d, %eax
-; AVX512F-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512F-NEXT:    vpbroadcastq {{.*#+}} ymm1 = [256,256,256,256]
+; AVX512F-NEXT:    vpermq {{.*#+}} ymm0 = ymm0[3,2,1,0]
+; AVX512F-NEXT:    vplzcntq %zmm0, %zmm2
+; AVX512F-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm2, %ymm2
+; AVX512F-NEXT:    vptestmq %zmm0, %zmm0, %k0
+; AVX512F-NEXT:    kshiftlw $12, %k0, %k0
+; AVX512F-NEXT:    kshiftrw $12, %k0, %k1
+; AVX512F-NEXT:    vpcompressq %zmm2, %zmm1 {%k1}
+; AVX512F-NEXT:    vmovd %xmm1, %eax
 ; AVX512F-NEXT:    retq
 ;
 ; AVX512VL-LABEL: vector_ctlz_i256:
 ; AVX512VL:       # %bb.0:
-; AVX512VL-NEXT:    vpextrq $1, %xmm0, %rcx
-; AVX512VL-NEXT:    vmovq %xmm0, %rax
-; AVX512VL-NEXT:    vextracti128 $1, %ymm0, %xmm0
-; AVX512VL-NEXT:    vmovq %xmm0, %rdx
-; AVX512VL-NEXT:    vpextrq $1, %xmm0, %rsi
-; AVX512VL-NEXT:    lzcntq %rsi, %rdi
-; AVX512VL-NEXT:    lzcntq %rdx, %r8
-; AVX512VL-NEXT:    addl $64, %r8d
-; AVX512VL-NEXT:    testq %rsi, %rsi
-; AVX512VL-NEXT:    cmovnel %edi, %r8d
-; AVX512VL-NEXT:    lzcntq %rcx, %rdi
-; AVX512VL-NEXT:    lzcntq %rax, %rax
-; AVX512VL-NEXT:    addl $64, %eax
-; AVX512VL-NEXT:    testq %rcx, %rcx
-; AVX512VL-NEXT:    cmovnel %edi, %eax
-; AVX512VL-NEXT:    subl $-128, %eax
-; AVX512VL-NEXT:    orq %rsi, %rdx
-; AVX512VL-NEXT:    cmovnel %r8d, %eax
-; AVX512VL-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512VL-NEXT:    vpermq {{.*#+}} ymm0 = ymm0[3,2,1,0]
+; AVX512VL-NEXT:    vplzcntq %ymm0, %ymm1
+; AVX512VL-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm1, %ymm1
+; AVX512VL-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512VL-NEXT:    vpbroadcastq {{.*#+}} ymm0 = [256,256,256,256]
+; AVX512VL-NEXT:    vpcompressq %ymm1, %ymm0 {%k1}
+; AVX512VL-NEXT:    vmovd %xmm0, %eax
 ; AVX512VL-NEXT:    vzeroupper
 ; AVX512VL-NEXT:    retq
 ;
 ; AVX512POPCNT-LABEL: vector_ctlz_i256:
 ; AVX512POPCNT:       # %bb.0:
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm0, %rcx
-; AVX512POPCNT-NEXT:    vmovq %xmm0, %rax
-; AVX512POPCNT-NEXT:    vextracti128 $1, %ymm0, %xmm0
-; AVX512POPCNT-NEXT:    vmovq %xmm0, %rdx
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm0, %rsi
-; AVX512POPCNT-NEXT:    lzcntq %rsi, %rdi
-; AVX512POPCNT-NEXT:    lzcntq %rdx, %r8
-; AVX512POPCNT-NEXT:    addl $64, %r8d
-; AVX512POPCNT-NEXT:    testq %rsi, %rsi
-; AVX512POPCNT-NEXT:    cmovnel %edi, %r8d
-; AVX512POPCNT-NEXT:    lzcntq %rcx, %rdi
-; AVX512POPCNT-NEXT:    lzcntq %rax, %rax
-; AVX512POPCNT-NEXT:    addl $64, %eax
-; AVX512POPCNT-NEXT:    testq %rcx, %rcx
-; AVX512POPCNT-NEXT:    cmovnel %edi, %eax
-; AVX512POPCNT-NEXT:    subl $-128, %eax
-; AVX512POPCNT-NEXT:    orq %rsi, %rdx
-; AVX512POPCNT-NEXT:    cmovnel %r8d, %eax
-; AVX512POPCNT-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512POPCNT-NEXT:    vpermq {{.*#+}} ymm0 = ymm0[3,2,1,0]
+; AVX512POPCNT-NEXT:    vplzcntq %ymm0, %ymm1
+; AVX512POPCNT-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm1, %ymm1
+; AVX512POPCNT-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512POPCNT-NEXT:    vpbroadcastq {{.*#+}} ymm0 = [256,256,256,256]
+; AVX512POPCNT-NEXT:    vpcompressq %ymm1, %ymm0 {%k1}
+; AVX512POPCNT-NEXT:    vmovd %xmm0, %eax
 ; AVX512POPCNT-NEXT:    vzeroupper
 ; AVX512POPCNT-NEXT:    retq
   %a0 = bitcast <8 x i32> %v0 to i256
@@ -3246,72 +3212,35 @@ define i32 @vector_ctlz_undef_i256(<8 x i32> %v0) nounwind {
 ;
 ; AVX512F-LABEL: vector_ctlz_undef_i256:
 ; AVX512F:       # %bb.0:
-; AVX512F-NEXT:    vmovq %xmm0, %rax
-; AVX512F-NEXT:    vpextrq $1, %xmm0, %rcx
-; AVX512F-NEXT:    vextracti128 $1, %ymm0, %xmm0
-; AVX512F-NEXT:    vmovq %xmm0, %rdx
-; AVX512F-NEXT:    vpextrq $1, %xmm0, %rsi
-; AVX512F-NEXT:    lzcntq %rsi, %rdi
-; AVX512F-NEXT:    lzcntq %rdx, %r8
-; AVX512F-NEXT:    addl $64, %r8d
-; AVX512F-NEXT:    testq %rsi, %rsi
-; AVX512F-NEXT:    cmovnel %edi, %r8d
-; AVX512F-NEXT:    lzcntq %rcx, %rdi
-; AVX512F-NEXT:    lzcntq %rax, %rax
-; AVX512F-NEXT:    addl $64, %eax
-; AVX512F-NEXT:    testq %rcx, %rcx
-; AVX512F-NEXT:    cmovnel %edi, %eax
-; AVX512F-NEXT:    subl $-128, %eax
-; AVX512F-NEXT:    orq %rsi, %rdx
-; AVX512F-NEXT:    cmovnel %r8d, %eax
-; AVX512F-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512F-NEXT:    vpermq {{.*#+}} ymm0 = ymm0[3,2,1,0]
+; AVX512F-NEXT:    vplzcntq %zmm0, %zmm1
+; AVX512F-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm1, %ymm1
+; AVX512F-NEXT:    vptestmq %zmm0, %zmm0, %k0
+; AVX512F-NEXT:    kshiftlw $12, %k0, %k0
+; AVX512F-NEXT:    kshiftrw $12, %k0, %k1
+; AVX512F-NEXT:    vpcompressq %zmm1, %zmm0 {%k1} {z}
+; AVX512F-NEXT:    vmovd %xmm0, %eax
 ; AVX512F-NEXT:    retq
 ;
 ; AVX512VL-LABEL: vector_ctlz_undef_i256:
 ; AVX512VL:       # %bb.0:
-; AVX512VL-NEXT:    vpextrq $1, %xmm0, %rcx
-; AVX512VL-NEXT:    vmovq %xmm0, %rax
-; AVX512VL-NEXT:    vextracti128 $1, %ymm0, %xmm0
-; AVX512VL-NEXT:    vmovq %xmm0, %rdx
-; AVX512VL-NEXT:    vpextrq $1, %xmm0, %rsi
-; AVX512VL-NEXT:    lzcntq %rsi, %rdi
-; AVX512VL-NEXT:    lzcntq %rdx, %r8
-; AVX512VL-NEXT:    addl $64, %r8d
-; AVX512VL-NEXT:    testq %rsi, %rsi
-; AVX512VL-NEXT:    cmovnel %edi, %r8d
-; AVX512VL-NEXT:    lzcntq %rcx, %rdi
-; AVX512VL-NEXT:    lzcntq %rax, %rax
-; AVX512VL-NEXT:    addl $64, %eax
-; AVX512VL-NEXT:    testq %rcx, %rcx
-; AVX512VL-NEXT:    cmovnel %edi, %eax
-; AVX512VL-NEXT:    subl $-128, %eax
-; AVX512VL-NEXT:    orq %rsi, %rdx
-; AVX512VL-NEXT:    cmovnel %r8d, %eax
-; AVX512VL-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512VL-NEXT:    vpermq {{.*#+}} ymm0 = ymm0[3,2,1,0]
+; AVX512VL-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512VL-NEXT:    vplzcntq %ymm0, %ymm0
+; AVX512VL-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm0, %ymm0
+; AVX512VL-NEXT:    vpcompressq %ymm0, %ymm0 {%k1} {z}
+; AVX512VL-NEXT:    vmovd %xmm0, %eax
 ; AVX512VL-NEXT:    vzeroupper
 ; AVX512VL-NEXT:    retq
 ;
 ; AVX512POPCNT-LABEL: vector_ctlz_undef_i256:
 ; AVX512POPCNT:       # %bb.0:
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm0, %rcx
-; AVX512POPCNT-NEXT:    vmovq %xmm0, %rax
-; AVX512POPCNT-NEXT:    vextracti128 $1, %ymm0, %xmm0
-; AVX512POPCNT-NEXT:    vmovq %xmm0, %rdx
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm0, %rsi
-; AVX512POPCNT-NEXT:    lzcntq %rsi, %rdi
-; AVX512POPCNT-NEXT:    lzcntq %rdx, %r8
-; AVX512POPCNT-NEXT:    addl $64, %r8d
-; AVX512POPCNT-NEXT:    testq %rsi, %rsi
-; AVX512POPCNT-NEXT:    cmovnel %edi, %r8d
-; AVX512POPCNT-NEXT:    lzcntq %rcx, %rdi
-; AVX512POPCNT-NEXT:    lzcntq %rax, %rax
-; AVX512POPCNT-NEXT:    addl $64, %eax
-; AVX512POPCNT-NEXT:    testq %rcx, %rcx
-; AVX512POPCNT-NEXT:    cmovnel %edi, %eax
-; AVX512POPCNT-NEXT:    subl $-128, %eax
-; AVX512POPCNT-NEXT:    orq %rsi, %rdx
-; AVX512POPCNT-NEXT:    cmovnel %r8d, %eax
-; AVX512POPCNT-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512POPCNT-NEXT:    vpermq {{.*#+}} ymm0 = ymm0[3,2,1,0]
+; AVX512POPCNT-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512POPCNT-NEXT:    vplzcntq %ymm0, %ymm0
+; AVX512POPCNT-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm0, %ymm0
+; AVX512POPCNT-NEXT:    vpcompressq %ymm0, %ymm0 {%k1} {z}
+; AVX512POPCNT-NEXT:    vmovd %xmm0, %eax
 ; AVX512POPCNT-NEXT:    vzeroupper
 ; AVX512POPCNT-NEXT:    retq
   %a0 = bitcast <8 x i32> %v0 to i256
@@ -4887,72 +4816,47 @@ define i32 @vector_cttz_i256(<8 x i32> %v0) nounwind {
 ;
 ; AVX512F-LABEL: vector_cttz_i256:
 ; AVX512F:       # %bb.0:
-; AVX512F-NEXT:    vextracti128 $1, %ymm0, %xmm1
-; AVX512F-NEXT:    vpextrq $1, %xmm1, %rax
-; AVX512F-NEXT:    vmovq %xmm1, %rcx
-; AVX512F-NEXT:    vpextrq $1, %xmm0, %rdx
-; AVX512F-NEXT:    vmovq %xmm0, %rsi
-; AVX512F-NEXT:    tzcntq %rsi, %rdi
-; AVX512F-NEXT:    tzcntq %rdx, %r8
-; AVX512F-NEXT:    addl $64, %r8d
-; AVX512F-NEXT:    testq %rsi, %rsi
-; AVX512F-NEXT:    cmovnel %edi, %r8d
-; AVX512F-NEXT:    tzcntq %rcx, %rdi
-; AVX512F-NEXT:    tzcntq %rax, %rax
-; AVX512F-NEXT:    addl $64, %eax
-; AVX512F-NEXT:    testq %rcx, %rcx
-; AVX512F-NEXT:    cmovnel %edi, %eax
-; AVX512F-NEXT:    subl $-128, %eax
-; AVX512F-NEXT:    orq %rdx, %rsi
-; AVX512F-NEXT:    cmovnel %r8d, %eax
-; AVX512F-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512F-NEXT:    # kill: def $ymm0 killed $ymm0 def $zmm0
+; AVX512F-NEXT:    vpbroadcastq {{.*#+}} ymm1 = [256,256,256,256]
+; AVX512F-NEXT:    vpcmpeqd %ymm2, %ymm2, %ymm2
+; AVX512F-NEXT:    vpaddq %ymm2, %ymm0, %ymm2
+; AVX512F-NEXT:    vpandn %ymm2, %ymm0, %ymm2
+; AVX512F-NEXT:    vplzcntq %zmm2, %zmm2
+; AVX512F-NEXT:    vmovdqa {{.*#+}} ymm3 = [64,128,192,256]
+; AVX512F-NEXT:    vpsubq %ymm2, %ymm3, %ymm2
+; AVX512F-NEXT:    vptestmq %zmm0, %zmm0, %k0
+; AVX512F-NEXT:    kshiftlw $12, %k0, %k0
+; AVX512F-NEXT:    kshiftrw $12, %k0, %k1
+; AVX512F-NEXT:    vpcompressq %zmm2, %zmm1 {%k1}
+; AVX512F-NEXT:    vmovd %xmm1, %eax
 ; AVX512F-NEXT:    retq
 ;
 ; AVX512VL-LABEL: vector_cttz_i256:
 ; AVX512VL:       # %bb.0:
-; AVX512VL-NEXT:    vextracti128 $1, %ymm0, %xmm1
-; AVX512VL-NEXT:    vpextrq $1, %xmm1, %rax
-; AVX512VL-NEXT:    vmovq %xmm1, %rcx
-; AVX512VL-NEXT:    vpextrq $1, %xmm0, %rdx
-; AVX512VL-NEXT:    vmovq %xmm0, %rsi
-; AVX512VL-NEXT:    tzcntq %rsi, %rdi
-; AVX512VL-NEXT:    tzcntq %rdx, %r8
-; AVX512VL-NEXT:    addl $64, %r8d
-; AVX512VL-NEXT:    testq %rsi, %rsi
-; AVX512VL-NEXT:    cmovnel %edi, %r8d
-; AVX512VL-NEXT:    tzcntq %rcx, %rdi
-; AVX512VL-NEXT:    tzcntq %rax, %rax
-; AVX512VL-NEXT:    addl $64, %eax
-; AVX512VL-NEXT:    testq %rcx, %rcx
-; AVX512VL-NEXT:    cmovnel %edi, %eax
-; AVX512VL-NEXT:    subl $-128, %eax
-; AVX512VL-NEXT:    orq %rdx, %rsi
-; AVX512VL-NEXT:    cmovnel %r8d, %eax
-; AVX512VL-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512VL-NEXT:    vpcmpeqd %ymm1, %ymm1, %ymm1
+; AVX512VL-NEXT:    vpaddq %ymm1, %ymm0, %ymm1
+; AVX512VL-NEXT:    vpandn %ymm1, %ymm0, %ymm1
+; AVX512VL-NEXT:    vplzcntq %ymm1, %ymm1
+; AVX512VL-NEXT:    vmovdqa {{.*#+}} ymm2 = [64,128,192,256]
+; AVX512VL-NEXT:    vpsubq %ymm1, %ymm2, %ymm1
+; AVX512VL-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512VL-NEXT:    vpbroadcastq {{.*#+}} ymm0 = [256,256,256,256]
+; AVX512VL-NEXT:    vpcompressq %ymm1, %ymm0 {%k1}
+; AVX512VL-NEXT:    vmovd %xmm0, %eax
 ; AVX512VL-NEXT:    vzeroupper
 ; AVX512VL-NEXT:    retq
 ;
 ; AVX512POPCNT-LABEL: vector_cttz_i256:
 ; AVX512POPCNT:       # %bb.0:
-; AVX512POPCNT-NEXT:    vextracti128 $1, %ymm0, %xmm1
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm1, %rax
-; AVX512POPCNT-NEXT:    vmovq %xmm1, %rcx
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm0, %rdx
-; AVX512POPCNT-NEXT:    vmovq %xmm0, %rsi
-; AVX512POPCNT-NEXT:    tzcntq %rsi, %rdi
-; AVX512POPCNT-NEXT:    tzcntq %rdx, %r8
-; AVX512POPCNT-NEXT:    addl $64, %r8d
-; AVX512POPCNT-NEXT:    testq %rsi, %rsi
-; AVX512POPCNT-NEXT:    cmovnel %edi, %r8d
-; AVX512POPCNT-NEXT:    tzcntq %rcx, %rdi
-; AVX512POPCNT-NEXT:    tzcntq %rax, %rax
-; AVX512POPCNT-NEXT:    addl $64, %eax
-; AVX512POPCNT-NEXT:    testq %rcx, %rcx
-; AVX512POPCNT-NEXT:    cmovnel %edi, %eax
-; AVX512POPCNT-NEXT:    subl $-128, %eax
-; AVX512POPCNT-NEXT:    orq %rdx, %rsi
-; AVX512POPCNT-NEXT:    cmovnel %r8d, %eax
-; AVX512POPCNT-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512POPCNT-NEXT:    vpcmpeqd %ymm1, %ymm1, %ymm1
+; AVX512POPCNT-NEXT:    vpaddq %ymm1, %ymm0, %ymm1
+; AVX512POPCNT-NEXT:    vpandn %ymm1, %ymm0, %ymm1
+; AVX512POPCNT-NEXT:    vpopcntq %ymm1, %ymm1
+; AVX512POPCNT-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm1, %ymm1
+; AVX512POPCNT-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512POPCNT-NEXT:    vpbroadcastq {{.*#+}} ymm0 = [256,256,256,256]
+; AVX512POPCNT-NEXT:    vpcompressq %ymm1, %ymm0 {%k1}
+; AVX512POPCNT-NEXT:    vmovd %xmm0, %eax
 ; AVX512POPCNT-NEXT:    vzeroupper
 ; AVX512POPCNT-NEXT:    retq
   %a0 = bitcast <8 x i32> %v0 to i256
@@ -6484,72 +6388,44 @@ define i32 @vector_cttz_undef_i256(<8 x i32> %v0) nounwind {
 ;
 ; AVX512F-LABEL: vector_cttz_undef_i256:
 ; AVX512F:       # %bb.0:
-; AVX512F-NEXT:    vextracti128 $1, %ymm0, %xmm1
-; AVX512F-NEXT:    vpextrq $1, %xmm1, %rax
-; AVX512F-NEXT:    vmovq %xmm1, %rcx
-; AVX512F-NEXT:    vpextrq $1, %xmm0, %rdx
-; AVX512F-NEXT:    vmovq %xmm0, %rsi
-; AVX512F-NEXT:    tzcntq %rsi, %rdi
-; AVX512F-NEXT:    tzcntq %rdx, %r8
-; AVX512F-NEXT:    addl $64, %r8d
-; AVX512F-NEXT:    testq %rsi, %rsi
-; AVX512F-NEXT:    cmovnel %edi, %r8d
-; AVX512F-NEXT:    tzcntq %rcx, %rdi
-; AVX512F-NEXT:    tzcntq %rax, %rax
-; AVX512F-NEXT:    addl $64, %eax
-; AVX512F-NEXT:    testq %rcx, %rcx
-; AVX512F-NEXT:    cmovnel %edi, %eax
-; AVX512F-NEXT:    subl $-128, %eax
-; AVX512F-NEXT:    orq %rdx, %rsi
-; AVX512F-NEXT:    cmovnel %r8d, %eax
-; AVX512F-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512F-NEXT:    # kill: def $ymm0 killed $ymm0 def $zmm0
+; AVX512F-NEXT:    vpcmpeqd %ymm1, %ymm1, %ymm1
+; AVX512F-NEXT:    vpaddq %ymm1, %ymm0, %ymm1
+; AVX512F-NEXT:    vpandn %ymm1, %ymm0, %ymm1
+; AVX512F-NEXT:    vplzcntq %zmm1, %zmm1
+; AVX512F-NEXT:    vmovdqa {{.*#+}} ymm2 = [64,128,192,256]
+; AVX512F-NEXT:    vpsubq %ymm1, %ymm2, %ymm1
+; AVX512F-NEXT:    vptestmq %zmm0, %zmm0, %k0
+; AVX512F-NEXT:    kshiftlw $12, %k0, %k0
+; AVX512F-NEXT:    kshiftrw $12, %k0, %k1
+; AVX512F-NEXT:    vpcompressq %zmm1, %zmm0 {%k1} {z}
+; AVX512F-NEXT:    vmovd %xmm0, %eax
 ; AVX512F-NEXT:    retq
 ;
 ; AVX512VL-LABEL: vector_cttz_undef_i256:
 ; AVX512VL:       # %bb.0:
-; AVX512VL-NEXT:    vextracti128 $1, %ymm0, %xmm1
-; AVX512VL-NEXT:    vpextrq $1, %xmm1, %rax
-; AVX512VL-NEXT:    vmovq %xmm1, %rcx
-; AVX512VL-NEXT:    vpextrq $1, %xmm0, %rdx
-; AVX512VL-NEXT:    vmovq %xmm0, %rsi
-; AVX512VL-NEXT:    tzcntq %rsi, %rdi
-; AVX512VL-NEXT:    tzcntq %rdx, %r8
-; AVX512VL-NEXT:    addl $64, %r8d
-; AVX512VL-NEXT:    testq %rsi, %rsi
-; AVX512VL-NEXT:    cmovnel %edi, %r8d
-; AVX512VL-NEXT:    tzcntq %rcx, %rdi
-; AVX512VL-NEXT:    tzcntq %rax, %rax
-; AVX512VL-NEXT:    addl $64, %eax
-; AVX512VL-NEXT:    testq %rcx, %rcx
-; AVX512VL-NEXT:    cmovnel %edi, %eax
-; AVX512VL-NEXT:    subl $-128, %eax
-; AVX512VL-NEXT:    orq %rdx, %rsi
-; AVX512VL-NEXT:    cmovnel %r8d, %eax
-; AVX512VL-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512VL-NEXT:    vpcmpeqd %ymm1, %ymm1, %ymm1
+; AVX512VL-NEXT:    vpaddq %ymm1, %ymm0, %ymm1
+; AVX512VL-NEXT:    vpandn %ymm1, %ymm0, %ymm1
+; AVX512VL-NEXT:    vmovdqa {{.*#+}} ymm2 = [64,128,192,256]
+; AVX512VL-NEXT:    vplzcntq %ymm1, %ymm1
+; AVX512VL-NEXT:    vpsubq %ymm1, %ymm2, %ymm1
+; AVX512VL-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512VL-NEXT:    vpcompressq %ymm1, %ymm0 {%k1} {z}
+; AVX512VL-NEXT:    vmovd %xmm0, %eax
 ; AVX512VL-NEXT:    vzeroupper
 ; AVX512VL-NEXT:    retq
 ;
 ; AVX512POPCNT-LABEL: vector_cttz_undef_i256:
 ; AVX512POPCNT:       # %bb.0:
-; AVX512POPCNT-NEXT:    vextracti128 $1, %ymm0, %xmm1
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm1, %rax
-; AVX512POPCNT-NEXT:    vmovq %xmm1, %rcx
-; AVX512POPCNT-NEXT:    vpextrq $1, %xmm0, %rdx
-; AVX512POPCNT-NEXT:    vmovq %xmm0, %rsi
-; AVX512POPCNT-NEXT:    tzcntq %rsi, %rdi
-; AVX512POPCNT-NEXT:    tzcntq %rdx, %r8
-; AVX512POPCNT-NEXT:    addl $64, %r8d
-; AVX512POPCNT-NEXT:    testq %rsi, %rsi
-; AVX512POPCNT-NEXT:    cmovnel %edi, %r8d
-; AVX512POPCNT-NEXT:    tzcntq %rcx, %rdi
-; AVX512POPCNT-NEXT:    tzcntq %rax, %rax
-; AVX512POPCNT-NEXT:    addl $64, %eax
-; AVX512POPCNT-NEXT:    testq %rcx, %rcx
-; AVX512POPCNT-NEXT:    cmovnel %edi, %eax
-; AVX512POPCNT-NEXT:    subl $-128, %eax
-; AVX512POPCNT-NEXT:    orq %rdx, %rsi
-; AVX512POPCNT-NEXT:    cmovnel %r8d, %eax
-; AVX512POPCNT-NEXT:    # kill: def $eax killed $eax killed $rax
+; AVX512POPCNT-NEXT:    vpcmpeqd %ymm1, %ymm1, %ymm1
+; AVX512POPCNT-NEXT:    vpaddq %ymm1, %ymm0, %ymm1
+; AVX512POPCNT-NEXT:    vpandn %ymm1, %ymm0, %ymm1
+; AVX512POPCNT-NEXT:    vpopcntq %ymm1, %ymm1
+; AVX512POPCNT-NEXT:    vpaddq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm1, %ymm1
+; AVX512POPCNT-NEXT:    vptestmq %ymm0, %ymm0, %k1
+; AVX512POPCNT-NEXT:    vpcompressq %ymm1, %ymm0 {%k1} {z}
+; AVX512POPCNT-NEXT:    vmovd %xmm0, %eax
 ; AVX512POPCNT-NEXT:    vzeroupper
 ; AVX512POPCNT-NEXT:    retq
   %a0 = bitcast <8 x i32> %v0 to i256

}

// Return true if its cheap to bitcast this to a vector type.
static bool mayFoldToVector(SDValue Op, const X86Subtarget &Subtarget) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mayFoldIntoVector to match above?

@RKSimon RKSimon requested a review from phoebewang December 10, 2025 10:48
Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@RKSimon RKSimon enabled auto-merge (squash) December 10, 2025 11:50
@RKSimon RKSimon merged commit 11e457c into llvm:main Dec 10, 2025
9 of 10 checks passed
@RKSimon RKSimon deleted the x86-cntbits-from-vector branch December 10, 2025 12:26
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 10, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-fast running on sanitizer-buildbot4 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/17882

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:74: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 94458 tests, 64 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.
FAIL: LLVM :: CodeGen/X86/basic-block-sections-clusters-bb-hash.ll (70398 of 94458)
******************** TEST 'LLVM :: CodeGen/X86/basic-block-sections-clusters-bb-hash.ll' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 11
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -O0 -mtriple=x86_64-pc-linux -function-sections -filetype=obj -basic-block-address-map -emit-bb-hash -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -O0 -mtriple=x86_64-pc-linux -function-sections -filetype=obj -basic-block-address-map -emit-bb-hash -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o
# note: command had no output on stdout or stderr
# RUN: at line 16
echo 'v1' > /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: echo v1
# note: command had no output on stdout or stderr
# RUN: at line 17
echo 'f foo' >> /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: echo 'f foo'
# note: command had no output on stdout or stderr
# RUN: at line 18
echo 'g 0:100,1:100,2:0 1:100,3:100 2:0,3:0 3:100' >> /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: echo 'g 0:100,1:100,2:0 1:100,3:100 2:0,3:0 3:100'
# note: command had no output on stdout or stderr
# RUN: at line 22
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-readobj /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o --bb-addr-map |  awk 'BEGIN {printf "h"}      /ID: [0-9]+/ {id=$2}      /Hash: 0x[0-9A-Fa-f]+/ {gsub(/^0x/, "", $2); hash=$2; printf " %s:%s", id, hash}      END {print ""}'  >> /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-readobj /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o --bb-addr-map
# note: command had no output on stdout or stderr
# executed command: awk 'BEGIN {printf "h"}      /ID: [0-9]+/ {id=$2}      /Hash: 0x[0-9A-Fa-f]+/ {gsub(/^0x/, "", $2); hash=$2; printf " %s:%s", id, hash}      END {print ""}'
# note: command had no output on stdout or stderr
# RUN: at line 29
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc < /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -O0 -mtriple=x86_64-pc-linux -function-sections -basic-block-sections=/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1 -basic-block-section-match-infer |  /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -check-prefixes=CHECK,LINUX-SECTIONS1
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -mtriple=x86_64-pc-linux -function-sections -basic-block-sections=/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1 -basic-block-section-match-infer
# note: command had no output on stdout or stderr
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -check-prefixes=CHECK,LINUX-SECTIONS1
# .---command stderr------------
# | /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll:80:26: error: LINUX-SECTIONS1-LABEL: expected string not found in input
# | ; LINUX-SECTIONS1-LABEL: # %bb.1:
# |                          ^
# | <stdin>:6:5: note: scanning from here
# | foo: # @foo
Step 14 (stage2/msan check) failure: stage2/msan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:564: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:74: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 94458 tests, 64 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.
FAIL: LLVM :: CodeGen/X86/basic-block-sections-clusters-bb-hash.ll (70398 of 94458)
******************** TEST 'LLVM :: CodeGen/X86/basic-block-sections-clusters-bb-hash.ll' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 11
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -O0 -mtriple=x86_64-pc-linux -function-sections -filetype=obj -basic-block-address-map -emit-bb-hash -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -O0 -mtriple=x86_64-pc-linux -function-sections -filetype=obj -basic-block-address-map -emit-bb-hash -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o
# note: command had no output on stdout or stderr
# RUN: at line 16
echo 'v1' > /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: echo v1
# note: command had no output on stdout or stderr
# RUN: at line 17
echo 'f foo' >> /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: echo 'f foo'
# note: command had no output on stdout or stderr
# RUN: at line 18
echo 'g 0:100,1:100,2:0 1:100,3:100 2:0,3:0 3:100' >> /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: echo 'g 0:100,1:100,2:0 1:100,3:100 2:0,3:0 3:100'
# note: command had no output on stdout or stderr
# RUN: at line 22
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-readobj /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o --bb-addr-map |  awk 'BEGIN {printf "h"}      /ID: [0-9]+/ {id=$2}      /Hash: 0x[0-9A-Fa-f]+/ {gsub(/^0x/, "", $2); hash=$2; printf " %s:%s", id, hash}      END {print ""}'  >> /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-readobj /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp.o --bb-addr-map
# note: command had no output on stdout or stderr
# executed command: awk 'BEGIN {printf "h"}      /ID: [0-9]+/ {id=$2}      /Hash: 0x[0-9A-Fa-f]+/ {gsub(/^0x/, "", $2); hash=$2; printf " %s:%s", id, hash}      END {print ""}'
# note: command had no output on stdout or stderr
# RUN: at line 29
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc < /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -O0 -mtriple=x86_64-pc-linux -function-sections -basic-block-sections=/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1 -basic-block-section-match-infer |  /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -check-prefixes=CHECK,LINUX-SECTIONS1
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -mtriple=x86_64-pc-linux -function-sections -basic-block-sections=/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/CodeGen/X86/Output/basic-block-sections-clusters-bb-hash.ll.tmp1 -basic-block-section-match-infer
# note: command had no output on stdout or stderr
# executed command: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll -check-prefixes=CHECK,LINUX-SECTIONS1
# .---command stderr------------
# | /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/X86/basic-block-sections-clusters-bb-hash.ll:80:26: error: LINUX-SECTIONS1-LABEL: expected string not found in input
# | ; LINUX-SECTIONS1-LABEL: # %bb.1:
# |                          ^
# | <stdin>:6:5: note: scanning from here
# | foo: # @foo

RKSimon added a commit that referenced this pull request Dec 11, 2025
…171616)

If the scalar integer sources are freely transferable to the FPU, then
perform the bitlogic op as a SSE/AVX operation.

Uses the mayFoldIntoVector helper added at #171589
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants