[RISCV] Disable performCombineVMergeAndVOps for PseduoVIOTA_M. #71483

yetingk · 2023-11-07T04:09:18Z

This transformation might be illegal for PseduoVIOTA_M. The value of viota.m vd, vs2 is the prefix sum of vd2 and adding mask for it may cause wrong prefix sum.
Take an example, the result of following expression is {5, 5, 5, 3},

; v4 = {1, 1, 1, 1}
viota.m v1, v4
; v0 = {0, 0, 0, 1}, v1 = {0, 1, 2, 3}, v8 = {5, 5, 5, 5}
vmerge.vvm v8, v8, v1, v0.t
; v8 = {5, 5, 5, 3}

but if we merge them to viota.m v8, v4, v0.t, then the result of is {5, 5, 5, 0}.
Also, we still does performCombineVMergeAndVOps for voita.m when mask of vmerge.vvm is a true mask.

This transformation is illegal for PseduoVIOTA_M. The value of `viota.m vd, vs2` is the prefix sum of vd2 and adding mask for it may cause wrong prefix sum. Take an example, the result of following expression is `{5, 5, 5, 3}`, ``` ; v4 = {1, 1, 1, 1} viota.m v1, v4 ; v0 = {0, 0, 0, 1}, v1 = {0, 1, 2, 3}, v8 = {5, 5, 5, 5} vmerge.vvm v8, v8, v1, v0.t ; v8 = {5, 5, 5, 3} ``` but if we merge them to `viota.m v8, v4, v0.t`, then the result of is `{5, 5, 5, 0}` We still does the transformation when mask of vmerge.vvm is a true mask.

llvmbot · 2023-11-07T04:09:56Z

@llvm/pr-subscribers-backend-risc-v

Author: Yeting Kuo (yetingk)

Changes

This transformation is illegal for PseduoVIOTA_M. The value of viota.m vd, vs2 is the prefix sum of vd2 and adding mask for it may cause wrong prefix sum.

Take an example, the result of following expression is {5, 5, 5, 3},

; v4 = {1, 1, 1, 1}
viota.m v1, v4
; v0 = {0, 0, 0, 1}, v1 = {0, 1, 2, 3}, v8 = {5, 5, 5, 5}
vmerge.vvm v8, v8, v1, v0.t
; v8 = {5, 5, 5, 3}

but if we merge them to viota.m v8, v4, v0.t, then the result of is {5, 5, 5, 0}

We still does the transformation when mask of vmerge.vvm is a true mask.

Full diff: https://github.com/llvm/llvm-project/pull/71483.diff

3 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+13)
(modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-masked-vops.ll (+16)
(modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll (+23-5)

diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 51a235bf2ca1861..f103d323648d16a 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -3501,6 +3501,19 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) {
   if (!True.isMachineOpcode())
     return false;
 
+  // This transformation is illegal for viota.m when Mask is not a true mask.
+  switch (True->getMachineOpcode()) {
+  case RISCV::PseudoVIOTA_M_MF8:
+  case RISCV::PseudoVIOTA_M_MF4:
+  case RISCV::PseudoVIOTA_M_MF2:
+  case RISCV::PseudoVIOTA_M_M1:
+  case RISCV::PseudoVIOTA_M_M2:
+  case RISCV::PseudoVIOTA_M_M4:
+  case RISCV::PseudoVIOTA_M_M8:
+    if (Mask && !usesAllOnesMask(Mask, Glue))
+      return false;
+  }
+
   unsigned TrueOpc = True.getMachineOpcode();
   const MCInstrDesc &TrueMCID = TII->get(TrueOpc);
   uint64_t TrueTSFlags = TrueMCID.TSFlags;
diff --git a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-masked-vops.ll b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-masked-vops.ll
index 7e137d6a6196921..f6d9d1e711e7169 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-masked-vops.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-masked-vops.ll
@@ -258,3 +258,19 @@ entry:
   %res = call <vscale x 2 x i32> @llvm.vp.merge.nxv2i32(<vscale x 2 x i1> %m, <vscale x 2 x i32> %i, <vscale x 2 x i32> %passthru, i32 %evl)
   ret <vscale x 2 x i32> %res
 }
+
+; Test VIOTA_M
+declare <vscale x 2 x i32> @llvm.riscv.viota.mask.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x i1>,  <vscale x 2 x i1>, i64, i64)
+define <vscale x 2 x i32> @vpmerge_viota(<vscale x 2 x i32> %passthru, <vscale x 2 x i1> %m, <vscale x 2 x i1> %vm, i32 zeroext %vl) {
+; CHECK-LABEL: vpmerge_viota:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, mu
+; CHECK-NEXT:    viota.m v8, v9, v0.t
+; CHECK-NEXT:    ret
+  %1 = zext i32 %vl to i64
+  %a = call <vscale x 2 x i32> @llvm.riscv.viota.mask.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> %vm, <vscale x 2 x i1> %m, i64 %1, i64 0)
+  %splat = insertelement <vscale x 2 x i1> poison, i1 -1, i32 0
+  %mask = shufflevector <vscale x 2 x i1> %splat, <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer
+  %b = call <vscale x 2 x i32> @llvm.riscv.vmerge.nxv2i32.nxv2i32(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> %passthru, <vscale x 2 x i32> %a, <vscale x 2 x i1> %mask, i64 %1)
+  ret <vscale x 2 x i32> %b
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
index 1ea2b7ef57cf081..df119435611d167 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
@@ -279,13 +279,15 @@ define <vscale x 2 x i32> @vpmerge_vid(<vscale x 2 x i32> %passthru, <vscale x 2
   ret <vscale x 2 x i32> %b
 }
 
-; Test riscv.viota
+; Test not combine VIOTA_M and VMERGE_VVM without true mask.
 declare <vscale x 2 x i32> @llvm.riscv.viota.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x i1>, i64)
 define <vscale x 2 x i32> @vpmerge_viota(<vscale x 2 x i32> %passthru, <vscale x 2 x i1> %m, <vscale x 2 x i1> %vm, i32 zeroext %vl) {
 ; CHECK-LABEL: vpmerge_viota:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, mu
-; CHECK-NEXT:    viota.m v8, v9, v0.t
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; CHECK-NEXT:    viota.m v10, v9
+; CHECK-NEXT:    vsetvli zero, zero, e32, m1, tu, ma
+; CHECK-NEXT:    vmerge.vvm v8, v8, v10, v0
 ; CHECK-NEXT:    ret
   %1 = zext i32 %vl to i64
   %a = call <vscale x 2 x i32> @llvm.riscv.viota.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> %vm, i64 %1)
@@ -293,6 +295,21 @@ define <vscale x 2 x i32> @vpmerge_viota(<vscale x 2 x i32> %passthru, <vscale x
   ret <vscale x 2 x i32> %b
 }
 
+; Test combine VIOTA_M and VMERGE_VVM with true mask.
+define <vscale x 2 x i32> @vpmerge_viota2(<vscale x 2 x i32> %passthru, <vscale x 2 x i1> %vm, i32 zeroext %vl) {
+; CHECK-LABEL: vpmerge_viota2:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
+; CHECK-NEXT:    viota.m v8, v0
+; CHECK-NEXT:    ret
+  %1 = zext i32 %vl to i64
+  %a = call <vscale x 2 x i32> @llvm.riscv.viota.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> %vm, i64 %1)
+  %splat = insertelement <vscale x 2 x i1> poison, i1 -1, i32 0
+  %true = shufflevector <vscale x 2 x i1> %splat, <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer
+  %b = call <vscale x 2 x i32> @llvm.vp.merge.nxv2i32(<vscale x 2 x i1> %true, <vscale x 2 x i32> %a, <vscale x 2 x i32> %passthru, i32 %vl)
+  ret <vscale x 2 x i32> %b
+}
+
 ; Test riscv.vfclass
 declare <vscale x 2 x i32> @llvm.riscv.vfclass.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x float>, i64)
 define <vscale x 2 x i32> @vpmerge_vflcass(<vscale x 2 x i32> %passthru, <vscale x 2 x float> %vf, <vscale x 2 x i1> %m, i32 zeroext %vl) {
@@ -730,8 +747,9 @@ define <vscale x 2 x i32> @vpselect_vid(<vscale x 2 x i32> %passthru, <vscale x
 define <vscale x 2 x i32> @vpselect_viota(<vscale x 2 x i32> %passthru, <vscale x 2 x i1> %m, <vscale x 2 x i1> %vm, i32 zeroext %vl) {
 ; CHECK-LABEL: vpselect_viota:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, mu
-; CHECK-NEXT:    viota.m v8, v9, v0.t
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; CHECK-NEXT:    viota.m v10, v9
+; CHECK-NEXT:    vmerge.vvm v8, v8, v10, v0
 ; CHECK-NEXT:    ret
   %1 = zext i32 %vl to i64
   %a = call <vscale x 2 x i32> @llvm.riscv.viota.nxv2i32(<vscale x 2 x i32> undef, <vscale x 2 x i1> %vm, i64 %1)

lukel97 · 2023-11-07T04:15:40Z

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

+  switch (True->getMachineOpcode()) {
+  case RISCV::PseudoVIOTA_M_MF8:
+  case RISCV::PseudoVIOTA_M_MF4:
+  case RISCV::PseudoVIOTA_M_MF2:
+  case RISCV::PseudoVIOTA_M_M1:
+  case RISCV::PseudoVIOTA_M_M2:
+  case RISCV::PseudoVIOTA_M_M4:
+  case RISCV::PseudoVIOTA_M_M8:
+    if (Mask && !usesAllOnesMask(Mask, Glue))
+      return false;
+  }


Could we use that new getRVVMCOpcode helper

Suggested change

switch (True->getMachineOpcode()) {

case RISCV::PseudoVIOTA_M_MF8:

case RISCV::PseudoVIOTA_M_MF4:

case RISCV::PseudoVIOTA_M_MF2:

case RISCV::PseudoVIOTA_M_M1:

case RISCV::PseudoVIOTA_M_M2:

case RISCV::PseudoVIOTA_M_M4:

case RISCV::PseudoVIOTA_M_M8:

if (Mask && !usesAllOnesMask(Mask, Glue))

return false;

}

if (getRVVMCOpcode(True->getMachineOpcode()) == RISCV::VIOTA_M && !usesAllOnesMask(Mask, Glue))

return false;

Thank you. I does similar refine in 540ce31

lukel97

LGTM. I wonder if we should add a bit to RISCVMaskedPseudo so that we can mark pseudos where changing the enabled elements affects semantics? I think the reduction instructions are in the same boat, except we don't mark it with RISCVMaskedPseudo at all. Which presumably means we can't convert from an all ones mask -> unmasked

yetingk · 2023-11-07T04:37:03Z

Which presumably means we can't convert from an all ones mask -> unmasked.

I think we can still does this transformation for viota.m, since all ones mask does not influence the prefix sum calculation.

lukel97 · 2023-11-07T04:40:53Z

I think we can still does this transformation for viota.m, since all ones mask does not influence the prefix sum calculation.

Sorry yes, I meant as in it's correct and we should do this transform for viota.m, but we should also do it for vredsum.vs since we don't do it currently

yetingk · 2023-11-07T04:44:31Z

Sorry, I misunderstood it and your comment was actually very clear.

I wonder if we should add a bit to RISCVMaskedPseudo so that we can mark pseudos where changing the enabled elements affects semantics?

I think your idea is great.

yetingk · 2023-11-07T05:16:30Z

Address @lukel97's idea to add a bit into RISCVMaskedPseudoInfo to represent the operation value is related to mask.

yetingk · 2023-11-07T05:17:54Z

llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td

  Pseudo MaskedPseudo = !cast<Pseudo>(NAME);
  Pseudo UnmaskedPseudo = !cast<Pseudo>(!subst("_MASK", "", NAME));
  bits<4> MaskOpIdx = MaskIdx;
+  bit IsAccumulatedOp = IsAcc;


Is there any better/general naming than IsAccumulatedOp?

Maybe something more generic like MaskAffectsResult?

Thank you for your opinion.

lukel97

LGTM x2

After llvm#71483 we now have a way of marking masked pseudos as having an unmasked equivalent, but their mask shouldn't be folded unless it's all ones since it would affect the result. This patch uses it to mark the pseudos for vredsum and friends, which in turn allows us to remove the unmasked patterns and remove vmerges entirely if it's known to have an all ones mask.

After #71483 we now have a way of marking masked pseudos as having an unmasked equivalent, but their mask shouldn't be folded unless it's all ones since it would affect the result. This patch uses it to mark the pseudos for vredsum and friends, which in turn allows us to remove the unmasked patterns, and catch some other forms of vmerge.

yetingk requested review from asb, preames, frasercrmck, lukel97 and topperc November 7, 2023 04:09

llvmbot added the backend:RISC-V label Nov 7, 2023

lukel97 reviewed Nov 7, 2023

View reviewed changes

Use getRVVMCOpcode to avoid list all pseduo instruction.

540ce31

lukel97 approved these changes Nov 7, 2023

View reviewed changes

Add isAccumulated bit for RISCVMaskedPseudoInfo.

a5bfcd0

yetingk commented Nov 7, 2023

View reviewed changes

Rename IsAccumulatedOp to MaskAffectsResult.

aed70ca

lukel97 approved these changes Nov 7, 2023

View reviewed changes

yetingk merged commit a5c1eca into llvm:main Nov 7, 2023

lukel97 mentioned this pull request Nov 7, 2023

[RISCV] Use masked pseudo peephole for reduction pseudos #71508

Merged

This was referenced Nov 8, 2023

fix empty bb maksfb/llvm-project#2

Closed

fix empty bb maksfb/llvm-project#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Disable performCombineVMergeAndVOps for PseduoVIOTA_M. #71483

[RISCV] Disable performCombineVMergeAndVOps for PseduoVIOTA_M. #71483

yetingk commented Nov 7, 2023 •

edited

Loading

llvmbot commented Nov 7, 2023

lukel97 Nov 7, 2023

yetingk Nov 7, 2023

lukel97 left a comment

yetingk commented Nov 7, 2023

lukel97 commented Nov 7, 2023

yetingk commented Nov 7, 2023 •

edited

Loading

yetingk commented Nov 7, 2023

yetingk Nov 7, 2023

lukel97 Nov 7, 2023

yetingk Nov 7, 2023

yetingk Nov 7, 2023

lukel97 left a comment

[RISCV] Disable performCombineVMergeAndVOps for PseduoVIOTA_M. #71483

[RISCV] Disable performCombineVMergeAndVOps for PseduoVIOTA_M. #71483

Conversation

yetingk commented Nov 7, 2023 • edited Loading

llvmbot commented Nov 7, 2023

lukel97 Nov 7, 2023

Choose a reason for hiding this comment

yetingk Nov 7, 2023

Choose a reason for hiding this comment

lukel97 left a comment

Choose a reason for hiding this comment

yetingk commented Nov 7, 2023

lukel97 commented Nov 7, 2023

yetingk commented Nov 7, 2023 • edited Loading

yetingk commented Nov 7, 2023

yetingk Nov 7, 2023

Choose a reason for hiding this comment

lukel97 Nov 7, 2023

Choose a reason for hiding this comment

yetingk Nov 7, 2023

Choose a reason for hiding this comment

yetingk Nov 7, 2023

Choose a reason for hiding this comment

lukel97 left a comment

Choose a reason for hiding this comment

yetingk commented Nov 7, 2023 •

edited

Loading

yetingk commented Nov 7, 2023 •

edited

Loading