[RISCV] Ensure the valid vtype during copyPhysReg #118252

BeMg · 2024-12-02T05:23:56Z

This patch inserts the VSETIVLI instruction before whole RVVReg move during the copyPhysReg hook. It ensures that each expanding COPY pseudo-instruction for whole RVVReg has a valid vtype. The goal is to avoid illegal instruction traps caused by invalid vtypes.

There may be performance regression due to redundant VSETVL insertions because of the lack of VSETVLInfo from data-flow analysis.

There are two ways to reduce the redundant instructions:

Perform the same task inside the insertvsetvl pass and leverage the computed VSETVLInfo from BasicBlock. This would treat the current patch as handling new COPY instructions that appear between the insertvsetvl pass and post-RA expand pseudo. (Like [RISCV] enable VTYPE before whole RVVReg move #117866)
Move the first vsetvl instruction to the beginning of the BasicBlock instead of placing it right before the RVV instruction.

This patch cannot handle some RVVReg moves since they don't originate from a COPY pseudo-instruction, but are instead generated directly from an intrinsic function(__riscv_vmv_v) or from pseudo-instructions like PseudoVMV_V_V_M4.

graphite-app · 2024-12-02T05:24:09Z

Your org has enabled the Graphite merge queue for merging into main

Add the label “FP Bundles” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

llvmbot · 2024-12-02T05:24:38Z

@llvm/pr-subscribers-backend-risc-v

Author: Piyou Chen (BeMg)

Changes

Address: #114518

This patch inserts the VSETIVLI instruction before whole RVVReg move during the copyPhysReg hook. It ensures that each expanding COPY pseudo-instruction for whole RVVReg has a valid vtype. The goal is to avoid illegal instruction traps caused by invalid vtypes.

There may be performance regression due to redundant VSETVL insertions because of the lack of VSETVLInfo from data-flow analysis.

There are two ways to reduce the redundant instructions:

Perform the same task inside the insertvsetvl pass and leverage the computed VSETVLInfo from BasicBlock. This would treat the current patch as handling new COPY instructions that appear between the insertvsetvl pass and post-RA expand pseudo. (Like [RISCV] enable VTYPE before whole RVVReg move #117866)
Move the first vsetvl instruction to the beginning of the BasicBlock instead of placing it right before the RVV instruction.

This patch cannot handle some RVVReg moves since they don't originate from a COPY pseudo-instruction, but are instead generated directly from an intrinsic function(__riscv_vmv_v) or from pseudo-instructions like PseudoVMV_V_V_M4.

Patch is 1.29 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118252.diff

174 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+30)
(modified) llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/calling-conv.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/compressstore.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/copyprop.mir (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll (+8)
(modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/cttz-vp.ll (+10)
(modified) llvm/test/CodeGen/RISCV/rvv/expandload.ll (+514)
(modified) llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll (+19)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv-fastcc.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop-vp.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll (+18)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum-vp.ll (+18)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-interleave.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fptrunc-vp.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fshr-fshl-vp.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-interleave.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+991-1066)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-load-int.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-mask-vp.ll (+31)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-concat.ll (+9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-reverse.ll (+11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-vslide1up.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-vpload.ll (+3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp.ll (+9)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-unaligned.ll (+28-40)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpgather.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpload.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpmerge.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-sdnode.ll (+3)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-vp.ll (+26)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-sdnode.ll (+3)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-vp.ll (+26)
(modified) llvm/test/CodeGen/RISCV/rvv/fold-scalar-load-crash.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/fshr-fshl-vp.ll (+7)
(modified) llvm/test/CodeGen/RISCV/rvv/inline-asm.ll (+7)
(modified) llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/llrint-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/lrint-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/masked-tama.ll (+3)
(modified) llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll (+16)
(modified) llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll (+13)
(modified) llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/pr88576.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/rint-vp.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/round-vp.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll (+22)
(modified) llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/rvv-args-by-mem.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll (+3)
(modified) llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll (+8)
(modified) llvm/test/CodeGen/RISCV/rvv/setcc-int-vp.ll (+12)
(modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/strided-vpload.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/strided-vpstore.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vadd-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vcpop.ll (+7)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll (+8)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave-store.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll (+15)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-reassociations.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-splice.ll (+12)
(modified) llvm/test/CodeGen/RISCV/rvv/vfabs-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vfadd-vp.ll (+8)
(modified) llvm/test/CodeGen/RISCV/rvv/vfdiv-vp.ll (+8)
(modified) llvm/test/CodeGen/RISCV/rvv/vfirst.ll (+7)
(modified) llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll (+21)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-constrained-sdnode.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-sdnode.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmax-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmin-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmul-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmuladd-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vfneg-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vfnmadd-constrained-sdnode.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vfnmsub-constrained-sdnode.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vfpext-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vfptosi-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vfptoui-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vfptrunc-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vfsqrt-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vfsub-vp.ll (+8)
(modified) llvm/test/CodeGen/RISCV/rvv/vl-opt.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv32-dead.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv32.ll (+165)
(modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv64-dead.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv64.ll (+165)
(modified) llvm/test/CodeGen/RISCV/rvv/vmax-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vmaxu-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfeq.ll (+24)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfge.ll (+24)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfgt.ll (+24)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfle.ll (+24)
(modified) llvm/test/CodeGen/RISCV/rvv/vmflt.ll (+24)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfne.ll (+24)
(modified) llvm/test/CodeGen/RISCV/rvv/vmin-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vminu-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsbf.ll (+7)
(modified) llvm/test/CodeGen/RISCV/rvv/vmseq.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsge.ll (+55)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgeu.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgt.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgtu.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsif.ll (+7)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsle.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsleu.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmslt.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsltu.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsne.ll (+54)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsof.ll (+7)
(modified) llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vp-cttz-elts.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vp-select.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vp-splice-mask-fixed-vectors.ll (+12)
(modified) llvm/test/CodeGen/RISCV/rvv/vp-splice-mask-vectors.ll (+21)
(modified) llvm/test/CodeGen/RISCV/rvv/vpgather-sdnode.ll (+18)
(modified) llvm/test/CodeGen/RISCV/rvv/vpload.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vpmerge-sdnode.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vpstore.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vreductions-mask-vp.ll (+38)
(modified) llvm/test/CodeGen/RISCV/rvv/vrgatherei16-subreg-liveness.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsadd-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vsaddu-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vselect-bf16.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vselect-fp.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vselect-int.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vselect-vp.ll (+11)
(modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-O0.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+1)
(modified) llvm/test/CodeGen/RISCV/rvv/vsext-vp.ll (+2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsitofp-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vssub-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vssubu-vp.ll (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/vtrunc-vp.ll (+10)
(modified) llvm/test/CodeGen/RISCV/rvv/vuitofp-vp.ll (+6)
(modified) llvm/test/CodeGen/RISCV/rvv/vzext-vp.ll (+2)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 47273d6bc06d65..5a93a13f3ae480 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -421,6 +421,36 @@ void RISCVInstrInfo::copyPhysRegVector(
     auto MIB = BuildMI(MBB, MBBI, DL, get(Opc), ActualDstReg);
     bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opc) == RISCV::VMV_V_I;
     bool UseVMV = UseVMV_V_I || RISCV::getRVVMCOpcode(Opc) == RISCV::VMV_V_V;
+
+    // Address https://github.com/llvm/llvm-project/issues/114518
+    // Make sure each whole RVVReg move has valid vtype.
+    unsigned Opcode = MIB->getOpcode();
+    if (UseVMV || Opcode == RISCV::VMV1R_V || Opcode == RISCV::VMV2R_V ||
+        Opcode == RISCV::VMV4R_V || Opcode == RISCV::VMV8R_V) {
+
+      // TODO: Data-flow analysis for vtype status could help avoid the
+      // redundant one.
+      bool NeedVSETIVLI = true;
+
+      for (auto &CurrMI : MBB) {
+        unsigned CurrMIOpcode = CurrMI.getOpcode();
+        if (CurrMIOpcode == RISCV::PseudoVSETIVLI ||
+            CurrMIOpcode == RISCV::PseudoVSETVLI ||
+            CurrMIOpcode == RISCV::PseudoVSETVLIX0)
+          NeedVSETIVLI = false;
+        else if (CurrMI.isInlineAsm())
+          NeedVSETIVLI = true;
+        else if (NeedVSETIVLI && &CurrMI == &*MIB) {
+          BuildMI(MBB, &*MIB, MIB->getDebugLoc(), get(RISCV::PseudoVSETIVLI))
+              .addReg(RISCV::X0, RegState::Define | RegState::Dead)
+              .addImm(0)
+              .addImm(RISCVVType::encodeVTYPE(RISCVII::VLMUL::LMUL_1, 32, false,
+                                              false));
+          break;
+        }
+      }
+    }
+
     if (UseVMV)
       MIB.addReg(ActualDstReg, RegState::Undef);
     if (UseVMV_V_I)
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll b/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll
index c04e4fea7b2c29..77ffdd9ae934a6 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll
@@ -45,6 +45,7 @@ define <vscale x 1 x i8> @constraint_vd(<vscale x 1 x i8> %0, <vscale x 1 x i8>
 define <vscale x 1 x i1> @constraint_vm(<vscale x 1 x i1> %0, <vscale x 1 x i1> %1) nounwind {
 ; RV32I-LABEL: constraint_vm:
 ; RV32I:       # %bb.0:
+; RV32I-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32I-NEXT:    vmv1r.v v9, v0
 ; RV32I-NEXT:    vmv1r.v v0, v8
 ; RV32I-NEXT:    #APP
@@ -54,6 +55,7 @@ define <vscale x 1 x i1> @constraint_vm(<vscale x 1 x i1> %0, <vscale x 1 x i1>
 ;
 ; RV64I-LABEL: constraint_vm:
 ; RV64I:       # %bb.0:
+; RV64I-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64I-NEXT:    vmv1r.v v9, v0
 ; RV64I-NEXT:    vmv1r.v v0, v8
 ; RV64I-NEXT:    #APP
diff --git a/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll b/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll
index 163d9145bc3623..46b76af7fffdad 100644
--- a/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll
@@ -567,6 +567,7 @@ define <vscale x 16 x i64> @vp_abs_nxv16i64(<vscale x 16 x i64> %va, <vscale x 1
 ; CHECK-NEXT:    slli a1, a1, 4
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x10, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 16 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v24, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -590,6 +591,7 @@ define <vscale x 16 x i64> @vp_abs_nxv16i64(<vscale x 16 x i64> %va, <vscale x 1
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB46_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    slli a1, a1, 3
 ; CHECK-NEXT:    add a1, sp, a1
diff --git a/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll b/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll
index 66a1178cddb66c..b2a08b1e077932 100644
--- a/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll
@@ -3075,6 +3075,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ; CHECK-NEXT:    slli a1, a1, 4
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x10, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 16 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v24, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -3121,6 +3122,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a3
 ; CHECK-NEXT:  .LBB46_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    csrr a3, vlenb
 ; CHECK-NEXT:    slli a3, a3, 3
@@ -3158,6 +3160,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ;
 ; CHECK-ZVBB-LABEL: vp_bitreverse_nxv64i16:
 ; CHECK-ZVBB:       # %bb.0:
+; CHECK-ZVBB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVBB-NEXT:    vmv1r.v v24, v0
 ; CHECK-ZVBB-NEXT:    csrr a1, vlenb
 ; CHECK-ZVBB-NEXT:    srli a2, a1, 1
@@ -3174,6 +3177,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ; CHECK-ZVBB-NEXT:  # %bb.1:
 ; CHECK-ZVBB-NEXT:    mv a0, a1
 ; CHECK-ZVBB-NEXT:  .LBB46_2:
+; CHECK-ZVBB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVBB-NEXT:    vmv1r.v v0, v24
 ; CHECK-ZVBB-NEXT:    vsetvli zero, a0, e16, m8, ta, ma
 ; CHECK-ZVBB-NEXT:    vbrev.v v8, v8, v0.t
diff --git a/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll b/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll
index 1c95ec8fafd4f1..04d3c00463dd74 100644
--- a/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll
@@ -1584,6 +1584,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ; CHECK-NEXT:    slli a1, a1, 4
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x10, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 16 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v24, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -1609,6 +1610,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB32_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -1631,6 +1633,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ;
 ; CHECK-ZVKB-LABEL: vp_bswap_nxv64i16:
 ; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVKB-NEXT:    vmv1r.v v24, v0
 ; CHECK-ZVKB-NEXT:    csrr a1, vlenb
 ; CHECK-ZVKB-NEXT:    srli a2, a1, 1
@@ -1647,6 +1650,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ; CHECK-ZVKB-NEXT:  # %bb.1:
 ; CHECK-ZVKB-NEXT:    mv a0, a1
 ; CHECK-ZVKB-NEXT:  .LBB32_2:
+; CHECK-ZVKB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVKB-NEXT:    vmv1r.v v0, v24
 ; CHECK-ZVKB-NEXT:    vsetvli zero, a0, e16, m8, ta, ma
 ; CHECK-ZVKB-NEXT:    vrev8.v v8, v8, v0.t
diff --git a/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll b/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
index a4e5ab661c5285..5c533f042b4e53 100644
--- a/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
@@ -336,6 +336,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_i32(<vsca
 ; RV32-NEXT:    add a1, a3, a1
 ; RV32-NEXT:    li a3, 2
 ; RV32-NEXT:    vs8r.v v16, (a1)
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv8r.v v8, v0
 ; RV32-NEXT:    vmv8r.v v16, v24
 ; RV32-NEXT:    call ext2
@@ -374,6 +375,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_i32(<vsca
 ; RV64-NEXT:    add a1, a3, a1
 ; RV64-NEXT:    li a3, 2
 ; RV64-NEXT:    vs8r.v v16, (a1)
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv8r.v v8, v0
 ; RV64-NEXT:    vmv8r.v v16, v24
 ; RV64-NEXT:    call ext2
@@ -451,6 +453,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_nxv32i32_
 ; RV32-NEXT:    add a1, sp, a1
 ; RV32-NEXT:    addi a1, a1, 128
 ; RV32-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv8r.v v16, v0
 ; RV32-NEXT:    call ext3
 ; RV32-NEXT:    addi sp, s0, -144
@@ -523,6 +526,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_nxv32i32_
 ; RV64-NEXT:    add a1, sp, a1
 ; RV64-NEXT:    addi a1, a1, 128
 ; RV64-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv8r.v v16, v0
 ; RV64-NEXT:    call ext3
 ; RV64-NEXT:    addi sp, s0, -144
diff --git a/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll b/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll
index 9b27116fef7cae..068fdad8a4ab3e 100644
--- a/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll
@@ -103,6 +103,7 @@ define target("riscv.vector.tuple", <vscale x 16 x i8>, 2) @caller_tuple_return(
 ; RV32-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
 ; RV32-NEXT:    .cfi_offset ra, -4
 ; RV32-NEXT:    call callee_tuple_return
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv2r.v v6, v8
 ; RV32-NEXT:    vmv2r.v v8, v10
 ; RV32-NEXT:    vmv2r.v v10, v6
@@ -119,6 +120,7 @@ define target("riscv.vector.tuple", <vscale x 16 x i8>, 2) @caller_tuple_return(
 ; RV64-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
 ; RV64-NEXT:    .cfi_offset ra, -8
 ; RV64-NEXT:    call callee_tuple_return
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv2r.v v6, v8
 ; RV64-NEXT:    vmv2r.v v8, v10
 ; RV64-NEXT:    vmv2r.v v10, v6
@@ -144,6 +146,7 @@ define void @caller_tuple_argument(target("riscv.vector.tuple", <vscale x 16 x i
 ; RV32-NEXT:    .cfi_def_cfa_offset 16
 ; RV32-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
 ; RV32-NEXT:    .cfi_offset ra, -4
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv2r.v v6, v8
 ; RV32-NEXT:    vmv2r.v v8, v10
 ; RV32-NEXT:    vmv2r.v v10, v6
@@ -160,6 +163,7 @@ define void @caller_tuple_argument(target("riscv.vector.tuple", <vscale x 16 x i
 ; RV64-NEXT:    .cfi_def_cfa_offset 16
 ; RV64-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
 ; RV64-NEXT:    .cfi_offset ra, -8
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv2r.v v6, v8
 ; RV64-NEXT:    vmv2r.v v8, v10
 ; RV64-NEXT:    vmv2r.v v10, v6
diff --git a/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll b/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
index 7d0b0118a72725..f422fe42e7a733 100644
--- a/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
@@ -117,6 +117,7 @@ declare <vscale x 4 x bfloat> @llvm.vp.ceil.nxv4bf16(<vscale x 4 x bfloat>, <vsc
 define <vscale x 4 x bfloat> @vp_ceil_vv_nxv4bf16(<vscale x 4 x bfloat> %va, <vscale x 4 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv4bf16:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v9, v0
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v10, v8
@@ -169,6 +170,7 @@ declare <vscale x 8 x bfloat> @llvm.vp.ceil.nxv8bf16(<vscale x 8 x bfloat>, <vsc
 define <vscale x 8 x bfloat> @vp_ceil_vv_nxv8bf16(<vscale x 8 x bfloat> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv8bf16:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v10, v0
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m2, ta, ma
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v12, v8
@@ -221,6 +223,7 @@ declare <vscale x 16 x bfloat> @llvm.vp.ceil.nxv16bf16(<vscale x 16 x bfloat>, <
 define <vscale x 16 x bfloat> @vp_ceil_vv_nxv16bf16(<vscale x 16 x bfloat> %va, <vscale x 16 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv16bf16:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v12, v0
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v16, v8
@@ -279,6 +282,7 @@ define <vscale x 32 x bfloat> @vp_ceil_vv_nxv32bf16(<vscale x 32 x bfloat> %va,
 ; CHECK-NEXT:    slli a1, a1, 3
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v7, v0
 ; CHECK-NEXT:    csrr a2, vlenb
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
@@ -317,6 +321,7 @@ define <vscale x 32 x bfloat> @vp_ceil_vv_nxv32bf16(<vscale x 32 x bfloat> %va,
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB10_2:
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v24, v8
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v7
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m8, ta, ma
 ; CHECK-NEXT:    vfabs.v v16, v24, v0.t
@@ -582,6 +587,7 @@ define <vscale x 4 x half> @vp_ceil_vv_nxv4f16(<vscale x 4 x half> %va, <vscale
 ;
 ; ZVFHMIN-LABEL: vp_ceil_vv_nxv4f16:
 ; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v9, v0
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v10, v8
@@ -649,6 +655,7 @@ declare <vscale x 8 x half> @llvm.vp.ceil.nxv8f16(<vscale x 8 x half>, <vscale x
 define <vscale x 8 x half> @vp_ceil_vv_nxv8f16(<vscale x 8 x half> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; ZVFH-LABEL: vp_ceil_vv_nxv8f16:
 ; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFH-NEXT:    vmv1r.v v10, v0
 ; ZVFH-NEXT:    lui a1, %hi(.LCPI18_0)
 ; ZVFH-NEXT:    flh fa5, %lo(.LCPI18_0)(a1)
@@ -668,6 +675,7 @@ define <vscale x 8 x half> @vp_ceil_vv_nxv8f16(<vscale x 8 x half> %va, <vscale
 ;
 ; ZVFHMIN-LABEL: vp_ceil_vv_nxv8f16:
 ; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v10, v0
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m2, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v12, v8
@@ -735,6 +743,7 @@ declare <vscale x 16 x half> @llvm.vp.ceil.nxv16f16(<vscale x 16 x half>, <vscal
 define <vscale x 16 x half> @vp_ceil_vv_nxv16f16(<vscale x 16 x half> %va, <vscale x 16 x i1> %m, i32 zeroext %evl) {
 ; ZVFH-LABEL: vp_ceil_vv_nxv16f16:
 ; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFH-NEXT:    vmv1r.v v12, v0
 ; ZVFH-NEXT:    lui a1, %hi(.LCPI20_0)
 ; ZVFH-NEXT:    flh fa5, %lo(.LCPI20_0)(a1)
@@ -754,6 +763,7 @@ define <vscale x 16 x half> @vp_ceil_vv_nxv16f16(<vscale x 16 x half> %va, <vsca
 ;
 ; ZVFHMIN-LABEL: vp_ceil_vv_nxv16f16:
 ; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v12, v0
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v16, v8
@@ -821,6 +831,7 @@ declare <vscale x 32 x half> @llvm.vp.ceil.nxv32f16(<vscale x 32 x half>, <vscal
 define <vscale x 32 x half> @vp_ceil_vv_nxv32f16(<vscale x 32 x half> %va, <vscale x 32 x i1> %m, i32 zeroext %evl) {
 ; ZVFH-LABEL: vp_ceil_vv_nxv32f16:
 ; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFH-NEXT:    vmv1r.v v16, v0
 ; ZVFH-NEXT:    lui a1, %hi(.LCPI22_0)
 ; ZVFH-NEXT:    flh fa5, %lo(.LCPI22_0)(a1)
@@ -846,6 +857,7 @@ define <vscale x 32 x half> @vp_ceil_vv_nxv32f16(<vscale x 32 x half> %va, <vsca
 ; ZVFHMIN-NEXT:    slli a1, a1, 3
 ; ZVFHMIN-NEXT:    sub sp, sp, a1
 ; ZVFHMIN-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v7, v0
 ; ZVFHMIN-NEXT:    csrr a2, vlenb
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
@@ -884,6 +896,7 @@ define <vscale x 32 x half> @vp_ceil_vv_nxv32f16(<vscale x 32 x half> %va, <vsca
 ; ZVFHMIN-NEXT:    mv a0, a1
 ; ZVFHMIN-NEXT:  .LBB22_2:
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v24, v8
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v0, v7
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, m8, ta, ma
 ; ZVFHMIN-NEXT:    vfabs.v v16, v24, v0.t
@@ -1068,6 +1081,7 @@ declare <vscale x 4 x float> @llvm.vp.ceil.nxv4f32(<vscale x 4 x float>, <vscale
 define <vscale x 4 x float> @vp_ceil_vv_nxv4f32(<vscale x 4 x float> %va, <vscale x 4 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv4f32:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v10, v0
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m2, ta, ma
 ; CHECK-NEXT:    vfabs.v v12, v8, v0.t
@@ -1112,6 +1126,7 @@ declare <vscale x 8 x float> @llvm.vp.ceil.nxv8f32(<vscale x 8 x float>, <vscale
 define <vscale x 8 x float> @vp_ceil_vv_nxv8f32(<vscale x 8 x float> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv8f32:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v12, v0
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m4, ta, ma
 ; CHECK-NEXT:    vfabs.v v16, v8, v0.t
@@ -1156,6 +1171,7 @@ declare <vscale x 16 x float> @llvm.vp.ceil.nxv16f32(<vscale x 16 x float>, <vsc
 define <vscale x 16 x float> @vp_ceil_vv_nxv16f32(<vscale x 16 x float> %va, <vscale x 16 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv16f32:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v16, v0
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m8, ta, ma
 ; CHECK-NEXT:    vfabs.v v24, v8, v0.t
@@ -1242,6 +1258,7 @@ declare <vscale x 2 x double> @llvm.vp.ceil.nxv2f64(<vscale x 2 x double>, <vsca
 define <vscale x 2 x double> @vp_ceil_vv_nxv2f64(<vscale x 2 x double> %va, <vscale x 2 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv2f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v10, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI36_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI36_0)(a1)
@@ -1286,6 +1303,7 @@ declare <vscale x 4 x double> @llvm.vp.ceil.nxv4f64(<vscale x 4 x double>, <vsca
 define <vscale x 4 x double> @vp_ceil_vv_nxv4f64(<vscale x 4 x double> %va, <vscale x 4 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv4f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v12, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI38_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI38_0)(a1)
@@ -1330,6 +1348,7 @@ declare <vscale x 7 x double> @llvm.vp.ceil.nxv7f64(<vscale x 7 x double>, <vsca
 define <vscale x 7 x double> @vp_ceil_vv_nxv7f64(<vscale x 7 x double> %va, <vscale x 7 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv7f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v16, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI40_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI40_0)(a1)
@@ -1374,6 +1393,7 @@ declare <vscale x 8 x double> @llvm.vp.ceil.nxv8f64(<vscale x 8 x double>, <vsca
 define <vscale x 8 x double> @vp_ceil_vv_nxv8f64(<vscale x 8 x double> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv8f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v16, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI42_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI42_0)(a1)
@@ -1425,6 +1445,7 @@ define <vscale x 16 x double> @vp_ceil_vv_nxv16f64(<vscale x 16 x double> %va, <
 ; CHECK-NEXT:    slli a1, a1, 3
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v7, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    lui a2, %hi(.LCPI44_0)
@@ -1458,6 +1479,7 @@ define <vscale x 16 x double> @vp_ceil_vv_nxv16f64(<vscale x 16 x double> %va, <
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB44_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ...
[truncated]

topperc · 2024-12-02T06:01:17Z

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp

+        else if (CurrMI.isInlineAsm())
+          NeedVSETIVLI = true;
+        else if (NeedVSETIVLI && &CurrMI == &*MIB) {
+          BuildMI(MBB, &*MIB, MIB->getDebugLoc(), get(RISCV::PseudoVSETIVLI))


What if there's a vsetvli in an earlier basic block and a later instruction in this basic block that is using the vtype from that vsetvli? Won't this new vsetvli invalidate that?

Oh, yes. In this case, it will invalidate the vtype coming from the predecessor basic block. Maybe we could check the following RVV instructions in the same basic block without the explicit vsetvl to ensure there is a vtype from another basic block.

topperc · 2024-12-02T06:02:56Z

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp

+            CurrMIOpcode == RISCV::PseudoVSETVLI ||
+            CurrMIOpcode == RISCV::PseudoVSETVLIX0)
+          NeedVSETIVLI = false;
+        else if (CurrMI.isInlineAsm())


Why do we only check for inline assembly? Don't we need to handle calls?

BeMg · 2024-12-04T03:12:17Z

This patch works incorrectly when inserting the vsetivli instruction. The VSETVLInfo is necessary. Without this information, the previous vtype status will be broken. I'm unsure whether we should reuse the analysis from the insert-vsetvl pass here or create a standalone pass to emit the valid vtype for whole vector register move.

lukel97 · 2024-12-11T03:22:58Z

I think we still need to take into account whole vector register moves that might be emitted after vsetvli insertion, i.e. #118283 might not be enough.

After #118283 do we know how frequent these are or where they might come from?

BeMg · 2024-12-11T04:15:14Z

I think we still need to take into account whole vector register moves that might be emitted after vsetvli insertion, i.e. #118283 might not be enough.

After #118283 do we know how frequent these are or where they might come from?

I have a branch BeMg@eca5123
for creating a standalone pass (borrowing part of insert vsetvl pass for VSETVLInfo analysis) to handle the COPY added after the insert vsetvl pass, but it hasn't found any examples that need to emit new vsetvl so far.

My idea is to wait until we actually encounter a RVVReg COPY pseudo being added after the insert vsetvl pass, and then discuss how to resolve that case.

cc @kito-cheng @topperc

lukel97 · 2025-01-06T14:06:37Z

My idea is to wait until we actually encounter a RVVReg COPY pseudo being added after the insert vsetvl pass, and then discuss how to resolve that case.

Now that some time has passed, has anyone run into a COPY inserted after RISCVInsertVSETVLI yet?

BeMg added 2 commits December 1, 2024 20:57

[RISCV] Ensure the valid vtype during whole RVVReg move

b489e62

Check by pointer value

7532848

BeMg requested review from preames, lukel97, kito-cheng, topperc and wangpc-pp December 2, 2024 05:23

llvmbot added the backend:RISC-V label Dec 2, 2024

BeMg mentioned this pull request Dec 2, 2024

[RISCV] enable VTYPE before whole RVVReg move #117866

Closed

topperc reviewed Dec 2, 2024

View reviewed changes

BeMg closed this Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Ensure the valid vtype during copyPhysReg #118252

[RISCV] Ensure the valid vtype during copyPhysReg #118252

Uh oh!

BeMg commented Dec 2, 2024

Uh oh!

graphite-app bot commented Dec 2, 2024

Uh oh!

llvmbot commented Dec 2, 2024

Uh oh!

topperc Dec 2, 2024

Uh oh!

BeMg Dec 3, 2024

Uh oh!

topperc Dec 2, 2024

Uh oh!

BeMg commented Dec 4, 2024

Uh oh!

lukel97 commented Dec 11, 2024

Uh oh!

BeMg commented Dec 11, 2024

Uh oh!

lukel97 commented Jan 6, 2025

Uh oh!

Uh oh!

[RISCV] Ensure the valid vtype during copyPhysReg #118252

[RISCV] Ensure the valid vtype during copyPhysReg #118252

Uh oh!

Conversation

BeMg commented Dec 2, 2024

Uh oh!

graphite-app bot commented Dec 2, 2024

Your org has enabled the Graphite merge queue for merging into main

Uh oh!

llvmbot commented Dec 2, 2024

Uh oh!

topperc Dec 2, 2024

Choose a reason for hiding this comment

Uh oh!

BeMg Dec 3, 2024

Choose a reason for hiding this comment

Uh oh!

topperc Dec 2, 2024

Choose a reason for hiding this comment

Uh oh!

BeMg commented Dec 4, 2024

Uh oh!

lukel97 commented Dec 11, 2024

Uh oh!

BeMg commented Dec 11, 2024

Uh oh!

lukel97 commented Jan 6, 2025

Uh oh!

Uh oh!