Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Ensure the valid vtype during copyPhysReg #118252

Closed
wants to merge 2 commits into from

Conversation

BeMg
Copy link
Contributor

@BeMg BeMg commented Dec 2, 2024

Address: #114518

This patch inserts the VSETIVLI instruction before whole RVVReg move during the copyPhysReg hook. It ensures that each expanding COPY pseudo-instruction for whole RVVReg has a valid vtype. The goal is to avoid illegal instruction traps caused by invalid vtypes.

There may be performance regression due to redundant VSETVL insertions because of the lack of VSETVLInfo from data-flow analysis.

There are two ways to reduce the redundant instructions:

  1. Perform the same task inside the insertvsetvl pass and leverage the computed VSETVLInfo from BasicBlock. This would treat the current patch as handling new COPY instructions that appear between the insertvsetvl pass and post-RA expand pseudo. (Like [RISCV] enable VTYPE before whole RVVReg move #117866)
  2. Move the first vsetvl instruction to the beginning of the BasicBlock instead of placing it right before the RVV instruction.

This patch cannot handle some RVVReg moves since they don't originate from a COPY pseudo-instruction, but are instead generated directly from an intrinsic function(__riscv_vmv_v) or from pseudo-instructions like PseudoVMV_V_V_M4.

Copy link

graphite-app bot commented Dec 2, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “FP Bundles” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@llvmbot
Copy link
Member

llvmbot commented Dec 2, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Piyou Chen (BeMg)

Changes

Address: #114518

This patch inserts the VSETIVLI instruction before whole RVVReg move during the copyPhysReg hook. It ensures that each expanding COPY pseudo-instruction for whole RVVReg has a valid vtype. The goal is to avoid illegal instruction traps caused by invalid vtypes.

There may be performance regression due to redundant VSETVL insertions because of the lack of VSETVLInfo from data-flow analysis.

There are two ways to reduce the redundant instructions:

  1. Perform the same task inside the insertvsetvl pass and leverage the computed VSETVLInfo from BasicBlock. This would treat the current patch as handling new COPY instructions that appear between the insertvsetvl pass and post-RA expand pseudo. (Like [RISCV] enable VTYPE before whole RVVReg move #117866)
  2. Move the first vsetvl instruction to the beginning of the BasicBlock instead of placing it right before the RVV instruction.

This patch cannot handle some RVVReg moves since they don't originate from a COPY pseudo-instruction, but are instead generated directly from an intrinsic function(__riscv_vmv_v) or from pseudo-instructions like PseudoVMV_V_V_M4.


Patch is 1.29 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118252.diff

174 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+30)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/abs-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/calling-conv.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/compressstore.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/copyprop.mir (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll (+8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/cttz-vp.ll (+10)
  • (modified) llvm/test/CodeGen/RISCV/rvv/expandload.ll (+514)
  • (modified) llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll (+19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vector-i8-index-cornercase.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv-fastcc.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop-vp.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll (+18)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum-vp.ll (+18)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-interleave.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fptrunc-vp.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fshr-fshl-vp.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-interleave.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+991-1066)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-load-int.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+9)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-mask-vp.ll (+31)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+9)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-concat.ll (+9)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-reverse.ll (+11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-vslide1up.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-vpload.ll (+3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp.ll (+9)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-unaligned.ll (+28-40)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpgather.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpload.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpmerge.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-sdnode.ll (+3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-vp.ll (+26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fminimum-sdnode.ll (+3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fminimum-vp.ll (+26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fold-scalar-load-crash.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fshr-fshl-vp.ll (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/inline-asm.ll (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/llrint-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/lrint-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/masked-tama.ll (+3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll (+16)
  • (modified) llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/named-vector-shuffle-reverse.ll (+13)
  • (modified) llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/pr88576.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rint-vp.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/round-vp.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll (+22)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv32-spill-vector-csr.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rvv-args-by-mem.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll (+3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll (+8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/setcc-int-vp.ll (+12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/strided-vpload.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/strided-vpstore.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vadd-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vcpop.ll (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll (+8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave-store.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll (+15)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-reassociations.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-splice.ll (+12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfabs-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfadd-vp.ll (+8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfdiv-vp.ll (+8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfirst.ll (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll (+21)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-constrained-sdnode.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-sdnode.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmax-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmin-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmul-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmuladd-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfneg-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfnmadd-constrained-sdnode.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfnmsub-constrained-sdnode.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfpext-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfptosi-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfptoui-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfptrunc-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfsqrt-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfsub-vp.ll (+8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vl-opt.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv32-dead.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv32.ll (+165)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv64-dead.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vlsegff-rv64.ll (+165)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmax-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmaxu-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmfeq.ll (+24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmfge.ll (+24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmfgt.ll (+24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmfle.ll (+24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmflt.ll (+24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmfne.ll (+24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmin-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vminu-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsbf.ll (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmseq.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsge.ll (+55)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsgeu.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsgt.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsgtu.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsif.ll (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsle.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsleu.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmslt.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsltu.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsne.ll (+54)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmsof.ll (+7)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vp-cttz-elts.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vp-select.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vp-splice-mask-fixed-vectors.ll (+12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vp-splice-mask-vectors.ll (+21)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vpgather-sdnode.ll (+18)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vpload.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vpmerge-sdnode.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vpstore.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vreductions-mask-vp.ll (+38)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vrgatherei16-subreg-liveness.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsadd-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsaddu-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vselect-bf16.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vselect-fp.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vselect-int.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vselect-vp.ll (+11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-O0.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsext-vp.ll (+2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsitofp-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vssub-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vssubu-vp.ll (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vtrunc-vp.ll (+10)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vuitofp-vp.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vzext-vp.ll (+2)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 47273d6bc06d65..5a93a13f3ae480 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -421,6 +421,36 @@ void RISCVInstrInfo::copyPhysRegVector(
     auto MIB = BuildMI(MBB, MBBI, DL, get(Opc), ActualDstReg);
     bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opc) == RISCV::VMV_V_I;
     bool UseVMV = UseVMV_V_I || RISCV::getRVVMCOpcode(Opc) == RISCV::VMV_V_V;
+
+    // Address https://github.com/llvm/llvm-project/issues/114518
+    // Make sure each whole RVVReg move has valid vtype.
+    unsigned Opcode = MIB->getOpcode();
+    if (UseVMV || Opcode == RISCV::VMV1R_V || Opcode == RISCV::VMV2R_V ||
+        Opcode == RISCV::VMV4R_V || Opcode == RISCV::VMV8R_V) {
+
+      // TODO: Data-flow analysis for vtype status could help avoid the
+      // redundant one.
+      bool NeedVSETIVLI = true;
+
+      for (auto &CurrMI : MBB) {
+        unsigned CurrMIOpcode = CurrMI.getOpcode();
+        if (CurrMIOpcode == RISCV::PseudoVSETIVLI ||
+            CurrMIOpcode == RISCV::PseudoVSETVLI ||
+            CurrMIOpcode == RISCV::PseudoVSETVLIX0)
+          NeedVSETIVLI = false;
+        else if (CurrMI.isInlineAsm())
+          NeedVSETIVLI = true;
+        else if (NeedVSETIVLI && &CurrMI == &*MIB) {
+          BuildMI(MBB, &*MIB, MIB->getDebugLoc(), get(RISCV::PseudoVSETIVLI))
+              .addReg(RISCV::X0, RegState::Define | RegState::Dead)
+              .addImm(0)
+              .addImm(RISCVVType::encodeVTYPE(RISCVII::VLMUL::LMUL_1, 32, false,
+                                              false));
+          break;
+        }
+      }
+    }
+
     if (UseVMV)
       MIB.addReg(ActualDstReg, RegState::Undef);
     if (UseVMV_V_I)
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll b/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll
index c04e4fea7b2c29..77ffdd9ae934a6 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-v-constraint.ll
@@ -45,6 +45,7 @@ define <vscale x 1 x i8> @constraint_vd(<vscale x 1 x i8> %0, <vscale x 1 x i8>
 define <vscale x 1 x i1> @constraint_vm(<vscale x 1 x i1> %0, <vscale x 1 x i1> %1) nounwind {
 ; RV32I-LABEL: constraint_vm:
 ; RV32I:       # %bb.0:
+; RV32I-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32I-NEXT:    vmv1r.v v9, v0
 ; RV32I-NEXT:    vmv1r.v v0, v8
 ; RV32I-NEXT:    #APP
@@ -54,6 +55,7 @@ define <vscale x 1 x i1> @constraint_vm(<vscale x 1 x i1> %0, <vscale x 1 x i1>
 ;
 ; RV64I-LABEL: constraint_vm:
 ; RV64I:       # %bb.0:
+; RV64I-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64I-NEXT:    vmv1r.v v9, v0
 ; RV64I-NEXT:    vmv1r.v v0, v8
 ; RV64I-NEXT:    #APP
diff --git a/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll b/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll
index 163d9145bc3623..46b76af7fffdad 100644
--- a/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/abs-vp.ll
@@ -567,6 +567,7 @@ define <vscale x 16 x i64> @vp_abs_nxv16i64(<vscale x 16 x i64> %va, <vscale x 1
 ; CHECK-NEXT:    slli a1, a1, 4
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x10, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 16 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v24, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -590,6 +591,7 @@ define <vscale x 16 x i64> @vp_abs_nxv16i64(<vscale x 16 x i64> %va, <vscale x 1
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB46_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    slli a1, a1, 3
 ; CHECK-NEXT:    add a1, sp, a1
diff --git a/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll b/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll
index 66a1178cddb66c..b2a08b1e077932 100644
--- a/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll
@@ -3075,6 +3075,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ; CHECK-NEXT:    slli a1, a1, 4
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x10, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 16 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v24, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -3121,6 +3122,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a3
 ; CHECK-NEXT:  .LBB46_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    csrr a3, vlenb
 ; CHECK-NEXT:    slli a3, a3, 3
@@ -3158,6 +3160,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ;
 ; CHECK-ZVBB-LABEL: vp_bitreverse_nxv64i16:
 ; CHECK-ZVBB:       # %bb.0:
+; CHECK-ZVBB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVBB-NEXT:    vmv1r.v v24, v0
 ; CHECK-ZVBB-NEXT:    csrr a1, vlenb
 ; CHECK-ZVBB-NEXT:    srli a2, a1, 1
@@ -3174,6 +3177,7 @@ define <vscale x 64 x i16> @vp_bitreverse_nxv64i16(<vscale x 64 x i16> %va, <vsc
 ; CHECK-ZVBB-NEXT:  # %bb.1:
 ; CHECK-ZVBB-NEXT:    mv a0, a1
 ; CHECK-ZVBB-NEXT:  .LBB46_2:
+; CHECK-ZVBB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVBB-NEXT:    vmv1r.v v0, v24
 ; CHECK-ZVBB-NEXT:    vsetvli zero, a0, e16, m8, ta, ma
 ; CHECK-ZVBB-NEXT:    vbrev.v v8, v8, v0.t
diff --git a/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll b/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll
index 1c95ec8fafd4f1..04d3c00463dd74 100644
--- a/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll
@@ -1584,6 +1584,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ; CHECK-NEXT:    slli a1, a1, 4
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x10, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 16 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v24, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -1609,6 +1610,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB32_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
@@ -1631,6 +1633,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ;
 ; CHECK-ZVKB-LABEL: vp_bswap_nxv64i16:
 ; CHECK-ZVKB:       # %bb.0:
+; CHECK-ZVKB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVKB-NEXT:    vmv1r.v v24, v0
 ; CHECK-ZVKB-NEXT:    csrr a1, vlenb
 ; CHECK-ZVKB-NEXT:    srli a2, a1, 1
@@ -1647,6 +1650,7 @@ define <vscale x 64 x i16> @vp_bswap_nxv64i16(<vscale x 64 x i16> %va, <vscale x
 ; CHECK-ZVKB-NEXT:  # %bb.1:
 ; CHECK-ZVKB-NEXT:    mv a0, a1
 ; CHECK-ZVKB-NEXT:  .LBB32_2:
+; CHECK-ZVKB-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-ZVKB-NEXT:    vmv1r.v v0, v24
 ; CHECK-ZVKB-NEXT:    vsetvli zero, a0, e16, m8, ta, ma
 ; CHECK-ZVKB-NEXT:    vrev8.v v8, v8, v0.t
diff --git a/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll b/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
index a4e5ab661c5285..5c533f042b4e53 100644
--- a/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
@@ -336,6 +336,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_i32(<vsca
 ; RV32-NEXT:    add a1, a3, a1
 ; RV32-NEXT:    li a3, 2
 ; RV32-NEXT:    vs8r.v v16, (a1)
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv8r.v v8, v0
 ; RV32-NEXT:    vmv8r.v v16, v24
 ; RV32-NEXT:    call ext2
@@ -374,6 +375,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_i32(<vsca
 ; RV64-NEXT:    add a1, a3, a1
 ; RV64-NEXT:    li a3, 2
 ; RV64-NEXT:    vs8r.v v16, (a1)
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv8r.v v8, v0
 ; RV64-NEXT:    vmv8r.v v16, v24
 ; RV64-NEXT:    call ext2
@@ -451,6 +453,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_nxv32i32_
 ; RV32-NEXT:    add a1, sp, a1
 ; RV32-NEXT:    addi a1, a1, 128
 ; RV32-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv8r.v v16, v0
 ; RV32-NEXT:    call ext3
 ; RV32-NEXT:    addi sp, s0, -144
@@ -523,6 +526,7 @@ define fastcc <vscale x 32 x i32> @ret_nxv32i32_call_nxv32i32_nxv32i32_nxv32i32_
 ; RV64-NEXT:    add a1, sp, a1
 ; RV64-NEXT:    addi a1, a1, 128
 ; RV64-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv8r.v v16, v0
 ; RV64-NEXT:    call ext3
 ; RV64-NEXT:    addi sp, s0, -144
diff --git a/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll b/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll
index 9b27116fef7cae..068fdad8a4ab3e 100644
--- a/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll
@@ -103,6 +103,7 @@ define target("riscv.vector.tuple", <vscale x 16 x i8>, 2) @caller_tuple_return(
 ; RV32-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
 ; RV32-NEXT:    .cfi_offset ra, -4
 ; RV32-NEXT:    call callee_tuple_return
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv2r.v v6, v8
 ; RV32-NEXT:    vmv2r.v v8, v10
 ; RV32-NEXT:    vmv2r.v v10, v6
@@ -119,6 +120,7 @@ define target("riscv.vector.tuple", <vscale x 16 x i8>, 2) @caller_tuple_return(
 ; RV64-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
 ; RV64-NEXT:    .cfi_offset ra, -8
 ; RV64-NEXT:    call callee_tuple_return
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv2r.v v6, v8
 ; RV64-NEXT:    vmv2r.v v8, v10
 ; RV64-NEXT:    vmv2r.v v10, v6
@@ -144,6 +146,7 @@ define void @caller_tuple_argument(target("riscv.vector.tuple", <vscale x 16 x i
 ; RV32-NEXT:    .cfi_def_cfa_offset 16
 ; RV32-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
 ; RV32-NEXT:    .cfi_offset ra, -4
+; RV32-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV32-NEXT:    vmv2r.v v6, v8
 ; RV32-NEXT:    vmv2r.v v8, v10
 ; RV32-NEXT:    vmv2r.v v10, v6
@@ -160,6 +163,7 @@ define void @caller_tuple_argument(target("riscv.vector.tuple", <vscale x 16 x i
 ; RV64-NEXT:    .cfi_def_cfa_offset 16
 ; RV64-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
 ; RV64-NEXT:    .cfi_offset ra, -8
+; RV64-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; RV64-NEXT:    vmv2r.v v6, v8
 ; RV64-NEXT:    vmv2r.v v8, v10
 ; RV64-NEXT:    vmv2r.v v10, v6
diff --git a/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll b/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
index 7d0b0118a72725..f422fe42e7a733 100644
--- a/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll
@@ -117,6 +117,7 @@ declare <vscale x 4 x bfloat> @llvm.vp.ceil.nxv4bf16(<vscale x 4 x bfloat>, <vsc
 define <vscale x 4 x bfloat> @vp_ceil_vv_nxv4bf16(<vscale x 4 x bfloat> %va, <vscale x 4 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv4bf16:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v9, v0
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v10, v8
@@ -169,6 +170,7 @@ declare <vscale x 8 x bfloat> @llvm.vp.ceil.nxv8bf16(<vscale x 8 x bfloat>, <vsc
 define <vscale x 8 x bfloat> @vp_ceil_vv_nxv8bf16(<vscale x 8 x bfloat> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv8bf16:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v10, v0
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m2, ta, ma
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v12, v8
@@ -221,6 +223,7 @@ declare <vscale x 16 x bfloat> @llvm.vp.ceil.nxv16bf16(<vscale x 16 x bfloat>, <
 define <vscale x 16 x bfloat> @vp_ceil_vv_nxv16bf16(<vscale x 16 x bfloat> %va, <vscale x 16 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv16bf16:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v12, v0
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v16, v8
@@ -279,6 +282,7 @@ define <vscale x 32 x bfloat> @vp_ceil_vv_nxv32bf16(<vscale x 32 x bfloat> %va,
 ; CHECK-NEXT:    slli a1, a1, 3
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v7, v0
 ; CHECK-NEXT:    csrr a2, vlenb
 ; CHECK-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
@@ -317,6 +321,7 @@ define <vscale x 32 x bfloat> @vp_ceil_vv_nxv32bf16(<vscale x 32 x bfloat> %va,
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB10_2:
 ; CHECK-NEXT:    vfwcvtbf16.f.f.v v24, v8
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v0, v7
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m8, ta, ma
 ; CHECK-NEXT:    vfabs.v v16, v24, v0.t
@@ -582,6 +587,7 @@ define <vscale x 4 x half> @vp_ceil_vv_nxv4f16(<vscale x 4 x half> %va, <vscale
 ;
 ; ZVFHMIN-LABEL: vp_ceil_vv_nxv4f16:
 ; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v9, v0
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v10, v8
@@ -649,6 +655,7 @@ declare <vscale x 8 x half> @llvm.vp.ceil.nxv8f16(<vscale x 8 x half>, <vscale x
 define <vscale x 8 x half> @vp_ceil_vv_nxv8f16(<vscale x 8 x half> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; ZVFH-LABEL: vp_ceil_vv_nxv8f16:
 ; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFH-NEXT:    vmv1r.v v10, v0
 ; ZVFH-NEXT:    lui a1, %hi(.LCPI18_0)
 ; ZVFH-NEXT:    flh fa5, %lo(.LCPI18_0)(a1)
@@ -668,6 +675,7 @@ define <vscale x 8 x half> @vp_ceil_vv_nxv8f16(<vscale x 8 x half> %va, <vscale
 ;
 ; ZVFHMIN-LABEL: vp_ceil_vv_nxv8f16:
 ; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v10, v0
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m2, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v12, v8
@@ -735,6 +743,7 @@ declare <vscale x 16 x half> @llvm.vp.ceil.nxv16f16(<vscale x 16 x half>, <vscal
 define <vscale x 16 x half> @vp_ceil_vv_nxv16f16(<vscale x 16 x half> %va, <vscale x 16 x i1> %m, i32 zeroext %evl) {
 ; ZVFH-LABEL: vp_ceil_vv_nxv16f16:
 ; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFH-NEXT:    vmv1r.v v12, v0
 ; ZVFH-NEXT:    lui a1, %hi(.LCPI20_0)
 ; ZVFH-NEXT:    flh fa5, %lo(.LCPI20_0)(a1)
@@ -754,6 +763,7 @@ define <vscale x 16 x half> @vp_ceil_vv_nxv16f16(<vscale x 16 x half> %va, <vsca
 ;
 ; ZVFHMIN-LABEL: vp_ceil_vv_nxv16f16:
 ; ZVFHMIN:       # %bb.0:
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v12, v0
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v16, v8
@@ -821,6 +831,7 @@ declare <vscale x 32 x half> @llvm.vp.ceil.nxv32f16(<vscale x 32 x half>, <vscal
 define <vscale x 32 x half> @vp_ceil_vv_nxv32f16(<vscale x 32 x half> %va, <vscale x 32 x i1> %m, i32 zeroext %evl) {
 ; ZVFH-LABEL: vp_ceil_vv_nxv32f16:
 ; ZVFH:       # %bb.0:
+; ZVFH-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFH-NEXT:    vmv1r.v v16, v0
 ; ZVFH-NEXT:    lui a1, %hi(.LCPI22_0)
 ; ZVFH-NEXT:    flh fa5, %lo(.LCPI22_0)(a1)
@@ -846,6 +857,7 @@ define <vscale x 32 x half> @vp_ceil_vv_nxv32f16(<vscale x 32 x half> %va, <vsca
 ; ZVFHMIN-NEXT:    slli a1, a1, 3
 ; ZVFHMIN-NEXT:    sub sp, sp, a1
 ; ZVFHMIN-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v7, v0
 ; ZVFHMIN-NEXT:    csrr a2, vlenb
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, m4, ta, ma
@@ -884,6 +896,7 @@ define <vscale x 32 x half> @vp_ceil_vv_nxv32f16(<vscale x 32 x half> %va, <vsca
 ; ZVFHMIN-NEXT:    mv a0, a1
 ; ZVFHMIN-NEXT:  .LBB22_2:
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v24, v8
+; ZVFHMIN-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ZVFHMIN-NEXT:    vmv1r.v v0, v7
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, m8, ta, ma
 ; ZVFHMIN-NEXT:    vfabs.v v16, v24, v0.t
@@ -1068,6 +1081,7 @@ declare <vscale x 4 x float> @llvm.vp.ceil.nxv4f32(<vscale x 4 x float>, <vscale
 define <vscale x 4 x float> @vp_ceil_vv_nxv4f32(<vscale x 4 x float> %va, <vscale x 4 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv4f32:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v10, v0
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m2, ta, ma
 ; CHECK-NEXT:    vfabs.v v12, v8, v0.t
@@ -1112,6 +1126,7 @@ declare <vscale x 8 x float> @llvm.vp.ceil.nxv8f32(<vscale x 8 x float>, <vscale
 define <vscale x 8 x float> @vp_ceil_vv_nxv8f32(<vscale x 8 x float> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv8f32:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v12, v0
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m4, ta, ma
 ; CHECK-NEXT:    vfabs.v v16, v8, v0.t
@@ -1156,6 +1171,7 @@ declare <vscale x 16 x float> @llvm.vp.ceil.nxv16f32(<vscale x 16 x float>, <vsc
 define <vscale x 16 x float> @vp_ceil_vv_nxv16f32(<vscale x 16 x float> %va, <vscale x 16 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv16f32:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v16, v0
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m8, ta, ma
 ; CHECK-NEXT:    vfabs.v v24, v8, v0.t
@@ -1242,6 +1258,7 @@ declare <vscale x 2 x double> @llvm.vp.ceil.nxv2f64(<vscale x 2 x double>, <vsca
 define <vscale x 2 x double> @vp_ceil_vv_nxv2f64(<vscale x 2 x double> %va, <vscale x 2 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv2f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v10, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI36_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI36_0)(a1)
@@ -1286,6 +1303,7 @@ declare <vscale x 4 x double> @llvm.vp.ceil.nxv4f64(<vscale x 4 x double>, <vsca
 define <vscale x 4 x double> @vp_ceil_vv_nxv4f64(<vscale x 4 x double> %va, <vscale x 4 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv4f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v12, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI38_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI38_0)(a1)
@@ -1330,6 +1348,7 @@ declare <vscale x 7 x double> @llvm.vp.ceil.nxv7f64(<vscale x 7 x double>, <vsca
 define <vscale x 7 x double> @vp_ceil_vv_nxv7f64(<vscale x 7 x double> %va, <vscale x 7 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv7f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v16, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI40_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI40_0)(a1)
@@ -1374,6 +1393,7 @@ declare <vscale x 8 x double> @llvm.vp.ceil.nxv8f64(<vscale x 8 x double>, <vsca
 define <vscale x 8 x double> @vp_ceil_vv_nxv8f64(<vscale x 8 x double> %va, <vscale x 8 x i1> %m, i32 zeroext %evl) {
 ; CHECK-LABEL: vp_ceil_vv_nxv8f64:
 ; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v16, v0
 ; CHECK-NEXT:    lui a1, %hi(.LCPI42_0)
 ; CHECK-NEXT:    fld fa5, %lo(.LCPI42_0)(a1)
@@ -1425,6 +1445,7 @@ define <vscale x 16 x double> @vp_ceil_vv_nxv16f64(<vscale x 16 x double> %va, <
 ; CHECK-NEXT:    slli a1, a1, 3
 ; CHECK-NEXT:    sub sp, sp, a1
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; CHECK-NEXT:    vmv1r.v v7, v0
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    lui a2, %hi(.LCPI44_0)
@@ -1458,6 +1479,7 @@ define <vscale x 16 x double> @vp_ceil_vv_nxv16f64(<vscale x 16 x double> %va, <
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    mv a0, a1
 ; CHECK-NEXT:  .LBB44_2:
+; CHECK-NEXT:    vsetivli zero, 0, e32, m1, tu, mu
 ; ...
[truncated]

else if (CurrMI.isInlineAsm())
NeedVSETIVLI = true;
else if (NeedVSETIVLI && &CurrMI == &*MIB) {
BuildMI(MBB, &*MIB, MIB->getDebugLoc(), get(RISCV::PseudoVSETIVLI))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there's a vsetvli in an earlier basic block and a later instruction in this basic block that is using the vtype from that vsetvli? Won't this new vsetvli invalidate that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes. In this case, it will invalidate the vtype coming from the predecessor basic block. Maybe we could check the following RVV instructions in the same basic block without the explicit vsetvl to ensure there is a vtype from another basic block.

CurrMIOpcode == RISCV::PseudoVSETVLI ||
CurrMIOpcode == RISCV::PseudoVSETVLIX0)
NeedVSETIVLI = false;
else if (CurrMI.isInlineAsm())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we only check for inline assembly? Don't we need to handle calls?

@BeMg
Copy link
Contributor Author

BeMg commented Dec 4, 2024

This patch works incorrectly when inserting the vsetivli instruction. The VSETVLInfo is necessary. Without this information, the previous vtype status will be broken. I'm unsure whether we should reuse the analysis from the insert-vsetvl pass here or create a standalone pass to emit the valid vtype for whole vector register move.

@BeMg BeMg closed this Dec 11, 2024
@lukel97
Copy link
Contributor

lukel97 commented Dec 11, 2024

I think we still need to take into account whole vector register moves that might be emitted after vsetvli insertion, i.e. #118283 might not be enough.

After #118283 do we know how frequent these are or where they might come from?

@BeMg
Copy link
Contributor Author

BeMg commented Dec 11, 2024

I think we still need to take into account whole vector register moves that might be emitted after vsetvli insertion, i.e. #118283 might not be enough.

After #118283 do we know how frequent these are or where they might come from?

I have a branch BeMg@eca5123
for creating a standalone pass (borrowing part of insert vsetvl pass for VSETVLInfo analysis) to handle the COPY added after the insert vsetvl pass, but it hasn't found any examples that need to emit new vsetvl so far.

My idea is to wait until we actually encounter a RVVReg COPY pseudo being added after the insert vsetvl pass, and then discuss how to resolve that case.

cc @kito-cheng @topperc

@lukel97
Copy link
Contributor

lukel97 commented Jan 6, 2025

My idea is to wait until we actually encounter a RVVReg COPY pseudo being added after the insert vsetvl pass, and then discuss how to resolve that case.

Now that some time has passed, has anyone run into a COPY inserted after RISCVInsertVSETVLI yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants