Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mi-sched] Suppress register pressure with i64. #88256

Merged
merged 2 commits into from
Apr 15, 2024

Conversation

lcvon007
Copy link
Contributor

Machine scheduler will suppress register pressure when the scheduling
window is too small, but now it doesn't consider i64 register type,
and this MR extends it into i64 register type, so architecture like
RISCV64 that only supports i64 interger register will have the same
behavior like RISCV32.

@llvmbot
Copy link
Member

llvmbot commented Apr 10, 2024

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-backend-loongarch

Author: laichunfeng@tencent.com (lcvon007)

Changes

Machine scheduler will suppress register pressure when the scheduling
window is too small, but now it doesn't consider i64 register type,
and this MR extends it into i64 register type, so architecture like
RISCV64 that only supports i64 interger register will have the same
behavior like RISCV32.


Patch is 46.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88256.diff

14 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineScheduler.cpp (+1-1)
  • (modified) llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll (+8-8)
  • (modified) llvm/test/CodeGen/LoongArch/vector-fp-imm.ll (+9-9)
  • (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/bfloat-convert.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/double-convert.ll (+14-14)
  • (modified) llvm/test/CodeGen/RISCV/float-convert.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/half-convert.ll (+66-66)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll (+17-29)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-fp-vp.ll (+54-103)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll (+51-103)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vpscatter-sdnode.ll (+12-12)
diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 203bdffc49ae18..90a6262cff2bce 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -3283,7 +3283,7 @@ void GenericScheduler::initPolicy(MachineBasicBlock::iterator Begin,
   // compile time. As a rough heuristic, only track pressure when the number of
   // schedulable instructions exceeds half the integer register file.
   RegionPolicy.ShouldTrackPressure = true;
-  for (unsigned VT = MVT::i32; VT > (unsigned)MVT::i1; --VT) {
+  for (unsigned VT = MVT::i64; VT > (unsigned)MVT::i1; --VT) {
     MVT::SimpleValueType LegalIntVT = (MVT::SimpleValueType)VT;
     if (TLI->isTypeLegal(LegalIntVT)) {
       unsigned NIntRegs = Context->RegClassInfo->getNumAllocatableRegs(
diff --git a/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll b/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
index bf48c0df3e4961..7cde034726e0b5 100644
--- a/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
+++ b/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
@@ -4,12 +4,12 @@
 define i8 @atomicrmw_uinc_wrap_i8(ptr %ptr, i8 %val) {
 ; LA64-LABEL: atomicrmw_uinc_wrap_i8:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    slli.d $a2, $a0, 3
+; LA64-NEXT:    slli.d $a4, $a0, 3
 ; LA64-NEXT:    bstrins.d $a0, $zero, 1, 0
-; LA64-NEXT:    ori $a3, $zero, 255
-; LA64-NEXT:    sll.w $a4, $a3, $a2
+; LA64-NEXT:    andi $a2, $a4, 24
+; LA64-NEXT:    ori $a5, $zero, 255
 ; LA64-NEXT:    ld.w $a3, $a0, 0
-; LA64-NEXT:    andi $a2, $a2, 24
+; LA64-NEXT:    sll.w $a4, $a5, $a4
 ; LA64-NEXT:    nor $a4, $a4, $zero
 ; LA64-NEXT:    andi $a1, $a1, 255
 ; LA64-NEXT:    .p2align 4, , 16
@@ -54,13 +54,13 @@ define i8 @atomicrmw_uinc_wrap_i8(ptr %ptr, i8 %val) {
 define i16 @atomicrmw_uinc_wrap_i16(ptr %ptr, i16 %val) {
 ; LA64-LABEL: atomicrmw_uinc_wrap_i16:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    slli.d $a2, $a0, 3
+; LA64-NEXT:    slli.d $a4, $a0, 3
 ; LA64-NEXT:    bstrins.d $a0, $zero, 1, 0
+; LA64-NEXT:    andi $a2, $a4, 24
 ; LA64-NEXT:    lu12i.w $a3, 15
-; LA64-NEXT:    ori $a3, $a3, 4095
-; LA64-NEXT:    sll.w $a4, $a3, $a2
+; LA64-NEXT:    ori $a5, $a3, 4095
 ; LA64-NEXT:    ld.w $a3, $a0, 0
-; LA64-NEXT:    andi $a2, $a2, 24
+; LA64-NEXT:    sll.w $a4, $a5, $a4
 ; LA64-NEXT:    nor $a4, $a4, $zero
 ; LA64-NEXT:    bstrpick.d $a1, $a1, 15, 0
 ; LA64-NEXT:    .p2align 4, , 16
diff --git a/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll b/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll
index d03af114bceefe..18d17751a77196 100644
--- a/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll
+++ b/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll
@@ -124,10 +124,10 @@ define void @test_f2(ptr %P, ptr %S) nounwind {
 ; LA64F:       # %bb.0:
 ; LA64F-NEXT:    fld.s $fa0, $a0, 4
 ; LA64F-NEXT:    fld.s $fa1, $a0, 0
-; LA64F-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI1_0)
-; LA64F-NEXT:    addi.d $a0, $a0, %pc_lo12(.LCPI1_0)
-; LA64F-NEXT:    fld.s $fa2, $a0, 0
 ; LA64F-NEXT:    addi.w $a0, $zero, 1
+; LA64F-NEXT:    pcalau12i $a2, %pc_hi20(.LCPI1_0)
+; LA64F-NEXT:    addi.d $a2, $a2, %pc_lo12(.LCPI1_0)
+; LA64F-NEXT:    fld.s $fa2, $a2, 0
 ; LA64F-NEXT:    movgr2fr.w $fa3, $a0
 ; LA64F-NEXT:    ffint.s.w $fa3, $fa3
 ; LA64F-NEXT:    fadd.s $fa1, $fa1, $fa3
@@ -140,10 +140,10 @@ define void @test_f2(ptr %P, ptr %S) nounwind {
 ; LA64D:       # %bb.0:
 ; LA64D-NEXT:    fld.s $fa0, $a0, 4
 ; LA64D-NEXT:    fld.s $fa1, $a0, 0
-; LA64D-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI1_0)
-; LA64D-NEXT:    addi.d $a0, $a0, %pc_lo12(.LCPI1_0)
-; LA64D-NEXT:    fld.s $fa2, $a0, 0
 ; LA64D-NEXT:    addi.w $a0, $zero, 1
+; LA64D-NEXT:    pcalau12i $a2, %pc_hi20(.LCPI1_0)
+; LA64D-NEXT:    addi.d $a2, $a2, %pc_lo12(.LCPI1_0)
+; LA64D-NEXT:    fld.s $fa2, $a2, 0
 ; LA64D-NEXT:    movgr2fr.w $fa3, $a0
 ; LA64D-NEXT:    ffint.s.w $fa3, $fa3
 ; LA64D-NEXT:    fadd.s $fa1, $fa1, $fa3
@@ -527,10 +527,10 @@ define void @test_d2(ptr %P, ptr %S) nounwind {
 ; LA64D:       # %bb.0:
 ; LA64D-NEXT:    fld.d $fa0, $a0, 8
 ; LA64D-NEXT:    fld.d $fa1, $a0, 0
-; LA64D-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI4_0)
-; LA64D-NEXT:    addi.d $a0, $a0, %pc_lo12(.LCPI4_0)
-; LA64D-NEXT:    fld.d $fa2, $a0, 0
 ; LA64D-NEXT:    addi.d $a0, $zero, 1
+; LA64D-NEXT:    pcalau12i $a2, %pc_hi20(.LCPI4_0)
+; LA64D-NEXT:    addi.d $a2, $a2, %pc_lo12(.LCPI4_0)
+; LA64D-NEXT:    fld.d $fa2, $a2, 0
 ; LA64D-NEXT:    movgr2fr.d $fa3, $a0
 ; LA64D-NEXT:    ffint.d.l $fa3, $fa3
 ; LA64D-NEXT:    fadd.d $fa1, $fa1, $fa3
diff --git a/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll b/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll
index 5914e45a153302..f96e1bad2e3895 100644
--- a/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll
+++ b/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll
@@ -127,11 +127,11 @@ define i8 @atomicrmw_uinc_wrap_i8(ptr %ptr, i8 %val) {
 ; RV64IA-LABEL: atomicrmw_uinc_wrap_i8:
 ; RV64IA:       # %bb.0:
 ; RV64IA-NEXT:    andi a2, a0, -4
-; RV64IA-NEXT:    slli a0, a0, 3
-; RV64IA-NEXT:    li a3, 255
-; RV64IA-NEXT:    sllw a4, a3, a0
+; RV64IA-NEXT:    slli a4, a0, 3
+; RV64IA-NEXT:    andi a0, a4, 24
+; RV64IA-NEXT:    li a5, 255
 ; RV64IA-NEXT:    lw a3, 0(a2)
-; RV64IA-NEXT:    andi a0, a0, 24
+; RV64IA-NEXT:    sllw a4, a5, a4
 ; RV64IA-NEXT:    not a4, a4
 ; RV64IA-NEXT:    andi a1, a1, 255
 ; RV64IA-NEXT:  .LBB0_1: # %atomicrmw.start
diff --git a/llvm/test/CodeGen/RISCV/bfloat-convert.ll b/llvm/test/CodeGen/RISCV/bfloat-convert.ll
index d5041c2a7ca78b..9e2b0b5c3cbb41 100644
--- a/llvm/test/CodeGen/RISCV/bfloat-convert.ll
+++ b/llvm/test/CodeGen/RISCV/bfloat-convert.ll
@@ -84,12 +84,12 @@ define i16 @fcvt_si_bf16_sat(bfloat %a) nounwind {
 ; CHECK64ZFBFMIN:       # %bb.0: # %start
 ; CHECK64ZFBFMIN-NEXT:    fcvt.s.bf16 fa5, fa0
 ; CHECK64ZFBFMIN-NEXT:    feq.s a0, fa5, fa5
+; CHECK64ZFBFMIN-NEXT:    neg a0, a0
 ; CHECK64ZFBFMIN-NEXT:    lui a1, %hi(.LCPI1_0)
 ; CHECK64ZFBFMIN-NEXT:    flw fa4, %lo(.LCPI1_0)(a1)
 ; CHECK64ZFBFMIN-NEXT:    lui a1, 815104
 ; CHECK64ZFBFMIN-NEXT:    fmv.w.x fa3, a1
 ; CHECK64ZFBFMIN-NEXT:    fmax.s fa5, fa5, fa3
-; CHECK64ZFBFMIN-NEXT:    neg a0, a0
 ; CHECK64ZFBFMIN-NEXT:    fmin.s fa5, fa5, fa4
 ; CHECK64ZFBFMIN-NEXT:    fcvt.l.s a1, fa5, rtz
 ; CHECK64ZFBFMIN-NEXT:    and a0, a0, a1
@@ -187,10 +187,10 @@ define i16 @fcvt_ui_bf16_sat(bfloat %a) nounwind {
 ;
 ; RV64ID-LABEL: fcvt_ui_bf16_sat:
 ; RV64ID:       # %bb.0: # %start
-; RV64ID-NEXT:    lui a0, %hi(.LCPI3_0)
-; RV64ID-NEXT:    flw fa5, %lo(.LCPI3_0)(a0)
 ; RV64ID-NEXT:    fmv.x.w a0, fa0
 ; RV64ID-NEXT:    slli a0, a0, 16
+; RV64ID-NEXT:    lui a1, %hi(.LCPI3_0)
+; RV64ID-NEXT:    flw fa5, %lo(.LCPI3_0)(a1)
 ; RV64ID-NEXT:    fmv.w.x fa4, a0
 ; RV64ID-NEXT:    fmv.w.x fa3, zero
 ; RV64ID-NEXT:    fmax.s fa4, fa4, fa3
diff --git a/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll b/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll
index d8471129433027..67123466354c41 100644
--- a/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll
+++ b/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll
@@ -140,11 +140,11 @@ define i64 @caller_large_scalars() nounwind {
 ; RV64I-NEXT:    sd a0, 0(sp)
 ; RV64I-NEXT:    sd zero, 56(sp)
 ; RV64I-NEXT:    sd zero, 48(sp)
-; RV64I-NEXT:    li a0, 1
-; RV64I-NEXT:    sd a0, 32(sp)
+; RV64I-NEXT:    sd zero, 40(sp)
+; RV64I-NEXT:    li a2, 1
 ; RV64I-NEXT:    addi a0, sp, 32
 ; RV64I-NEXT:    mv a1, sp
-; RV64I-NEXT:    sd zero, 40(sp)
+; RV64I-NEXT:    sd a2, 32(sp)
 ; RV64I-NEXT:    call callee_large_scalars
 ; RV64I-NEXT:    ld ra, 72(sp) # 8-byte Folded Reload
 ; RV64I-NEXT:    addi sp, sp, 80
diff --git a/llvm/test/CodeGen/RISCV/double-convert.ll b/llvm/test/CodeGen/RISCV/double-convert.ll
index da882cafd99715..c147d6ec6d9b15 100644
--- a/llvm/test/CodeGen/RISCV/double-convert.ll
+++ b/llvm/test/CodeGen/RISCV/double-convert.ll
@@ -1651,8 +1651,8 @@ define signext i16 @fcvt_w_s_sat_i16(double %a) nounwind {
 ; RV64IFD-NEXT:    lui a0, %hi(.LCPI26_1)
 ; RV64IFD-NEXT:    fld fa4, %lo(.LCPI26_1)(a0)
 ; RV64IFD-NEXT:    feq.d a0, fa0, fa0
-; RV64IFD-NEXT:    fmax.d fa5, fa0, fa5
 ; RV64IFD-NEXT:    neg a0, a0
+; RV64IFD-NEXT:    fmax.d fa5, fa0, fa5
 ; RV64IFD-NEXT:    fmin.d fa5, fa5, fa4
 ; RV64IFD-NEXT:    fcvt.l.d a1, fa5, rtz
 ; RV64IFD-NEXT:    and a0, a0, a1
@@ -1680,12 +1680,12 @@ define signext i16 @fcvt_w_s_sat_i16(double %a) nounwind {
 ; RV64IZFINXZDINX-NEXT:    ld a1, %lo(.LCPI26_0)(a1)
 ; RV64IZFINXZDINX-NEXT:    lui a2, %hi(.LCPI26_1)
 ; RV64IZFINXZDINX-NEXT:    ld a2, %lo(.LCPI26_1)(a2)
-; RV64IZFINXZDINX-NEXT:    fmax.d a1, a0, a1
-; RV64IZFINXZDINX-NEXT:    feq.d a0, a0, a0
-; RV64IZFINXZDINX-NEXT:    neg a0, a0
-; RV64IZFINXZDINX-NEXT:    fmin.d a1, a1, a2
-; RV64IZFINXZDINX-NEXT:    fcvt.l.d a1, a1, rtz
-; RV64IZFINXZDINX-NEXT:    and a0, a0, a1
+; RV64IZFINXZDINX-NEXT:    feq.d a3, a0, a0
+; RV64IZFINXZDINX-NEXT:    neg a3, a3
+; RV64IZFINXZDINX-NEXT:    fmax.d a0, a0, a1
+; RV64IZFINXZDINX-NEXT:    fmin.d a0, a0, a2
+; RV64IZFINXZDINX-NEXT:    fcvt.l.d a0, a0, rtz
+; RV64IZFINXZDINX-NEXT:    and a0, a3, a0
 ; RV64IZFINXZDINX-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_w_s_sat_i16:
@@ -2026,8 +2026,8 @@ define signext i8 @fcvt_w_s_sat_i8(double %a) nounwind {
 ; RV64IFD-NEXT:    lui a0, %hi(.LCPI30_1)
 ; RV64IFD-NEXT:    fld fa4, %lo(.LCPI30_1)(a0)
 ; RV64IFD-NEXT:    feq.d a0, fa0, fa0
-; RV64IFD-NEXT:    fmax.d fa5, fa0, fa5
 ; RV64IFD-NEXT:    neg a0, a0
+; RV64IFD-NEXT:    fmax.d fa5, fa0, fa5
 ; RV64IFD-NEXT:    fmin.d fa5, fa5, fa4
 ; RV64IFD-NEXT:    fcvt.l.d a1, fa5, rtz
 ; RV64IFD-NEXT:    and a0, a0, a1
@@ -2055,12 +2055,12 @@ define signext i8 @fcvt_w_s_sat_i8(double %a) nounwind {
 ; RV64IZFINXZDINX-NEXT:    ld a1, %lo(.LCPI30_0)(a1)
 ; RV64IZFINXZDINX-NEXT:    lui a2, %hi(.LCPI30_1)
 ; RV64IZFINXZDINX-NEXT:    ld a2, %lo(.LCPI30_1)(a2)
-; RV64IZFINXZDINX-NEXT:    fmax.d a1, a0, a1
-; RV64IZFINXZDINX-NEXT:    feq.d a0, a0, a0
-; RV64IZFINXZDINX-NEXT:    neg a0, a0
-; RV64IZFINXZDINX-NEXT:    fmin.d a1, a1, a2
-; RV64IZFINXZDINX-NEXT:    fcvt.l.d a1, a1, rtz
-; RV64IZFINXZDINX-NEXT:    and a0, a0, a1
+; RV64IZFINXZDINX-NEXT:    feq.d a3, a0, a0
+; RV64IZFINXZDINX-NEXT:    neg a3, a3
+; RV64IZFINXZDINX-NEXT:    fmax.d a0, a0, a1
+; RV64IZFINXZDINX-NEXT:    fmin.d a0, a0, a2
+; RV64IZFINXZDINX-NEXT:    fcvt.l.d a0, a0, rtz
+; RV64IZFINXZDINX-NEXT:    and a0, a3, a0
 ; RV64IZFINXZDINX-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_w_s_sat_i8:
diff --git a/llvm/test/CodeGen/RISCV/float-convert.ll b/llvm/test/CodeGen/RISCV/float-convert.ll
index 2c7315fbe59f6f..653b64ec730496 100644
--- a/llvm/test/CodeGen/RISCV/float-convert.ll
+++ b/llvm/test/CodeGen/RISCV/float-convert.ll
@@ -1424,12 +1424,12 @@ define signext i16 @fcvt_w_s_sat_i16(float %a) nounwind {
 ; RV64IF-LABEL: fcvt_w_s_sat_i16:
 ; RV64IF:       # %bb.0: # %start
 ; RV64IF-NEXT:    feq.s a0, fa0, fa0
+; RV64IF-NEXT:    neg a0, a0
 ; RV64IF-NEXT:    lui a1, %hi(.LCPI24_0)
 ; RV64IF-NEXT:    flw fa5, %lo(.LCPI24_0)(a1)
 ; RV64IF-NEXT:    lui a1, 815104
 ; RV64IF-NEXT:    fmv.w.x fa4, a1
 ; RV64IF-NEXT:    fmax.s fa4, fa0, fa4
-; RV64IF-NEXT:    neg a0, a0
 ; RV64IF-NEXT:    fmin.s fa5, fa4, fa5
 ; RV64IF-NEXT:    fcvt.l.s a1, fa5, rtz
 ; RV64IF-NEXT:    and a0, a0, a1
@@ -1450,15 +1450,15 @@ define signext i16 @fcvt_w_s_sat_i16(float %a) nounwind {
 ;
 ; RV64IZFINX-LABEL: fcvt_w_s_sat_i16:
 ; RV64IZFINX:       # %bb.0: # %start
-; RV64IZFINX-NEXT:    lui a1, 815104
+; RV64IZFINX-NEXT:    feq.s a1, a0, a0
 ; RV64IZFINX-NEXT:    lui a2, %hi(.LCPI24_0)
 ; RV64IZFINX-NEXT:    lw a2, %lo(.LCPI24_0)(a2)
-; RV64IZFINX-NEXT:    fmax.s a1, a0, a1
-; RV64IZFINX-NEXT:    feq.s a0, a0, a0
-; RV64IZFINX-NEXT:    neg a0, a0
-; RV64IZFINX-NEXT:    fmin.s a1, a1, a2
-; RV64IZFINX-NEXT:    fcvt.l.s a1, a1, rtz
-; RV64IZFINX-NEXT:    and a0, a0, a1
+; RV64IZFINX-NEXT:    neg a1, a1
+; RV64IZFINX-NEXT:    lui a3, 815104
+; RV64IZFINX-NEXT:    fmax.s a0, a0, a3
+; RV64IZFINX-NEXT:    fmin.s a0, a0, a2
+; RV64IZFINX-NEXT:    fcvt.l.s a0, a0, rtz
+; RV64IZFINX-NEXT:    and a0, a1, a0
 ; RV64IZFINX-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_w_s_sat_i16:
diff --git a/llvm/test/CodeGen/RISCV/half-convert.ll b/llvm/test/CodeGen/RISCV/half-convert.ll
index 16c096290720d3..277749c75bbbf1 100644
--- a/llvm/test/CodeGen/RISCV/half-convert.ll
+++ b/llvm/test/CodeGen/RISCV/half-convert.ll
@@ -210,12 +210,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; RV64IZFH:       # %bb.0: # %start
 ; RV64IZFH-NEXT:    fcvt.s.h fa5, fa0
 ; RV64IZFH-NEXT:    feq.s a0, fa5, fa5
+; RV64IZFH-NEXT:    neg a0, a0
 ; RV64IZFH-NEXT:    lui a1, %hi(.LCPI1_0)
 ; RV64IZFH-NEXT:    flw fa4, %lo(.LCPI1_0)(a1)
 ; RV64IZFH-NEXT:    lui a1, 815104
 ; RV64IZFH-NEXT:    fmv.w.x fa3, a1
 ; RV64IZFH-NEXT:    fmax.s fa5, fa5, fa3
-; RV64IZFH-NEXT:    neg a0, a0
 ; RV64IZFH-NEXT:    fmin.s fa5, fa5, fa4
 ; RV64IZFH-NEXT:    fcvt.l.s a1, fa5, rtz
 ; RV64IZFH-NEXT:    and a0, a0, a1
@@ -240,12 +240,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; RV64IDZFH:       # %bb.0: # %start
 ; RV64IDZFH-NEXT:    fcvt.s.h fa5, fa0
 ; RV64IDZFH-NEXT:    feq.s a0, fa5, fa5
+; RV64IDZFH-NEXT:    neg a0, a0
 ; RV64IDZFH-NEXT:    lui a1, %hi(.LCPI1_0)
 ; RV64IDZFH-NEXT:    flw fa4, %lo(.LCPI1_0)(a1)
 ; RV64IDZFH-NEXT:    lui a1, 815104
 ; RV64IDZFH-NEXT:    fmv.w.x fa3, a1
 ; RV64IDZFH-NEXT:    fmax.s fa5, fa5, fa3
-; RV64IDZFH-NEXT:    neg a0, a0
 ; RV64IDZFH-NEXT:    fmin.s fa5, fa5, fa4
 ; RV64IDZFH-NEXT:    fcvt.l.s a1, fa5, rtz
 ; RV64IDZFH-NEXT:    and a0, a0, a1
@@ -268,15 +268,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; RV64IZHINX-LABEL: fcvt_si_h_sat:
 ; RV64IZHINX:       # %bb.0: # %start
 ; RV64IZHINX-NEXT:    fcvt.s.h a0, a0
-; RV64IZHINX-NEXT:    lui a1, 815104
+; RV64IZHINX-NEXT:    feq.s a1, a0, a0
 ; RV64IZHINX-NEXT:    lui a2, %hi(.LCPI1_0)
 ; RV64IZHINX-NEXT:    lw a2, %lo(.LCPI1_0)(a2)
-; RV64IZHINX-NEXT:    fmax.s a1, a0, a1
-; RV64IZHINX-NEXT:    feq.s a0, a0, a0
-; RV64IZHINX-NEXT:    neg a0, a0
-; RV64IZHINX-NEXT:    fmin.s a1, a1, a2
-; RV64IZHINX-NEXT:    fcvt.l.s a1, a1, rtz
-; RV64IZHINX-NEXT:    and a0, a0, a1
+; RV64IZHINX-NEXT:    neg a1, a1
+; RV64IZHINX-NEXT:    lui a3, 815104
+; RV64IZHINX-NEXT:    fmax.s a0, a0, a3
+; RV64IZHINX-NEXT:    fmin.s a0, a0, a2
+; RV64IZHINX-NEXT:    fcvt.l.s a0, a0, rtz
+; RV64IZHINX-NEXT:    and a0, a1, a0
 ; RV64IZHINX-NEXT:    ret
 ;
 ; RV32IZDINXZHINX-LABEL: fcvt_si_h_sat:
@@ -296,15 +296,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; RV64IZDINXZHINX-LABEL: fcvt_si_h_sat:
 ; RV64IZDINXZHINX:       # %bb.0: # %start
 ; RV64IZDINXZHINX-NEXT:    fcvt.s.h a0, a0
-; RV64IZDINXZHINX-NEXT:    lui a1, 815104
+; RV64IZDINXZHINX-NEXT:    feq.s a1, a0, a0
 ; RV64IZDINXZHINX-NEXT:    lui a2, %hi(.LCPI1_0)
 ; RV64IZDINXZHINX-NEXT:    lw a2, %lo(.LCPI1_0)(a2)
-; RV64IZDINXZHINX-NEXT:    fmax.s a1, a0, a1
-; RV64IZDINXZHINX-NEXT:    feq.s a0, a0, a0
-; RV64IZDINXZHINX-NEXT:    neg a0, a0
-; RV64IZDINXZHINX-NEXT:    fmin.s a1, a1, a2
-; RV64IZDINXZHINX-NEXT:    fcvt.l.s a1, a1, rtz
-; RV64IZDINXZHINX-NEXT:    and a0, a0, a1
+; RV64IZDINXZHINX-NEXT:    neg a1, a1
+; RV64IZDINXZHINX-NEXT:    lui a3, 815104
+; RV64IZDINXZHINX-NEXT:    fmax.s a0, a0, a3
+; RV64IZDINXZHINX-NEXT:    fmin.s a0, a0, a2
+; RV64IZDINXZHINX-NEXT:    fcvt.l.s a0, a0, rtz
+; RV64IZDINXZHINX-NEXT:    and a0, a1, a0
 ; RV64IZDINXZHINX-NEXT:    ret
 ;
 ; RV32I-LABEL: fcvt_si_h_sat:
@@ -420,12 +420,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; RV64ID-LP64-NEXT:    call __extendhfsf2
 ; RV64ID-LP64-NEXT:    fmv.w.x fa5, a0
 ; RV64ID-LP64-NEXT:    feq.s a0, fa5, fa5
+; RV64ID-LP64-NEXT:    neg a0, a0
 ; RV64ID-LP64-NEXT:    lui a1, %hi(.LCPI1_0)
 ; RV64ID-LP64-NEXT:    flw fa4, %lo(.LCPI1_0)(a1)
 ; RV64ID-LP64-NEXT:    lui a1, 815104
 ; RV64ID-LP64-NEXT:    fmv.w.x fa3, a1
 ; RV64ID-LP64-NEXT:    fmax.s fa5, fa5, fa3
-; RV64ID-LP64-NEXT:    neg a0, a0
 ; RV64ID-LP64-NEXT:    fmin.s fa5, fa5, fa4
 ; RV64ID-LP64-NEXT:    fcvt.l.s a1, fa5, rtz
 ; RV64ID-LP64-NEXT:    and a0, a0, a1
@@ -458,12 +458,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; RV64ID-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
 ; RV64ID-NEXT:    call __extendhfsf2
 ; RV64ID-NEXT:    feq.s a0, fa0, fa0
+; RV64ID-NEXT:    neg a0, a0
 ; RV64ID-NEXT:    lui a1, %hi(.LCPI1_0)
 ; RV64ID-NEXT:    flw fa5, %lo(.LCPI1_0)(a1)
 ; RV64ID-NEXT:    lui a1, 815104
 ; RV64ID-NEXT:    fmv.w.x fa4, a1
 ; RV64ID-NEXT:    fmax.s fa4, fa0, fa4
-; RV64ID-NEXT:    neg a0, a0
 ; RV64ID-NEXT:    fmin.s fa5, fa4, fa5
 ; RV64ID-NEXT:    fcvt.l.s a1, fa5, rtz
 ; RV64ID-NEXT:    and a0, a0, a1
@@ -490,12 +490,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; CHECK64-IZFHMIN:       # %bb.0: # %start
 ; CHECK64-IZFHMIN-NEXT:    fcvt.s.h fa5, fa0
 ; CHECK64-IZFHMIN-NEXT:    feq.s a0, fa5, fa5
+; CHECK64-IZFHMIN-NEXT:    neg a0, a0
 ; CHECK64-IZFHMIN-NEXT:    lui a1, %hi(.LCPI1_0)
 ; CHECK64-IZFHMIN-NEXT:    flw fa4, %lo(.LCPI1_0)(a1)
 ; CHECK64-IZFHMIN-NEXT:    lui a1, 815104
 ; CHECK64-IZFHMIN-NEXT:    fmv.w.x fa3, a1
 ; CHECK64-IZFHMIN-NEXT:    fmax.s fa5, fa5, fa3
-; CHECK64-IZFHMIN-NEXT:    neg a0, a0
 ; CHECK64-IZFHMIN-NEXT:    fmin.s fa5, fa5, fa4
 ; CHECK64-IZFHMIN-NEXT:    fcvt.l.s a1, fa5, rtz
 ; CHECK64-IZFHMIN-NEXT:    and a0, a0, a1
@@ -518,15 +518,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; CHECK64-IZHINXMIN-LABEL: fcvt_si_h_sat:
 ; CHECK64-IZHINXMIN:       # %bb.0: # %start
 ; CHECK64-IZHINXMIN-NEXT:    fcvt.s.h a0, a0
-; CHECK64-IZHINXMIN-NEXT:    lui a1, 815104
+; CHECK64-IZHINXMIN-NEXT:    feq.s a1, a0, a0
 ; CHECK64-IZHINXMIN-NEXT:    lui a2, %hi(.LCPI1_0)
 ; CHECK64-IZHINXMIN-NEXT:    lw a2, %lo(.LCPI1_0)(a2)
-; CHECK64-IZHINXMIN-NEXT:    fmax.s a1, a0, a1
-; CHECK64-IZHINXMIN-NEXT:    feq.s a0, a0, a0
-; CHECK64-IZHINXMIN-NEXT:    neg a0, a0
-; CHECK64-IZHINXMIN-NEXT:    fmin.s a1, a1, a2
-; CHECK64-IZHINXMIN-NEXT:    fcvt.l.s a1, a1, rtz
-; CHECK64-IZHINXMIN-NEXT:    and a0, a0, a1
+; CHECK64-IZHINXMIN-NEXT:    neg a1, a1
+; CHECK64-IZHINXMIN-NEXT:    lui a3, 815104
+; CHECK64-IZHINXMIN-NEXT:    fmax.s a0, a0, a3
+; CHECK64-IZHINXMIN-NEXT:    fmin.s a0, a0, a2
+; CHECK64-IZHINXMIN-NEXT:    fcvt.l.s a0, a0, rtz
+; CHECK64-IZHINXMIN-NEXT:    and a0, a1, a0
 ; CHECK64-IZHINXMIN-NEXT:    ret
 ;
 ; CHECK32-IZDINXZHINXMIN-LABEL: fcvt_si_h_sat:
@@ -546,15 +546,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
 ; CHECK64-IZDINXZHINXMIN-LABEL: fcvt_si_h_sat:
 ; CHECK64-IZDINXZHINXMIN:       # %bb.0: # %start
 ; CHECK64-IZDINXZHINXMIN-NEXT:    fcvt.s.h a0, a0
-; CHECK64-IZDINXZHINXMIN-NEXT:    lui a1, 815104
+; CHECK64-IZDINXZHINXMIN-NEXT:    feq.s a1, a0, a0
 ; CHECK64-IZDINXZHINXMIN-NEXT:    lui a2, %hi(.LCPI1_0)
 ; CHECK64-IZDINXZHINXMIN-NEXT:    lw a2, %lo(.LCPI1_0)(a2)
-; CHECK64-IZDINXZHINXMIN-NEXT:    fmax.s a1, a0, a1
-; CHECK64-IZDINXZHINXMIN-NEXT:    feq.s a0, a0, a0
-; CHECK64-IZDINXZHINXMIN-NEXT:    neg a0, a0
-; CHECK64-IZDINXZHINXMIN-NEXT:    fmin.s a1, a1, a2
-; CHECK64-IZDINXZHINXMIN-NEXT:    fcvt.l.s a1, a1, rtz
-; CHECK64-IZDINXZHINXMIN-NEXT:    and a0, a0, a1
+; CHECK64-IZDINXZHINXMIN-NEXT:    neg a1, a1
+; CHECK64-IZDINXZHINXMIN-NEXT:    lui a3, 815104
+; CHECK64-IZDINXZHINXMIN-NEXT:    fmax.s a0, a0, a3
+; CHECK64-IZDINXZHINXMIN-NEXT:    fmin.s a0, a0, a2
+; CHECK64-IZDINXZHINXMIN-NEXT:    fcvt.l.s a0, a0, rtz
+; CHECK64-IZDINXZHINXMIN-NEXT:    and a0, a1, a0
 ; CHECK64-IZDINXZHINXMIN-NEXT:    ret
 start:
   %0 = tail call i16 @llvm.fptosi.sat.i16.f16(half %a)
@@ -6377,12 +6377,12 @@ define signext i16 @fcvt_w_s_sat_i16(half %a) nounwind {
 ; RV64IZFH:       # %bb.0: # %start
 ; RV64IZFH-NEXT:    fcvt.s.h fa5, fa0
 ; RV64IZFH-NEXT:    feq.s a0, fa5, fa5
+; RV64IZFH-NEXT:    neg a0, a0
 ; RV64IZFH-NEXT:    lui a1, %hi(.LCPI32_0)
 ; RV64IZFH-NEXT:    flw fa4, %lo(.LCPI32_0)(a1)
 ; RV64IZFH-NEXT:    lui a1, 815104
 ; RV64IZFH-NEXT:    fmv.w.x fa3, a1
 ; RV64IZFH-NEXT:    fmax.s fa5, fa5, fa3
-; RV64IZFH-NEXT:    neg a0, a0
 ; RV64IZFH-NEXT:    fmin.s fa5, fa5, fa4
 ; RV64IZFH-NEXT:    fcvt.l.s a1, fa5, rtz
 ; RV64IZFH-NEXT:    and a0, a0, a1
@@ -6407,12 +6407,12 @@ define signext i16 @fcvt_w_s_sat_i16(half %a) nounwind {
 ; RV64IDZFH:       # %bb.0: # %start
 ; RV64IDZFH-NEXT:    fcvt.s.h fa5, fa0
 ; RV64IDZFH-NEXT:    feq.s a0, fa5, fa5
+; RV64IDZFH-NEXT:    neg a0, a0
 ; RV64IDZFH-NEXT:    lui a1, %hi(.LCPI32_0)
 ; RV64IDZFH-NEXT:    flw fa4, %lo(.LCPI32_0)(a1)
 ; RV64IDZFH-NEXT:    lui a1, 815104
 ; RV64IDZFH-NEXT:    fmv.w.x fa3, ...
[truncated]

@lcvon007 lcvon007 added backend:RISC-V mi-sched machine instruction scheduler labels Apr 11, 2024
Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM though I think this existed traversal of all int VTs is kind of silly, but I don't figure out a better way now.

@lcvon007
Copy link
Contributor Author

LGTM though I think this existed traversal of all int VTs is kind of silly, but I don't figure out a better way now.

This heuristic is first introduced in 66c3dfb by Andrew, and it only check MVT::i32, and do a refactor like current codes to fix llc crash in 350ff2c.

Like X86, i64, i32, i16, i8 are all legal register type with same allocatable register number and it has no break in loop, so it will use the register number of i8 to do final check, and we can see it's a rough checking rule like Andrew says.

I try to find interface to get the XLen to simplify this logic, but it doesn't have yet.

I have updated codes adding one break in the loop so it only checks the first legal register type(it has the same behaviour like before), please help review again @wangpc-pp

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well.

Machine scheduler will suppress register pressure when the scheduling
window is too small, but now it doesn't consider i64 register type,
and this MR extends it into i64 register type, so architecture like
RISCV64 that only supports i64 interger register will have the same
behavior like RISCV32.
@lcvon007 lcvon007 force-pushed the feature_reg_pressure branch from 49329a2 to 72e7829 Compare April 15, 2024 02:42
@lcvon007 lcvon007 merged commit 6b80e2f into llvm:main Apr 15, 2024
4 checks passed
bazuzi pushed a commit to bazuzi/llvm-project that referenced this pull request Apr 15, 2024
Machine scheduler will suppress register pressure when the scheduling
window is too small, but now it doesn't consider i64 register type,
and this MR extends it into i64 register type, so architecture like
RISCV64 that only supports i64 interger register will have the same
behavior like RISCV32.
aniplcc pushed a commit to aniplcc/llvm-project that referenced this pull request Apr 15, 2024
Machine scheduler will suppress register pressure when the scheduling
window is too small, but now it doesn't consider i64 register type,
and this MR extends it into i64 register type, so architecture like
RISCV64 that only supports i64 interger register will have the same
behavior like RISCV32.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants