-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mi-sched] Suppress register pressure with i64. #88256
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-backend-loongarch Author: laichunfeng@tencent.com (lcvon007) ChangesMachine scheduler will suppress register pressure when the scheduling Patch is 46.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88256.diff 14 Files Affected:
diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 203bdffc49ae18..90a6262cff2bce 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -3283,7 +3283,7 @@ void GenericScheduler::initPolicy(MachineBasicBlock::iterator Begin,
// compile time. As a rough heuristic, only track pressure when the number of
// schedulable instructions exceeds half the integer register file.
RegionPolicy.ShouldTrackPressure = true;
- for (unsigned VT = MVT::i32; VT > (unsigned)MVT::i1; --VT) {
+ for (unsigned VT = MVT::i64; VT > (unsigned)MVT::i1; --VT) {
MVT::SimpleValueType LegalIntVT = (MVT::SimpleValueType)VT;
if (TLI->isTypeLegal(LegalIntVT)) {
unsigned NIntRegs = Context->RegClassInfo->getNumAllocatableRegs(
diff --git a/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll b/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
index bf48c0df3e4961..7cde034726e0b5 100644
--- a/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
+++ b/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
@@ -4,12 +4,12 @@
define i8 @atomicrmw_uinc_wrap_i8(ptr %ptr, i8 %val) {
; LA64-LABEL: atomicrmw_uinc_wrap_i8:
; LA64: # %bb.0:
-; LA64-NEXT: slli.d $a2, $a0, 3
+; LA64-NEXT: slli.d $a4, $a0, 3
; LA64-NEXT: bstrins.d $a0, $zero, 1, 0
-; LA64-NEXT: ori $a3, $zero, 255
-; LA64-NEXT: sll.w $a4, $a3, $a2
+; LA64-NEXT: andi $a2, $a4, 24
+; LA64-NEXT: ori $a5, $zero, 255
; LA64-NEXT: ld.w $a3, $a0, 0
-; LA64-NEXT: andi $a2, $a2, 24
+; LA64-NEXT: sll.w $a4, $a5, $a4
; LA64-NEXT: nor $a4, $a4, $zero
; LA64-NEXT: andi $a1, $a1, 255
; LA64-NEXT: .p2align 4, , 16
@@ -54,13 +54,13 @@ define i8 @atomicrmw_uinc_wrap_i8(ptr %ptr, i8 %val) {
define i16 @atomicrmw_uinc_wrap_i16(ptr %ptr, i16 %val) {
; LA64-LABEL: atomicrmw_uinc_wrap_i16:
; LA64: # %bb.0:
-; LA64-NEXT: slli.d $a2, $a0, 3
+; LA64-NEXT: slli.d $a4, $a0, 3
; LA64-NEXT: bstrins.d $a0, $zero, 1, 0
+; LA64-NEXT: andi $a2, $a4, 24
; LA64-NEXT: lu12i.w $a3, 15
-; LA64-NEXT: ori $a3, $a3, 4095
-; LA64-NEXT: sll.w $a4, $a3, $a2
+; LA64-NEXT: ori $a5, $a3, 4095
; LA64-NEXT: ld.w $a3, $a0, 0
-; LA64-NEXT: andi $a2, $a2, 24
+; LA64-NEXT: sll.w $a4, $a5, $a4
; LA64-NEXT: nor $a4, $a4, $zero
; LA64-NEXT: bstrpick.d $a1, $a1, 15, 0
; LA64-NEXT: .p2align 4, , 16
diff --git a/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll b/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll
index d03af114bceefe..18d17751a77196 100644
--- a/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll
+++ b/llvm/test/CodeGen/LoongArch/vector-fp-imm.ll
@@ -124,10 +124,10 @@ define void @test_f2(ptr %P, ptr %S) nounwind {
; LA64F: # %bb.0:
; LA64F-NEXT: fld.s $fa0, $a0, 4
; LA64F-NEXT: fld.s $fa1, $a0, 0
-; LA64F-NEXT: pcalau12i $a0, %pc_hi20(.LCPI1_0)
-; LA64F-NEXT: addi.d $a0, $a0, %pc_lo12(.LCPI1_0)
-; LA64F-NEXT: fld.s $fa2, $a0, 0
; LA64F-NEXT: addi.w $a0, $zero, 1
+; LA64F-NEXT: pcalau12i $a2, %pc_hi20(.LCPI1_0)
+; LA64F-NEXT: addi.d $a2, $a2, %pc_lo12(.LCPI1_0)
+; LA64F-NEXT: fld.s $fa2, $a2, 0
; LA64F-NEXT: movgr2fr.w $fa3, $a0
; LA64F-NEXT: ffint.s.w $fa3, $fa3
; LA64F-NEXT: fadd.s $fa1, $fa1, $fa3
@@ -140,10 +140,10 @@ define void @test_f2(ptr %P, ptr %S) nounwind {
; LA64D: # %bb.0:
; LA64D-NEXT: fld.s $fa0, $a0, 4
; LA64D-NEXT: fld.s $fa1, $a0, 0
-; LA64D-NEXT: pcalau12i $a0, %pc_hi20(.LCPI1_0)
-; LA64D-NEXT: addi.d $a0, $a0, %pc_lo12(.LCPI1_0)
-; LA64D-NEXT: fld.s $fa2, $a0, 0
; LA64D-NEXT: addi.w $a0, $zero, 1
+; LA64D-NEXT: pcalau12i $a2, %pc_hi20(.LCPI1_0)
+; LA64D-NEXT: addi.d $a2, $a2, %pc_lo12(.LCPI1_0)
+; LA64D-NEXT: fld.s $fa2, $a2, 0
; LA64D-NEXT: movgr2fr.w $fa3, $a0
; LA64D-NEXT: ffint.s.w $fa3, $fa3
; LA64D-NEXT: fadd.s $fa1, $fa1, $fa3
@@ -527,10 +527,10 @@ define void @test_d2(ptr %P, ptr %S) nounwind {
; LA64D: # %bb.0:
; LA64D-NEXT: fld.d $fa0, $a0, 8
; LA64D-NEXT: fld.d $fa1, $a0, 0
-; LA64D-NEXT: pcalau12i $a0, %pc_hi20(.LCPI4_0)
-; LA64D-NEXT: addi.d $a0, $a0, %pc_lo12(.LCPI4_0)
-; LA64D-NEXT: fld.d $fa2, $a0, 0
; LA64D-NEXT: addi.d $a0, $zero, 1
+; LA64D-NEXT: pcalau12i $a2, %pc_hi20(.LCPI4_0)
+; LA64D-NEXT: addi.d $a2, $a2, %pc_lo12(.LCPI4_0)
+; LA64D-NEXT: fld.d $fa2, $a2, 0
; LA64D-NEXT: movgr2fr.d $fa3, $a0
; LA64D-NEXT: ffint.d.l $fa3, $fa3
; LA64D-NEXT: fadd.d $fa1, $fa1, $fa3
diff --git a/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll b/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll
index 5914e45a153302..f96e1bad2e3895 100644
--- a/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll
+++ b/llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll
@@ -127,11 +127,11 @@ define i8 @atomicrmw_uinc_wrap_i8(ptr %ptr, i8 %val) {
; RV64IA-LABEL: atomicrmw_uinc_wrap_i8:
; RV64IA: # %bb.0:
; RV64IA-NEXT: andi a2, a0, -4
-; RV64IA-NEXT: slli a0, a0, 3
-; RV64IA-NEXT: li a3, 255
-; RV64IA-NEXT: sllw a4, a3, a0
+; RV64IA-NEXT: slli a4, a0, 3
+; RV64IA-NEXT: andi a0, a4, 24
+; RV64IA-NEXT: li a5, 255
; RV64IA-NEXT: lw a3, 0(a2)
-; RV64IA-NEXT: andi a0, a0, 24
+; RV64IA-NEXT: sllw a4, a5, a4
; RV64IA-NEXT: not a4, a4
; RV64IA-NEXT: andi a1, a1, 255
; RV64IA-NEXT: .LBB0_1: # %atomicrmw.start
diff --git a/llvm/test/CodeGen/RISCV/bfloat-convert.ll b/llvm/test/CodeGen/RISCV/bfloat-convert.ll
index d5041c2a7ca78b..9e2b0b5c3cbb41 100644
--- a/llvm/test/CodeGen/RISCV/bfloat-convert.ll
+++ b/llvm/test/CodeGen/RISCV/bfloat-convert.ll
@@ -84,12 +84,12 @@ define i16 @fcvt_si_bf16_sat(bfloat %a) nounwind {
; CHECK64ZFBFMIN: # %bb.0: # %start
; CHECK64ZFBFMIN-NEXT: fcvt.s.bf16 fa5, fa0
; CHECK64ZFBFMIN-NEXT: feq.s a0, fa5, fa5
+; CHECK64ZFBFMIN-NEXT: neg a0, a0
; CHECK64ZFBFMIN-NEXT: lui a1, %hi(.LCPI1_0)
; CHECK64ZFBFMIN-NEXT: flw fa4, %lo(.LCPI1_0)(a1)
; CHECK64ZFBFMIN-NEXT: lui a1, 815104
; CHECK64ZFBFMIN-NEXT: fmv.w.x fa3, a1
; CHECK64ZFBFMIN-NEXT: fmax.s fa5, fa5, fa3
-; CHECK64ZFBFMIN-NEXT: neg a0, a0
; CHECK64ZFBFMIN-NEXT: fmin.s fa5, fa5, fa4
; CHECK64ZFBFMIN-NEXT: fcvt.l.s a1, fa5, rtz
; CHECK64ZFBFMIN-NEXT: and a0, a0, a1
@@ -187,10 +187,10 @@ define i16 @fcvt_ui_bf16_sat(bfloat %a) nounwind {
;
; RV64ID-LABEL: fcvt_ui_bf16_sat:
; RV64ID: # %bb.0: # %start
-; RV64ID-NEXT: lui a0, %hi(.LCPI3_0)
-; RV64ID-NEXT: flw fa5, %lo(.LCPI3_0)(a0)
; RV64ID-NEXT: fmv.x.w a0, fa0
; RV64ID-NEXT: slli a0, a0, 16
+; RV64ID-NEXT: lui a1, %hi(.LCPI3_0)
+; RV64ID-NEXT: flw fa5, %lo(.LCPI3_0)(a1)
; RV64ID-NEXT: fmv.w.x fa4, a0
; RV64ID-NEXT: fmv.w.x fa3, zero
; RV64ID-NEXT: fmax.s fa4, fa4, fa3
diff --git a/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll b/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll
index d8471129433027..67123466354c41 100644
--- a/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll
+++ b/llvm/test/CodeGen/RISCV/calling-conv-lp64-lp64f-lp64d-common.ll
@@ -140,11 +140,11 @@ define i64 @caller_large_scalars() nounwind {
; RV64I-NEXT: sd a0, 0(sp)
; RV64I-NEXT: sd zero, 56(sp)
; RV64I-NEXT: sd zero, 48(sp)
-; RV64I-NEXT: li a0, 1
-; RV64I-NEXT: sd a0, 32(sp)
+; RV64I-NEXT: sd zero, 40(sp)
+; RV64I-NEXT: li a2, 1
; RV64I-NEXT: addi a0, sp, 32
; RV64I-NEXT: mv a1, sp
-; RV64I-NEXT: sd zero, 40(sp)
+; RV64I-NEXT: sd a2, 32(sp)
; RV64I-NEXT: call callee_large_scalars
; RV64I-NEXT: ld ra, 72(sp) # 8-byte Folded Reload
; RV64I-NEXT: addi sp, sp, 80
diff --git a/llvm/test/CodeGen/RISCV/double-convert.ll b/llvm/test/CodeGen/RISCV/double-convert.ll
index da882cafd99715..c147d6ec6d9b15 100644
--- a/llvm/test/CodeGen/RISCV/double-convert.ll
+++ b/llvm/test/CodeGen/RISCV/double-convert.ll
@@ -1651,8 +1651,8 @@ define signext i16 @fcvt_w_s_sat_i16(double %a) nounwind {
; RV64IFD-NEXT: lui a0, %hi(.LCPI26_1)
; RV64IFD-NEXT: fld fa4, %lo(.LCPI26_1)(a0)
; RV64IFD-NEXT: feq.d a0, fa0, fa0
-; RV64IFD-NEXT: fmax.d fa5, fa0, fa5
; RV64IFD-NEXT: neg a0, a0
+; RV64IFD-NEXT: fmax.d fa5, fa0, fa5
; RV64IFD-NEXT: fmin.d fa5, fa5, fa4
; RV64IFD-NEXT: fcvt.l.d a1, fa5, rtz
; RV64IFD-NEXT: and a0, a0, a1
@@ -1680,12 +1680,12 @@ define signext i16 @fcvt_w_s_sat_i16(double %a) nounwind {
; RV64IZFINXZDINX-NEXT: ld a1, %lo(.LCPI26_0)(a1)
; RV64IZFINXZDINX-NEXT: lui a2, %hi(.LCPI26_1)
; RV64IZFINXZDINX-NEXT: ld a2, %lo(.LCPI26_1)(a2)
-; RV64IZFINXZDINX-NEXT: fmax.d a1, a0, a1
-; RV64IZFINXZDINX-NEXT: feq.d a0, a0, a0
-; RV64IZFINXZDINX-NEXT: neg a0, a0
-; RV64IZFINXZDINX-NEXT: fmin.d a1, a1, a2
-; RV64IZFINXZDINX-NEXT: fcvt.l.d a1, a1, rtz
-; RV64IZFINXZDINX-NEXT: and a0, a0, a1
+; RV64IZFINXZDINX-NEXT: feq.d a3, a0, a0
+; RV64IZFINXZDINX-NEXT: neg a3, a3
+; RV64IZFINXZDINX-NEXT: fmax.d a0, a0, a1
+; RV64IZFINXZDINX-NEXT: fmin.d a0, a0, a2
+; RV64IZFINXZDINX-NEXT: fcvt.l.d a0, a0, rtz
+; RV64IZFINXZDINX-NEXT: and a0, a3, a0
; RV64IZFINXZDINX-NEXT: ret
;
; RV32I-LABEL: fcvt_w_s_sat_i16:
@@ -2026,8 +2026,8 @@ define signext i8 @fcvt_w_s_sat_i8(double %a) nounwind {
; RV64IFD-NEXT: lui a0, %hi(.LCPI30_1)
; RV64IFD-NEXT: fld fa4, %lo(.LCPI30_1)(a0)
; RV64IFD-NEXT: feq.d a0, fa0, fa0
-; RV64IFD-NEXT: fmax.d fa5, fa0, fa5
; RV64IFD-NEXT: neg a0, a0
+; RV64IFD-NEXT: fmax.d fa5, fa0, fa5
; RV64IFD-NEXT: fmin.d fa5, fa5, fa4
; RV64IFD-NEXT: fcvt.l.d a1, fa5, rtz
; RV64IFD-NEXT: and a0, a0, a1
@@ -2055,12 +2055,12 @@ define signext i8 @fcvt_w_s_sat_i8(double %a) nounwind {
; RV64IZFINXZDINX-NEXT: ld a1, %lo(.LCPI30_0)(a1)
; RV64IZFINXZDINX-NEXT: lui a2, %hi(.LCPI30_1)
; RV64IZFINXZDINX-NEXT: ld a2, %lo(.LCPI30_1)(a2)
-; RV64IZFINXZDINX-NEXT: fmax.d a1, a0, a1
-; RV64IZFINXZDINX-NEXT: feq.d a0, a0, a0
-; RV64IZFINXZDINX-NEXT: neg a0, a0
-; RV64IZFINXZDINX-NEXT: fmin.d a1, a1, a2
-; RV64IZFINXZDINX-NEXT: fcvt.l.d a1, a1, rtz
-; RV64IZFINXZDINX-NEXT: and a0, a0, a1
+; RV64IZFINXZDINX-NEXT: feq.d a3, a0, a0
+; RV64IZFINXZDINX-NEXT: neg a3, a3
+; RV64IZFINXZDINX-NEXT: fmax.d a0, a0, a1
+; RV64IZFINXZDINX-NEXT: fmin.d a0, a0, a2
+; RV64IZFINXZDINX-NEXT: fcvt.l.d a0, a0, rtz
+; RV64IZFINXZDINX-NEXT: and a0, a3, a0
; RV64IZFINXZDINX-NEXT: ret
;
; RV32I-LABEL: fcvt_w_s_sat_i8:
diff --git a/llvm/test/CodeGen/RISCV/float-convert.ll b/llvm/test/CodeGen/RISCV/float-convert.ll
index 2c7315fbe59f6f..653b64ec730496 100644
--- a/llvm/test/CodeGen/RISCV/float-convert.ll
+++ b/llvm/test/CodeGen/RISCV/float-convert.ll
@@ -1424,12 +1424,12 @@ define signext i16 @fcvt_w_s_sat_i16(float %a) nounwind {
; RV64IF-LABEL: fcvt_w_s_sat_i16:
; RV64IF: # %bb.0: # %start
; RV64IF-NEXT: feq.s a0, fa0, fa0
+; RV64IF-NEXT: neg a0, a0
; RV64IF-NEXT: lui a1, %hi(.LCPI24_0)
; RV64IF-NEXT: flw fa5, %lo(.LCPI24_0)(a1)
; RV64IF-NEXT: lui a1, 815104
; RV64IF-NEXT: fmv.w.x fa4, a1
; RV64IF-NEXT: fmax.s fa4, fa0, fa4
-; RV64IF-NEXT: neg a0, a0
; RV64IF-NEXT: fmin.s fa5, fa4, fa5
; RV64IF-NEXT: fcvt.l.s a1, fa5, rtz
; RV64IF-NEXT: and a0, a0, a1
@@ -1450,15 +1450,15 @@ define signext i16 @fcvt_w_s_sat_i16(float %a) nounwind {
;
; RV64IZFINX-LABEL: fcvt_w_s_sat_i16:
; RV64IZFINX: # %bb.0: # %start
-; RV64IZFINX-NEXT: lui a1, 815104
+; RV64IZFINX-NEXT: feq.s a1, a0, a0
; RV64IZFINX-NEXT: lui a2, %hi(.LCPI24_0)
; RV64IZFINX-NEXT: lw a2, %lo(.LCPI24_0)(a2)
-; RV64IZFINX-NEXT: fmax.s a1, a0, a1
-; RV64IZFINX-NEXT: feq.s a0, a0, a0
-; RV64IZFINX-NEXT: neg a0, a0
-; RV64IZFINX-NEXT: fmin.s a1, a1, a2
-; RV64IZFINX-NEXT: fcvt.l.s a1, a1, rtz
-; RV64IZFINX-NEXT: and a0, a0, a1
+; RV64IZFINX-NEXT: neg a1, a1
+; RV64IZFINX-NEXT: lui a3, 815104
+; RV64IZFINX-NEXT: fmax.s a0, a0, a3
+; RV64IZFINX-NEXT: fmin.s a0, a0, a2
+; RV64IZFINX-NEXT: fcvt.l.s a0, a0, rtz
+; RV64IZFINX-NEXT: and a0, a1, a0
; RV64IZFINX-NEXT: ret
;
; RV32I-LABEL: fcvt_w_s_sat_i16:
diff --git a/llvm/test/CodeGen/RISCV/half-convert.ll b/llvm/test/CodeGen/RISCV/half-convert.ll
index 16c096290720d3..277749c75bbbf1 100644
--- a/llvm/test/CodeGen/RISCV/half-convert.ll
+++ b/llvm/test/CodeGen/RISCV/half-convert.ll
@@ -210,12 +210,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; RV64IZFH: # %bb.0: # %start
; RV64IZFH-NEXT: fcvt.s.h fa5, fa0
; RV64IZFH-NEXT: feq.s a0, fa5, fa5
+; RV64IZFH-NEXT: neg a0, a0
; RV64IZFH-NEXT: lui a1, %hi(.LCPI1_0)
; RV64IZFH-NEXT: flw fa4, %lo(.LCPI1_0)(a1)
; RV64IZFH-NEXT: lui a1, 815104
; RV64IZFH-NEXT: fmv.w.x fa3, a1
; RV64IZFH-NEXT: fmax.s fa5, fa5, fa3
-; RV64IZFH-NEXT: neg a0, a0
; RV64IZFH-NEXT: fmin.s fa5, fa5, fa4
; RV64IZFH-NEXT: fcvt.l.s a1, fa5, rtz
; RV64IZFH-NEXT: and a0, a0, a1
@@ -240,12 +240,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; RV64IDZFH: # %bb.0: # %start
; RV64IDZFH-NEXT: fcvt.s.h fa5, fa0
; RV64IDZFH-NEXT: feq.s a0, fa5, fa5
+; RV64IDZFH-NEXT: neg a0, a0
; RV64IDZFH-NEXT: lui a1, %hi(.LCPI1_0)
; RV64IDZFH-NEXT: flw fa4, %lo(.LCPI1_0)(a1)
; RV64IDZFH-NEXT: lui a1, 815104
; RV64IDZFH-NEXT: fmv.w.x fa3, a1
; RV64IDZFH-NEXT: fmax.s fa5, fa5, fa3
-; RV64IDZFH-NEXT: neg a0, a0
; RV64IDZFH-NEXT: fmin.s fa5, fa5, fa4
; RV64IDZFH-NEXT: fcvt.l.s a1, fa5, rtz
; RV64IDZFH-NEXT: and a0, a0, a1
@@ -268,15 +268,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; RV64IZHINX-LABEL: fcvt_si_h_sat:
; RV64IZHINX: # %bb.0: # %start
; RV64IZHINX-NEXT: fcvt.s.h a0, a0
-; RV64IZHINX-NEXT: lui a1, 815104
+; RV64IZHINX-NEXT: feq.s a1, a0, a0
; RV64IZHINX-NEXT: lui a2, %hi(.LCPI1_0)
; RV64IZHINX-NEXT: lw a2, %lo(.LCPI1_0)(a2)
-; RV64IZHINX-NEXT: fmax.s a1, a0, a1
-; RV64IZHINX-NEXT: feq.s a0, a0, a0
-; RV64IZHINX-NEXT: neg a0, a0
-; RV64IZHINX-NEXT: fmin.s a1, a1, a2
-; RV64IZHINX-NEXT: fcvt.l.s a1, a1, rtz
-; RV64IZHINX-NEXT: and a0, a0, a1
+; RV64IZHINX-NEXT: neg a1, a1
+; RV64IZHINX-NEXT: lui a3, 815104
+; RV64IZHINX-NEXT: fmax.s a0, a0, a3
+; RV64IZHINX-NEXT: fmin.s a0, a0, a2
+; RV64IZHINX-NEXT: fcvt.l.s a0, a0, rtz
+; RV64IZHINX-NEXT: and a0, a1, a0
; RV64IZHINX-NEXT: ret
;
; RV32IZDINXZHINX-LABEL: fcvt_si_h_sat:
@@ -296,15 +296,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; RV64IZDINXZHINX-LABEL: fcvt_si_h_sat:
; RV64IZDINXZHINX: # %bb.0: # %start
; RV64IZDINXZHINX-NEXT: fcvt.s.h a0, a0
-; RV64IZDINXZHINX-NEXT: lui a1, 815104
+; RV64IZDINXZHINX-NEXT: feq.s a1, a0, a0
; RV64IZDINXZHINX-NEXT: lui a2, %hi(.LCPI1_0)
; RV64IZDINXZHINX-NEXT: lw a2, %lo(.LCPI1_0)(a2)
-; RV64IZDINXZHINX-NEXT: fmax.s a1, a0, a1
-; RV64IZDINXZHINX-NEXT: feq.s a0, a0, a0
-; RV64IZDINXZHINX-NEXT: neg a0, a0
-; RV64IZDINXZHINX-NEXT: fmin.s a1, a1, a2
-; RV64IZDINXZHINX-NEXT: fcvt.l.s a1, a1, rtz
-; RV64IZDINXZHINX-NEXT: and a0, a0, a1
+; RV64IZDINXZHINX-NEXT: neg a1, a1
+; RV64IZDINXZHINX-NEXT: lui a3, 815104
+; RV64IZDINXZHINX-NEXT: fmax.s a0, a0, a3
+; RV64IZDINXZHINX-NEXT: fmin.s a0, a0, a2
+; RV64IZDINXZHINX-NEXT: fcvt.l.s a0, a0, rtz
+; RV64IZDINXZHINX-NEXT: and a0, a1, a0
; RV64IZDINXZHINX-NEXT: ret
;
; RV32I-LABEL: fcvt_si_h_sat:
@@ -420,12 +420,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; RV64ID-LP64-NEXT: call __extendhfsf2
; RV64ID-LP64-NEXT: fmv.w.x fa5, a0
; RV64ID-LP64-NEXT: feq.s a0, fa5, fa5
+; RV64ID-LP64-NEXT: neg a0, a0
; RV64ID-LP64-NEXT: lui a1, %hi(.LCPI1_0)
; RV64ID-LP64-NEXT: flw fa4, %lo(.LCPI1_0)(a1)
; RV64ID-LP64-NEXT: lui a1, 815104
; RV64ID-LP64-NEXT: fmv.w.x fa3, a1
; RV64ID-LP64-NEXT: fmax.s fa5, fa5, fa3
-; RV64ID-LP64-NEXT: neg a0, a0
; RV64ID-LP64-NEXT: fmin.s fa5, fa5, fa4
; RV64ID-LP64-NEXT: fcvt.l.s a1, fa5, rtz
; RV64ID-LP64-NEXT: and a0, a0, a1
@@ -458,12 +458,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; RV64ID-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
; RV64ID-NEXT: call __extendhfsf2
; RV64ID-NEXT: feq.s a0, fa0, fa0
+; RV64ID-NEXT: neg a0, a0
; RV64ID-NEXT: lui a1, %hi(.LCPI1_0)
; RV64ID-NEXT: flw fa5, %lo(.LCPI1_0)(a1)
; RV64ID-NEXT: lui a1, 815104
; RV64ID-NEXT: fmv.w.x fa4, a1
; RV64ID-NEXT: fmax.s fa4, fa0, fa4
-; RV64ID-NEXT: neg a0, a0
; RV64ID-NEXT: fmin.s fa5, fa4, fa5
; RV64ID-NEXT: fcvt.l.s a1, fa5, rtz
; RV64ID-NEXT: and a0, a0, a1
@@ -490,12 +490,12 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; CHECK64-IZFHMIN: # %bb.0: # %start
; CHECK64-IZFHMIN-NEXT: fcvt.s.h fa5, fa0
; CHECK64-IZFHMIN-NEXT: feq.s a0, fa5, fa5
+; CHECK64-IZFHMIN-NEXT: neg a0, a0
; CHECK64-IZFHMIN-NEXT: lui a1, %hi(.LCPI1_0)
; CHECK64-IZFHMIN-NEXT: flw fa4, %lo(.LCPI1_0)(a1)
; CHECK64-IZFHMIN-NEXT: lui a1, 815104
; CHECK64-IZFHMIN-NEXT: fmv.w.x fa3, a1
; CHECK64-IZFHMIN-NEXT: fmax.s fa5, fa5, fa3
-; CHECK64-IZFHMIN-NEXT: neg a0, a0
; CHECK64-IZFHMIN-NEXT: fmin.s fa5, fa5, fa4
; CHECK64-IZFHMIN-NEXT: fcvt.l.s a1, fa5, rtz
; CHECK64-IZFHMIN-NEXT: and a0, a0, a1
@@ -518,15 +518,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; CHECK64-IZHINXMIN-LABEL: fcvt_si_h_sat:
; CHECK64-IZHINXMIN: # %bb.0: # %start
; CHECK64-IZHINXMIN-NEXT: fcvt.s.h a0, a0
-; CHECK64-IZHINXMIN-NEXT: lui a1, 815104
+; CHECK64-IZHINXMIN-NEXT: feq.s a1, a0, a0
; CHECK64-IZHINXMIN-NEXT: lui a2, %hi(.LCPI1_0)
; CHECK64-IZHINXMIN-NEXT: lw a2, %lo(.LCPI1_0)(a2)
-; CHECK64-IZHINXMIN-NEXT: fmax.s a1, a0, a1
-; CHECK64-IZHINXMIN-NEXT: feq.s a0, a0, a0
-; CHECK64-IZHINXMIN-NEXT: neg a0, a0
-; CHECK64-IZHINXMIN-NEXT: fmin.s a1, a1, a2
-; CHECK64-IZHINXMIN-NEXT: fcvt.l.s a1, a1, rtz
-; CHECK64-IZHINXMIN-NEXT: and a0, a0, a1
+; CHECK64-IZHINXMIN-NEXT: neg a1, a1
+; CHECK64-IZHINXMIN-NEXT: lui a3, 815104
+; CHECK64-IZHINXMIN-NEXT: fmax.s a0, a0, a3
+; CHECK64-IZHINXMIN-NEXT: fmin.s a0, a0, a2
+; CHECK64-IZHINXMIN-NEXT: fcvt.l.s a0, a0, rtz
+; CHECK64-IZHINXMIN-NEXT: and a0, a1, a0
; CHECK64-IZHINXMIN-NEXT: ret
;
; CHECK32-IZDINXZHINXMIN-LABEL: fcvt_si_h_sat:
@@ -546,15 +546,15 @@ define i16 @fcvt_si_h_sat(half %a) nounwind {
; CHECK64-IZDINXZHINXMIN-LABEL: fcvt_si_h_sat:
; CHECK64-IZDINXZHINXMIN: # %bb.0: # %start
; CHECK64-IZDINXZHINXMIN-NEXT: fcvt.s.h a0, a0
-; CHECK64-IZDINXZHINXMIN-NEXT: lui a1, 815104
+; CHECK64-IZDINXZHINXMIN-NEXT: feq.s a1, a0, a0
; CHECK64-IZDINXZHINXMIN-NEXT: lui a2, %hi(.LCPI1_0)
; CHECK64-IZDINXZHINXMIN-NEXT: lw a2, %lo(.LCPI1_0)(a2)
-; CHECK64-IZDINXZHINXMIN-NEXT: fmax.s a1, a0, a1
-; CHECK64-IZDINXZHINXMIN-NEXT: feq.s a0, a0, a0
-; CHECK64-IZDINXZHINXMIN-NEXT: neg a0, a0
-; CHECK64-IZDINXZHINXMIN-NEXT: fmin.s a1, a1, a2
-; CHECK64-IZDINXZHINXMIN-NEXT: fcvt.l.s a1, a1, rtz
-; CHECK64-IZDINXZHINXMIN-NEXT: and a0, a0, a1
+; CHECK64-IZDINXZHINXMIN-NEXT: neg a1, a1
+; CHECK64-IZDINXZHINXMIN-NEXT: lui a3, 815104
+; CHECK64-IZDINXZHINXMIN-NEXT: fmax.s a0, a0, a3
+; CHECK64-IZDINXZHINXMIN-NEXT: fmin.s a0, a0, a2
+; CHECK64-IZDINXZHINXMIN-NEXT: fcvt.l.s a0, a0, rtz
+; CHECK64-IZDINXZHINXMIN-NEXT: and a0, a1, a0
; CHECK64-IZDINXZHINXMIN-NEXT: ret
start:
%0 = tail call i16 @llvm.fptosi.sat.i16.f16(half %a)
@@ -6377,12 +6377,12 @@ define signext i16 @fcvt_w_s_sat_i16(half %a) nounwind {
; RV64IZFH: # %bb.0: # %start
; RV64IZFH-NEXT: fcvt.s.h fa5, fa0
; RV64IZFH-NEXT: feq.s a0, fa5, fa5
+; RV64IZFH-NEXT: neg a0, a0
; RV64IZFH-NEXT: lui a1, %hi(.LCPI32_0)
; RV64IZFH-NEXT: flw fa4, %lo(.LCPI32_0)(a1)
; RV64IZFH-NEXT: lui a1, 815104
; RV64IZFH-NEXT: fmv.w.x fa3, a1
; RV64IZFH-NEXT: fmax.s fa5, fa5, fa3
-; RV64IZFH-NEXT: neg a0, a0
; RV64IZFH-NEXT: fmin.s fa5, fa5, fa4
; RV64IZFH-NEXT: fcvt.l.s a1, fa5, rtz
; RV64IZFH-NEXT: and a0, a0, a1
@@ -6407,12 +6407,12 @@ define signext i16 @fcvt_w_s_sat_i16(half %a) nounwind {
; RV64IDZFH: # %bb.0: # %start
; RV64IDZFH-NEXT: fcvt.s.h fa5, fa0
; RV64IDZFH-NEXT: feq.s a0, fa5, fa5
+; RV64IDZFH-NEXT: neg a0, a0
; RV64IDZFH-NEXT: lui a1, %hi(.LCPI32_0)
; RV64IDZFH-NEXT: flw fa4, %lo(.LCPI32_0)(a1)
; RV64IDZFH-NEXT: lui a1, 815104
; RV64IDZFH-NEXT: fmv.w.x fa3, ...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM though I think this existed traversal of all int VTs is kind of silly, but I don't figure out a better way now.
This heuristic is first introduced in 66c3dfb by Andrew, and it only check MVT::i32, and do a refactor like current codes to fix llc crash in 350ff2c. Like X86, i64, i32, i16, i8 are all legal register type with same allocatable register number and it has no break in loop, so it will use the register number of i8 to do final check, and we can see it's a rough checking rule like Andrew says. I try to find interface to get the XLen to simplify this logic, but it doesn't have yet. I have updated codes adding one break in the loop so it only checks the first legal register type(it has the same behaviour like before), please help review again @wangpc-pp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well.
Machine scheduler will suppress register pressure when the scheduling window is too small, but now it doesn't consider i64 register type, and this MR extends it into i64 register type, so architecture like RISCV64 that only supports i64 interger register will have the same behavior like RISCV32.
49329a2
to
72e7829
Compare
Machine scheduler will suppress register pressure when the scheduling window is too small, but now it doesn't consider i64 register type, and this MR extends it into i64 register type, so architecture like RISCV64 that only supports i64 interger register will have the same behavior like RISCV32.
Machine scheduler will suppress register pressure when the scheduling window is too small, but now it doesn't consider i64 register type, and this MR extends it into i64 register type, so architecture like RISCV64 that only supports i64 interger register will have the same behavior like RISCV32.
Machine scheduler will suppress register pressure when the scheduling
window is too small, but now it doesn't consider i64 register type,
and this MR extends it into i64 register type, so architecture like
RISCV64 that only supports i64 interger register will have the same
behavior like RISCV32.