-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Strip W suffix from ADDIW #68425
Conversation
The original motivation of this change was simply to reduce test duplication. As can be seen in the (massive) test delta, we have many tests whose output differ only due to the use of addi on rv32 vs addiw on rv64 when the high bits are don't care. However, after reading the isa specification, I believe this to also be a compressibility optimization. There doesn't seem to be compressed versions of these instructions (or of slliw despite what the previous comment says), so using the non-W variant should allow the formation of more compressed instructions. As an aside, we don't need to worry about the non-zero immediate restriction on the compressed variants because we're not directly forming the compressed variants. If we happen to get a zero immediate for e.g. the ADDI, then either a later optimization will strip the useless instruction or the encoder is responsible for not compressing the instruction.
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-risc-v ChangesThe original motivation of this change was simply to reduce test duplication. As can be seen in the (massive) test delta, we have many tests whose output differ only due to the use of addi on rv32 vs addiw on rv64 when the high bits are don't care. However, after reading the isa specification, I believe this to also be a compressibility optimization. There doesn't seem to be compressed versions of these instructions (or of slliw despite what the previous comment says), so using the non-W variant should allow the formation of more compressed instructions. As an aside, we don't need to worry about the non-zero immediate restriction on the compressed variants because we're not directly forming the compressed variants. If we happen to get a zero immediate for e.g. the ADDI, then either a later optimization will strip the useless instruction or the encoder is responsible for not compressing the instruction. Patch is 1.49 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/68425.diff 113 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index 439a1bb6e1e69d2..3a756db977904b6 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -12,11 +12,12 @@
// extended bits aren't consumed or because the input was already sign extended
// by an earlier instruction.
//
-// Then it removes the -w suffix from addw, slliw and mulw instructions
-// whenever all users are dependent only on the lower word of the result of the
-// instruction. We do this only for addw, slliw, and mulw because the -w forms
-// are less compressible: c.add and c.slli have a larger register encoding than
-// their w counterparts, and there's no compressible version of mulw.
+// Then it removes the -w suffix from opw instructions whenever all users are
+// dependent only on the lower word of the result of the instruction. This is
+// profitable for addw because c.add has a larger register encoding than c.addw.
+// For the remaining opw instructions, there is no compressed w variant. This
+// tramsform also has the side effect of making RV32 and RV64 codegen for 32
+// bit constants match which helps reduce check duplication in LIT tests.
//
//===---------------------------------------------------------------------===//
@@ -661,8 +662,11 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
default:
continue;
case RISCV::ADDW: Opc = RISCV::ADD; break;
+ case RISCV::ADDIW: Opc = RISCV::ADDI; break;
case RISCV::MULW: Opc = RISCV::MUL; break;
case RISCV::SLLIW: Opc = RISCV::SLLI; break;
+ case RISCV::SRLIW: Opc = RISCV::SRLI; break;
+ case RISCV::SRAIW: Opc = RISCV::SRAI; break;
}
if (hasAllWUsers(MI, ST, MRI)) {
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll b/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
index c503d6541b0a577..16a81b79b2f3fc2 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/alu-roundtrip.ll
@@ -34,9 +34,9 @@ define i32 @add_i8_signext_i32(i8 %a, i8 %b) {
; RV64IM-LABEL: add_i8_signext_i32:
; RV64IM: # %bb.0: # %entry
; RV64IM-NEXT: slli a0, a0, 24
-; RV64IM-NEXT: sraiw a0, a0, 24
+; RV64IM-NEXT: srai a0, a0, 24
; RV64IM-NEXT: slli a1, a1, 24
-; RV64IM-NEXT: sraiw a1, a1, 24
+; RV64IM-NEXT: srai a1, a1, 24
; RV64IM-NEXT: addw a0, a0, a1
; RV64IM-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/RISCV/add-before-shl.ll b/llvm/test/CodeGen/RISCV/add-before-shl.ll
index a41664fde38581c..274f1cef49aa955 100644
--- a/llvm/test/CodeGen/RISCV/add-before-shl.ll
+++ b/llvm/test/CodeGen/RISCV/add-before-shl.ll
@@ -25,7 +25,7 @@ define signext i32 @add_small_const(i32 signext %a) nounwind {
;
; RV64I-LABEL: add_small_const:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, 1
+; RV64I-NEXT: addi a0, a0, 1
; RV64I-NEXT: slli a0, a0, 56
; RV64I-NEXT: srai a0, a0, 56
; RV64I-NEXT: jalr zero, 0(ra)
@@ -39,7 +39,7 @@ define signext i32 @add_small_const(i32 signext %a) nounwind {
;
; RV64C-LABEL: add_small_const:
; RV64C: # %bb.0:
-; RV64C-NEXT: c.addiw a0, 1
+; RV64C-NEXT: c.addi a0, 1
; RV64C-NEXT: c.slli a0, 56
; RV64C-NEXT: c.srai a0, 56
; RV64C-NEXT: c.jr ra
@@ -78,7 +78,7 @@ define signext i32 @add_large_const(i32 signext %a) nounwind {
; RV64C-LABEL: add_large_const:
; RV64C: # %bb.0:
; RV64C-NEXT: c.lui a1, 1
-; RV64C-NEXT: c.addiw a1, -1
+; RV64C-NEXT: c.addi a1, -1
; RV64C-NEXT: c.add a0, a1
; RV64C-NEXT: c.slli a0, 48
; RV64C-NEXT: c.srai a0, 48
@@ -118,7 +118,7 @@ define signext i32 @add_huge_const(i32 signext %a) nounwind {
; RV64C-LABEL: add_huge_const:
; RV64C: # %bb.0:
; RV64C-NEXT: c.lui a1, 8
-; RV64C-NEXT: c.addiw a1, -1
+; RV64C-NEXT: c.addi a1, -1
; RV64C-NEXT: c.add a0, a1
; RV64C-NEXT: c.slli a0, 48
; RV64C-NEXT: c.srai a0, 48
@@ -139,7 +139,7 @@ define signext i24 @add_non_machine_type(i24 signext %a) nounwind {
;
; RV64I-LABEL: add_non_machine_type:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, 256
+; RV64I-NEXT: addi a0, a0, 256
; RV64I-NEXT: slli a0, a0, 52
; RV64I-NEXT: srai a0, a0, 40
; RV64I-NEXT: jalr zero, 0(ra)
@@ -153,7 +153,7 @@ define signext i24 @add_non_machine_type(i24 signext %a) nounwind {
;
; RV64C-LABEL: add_non_machine_type:
; RV64C: # %bb.0:
-; RV64C-NEXT: addiw a0, a0, 256
+; RV64C-NEXT: addi a0, a0, 256
; RV64C-NEXT: c.slli a0, 52
; RV64C-NEXT: c.srai a0, 40
; RV64C-NEXT: c.jr ra
diff --git a/llvm/test/CodeGen/RISCV/add-imm.ll b/llvm/test/CodeGen/RISCV/add-imm.ll
index 700fec0192d3e74..52751f1c224211f 100644
--- a/llvm/test/CodeGen/RISCV/add-imm.ll
+++ b/llvm/test/CodeGen/RISCV/add-imm.ll
@@ -29,7 +29,7 @@ define i32 @add_positive_low_bound_accept(i32 %a) nounwind {
;
; RV64I-LABEL: add_positive_low_bound_accept:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, 2047
+; RV64I-NEXT: addi a0, a0, 2047
; RV64I-NEXT: addiw a0, a0, 1
; RV64I-NEXT: ret
%1 = add i32 %a, 2048
@@ -45,7 +45,7 @@ define i32 @add_positive_high_bound_accept(i32 %a) nounwind {
;
; RV64I-LABEL: add_positive_high_bound_accept:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, 2047
+; RV64I-NEXT: addi a0, a0, 2047
; RV64I-NEXT: addiw a0, a0, 2047
; RV64I-NEXT: ret
%1 = add i32 %a, 4094
@@ -63,7 +63,7 @@ define i32 @add_positive_high_bound_reject(i32 %a) nounwind {
; RV64I-LABEL: add_positive_high_bound_reject:
; RV64I: # %bb.0:
; RV64I-NEXT: lui a1, 1
-; RV64I-NEXT: addiw a1, a1, -1
+; RV64I-NEXT: addi a1, a1, -1
; RV64I-NEXT: addw a0, a0, a1
; RV64I-NEXT: ret
%1 = add i32 %a, 4095
@@ -93,7 +93,7 @@ define i32 @add_negative_high_bound_accept(i32 %a) nounwind {
;
; RV64I-LABEL: add_negative_high_bound_accept:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, -2048
+; RV64I-NEXT: addi a0, a0, -2048
; RV64I-NEXT: addiw a0, a0, -1
; RV64I-NEXT: ret
%1 = add i32 %a, -2049
@@ -109,7 +109,7 @@ define i32 @add_negative_low_bound_accept(i32 %a) nounwind {
;
; RV64I-LABEL: add_negative_low_bound_accept:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, -2048
+; RV64I-NEXT: addi a0, a0, -2048
; RV64I-NEXT: addiw a0, a0, -2048
; RV64I-NEXT: ret
%1 = add i32 %a, -4096
@@ -127,7 +127,7 @@ define i32 @add_negative_low_bound_reject(i32 %a) nounwind {
; RV64I-LABEL: add_negative_low_bound_reject:
; RV64I: # %bb.0:
; RV64I-NEXT: lui a1, 1048575
-; RV64I-NEXT: addiw a1, a1, -1
+; RV64I-NEXT: addi a1, a1, -1
; RV64I-NEXT: addw a0, a0, a1
; RV64I-NEXT: ret
%1 = add i32 %a, -4097
@@ -143,7 +143,7 @@ define i32 @add32_accept(i32 %a) nounwind {
;
; RV64I-LABEL: add32_accept:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, 2047
+; RV64I-NEXT: addi a0, a0, 2047
; RV64I-NEXT: addiw a0, a0, 952
; RV64I-NEXT: ret
%1 = add i32 %a, 2999
@@ -159,7 +159,7 @@ define signext i32 @add32_sext_accept(i32 signext %a) nounwind {
;
; RV64I-LABEL: add32_sext_accept:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, 2047
+; RV64I-NEXT: addi a0, a0, 2047
; RV64I-NEXT: addiw a0, a0, 952
; RV64I-NEXT: ret
%1 = add i32 %a, 2999
@@ -178,7 +178,7 @@ define signext i32 @add32_sext_reject_on_rv64(i32 signext %a) nounwind {
;
; RV64I-LABEL: add32_sext_reject_on_rv64:
; RV64I: # %bb.0:
-; RV64I-NEXT: addiw a0, a0, 2047
+; RV64I-NEXT: addi a0, a0, 2047
; RV64I-NEXT: addiw a0, a0, 953
; RV64I-NEXT: lui a1, %hi(gv0)
; RV64I-NEXT: sw a0, %lo(gv0)(a1)
@@ -231,7 +231,7 @@ define void @add32_reject() nounwind {
; RV64I-NEXT: lui a2, %hi(gb)
; RV64I-NEXT: lw a3, %lo(gb)(a2)
; RV64I-NEXT: lui a4, 1
-; RV64I-NEXT: addiw a4, a4, -1096
+; RV64I-NEXT: addi a4, a4, -1096
; RV64I-NEXT: add a1, a1, a4
; RV64I-NEXT: add a3, a3, a4
; RV64I-NEXT: sw a1, %lo(ga)(a0)
diff --git a/llvm/test/CodeGen/RISCV/addimm-mulimm.ll b/llvm/test/CodeGen/RISCV/addimm-mulimm.ll
index d1bc480455dd35f..48fa69e10456563 100644
--- a/llvm/test/CodeGen/RISCV/addimm-mulimm.ll
+++ b/llvm/test/CodeGen/RISCV/addimm-mulimm.ll
@@ -84,7 +84,7 @@ define i32 @add_mul_combine_accept_b1(i32 %x) {
; RV64IMB-NEXT: li a1, 23
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: lui a1, 50
-; RV64IMB-NEXT: addiw a1, a1, 1119
+; RV64IMB-NEXT: addi a1, a1, 1119
; RV64IMB-NEXT: addw a0, a0, a1
; RV64IMB-NEXT: ret
%tmp0 = add i32 %x, 8953
@@ -107,7 +107,7 @@ define signext i32 @add_mul_combine_accept_b2(i32 signext %x) {
; RV64IMB-NEXT: li a1, 23
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: lui a1, 50
-; RV64IMB-NEXT: addiw a1, a1, 1119
+; RV64IMB-NEXT: addi a1, a1, 1119
; RV64IMB-NEXT: addw a0, a0, a1
; RV64IMB-NEXT: ret
%tmp0 = add i32 %x, 8953
@@ -153,7 +153,7 @@ define i32 @add_mul_combine_reject_a1(i32 %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_a1:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1971
+; RV64IMB-NEXT: addi a0, a0, 1971
; RV64IMB-NEXT: li a1, 29
; RV64IMB-NEXT: mulw a0, a0, a1
; RV64IMB-NEXT: ret
@@ -172,7 +172,7 @@ define signext i32 @add_mul_combine_reject_a2(i32 signext %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_a2:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1971
+; RV64IMB-NEXT: addi a0, a0, 1971
; RV64IMB-NEXT: li a1, 29
; RV64IMB-NEXT: mulw a0, a0, a1
; RV64IMB-NEXT: ret
@@ -217,7 +217,7 @@ define i32 @add_mul_combine_reject_c1(i32 %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_c1:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1000
+; RV64IMB-NEXT: addi a0, a0, 1000
; RV64IMB-NEXT: sh3add a1, a0, a0
; RV64IMB-NEXT: sh3add a0, a1, a0
; RV64IMB-NEXT: sext.w a0, a0
@@ -237,7 +237,7 @@ define signext i32 @add_mul_combine_reject_c2(i32 signext %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_c2:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1000
+; RV64IMB-NEXT: addi a0, a0, 1000
; RV64IMB-NEXT: sh3add a1, a0, a0
; RV64IMB-NEXT: sh3add a0, a1, a0
; RV64IMB-NEXT: sext.w a0, a0
@@ -349,7 +349,7 @@ define i32 @add_mul_combine_reject_e1(i32 %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_e1:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1971
+; RV64IMB-NEXT: addi a0, a0, 1971
; RV64IMB-NEXT: li a1, 29
; RV64IMB-NEXT: mulw a0, a0, a1
; RV64IMB-NEXT: ret
@@ -368,7 +368,7 @@ define signext i32 @add_mul_combine_reject_e2(i32 signext %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_e2:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1971
+; RV64IMB-NEXT: addi a0, a0, 1971
; RV64IMB-NEXT: li a1, 29
; RV64IMB-NEXT: mulw a0, a0, a1
; RV64IMB-NEXT: ret
@@ -414,7 +414,7 @@ define i32 @add_mul_combine_reject_f1(i32 %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_f1:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1972
+; RV64IMB-NEXT: addi a0, a0, 1972
; RV64IMB-NEXT: li a1, 29
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, 11
@@ -435,7 +435,7 @@ define signext i32 @add_mul_combine_reject_f2(i32 signext %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_f2:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 1972
+; RV64IMB-NEXT: addi a0, a0, 1972
; RV64IMB-NEXT: li a1, 29
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, 11
@@ -483,7 +483,7 @@ define i32 @add_mul_combine_reject_g1(i32 %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_g1:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 100
+; RV64IMB-NEXT: addi a0, a0, 100
; RV64IMB-NEXT: sh3add a1, a0, a0
; RV64IMB-NEXT: sh3add a0, a1, a0
; RV64IMB-NEXT: addiw a0, a0, 10
@@ -504,7 +504,7 @@ define signext i32 @add_mul_combine_reject_g2(i32 signext %x) {
;
; RV64IMB-LABEL: add_mul_combine_reject_g2:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 100
+; RV64IMB-NEXT: addi a0, a0, 100
; RV64IMB-NEXT: sh3add a1, a0, a0
; RV64IMB-NEXT: sh3add a0, a1, a0
; RV64IMB-NEXT: addiw a0, a0, 10
@@ -581,9 +581,9 @@ define i32 @mul3000_add8990_a(i32 %x) {
;
; RV64IMB-LABEL: mul3000_add8990_a:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 3
+; RV64IMB-NEXT: addi a0, a0, 3
; RV64IMB-NEXT: lui a1, 1
-; RV64IMB-NEXT: addiw a1, a1, -1096
+; RV64IMB-NEXT: addi a1, a1, -1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, -10
; RV64IMB-NEXT: ret
@@ -604,9 +604,9 @@ define signext i32 @mul3000_add8990_b(i32 signext %x) {
;
; RV64IMB-LABEL: mul3000_add8990_b:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 3
+; RV64IMB-NEXT: addi a0, a0, 3
; RV64IMB-NEXT: lui a1, 1
-; RV64IMB-NEXT: addiw a1, a1, -1096
+; RV64IMB-NEXT: addi a1, a1, -1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, -10
; RV64IMB-NEXT: ret
@@ -656,9 +656,9 @@ define i32 @mul3000_sub8990_a(i32 %x) {
;
; RV64IMB-LABEL: mul3000_sub8990_a:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, -3
+; RV64IMB-NEXT: addi a0, a0, -3
; RV64IMB-NEXT: lui a1, 1
-; RV64IMB-NEXT: addiw a1, a1, -1096
+; RV64IMB-NEXT: addi a1, a1, -1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, 10
; RV64IMB-NEXT: ret
@@ -679,9 +679,9 @@ define signext i32 @mul3000_sub8990_b(i32 signext %x) {
;
; RV64IMB-LABEL: mul3000_sub8990_b:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, -3
+; RV64IMB-NEXT: addi a0, a0, -3
; RV64IMB-NEXT: lui a1, 1
-; RV64IMB-NEXT: addiw a1, a1, -1096
+; RV64IMB-NEXT: addi a1, a1, -1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, 10
; RV64IMB-NEXT: ret
@@ -732,9 +732,9 @@ define i32 @mulneg3000_add8990_a(i32 %x) {
;
; RV64IMB-LABEL: mulneg3000_add8990_a:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, -3
+; RV64IMB-NEXT: addi a0, a0, -3
; RV64IMB-NEXT: lui a1, 1048575
-; RV64IMB-NEXT: addiw a1, a1, 1096
+; RV64IMB-NEXT: addi a1, a1, 1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, -10
; RV64IMB-NEXT: ret
@@ -755,9 +755,9 @@ define signext i32 @mulneg3000_add8990_b(i32 signext %x) {
;
; RV64IMB-LABEL: mulneg3000_add8990_b:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, -3
+; RV64IMB-NEXT: addi a0, a0, -3
; RV64IMB-NEXT: lui a1, 1048575
-; RV64IMB-NEXT: addiw a1, a1, 1096
+; RV64IMB-NEXT: addi a1, a1, 1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, -10
; RV64IMB-NEXT: ret
@@ -808,9 +808,9 @@ define i32 @mulneg3000_sub8990_a(i32 %x) {
;
; RV64IMB-LABEL: mulneg3000_sub8990_a:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 3
+; RV64IMB-NEXT: addi a0, a0, 3
; RV64IMB-NEXT: lui a1, 1048575
-; RV64IMB-NEXT: addiw a1, a1, 1096
+; RV64IMB-NEXT: addi a1, a1, 1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, 10
; RV64IMB-NEXT: ret
@@ -831,9 +831,9 @@ define signext i32 @mulneg3000_sub8990_b(i32 signext %x) {
;
; RV64IMB-LABEL: mulneg3000_sub8990_b:
; RV64IMB: # %bb.0:
-; RV64IMB-NEXT: addiw a0, a0, 3
+; RV64IMB-NEXT: addi a0, a0, 3
; RV64IMB-NEXT: lui a1, 1048575
-; RV64IMB-NEXT: addiw a1, a1, 1096
+; RV64IMB-NEXT: addi a1, a1, 1096
; RV64IMB-NEXT: mul a0, a0, a1
; RV64IMB-NEXT: addiw a0, a0, 10
; RV64IMB-NEXT: ret
diff --git a/llvm/test/CodeGen/RISCV/and.ll b/llvm/test/CodeGen/RISCV/and.ll
index 5eff422013da6a8..79e3b954c50d8d8 100644
--- a/llvm/test/CodeGen/RISCV/and.ll
+++ b/llvm/test/CodeGen/RISCV/and.ll
@@ -195,7 +195,7 @@ define i64 @and64_0x7fffffff00000000(i64 %x) {
; RV64I-LABEL: and64_0x7fffffff00000000:
; RV64I: # %bb.0:
; RV64I-NEXT: lui a1, 524288
-; RV64I-NEXT: addiw a1, a1, -1
+; RV64I-NEXT: addi a1, a1, -1
; RV64I-NEXT: slli a1, a1, 32
; RV64I-NEXT: and a0, a0, a1
; RV64I-NEXT: ret
diff --git a/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll b/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll
index f900b5161f75128..eea4cb72938af23 100644
--- a/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll
+++ b/llvm/test/CodeGen/RISCV/atomic-cmpxchg.ll
@@ -1104,7 +1104,7 @@ define void @cmpxchg_i16_monotonic_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounw
; RV64IA-NEXT: andi a3, a0, -4
; RV64IA-NEXT: slli a0, a0, 3
; RV64IA-NEXT: lui a4, 16
-; RV64IA-NEXT: addiw a4, a4, -1
+; RV64IA-NEXT: addi a4, a4, -1
; RV64IA-NEXT: sllw a5, a4, a0
; RV64IA-NEXT: and a1, a1, a4
; RV64IA-NEXT: sllw a1, a1, a0
@@ -1206,7 +1206,7 @@ define void @cmpxchg_i16_acquire_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounwin
; RV64IA-WMO-NEXT: andi a3, a0, -4
; RV64IA-WMO-NEXT: slli a0, a0, 3
; RV64IA-WMO-NEXT: lui a4, 16
-; RV64IA-WMO-NEXT: addiw a4, a4, -1
+; RV64IA-WMO-NEXT: addi a4, a4, -1
; RV64IA-WMO-NEXT: sllw a5, a4, a0
; RV64IA-WMO-NEXT: and a1, a1, a4
; RV64IA-WMO-NEXT: sllw a1, a1, a0
@@ -1230,7 +1230,7 @@ define void @cmpxchg_i16_acquire_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounwin
; RV64IA-TSO-NEXT: andi a3, a0, -4
; RV64IA-TSO-NEXT: slli a0, a0, 3
; RV64IA-TSO-NEXT: lui a4, 16
-; RV64IA-TSO-NEXT: addiw a4, a4, -1
+; RV64IA-TSO-NEXT: addi a4, a4, -1
; RV64IA-TSO-NEXT: sllw a5, a4, a0
; RV64IA-TSO-NEXT: and a1, a1, a4
; RV64IA-TSO-NEXT: sllw a1, a1, a0
@@ -1332,7 +1332,7 @@ define void @cmpxchg_i16_acquire_acquire(ptr %ptr, i16 %cmp, i16 %val) nounwind
; RV64IA-WMO-NEXT: andi a3, a0, -4
; RV64IA-WMO-NEXT: slli a0, a0, 3
; RV64IA-WMO-NEXT: lui a4, 16
-; RV64IA-WMO-NEXT: addiw a4, a4, -1
+; RV64IA-WMO-NEXT: addi a4, a4, -1
; RV64IA-WMO-NEXT: sllw a5, a4, a0
; RV64IA-WMO-NEXT: and a1, a1, a4
; RV64IA-WMO-NEXT: sllw a1, a1, a0
@@ -1356,7 +1356,7 @@ define void @cmpxchg_i16_acquire_acquire(ptr %ptr, i16 %cmp, i16 %val) nounwind
; RV64IA-TSO-NEXT: andi a3, a0, -4
; RV64IA-TSO-NEXT: slli a0, a0, 3
; RV64IA-TSO-NEXT: lui a4, 16
-; RV64IA-TSO-NEXT: addiw a4, a4, -1
+; RV64IA-TSO-NEXT: addi a4, a4, -1
; RV64IA-TSO-NEXT: sllw a5, a4, a0
; RV64IA-TSO-NEXT: and a1, a1, a4
; RV64IA-TSO-NEXT: sllw a1, a1, a0
@@ -1458,7 +1458,7 @@ define void @cmpxchg_i16_release_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounwin
; RV64IA-WMO-NEXT: andi a3, a0, -4
; RV64IA-WMO-NEXT: slli a0, a0, 3
; RV64IA-WMO-NEXT: lui a4, 16
-; RV64IA-WMO-NEXT: addiw a4, a4, -1
+; RV64IA-WMO-NEXT: addi a4, a4, -1
; RV64IA-WMO-NEXT: sllw a5, a4, a0
; RV64IA-WMO-NEXT: and a1, a1, a4
; RV64IA-WMO-NEXT: sllw a1, a1, a0
@@ -1482,7 +1482,7 @@ define void @cmpxchg_i16_release_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounwin
; RV64IA-TSO-NEXT: andi a3, a0, -4
; RV64IA-TSO-NEXT: slli a0, a0, 3
; RV64IA-TSO-NEXT: lui a4, 16
-; RV64IA-TSO-NEXT: addiw a4, a4, -1
+; RV64IA-TSO-NEXT: addi a4, a4, -1
; RV64IA-TSO-NEXT: sllw a5, a4, a0
; RV64IA-TSO-NEXT: and a1, a1, a4
; RV64IA-TSO-NEXT: sllw a1, a1, a0
@@ -1584,7 +1584,7 @@ define void @cmpxchg_i16_release_acquire(ptr %ptr, i16 %cmp, i16 %val) nounwind
; RV64IA-WMO-NEXT: andi a3, a0, -4
; RV64IA-WMO-NEXT: slli a0, a0, 3
; RV64IA-WMO-NEXT: lui a4, 16
-; RV64IA-WMO-NEXT: addiw a4, a4, -1
+; RV64IA-WMO-NEXT: addi a4, a4, -1
; RV64IA-WMO-NEXT: sllw a5, a4, a0
; RV64IA-WMO-NEXT: and a1, a1, a4
; RV64IA-WMO-NEXT: sllw a1, a1, a0
@@ -1608,7 +1608,7 @@ define void @cmpxchg_i16_release_acquire(ptr %ptr, i16 %cmp, i16 %val) nounwind
; RV64IA-TSO-NEXT: andi a3, a0, -4
; RV64IA-TSO-NEXT: slli a0, a0, 3
; RV64IA-TSO-NEXT: lui a4, 16
-; RV64IA-TSO-NEXT: addiw a4, a4, -1
+; RV64IA-TSO-NEXT: addi a4, a4, -1
; RV64IA-TSO-NEXT: sllw a5, a4, a0
; RV64IA-TSO-NEXT: and a1, a1, a4
; RV64IA-TSO-NEXT: sllw a1, a1, a0
@@ -1710,7 +1710,7 @@ define void @cmpxchg_i16_acq_rel_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounwin
; RV64IA-WMO-NEXT: andi a3, a0, -4
; RV64IA-WMO-NEXT: slli a0, a0, 3
; RV64IA-WMO-NEXT: lui a4, 16
-; RV64IA-WMO-NEXT: addiw a4, a4, -1
+; RV64IA-WMO-NEXT: addi a4, a4, -1
; RV64IA-WMO-NEXT: sllw a5, a4, a0
; RV64IA-WMO-NEXT: and a1, a1, a4
; RV64IA-WMO-NEXT: sllw a1, a1, a0
@@ -1734,7 +1734,7 @@ define void @cmpxchg_i16_acq_rel_monotonic(ptr %ptr, i16 %cmp, ...
[truncated]
|
You can test this locally with the following command:git-clang-format --diff ffdae1a11786b482104371defffe2e5772fbf0b1 fd4f5cdce9683a10508159afc234e85f8b459baf -- llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp View the diff from clang-format here.diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index a33ad8a194af..85b899503b81 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -664,7 +664,9 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
default:
continue;
case RISCV::ADDW: Opc = RISCV::ADD; break;
- case RISCV::ADDIW: Opc = RISCV::ADDI; break;
+ case RISCV::ADDIW:
+ Opc = RISCV::ADDI;
+ break;
case RISCV::MULW: Opc = RISCV::MUL; break;
case RISCV::SLLIW: Opc = RISCV::SLLI; break;
}
|
I don't think it's safe to do this for SRAIW and SRLIW. SRAIW copies bit 31 to all the shifted right bits, SRAI copies bit 63. SRLIW inserts zeros starting at bit 31, SRLI starts at bit 63. |
// Then it removes the -w suffix from opw instructions whenever all users are | ||
// dependent only on the lower word of the result of the instruction. This is | ||
// profitable for addw because c.add has a larger register encoding than c.addw. | ||
// For the remaining opw instructions, there is no compressed w variant. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a c.addiw.
C.ADDIW is an RV64C/RV128C-only instruction that performs the same computation but produces a 32-bit result, then sign-extends result to 64 bits. C.ADDIW expands into addiw rd, rd, imm. The immediate can be zero for C.ADDIW, where this corresponds to sext.w rd. C.ADDIW is only valid when rd̸=x0; the code points with rd=x0 are reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, need to fix comment.
// whenever all users are dependent only on the lower word of the result of the | ||
// instruction. We do this only for addw, slliw, and mulw because the -w forms | ||
// are less compressible: c.add and c.slli have a larger register encoding than | ||
// their w counterparts, and there's no compressible version of mulw. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was incorrect. c.mul was added with Zcb.
Are you sure about this semantic? The wording from the specification simple says "SLLIW, SRLIW, and SRAIW are RV64I-only instructions that are analogously defined but operate on 32-bit values and sign-extend their 32-bit results to 64 bits. SLLIW, SRLIW, and SRAIW encodings with imm[5] ̸= 0 are reserved." I wouldn't get your stated semantic from this without some very tortured reading of "operated on 32-bit values". Assuming you're correct about the semantic, I definitely agree these are unsound and need removed. |
I'm pretty sure. These are our isel patterns for them. Which are we get from type legalization of i32 lshr/ashr. def : Pat<(i64 (srl (and GPR:$rs1, 0xffffffff), uimm5:$shamt)), |
(Changes planned) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The motivation of this change is simply to reduce test duplication. As can be seen in the (massive) test delta, we have many tests whose output differ only due to the use of addi on rv32 vs addiw on rv64 when the high bits are don't care.
As an aside, we don't need to worry about the non-zero immediate restriction on the compressed variants because we're not directly forming the compressed variants. If we happen to get a zero immediate for the ADDI, then either a later optimization will strip the useless instruction or the encoder is responsible for not compressing the instruction.