[AArch64] Define high bits of FPR and GPR registers. #114263

sdesmalen-arm · 2024-10-30T16:53:31Z

This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797.

There are several changes in this patch:

AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests.

The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests.
LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples.
TableGen: The handling of the isArtificial flag was entirely broken. Skipping out too early from some of the loops led to incorrect internal representation of the (sub)register(index) hierarchy, and thus resulted in incorrect TableGen info.

llvmbot · 2024-10-30T16:54:06Z

@llvm/pr-subscribers-tablegen

@llvm/pr-subscribers-llvm-globalisel

Author: Sander de Smalen (sdesmalen-arm)

Changes

This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797.

There are several changes in this patch:

AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests.

The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests.
LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples.
TableGen: The handling of the isArtificial flag was entirely broken. Skipping out too early from some of the loops led to incorrect internal representation of the (sub)register(index) hierarchy, and thus resulted in incorrect TableGen info.

Patch is 137.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114263.diff

23 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp (+4-1)
(modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp (+57-2)
(modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.td (+263-234)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir (+4-4)
(modified) llvm/test/CodeGen/AArch64/aarch64-sve-asm.ll (+11-11)
(modified) llvm/test/CodeGen/AArch64/blr-bti-preserves-operands.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/emit_fneg_with_non_register_operand.mir (+4-4)
(modified) llvm/test/CodeGen/AArch64/expand-blr-rvmarker-pseudo.mir (+6-6)
(modified) llvm/test/CodeGen/AArch64/ldrpre-ldr-merge.mir (+30-30)
(modified) llvm/test/CodeGen/AArch64/machine-outliner-calls.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/misched-bundle.mir (+23-3)
(modified) llvm/test/CodeGen/AArch64/misched-detail-resource-booking-01.mir (+12)
(modified) llvm/test/CodeGen/AArch64/misched-detail-resource-booking-02.mir (+4-1)
(modified) llvm/test/CodeGen/AArch64/peephole-insvigpr.mir (+2-2)
(modified) llvm/test/CodeGen/AArch64/preserve.ll (+3-4)
(modified) llvm/test/CodeGen/AArch64/strpre-str-merge.mir (+14-14)
(modified) llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith-merging.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/sve-intrinsics-int-binaryComm-merging.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/sve-intrinsics-int-binaryCommWithRev-merging.mir (+1-1)
(added) llvm/unittests/Target/AArch64/AArch64RegisterInfoTest.cpp (+148)
(modified) llvm/unittests/Target/AArch64/CMakeLists.txt (+1)
(modified) llvm/utils/TableGen/Common/CodeGenRegisters.cpp (+8-8)
(modified) llvm/utils/TableGen/RegisterInfoEmitter.cpp (+1)

diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
index 1a9e5899892a1b..b051122609883f 100644
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
@@ -1541,7 +1541,10 @@ static bool canRenameMOP(const MachineOperand &MOP,
     // Note that this relies on the structure of the AArch64 register file. In
     // particular, a subregister cannot be written without overwriting the
     // whole register.
-    if (RegClass->HasDisjunctSubRegs) {
+    if (RegClass->HasDisjunctSubRegs && RegClass->CoveredBySubRegs &&
+        (TRI->getSubRegisterClass(RegClass, AArch64::dsub0) ||
+         TRI->getSubRegisterClass(RegClass, AArch64::qsub0) ||
+         TRI->getSubRegisterClass(RegClass, AArch64::zsub0))) {
       LLVM_DEBUG(
           dbgs()
           << "  Cannot rename operands with multiple disjunct subregisters ("
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index 18290dd5f32df9..1ef75c6e02e8c7 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -424,6 +424,58 @@ AArch64RegisterInfo::explainReservedReg(const MachineFunction &MF,
   return {};
 }
 
+static SmallVector<MCPhysReg> ReservedHi = {
+    AArch64::B0_HI,  AArch64::B1_HI,  AArch64::B2_HI,  AArch64::B3_HI,
+    AArch64::B4_HI,  AArch64::B5_HI,  AArch64::B6_HI,  AArch64::B7_HI,
+    AArch64::B8_HI,  AArch64::B9_HI,  AArch64::B10_HI, AArch64::B11_HI,
+    AArch64::B12_HI, AArch64::B13_HI, AArch64::B14_HI, AArch64::B15_HI,
+    AArch64::B16_HI, AArch64::B17_HI, AArch64::B18_HI, AArch64::B19_HI,
+    AArch64::B20_HI, AArch64::B21_HI, AArch64::B22_HI, AArch64::B23_HI,
+    AArch64::B24_HI, AArch64::B25_HI, AArch64::B26_HI, AArch64::B27_HI,
+    AArch64::B28_HI, AArch64::B29_HI, AArch64::B30_HI, AArch64::B31_HI,
+    AArch64::H0_HI,  AArch64::H1_HI,  AArch64::H2_HI,  AArch64::H3_HI,
+    AArch64::H4_HI,  AArch64::H5_HI,  AArch64::H6_HI,  AArch64::H7_HI,
+    AArch64::H8_HI,  AArch64::H9_HI,  AArch64::H10_HI, AArch64::H11_HI,
+    AArch64::H12_HI, AArch64::H13_HI, AArch64::H14_HI, AArch64::H15_HI,
+    AArch64::H16_HI, AArch64::H17_HI, AArch64::H18_HI, AArch64::H19_HI,
+    AArch64::H20_HI, AArch64::H21_HI, AArch64::H22_HI, AArch64::H23_HI,
+    AArch64::H24_HI, AArch64::H25_HI, AArch64::H26_HI, AArch64::H27_HI,
+    AArch64::H28_HI, AArch64::H29_HI, AArch64::H30_HI, AArch64::H31_HI,
+    AArch64::S0_HI,  AArch64::S1_HI,  AArch64::S2_HI,  AArch64::S3_HI,
+    AArch64::S4_HI,  AArch64::S5_HI,  AArch64::S6_HI,  AArch64::S7_HI,
+    AArch64::S8_HI,  AArch64::S9_HI,  AArch64::S10_HI, AArch64::S11_HI,
+    AArch64::S12_HI, AArch64::S13_HI, AArch64::S14_HI, AArch64::S15_HI,
+    AArch64::S16_HI, AArch64::S17_HI, AArch64::S18_HI, AArch64::S19_HI,
+    AArch64::S20_HI, AArch64::S21_HI, AArch64::S22_HI, AArch64::S23_HI,
+    AArch64::S24_HI, AArch64::S25_HI, AArch64::S26_HI, AArch64::S27_HI,
+    AArch64::S28_HI, AArch64::S29_HI, AArch64::S30_HI, AArch64::S31_HI,
+    AArch64::D0_HI,  AArch64::D1_HI,  AArch64::D2_HI,  AArch64::D3_HI,
+    AArch64::D4_HI,  AArch64::D5_HI,  AArch64::D6_HI,  AArch64::D7_HI,
+    AArch64::D8_HI,  AArch64::D9_HI,  AArch64::D10_HI, AArch64::D11_HI,
+    AArch64::D12_HI, AArch64::D13_HI, AArch64::D14_HI, AArch64::D15_HI,
+    AArch64::D16_HI, AArch64::D17_HI, AArch64::D18_HI, AArch64::D19_HI,
+    AArch64::D20_HI, AArch64::D21_HI, AArch64::D22_HI, AArch64::D23_HI,
+    AArch64::D24_HI, AArch64::D25_HI, AArch64::D26_HI, AArch64::D27_HI,
+    AArch64::D28_HI, AArch64::D29_HI, AArch64::D30_HI, AArch64::D31_HI,
+    AArch64::Q0_HI,  AArch64::Q1_HI,  AArch64::Q2_HI,  AArch64::Q3_HI,
+    AArch64::Q4_HI,  AArch64::Q5_HI,  AArch64::Q6_HI,  AArch64::Q7_HI,
+    AArch64::Q8_HI,  AArch64::Q9_HI,  AArch64::Q10_HI, AArch64::Q11_HI,
+    AArch64::Q12_HI, AArch64::Q13_HI, AArch64::Q14_HI, AArch64::Q15_HI,
+    AArch64::Q16_HI, AArch64::Q17_HI, AArch64::Q18_HI, AArch64::Q19_HI,
+    AArch64::Q20_HI, AArch64::Q21_HI, AArch64::Q22_HI, AArch64::Q23_HI,
+    AArch64::Q24_HI, AArch64::Q25_HI, AArch64::Q26_HI, AArch64::Q27_HI,
+    AArch64::Q28_HI, AArch64::Q29_HI, AArch64::Q30_HI, AArch64::Q31_HI,
+    AArch64::W0_HI,  AArch64::W1_HI,  AArch64::W2_HI,  AArch64::W3_HI,
+    AArch64::W4_HI,  AArch64::W5_HI,  AArch64::W6_HI,  AArch64::W7_HI,
+    AArch64::W8_HI,  AArch64::W9_HI,  AArch64::W10_HI, AArch64::W11_HI,
+    AArch64::W12_HI, AArch64::W13_HI, AArch64::W14_HI, AArch64::W15_HI,
+    AArch64::W16_HI, AArch64::W17_HI, AArch64::W18_HI, AArch64::W19_HI,
+    AArch64::W20_HI, AArch64::W21_HI, AArch64::W22_HI, AArch64::W23_HI,
+    AArch64::W24_HI, AArch64::W25_HI, AArch64::W26_HI, AArch64::W27_HI,
+    AArch64::W28_HI, AArch64::W29_HI, AArch64::W30_HI, AArch64::WSP_HI,
+    AArch64::WZR_HI
+    };
+
 BitVector
 AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
   const AArch64FrameLowering *TFI = getFrameLowering(MF);
@@ -490,7 +542,10 @@ AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
     markSuperRegs(Reserved, AArch64::W28);
   }
 
-  assert(checkAllSuperRegsMarked(Reserved));
+  for (Register R : ReservedHi)
+    Reserved.set(R);
+
+  assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
   return Reserved;
 }
 
@@ -514,7 +569,7 @@ AArch64RegisterInfo::getReservedRegs(const MachineFunction &MF) const {
       markSuperRegs(Reserved, AArch64::LR);
   }
 
-  assert(checkAllSuperRegsMarked(Reserved));
+  assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
   return Reserved;
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
index 4117d74d10c1e7..5f7f6aa7a7bf99 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
@@ -20,33 +20,39 @@ class AArch64Reg<bits<16> enc, string n, list<Register> subregs = [],
 
 let Namespace = "AArch64" in {
   // SubRegIndexes for GPR registers
-  def sub_32 : SubRegIndex<32>;
-  def sube64 : SubRegIndex<64>;
-  def subo64 : SubRegIndex<64>;
-  def sube32 : SubRegIndex<32>;
-  def subo32 : SubRegIndex<32>;
+  def sub_32   : SubRegIndex<32,  0>;
+  def sub_32_hi: SubRegIndex<32, 32>;
+  def sube64   : SubRegIndex<64>;
+  def subo64   : SubRegIndex<64>;
+  def sube32   : SubRegIndex<32>;
+  def subo32   : SubRegIndex<32>;
 
   // SubRegIndexes for FPR/Vector registers
-  def bsub : SubRegIndex<8>;
-  def hsub : SubRegIndex<16>;
-  def ssub : SubRegIndex<32>;
-  def dsub : SubRegIndex<64>;
-  def zsub : SubRegIndex<128>;
+  def bsub    : SubRegIndex<8, 0>;
+  def bsub_hi : SubRegIndex<8, 8>;
+  def hsub    : SubRegIndex<16, 0>;
+  def hsub_hi : SubRegIndex<16, 16>;
+  def ssub    : SubRegIndex<32, 0>;
+  def ssub_hi : SubRegIndex<32, 32>;
+  def dsub    : SubRegIndex<64, 0>;
+  def dsub_hi : SubRegIndex<64, 64>;
+  def zsub    : SubRegIndex<128, 0>;
+  def zsub_hi : SubRegIndex<-1, 128>;
   // Note: Code depends on these having consecutive numbers
-  def zsub0 : SubRegIndex<128, -1>;
-  def zsub1 : SubRegIndex<128, -1>;
-  def zsub2 : SubRegIndex<128, -1>;
-  def zsub3 : SubRegIndex<128, -1>;
-  // Note: Code depends on these having consecutive numbers
-  def dsub0 : SubRegIndex<64>;
-  def dsub1 : SubRegIndex<64>;
-  def dsub2 : SubRegIndex<64>;
-  def dsub3 : SubRegIndex<64>;
+  def zsub0 : SubRegIndex<-1>;
+  def zsub1 : SubRegIndex<-1>;
+  def zsub2 : SubRegIndex<-1>;
+  def zsub3 : SubRegIndex<-1>;
   // Note: Code depends on these having consecutive numbers
   def qsub0 : SubRegIndex<128>;
-  def qsub1 : SubRegIndex<128>;
-  def qsub2 : SubRegIndex<128>;
-  def qsub3 : SubRegIndex<128>;
+  def qsub1 : ComposedSubRegIndex<zsub1, zsub>;
+  def qsub2 : ComposedSubRegIndex<zsub2, zsub>;
+  def qsub3 : ComposedSubRegIndex<zsub3, zsub>;
+  // Note: Code depends on these having consecutive numbers
+  def dsub0 : SubRegIndex<64>;
+  def dsub1 : ComposedSubRegIndex<qsub1, dsub>;
+  def dsub2 : ComposedSubRegIndex<qsub2, dsub>;
+  def dsub3 : ComposedSubRegIndex<qsub3, dsub>;
 
   // SubRegIndexes for SME Matrix tiles
   def zasubb  : SubRegIndex<2048>; // (16 x 16)/1 bytes  = 2048 bits
@@ -60,10 +66,10 @@ let Namespace = "AArch64" in {
   def zasubq1 : SubRegIndex<128>;  // (16 x 16)/16 bytes = 128 bits
 
   // SubRegIndexes for SVE Predicates
-  def psub  : SubRegIndex<16>;
+  def psub  : SubRegIndex<-1>;
   // Note: Code depends on these having consecutive numbers
-  def psub0 : SubRegIndex<16, -1>;
-  def psub1 : SubRegIndex<16, -1>;
+  def psub0 : SubRegIndex<-1>;
+  def psub1 : SubRegIndex<-1>;
 }
 
 let Namespace = "AArch64" in {
@@ -74,6 +80,14 @@ let Namespace = "AArch64" in {
 //===----------------------------------------------------------------------===//
 // Registers
 //===----------------------------------------------------------------------===//
+
+foreach i = 0-30 in {
+  // Define W0_HI, W1_HI, .. W30_HI
+  def W#i#_HI : AArch64Reg<-1,  "w"#i#"_hi"> { let isArtificial = 1; }
+}
+def WSP_HI : AArch64Reg<-1,  "wsp_hi"> { let isArtificial = 1; }
+def WZR_HI : AArch64Reg<-1,  "wzr_hi"> { let isArtificial = 1; }
+
 def W0    : AArch64Reg<0,   "w0" >, DwarfRegNum<[0]>;
 def W1    : AArch64Reg<1,   "w1" >, DwarfRegNum<[1]>;
 def W2    : AArch64Reg<2,   "w2" >, DwarfRegNum<[2]>;
@@ -106,44 +120,42 @@ def W28   : AArch64Reg<28, "w28">, DwarfRegNum<[28]>;
 def W29   : AArch64Reg<29, "w29">, DwarfRegNum<[29]>;
 def W30   : AArch64Reg<30, "w30">, DwarfRegNum<[30]>;
 def WSP   : AArch64Reg<31, "wsp">, DwarfRegNum<[31]>;
-let isConstant = true in
-def WZR   : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP>;
-
-let SubRegIndices = [sub_32] in {
-def X0    : AArch64Reg<0,   "x0",  [W0]>, DwarfRegAlias<W0>;
-def X1    : AArch64Reg<1,   "x1",  [W1]>, DwarfRegAlias<W1>;
-def X2    : AArch64Reg<2,   "x2",  [W2]>, DwarfRegAlias<W2>;
-def X3    : AArch64Reg<3,   "x3",  [W3]>, DwarfRegAlias<W3>;
-def X4    : AArch64Reg<4,   "x4",  [W4]>, DwarfRegAlias<W4>;
-def X5    : AArch64Reg<5,   "x5",  [W5]>, DwarfRegAlias<W5>;
-def X6    : AArch64Reg<6,   "x6",  [W6]>, DwarfRegAlias<W6>;
-def X7    : AArch64Reg<7,   "x7",  [W7]>, DwarfRegAlias<W7>;
-def X8    : AArch64Reg<8,   "x8",  [W8]>, DwarfRegAlias<W8>;
-def X9    : AArch64Reg<9,   "x9",  [W9]>, DwarfRegAlias<W9>;
-def X10   : AArch64Reg<10, "x10", [W10]>, DwarfRegAlias<W10>;
-def X11   : AArch64Reg<11, "x11", [W11]>, DwarfRegAlias<W11>;
-def X12   : AArch64Reg<12, "x12", [W12]>, DwarfRegAlias<W12>;
-def X13   : AArch64Reg<13, "x13", [W13]>, DwarfRegAlias<W13>;
-def X14   : AArch64Reg<14, "x14", [W14]>, DwarfRegAlias<W14>;
-def X15   : AArch64Reg<15, "x15", [W15]>, DwarfRegAlias<W15>;
-def X16   : AArch64Reg<16, "x16", [W16]>, DwarfRegAlias<W16>;
-def X17   : AArch64Reg<17, "x17", [W17]>, DwarfRegAlias<W17>;
-def X18   : AArch64Reg<18, "x18", [W18]>, DwarfRegAlias<W18>;
-def X19   : AArch64Reg<19, "x19", [W19]>, DwarfRegAlias<W19>;
-def X20   : AArch64Reg<20, "x20", [W20]>, DwarfRegAlias<W20>;
-def X21   : AArch64Reg<21, "x21", [W21]>, DwarfRegAlias<W21>;
-def X22   : AArch64Reg<22, "x22", [W22]>, DwarfRegAlias<W22>;
-def X23   : AArch64Reg<23, "x23", [W23]>, DwarfRegAlias<W23>;
-def X24   : AArch64Reg<24, "x24", [W24]>, DwarfRegAlias<W24>;
-def X25   : AArch64Reg<25, "x25", [W25]>, DwarfRegAlias<W25>;
-def X26   : AArch64Reg<26, "x26", [W26]>, DwarfRegAlias<W26>;
-def X27   : AArch64Reg<27, "x27", [W27]>, DwarfRegAlias<W27>;
-def X28   : AArch64Reg<28, "x28", [W28]>, DwarfRegAlias<W28>;
-def FP    : AArch64Reg<29, "x29", [W29]>, DwarfRegAlias<W29>;
-def LR    : AArch64Reg<30, "x30", [W30]>, DwarfRegAlias<W30>;
-def SP    : AArch64Reg<31, "sp",  [WSP]>, DwarfRegAlias<WSP>;
-let isConstant = true in
-def XZR   : AArch64Reg<31, "xzr", [WZR]>, DwarfRegAlias<WSP>;
+def WZR   : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP> { let isConstant = true; }
+
+let SubRegIndices = [sub_32, sub_32_hi], CoveredBySubRegs = 1 in {
+def X0    : AArch64Reg<0,   "x0",  [W0,  W0_HI]>, DwarfRegAlias<W0>;
+def X1    : AArch64Reg<1,   "x1",  [W1,  W1_HI]>, DwarfRegAlias<W1>;
+def X2    : AArch64Reg<2,   "x2",  [W2,  W2_HI]>, DwarfRegAlias<W2>;
+def X3    : AArch64Reg<3,   "x3",  [W3,  W3_HI]>, DwarfRegAlias<W3>;
+def X4    : AArch64Reg<4,   "x4",  [W4,  W4_HI]>, DwarfRegAlias<W4>;
+def X5    : AArch64Reg<5,   "x5",  [W5,  W5_HI]>, DwarfRegAlias<W5>;
+def X6    : AArch64Reg<6,   "x6",  [W6,  W6_HI]>, DwarfRegAlias<W6>;
+def X7    : AArch64Reg<7,   "x7",  [W7,  W7_HI]>, DwarfRegAlias<W7>;
+def X8    : AArch64Reg<8,   "x8",  [W8,  W8_HI]>, DwarfRegAlias<W8>;
+def X9    : AArch64Reg<9,   "x9",  [W9,  W9_HI]>, DwarfRegAlias<W9>;
+def X10   : AArch64Reg<10, "x10", [W10, W10_HI]>, DwarfRegAlias<W10>;
+def X11   : AArch64Reg<11, "x11", [W11, W11_HI]>, DwarfRegAlias<W11>;
+def X12   : AArch64Reg<12, "x12", [W12, W12_HI]>, DwarfRegAlias<W12>;
+def X13   : AArch64Reg<13, "x13", [W13, W13_HI]>, DwarfRegAlias<W13>;
+def X14   : AArch64Reg<14, "x14", [W14, W14_HI]>, DwarfRegAlias<W14>;
+def X15   : AArch64Reg<15, "x15", [W15, W15_HI]>, DwarfRegAlias<W15>;
+def X16   : AArch64Reg<16, "x16", [W16, W16_HI]>, DwarfRegAlias<W16>;
+def X17   : AArch64Reg<17, "x17", [W17, W17_HI]>, DwarfRegAlias<W17>;
+def X18   : AArch64Reg<18, "x18", [W18, W18_HI]>, DwarfRegAlias<W18>;
+def X19   : AArch64Reg<19, "x19", [W19, W19_HI]>, DwarfRegAlias<W19>;
+def X20   : AArch64Reg<20, "x20", [W20, W20_HI]>, DwarfRegAlias<W20>;
+def X21   : AArch64Reg<21, "x21", [W21, W21_HI]>, DwarfRegAlias<W21>;
+def X22   : AArch64Reg<22, "x22", [W22, W22_HI]>, DwarfRegAlias<W22>;
+def X23   : AArch64Reg<23, "x23", [W23, W23_HI]>, DwarfRegAlias<W23>;
+def X24   : AArch64Reg<24, "x24", [W24, W24_HI]>, DwarfRegAlias<W24>;
+def X25   : AArch64Reg<25, "x25", [W25, W25_HI]>, DwarfRegAlias<W25>;
+def X26   : AArch64Reg<26, "x26", [W26, W26_HI]>, DwarfRegAlias<W26>;
+def X27   : AArch64Reg<27, "x27", [W27, W27_HI]>, DwarfRegAlias<W27>;
+def X28   : AArch64Reg<28, "x28", [W28, W28_HI]>, DwarfRegAlias<W28>;
+def FP    : AArch64Reg<29, "x29", [W29, W29_HI]>, DwarfRegAlias<W29>;
+def LR    : AArch64Reg<30, "x30", [W30, W30_HI]>, DwarfRegAlias<W30>;
+def SP    : AArch64Reg<31, "sp",  [WSP, WSP_HI]>, DwarfRegAlias<WSP>;
+def XZR   : AArch64Reg<31, "xzr", [WZR, WZR_HI]>, DwarfRegAlias<WSP> { let isConstant = true; }
 }
 
 // Condition code register.
@@ -290,6 +302,14 @@ def CCR : RegisterClass<"AArch64", [i32], 32, (add NZCV)> {
 // Floating Point Scalar Registers
 //===----------------------------------------------------------------------===//
 
+foreach i = 0-31 in {
+  def B#i#_HI : AArch64Reg<-1,  "b"#i#"_hi"> { let isArtificial = 1; }
+  def H#i#_HI : AArch64Reg<-1,  "h"#i#"_hi"> { let isArtificial = 1; }
+  def S#i#_HI : AArch64Reg<-1,  "s"#i#"_hi"> { let isArtificial = 1; }
+  def D#i#_HI : AArch64Reg<-1,  "d"#i#"_hi"> { let isArtificial = 1; }
+  def Q#i#_HI : AArch64Reg<-1,  "q"#i#"_hi"> { let isArtificial = 1; }
+}
+
 def B0    : AArch64Reg<0,   "b0">, DwarfRegNum<[64]>;
 def B1    : AArch64Reg<1,   "b1">, DwarfRegNum<[65]>;
 def B2    : AArch64Reg<2,   "b2">, DwarfRegNum<[66]>;
@@ -323,144 +343,144 @@ def B29   : AArch64Reg<29, "b29">, DwarfRegNum<[93]>;
 def B30   : AArch64Reg<30, "b30">, DwarfRegNum<[94]>;
 def B31   : AArch64Reg<31, "b31">, DwarfRegNum<[95]>;
 
-let SubRegIndices = [bsub] in {
-def H0    : AArch64Reg<0,   "h0", [B0]>, DwarfRegAlias<B0>;
-def H1    : AArch64Reg<1,   "h1", [B1]>, DwarfRegAlias<B1>;
-def H2    : AArch64Reg<2,   "h2", [B2]>, DwarfRegAlias<B2>;
-def H3    : AArch64Reg<3,   "h3", [B3]>, DwarfRegAlias<B3>;
-def H4    : AArch64Reg<4,   "h4", [B4]>, DwarfRegAlias<B4>;
-def H5    : AArch64Reg<5,   "h5", [B5]>, DwarfRegAlias<B5>;
-def H6    : AArch64Reg<6,   "h6", [B6]>, DwarfRegAlias<B6>;
-def H7    : AArch64Reg<7,   "h7", [B7]>, DwarfRegAlias<B7>;
-def H8    : AArch64Reg<8,   "h8", [B8]>, DwarfRegAlias<B8>;
-def H9    : AArch64Reg<9,   "h9", [B9]>, DwarfRegAlias<B9>;
-def H10   : AArch64Reg<10, "h10", [B10]>, DwarfRegAlias<B10>;
-def H11   : AArch64Reg<11, "h11", [B11]>, DwarfRegAlias<B11>;
-def H12   : AArch64Reg<12, "h12", [B12]>, DwarfRegAlias<B12>;
-def H13   : AArch64Reg<13, "h13", [B13]>, DwarfRegAlias<B13>;
-def H14   : AArch64Reg<14, "h14", [B14]>, DwarfRegAlias<B14>;
-def H15   : AArch64Reg<15, "h15", [B15]>, DwarfRegAlias<B15>;
-def H16   : AArch64Reg<16, "h16", [B16]>, DwarfRegAlias<B16>;
-def H17   : AArch64Reg<17, "h17", [B17]>, DwarfRegAlias<B17>;
-def H18   : AArch64Reg<18, "h18", [B18]>, DwarfRegAlias<B18>;
-def H19   : AArch64Reg<19, "h19", [B19]>, DwarfRegAlias<B19>;
-def H20   : AArch64Reg<20, "h20", [B20]>, DwarfRegAlias<B20>;
-def H21   : AArch64Reg<21, "h21", [B21]>, DwarfRegAlias<B21>;
-def H22   : AArch64Reg<22, "h22", [B22]>, DwarfRegAlias<B22>;
-def H23   : AArch64Reg<23, "h23", [B23]>, DwarfRegAlias<B23>;
-def H24   : AArch64Reg<24, "h24", [B24]>, DwarfRegAlias<B24>;
-def H25   : AArch64Reg<25, "h25", [B25]>, DwarfRegAlias<B25>;
-def H26   : AArch64Reg<26, "h26", [B26]>, DwarfRegAlias<B26>;
-def H27   : AArch64Reg<27, "h27", [B27]>, DwarfRegAlias<B27>;
-def H28   : AArch64Reg<28, "h28", [B28]>, DwarfRegAlias<B28>;
-def H29   : AArch64Reg<29, "h29", [B29]>, DwarfRegAlias<B29>;
-def H30   : AArch64Reg<30, "h30", [B30]>, DwarfRegAlias<B30>;
-def H31   : AArch64Reg<31, "h31", [B31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [hsub] in {
-def S0    : AArch64Reg<0,   "s0", [H0]>, DwarfRegAlias<B0>;
-def S1    : AArch64Reg<1,   "s1", [H1]>, DwarfRegAlias<B1>;
-def S2    : AArch64Reg<2,   "s2", [H2]>, DwarfRegAlias<B2>;
-def S3    : AArch64Reg<3,   "s3", [H3]>, DwarfRegAlias<B3>;
-def S4    : AArch64Reg<4,   "s4", [H4]>, DwarfRegAlias<B4>;
-def S5    : AArch64Reg<5,   "s5", [H5]>, DwarfRegAlias<B5>;
-def S6    : AArch64Reg<6,   "s6", [H6]>, DwarfRegAlias<B6>;
-def S7    : AArch64Reg<7,   "s7", [H7]>, DwarfRegAlias<B7>;
-def S8    : AArch64Reg<8,   "s8", [H8]>, DwarfRegAlias<B8>;
-def S9    : AArch64Reg<9,   "s9", [H9]>, DwarfRegAlias<B9>;
-def S10   : AArch64Reg<10, "s10", [H10]>, DwarfRegAlias<B10>;
-def S11   : AArch64Reg<11, "s11", [H11]>, DwarfRegAlias<B11>;
-def S12   : AArch64Reg<12, "s12", [H12]>, DwarfRegAlias<B12>;
-def S13   : AArch64Reg<13, "s13", [H13]>, DwarfRegAlias<B13>;
-def S14   : AArch64Reg<14, "s14", [H14]>, DwarfRegAlias<B14>;
-def S15   : AArch64Reg<15, "s15", [H15]>, DwarfRegAlias<B15>;
-def S16   : AArch64Reg<16, "s16", [H16]>, DwarfRegAlias<B16>;
-def S17   : AArch64Reg<17, "s17", [H17]>, DwarfRegAlias<B17>;
-def S18   : AArch64Reg<18, "s18", [H18]>, DwarfRegAlias<B18>;
-def S19   : AArch64Reg<19, "s19", [H19]>, DwarfRegAlias<B19>;
-def S20   : AArch64Reg<20, "s20", [H20]>, DwarfRegAlias<B20>;
-def S21   : AArch64Reg<21, "s21", [H21]>, DwarfRegAlias<B21>;
-def S22   : AArch64Reg<22, "s22", [H22]>, DwarfRegAlias<B22>;
-def S23   : AArch64Reg<23, "s23", [H23]>, DwarfRegAlias<B23>;
-def S24   : AArch64Reg<24, "s24", [H24]>, DwarfRegAlias<B24>;
-def S25   : AArch64Reg<25, "s25", [H25]>, DwarfRegAlias<B25>;
-def S26   : AArch64Reg<26, "s26", [H26]>, DwarfRegAlias<B26>;
-def S27   : AArch64Reg<27, "s27", [H27]>, DwarfRegAlias<B27>;
-def S28   : AArch64Reg<28, "s28", [H28]>, DwarfRegAlias<B28>;
-def S29   : AArch64Reg<29, "s29", [H29]>, DwarfRegAlias<B29>;
-def S30   : AArch64Reg<30, "s30", [H30]>, DwarfRegAlias<B30>;
-def S31   : AArch64Reg<31, "s31", [H31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [ssub], RegAltNameIndices = [vreg, vlist1] in {
-def D0    : AArch64Reg<0,   "d0", [S0], ["v0", ""]>, DwarfRegAlias<B0>;
-def D1    : AArch64Reg<1,   "d1", [S1], ["v1", ""]>, DwarfRegAlias<B1>;
-def D2    : AArch64Reg<2,   "d2", [S2], ["v2", ""]>, DwarfRegAlias<B2>;
-def D3    : AArch64Reg<3,   "d3", [S3], ["v3", ""]>, DwarfRegAlias<B3>;
-def D4    : AArch64Reg<4,   "d4", [S4], ["v4", ""]>, DwarfRegAlias<B4>;
-def D5    : AArch64Reg<5,   "d5", [S5], ["v5", ""]>, DwarfRegAlias<B5>;
-def D6    : AArch64Reg<6,   "d6", [S6], ["v6", ""]>, DwarfRegAlias<B6>;
-def D7    : AArch64Reg<7,   "d7", [S7], ["v7", ""]>, DwarfRegAlias<B7>;
-def D8    : AArch64Reg<8,   "d8", [S8], ["v8", ""]>, DwarfRegAlias<B8>;
-def D9    : AArch64Reg<9,   "d9", [S9], ["v9", ""]>, DwarfRegAlias<B9>;
-def D10   : AArch64Reg<10, "d10", [S10], ["v10", ""]>, DwarfRegAlias<B10>;
-def D11   : AArch64Reg<11, "d11", [S11], ["v11", ""]>, DwarfRegAlias<B11>;
-def D12   : AArch64Reg<12, "d12", [S12], ["v12", ""]>, DwarfRegAlias<B12>;
-def D13   : AArch64Reg<13, "d13", [S13], ["v13", ""]>,...
[truncated]

llvmbot · 2024-10-30T16:54:06Z

@llvm/pr-subscribers-backend-aarch64

Author: Sander de Smalen (sdesmalen-arm)

Changes

This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797.

There are several changes in this patch:

AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests.

The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests.
LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples.
TableGen: The handling of the isArtificial flag was entirely broken. Skipping out too early from some of the loops led to incorrect internal representation of the (sub)register(index) hierarchy, and thus resulted in incorrect TableGen info.

Patch is 137.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114263.diff

23 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp (+4-1)
(modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp (+57-2)
(modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.td (+263-234)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir (+4-4)
(modified) llvm/test/CodeGen/AArch64/aarch64-sve-asm.ll (+11-11)
(modified) llvm/test/CodeGen/AArch64/blr-bti-preserves-operands.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/emit_fneg_with_non_register_operand.mir (+4-4)
(modified) llvm/test/CodeGen/AArch64/expand-blr-rvmarker-pseudo.mir (+6-6)
(modified) llvm/test/CodeGen/AArch64/ldrpre-ldr-merge.mir (+30-30)
(modified) llvm/test/CodeGen/AArch64/machine-outliner-calls.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/misched-bundle.mir (+23-3)
(modified) llvm/test/CodeGen/AArch64/misched-detail-resource-booking-01.mir (+12)
(modified) llvm/test/CodeGen/AArch64/misched-detail-resource-booking-02.mir (+4-1)
(modified) llvm/test/CodeGen/AArch64/peephole-insvigpr.mir (+2-2)
(modified) llvm/test/CodeGen/AArch64/preserve.ll (+3-4)
(modified) llvm/test/CodeGen/AArch64/strpre-str-merge.mir (+14-14)
(modified) llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith-merging.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/sve-intrinsics-int-binaryComm-merging.mir (+1-1)
(modified) llvm/test/CodeGen/AArch64/sve-intrinsics-int-binaryCommWithRev-merging.mir (+1-1)
(added) llvm/unittests/Target/AArch64/AArch64RegisterInfoTest.cpp (+148)
(modified) llvm/unittests/Target/AArch64/CMakeLists.txt (+1)
(modified) llvm/utils/TableGen/Common/CodeGenRegisters.cpp (+8-8)
(modified) llvm/utils/TableGen/RegisterInfoEmitter.cpp (+1)

diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
index 1a9e5899892a1b..b051122609883f 100644
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
@@ -1541,7 +1541,10 @@ static bool canRenameMOP(const MachineOperand &MOP,
     // Note that this relies on the structure of the AArch64 register file. In
     // particular, a subregister cannot be written without overwriting the
     // whole register.
-    if (RegClass->HasDisjunctSubRegs) {
+    if (RegClass->HasDisjunctSubRegs && RegClass->CoveredBySubRegs &&
+        (TRI->getSubRegisterClass(RegClass, AArch64::dsub0) ||
+         TRI->getSubRegisterClass(RegClass, AArch64::qsub0) ||
+         TRI->getSubRegisterClass(RegClass, AArch64::zsub0))) {
       LLVM_DEBUG(
           dbgs()
           << "  Cannot rename operands with multiple disjunct subregisters ("
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index 18290dd5f32df9..1ef75c6e02e8c7 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -424,6 +424,58 @@ AArch64RegisterInfo::explainReservedReg(const MachineFunction &MF,
   return {};
 }
 
+static SmallVector<MCPhysReg> ReservedHi = {
+    AArch64::B0_HI,  AArch64::B1_HI,  AArch64::B2_HI,  AArch64::B3_HI,
+    AArch64::B4_HI,  AArch64::B5_HI,  AArch64::B6_HI,  AArch64::B7_HI,
+    AArch64::B8_HI,  AArch64::B9_HI,  AArch64::B10_HI, AArch64::B11_HI,
+    AArch64::B12_HI, AArch64::B13_HI, AArch64::B14_HI, AArch64::B15_HI,
+    AArch64::B16_HI, AArch64::B17_HI, AArch64::B18_HI, AArch64::B19_HI,
+    AArch64::B20_HI, AArch64::B21_HI, AArch64::B22_HI, AArch64::B23_HI,
+    AArch64::B24_HI, AArch64::B25_HI, AArch64::B26_HI, AArch64::B27_HI,
+    AArch64::B28_HI, AArch64::B29_HI, AArch64::B30_HI, AArch64::B31_HI,
+    AArch64::H0_HI,  AArch64::H1_HI,  AArch64::H2_HI,  AArch64::H3_HI,
+    AArch64::H4_HI,  AArch64::H5_HI,  AArch64::H6_HI,  AArch64::H7_HI,
+    AArch64::H8_HI,  AArch64::H9_HI,  AArch64::H10_HI, AArch64::H11_HI,
+    AArch64::H12_HI, AArch64::H13_HI, AArch64::H14_HI, AArch64::H15_HI,
+    AArch64::H16_HI, AArch64::H17_HI, AArch64::H18_HI, AArch64::H19_HI,
+    AArch64::H20_HI, AArch64::H21_HI, AArch64::H22_HI, AArch64::H23_HI,
+    AArch64::H24_HI, AArch64::H25_HI, AArch64::H26_HI, AArch64::H27_HI,
+    AArch64::H28_HI, AArch64::H29_HI, AArch64::H30_HI, AArch64::H31_HI,
+    AArch64::S0_HI,  AArch64::S1_HI,  AArch64::S2_HI,  AArch64::S3_HI,
+    AArch64::S4_HI,  AArch64::S5_HI,  AArch64::S6_HI,  AArch64::S7_HI,
+    AArch64::S8_HI,  AArch64::S9_HI,  AArch64::S10_HI, AArch64::S11_HI,
+    AArch64::S12_HI, AArch64::S13_HI, AArch64::S14_HI, AArch64::S15_HI,
+    AArch64::S16_HI, AArch64::S17_HI, AArch64::S18_HI, AArch64::S19_HI,
+    AArch64::S20_HI, AArch64::S21_HI, AArch64::S22_HI, AArch64::S23_HI,
+    AArch64::S24_HI, AArch64::S25_HI, AArch64::S26_HI, AArch64::S27_HI,
+    AArch64::S28_HI, AArch64::S29_HI, AArch64::S30_HI, AArch64::S31_HI,
+    AArch64::D0_HI,  AArch64::D1_HI,  AArch64::D2_HI,  AArch64::D3_HI,
+    AArch64::D4_HI,  AArch64::D5_HI,  AArch64::D6_HI,  AArch64::D7_HI,
+    AArch64::D8_HI,  AArch64::D9_HI,  AArch64::D10_HI, AArch64::D11_HI,
+    AArch64::D12_HI, AArch64::D13_HI, AArch64::D14_HI, AArch64::D15_HI,
+    AArch64::D16_HI, AArch64::D17_HI, AArch64::D18_HI, AArch64::D19_HI,
+    AArch64::D20_HI, AArch64::D21_HI, AArch64::D22_HI, AArch64::D23_HI,
+    AArch64::D24_HI, AArch64::D25_HI, AArch64::D26_HI, AArch64::D27_HI,
+    AArch64::D28_HI, AArch64::D29_HI, AArch64::D30_HI, AArch64::D31_HI,
+    AArch64::Q0_HI,  AArch64::Q1_HI,  AArch64::Q2_HI,  AArch64::Q3_HI,
+    AArch64::Q4_HI,  AArch64::Q5_HI,  AArch64::Q6_HI,  AArch64::Q7_HI,
+    AArch64::Q8_HI,  AArch64::Q9_HI,  AArch64::Q10_HI, AArch64::Q11_HI,
+    AArch64::Q12_HI, AArch64::Q13_HI, AArch64::Q14_HI, AArch64::Q15_HI,
+    AArch64::Q16_HI, AArch64::Q17_HI, AArch64::Q18_HI, AArch64::Q19_HI,
+    AArch64::Q20_HI, AArch64::Q21_HI, AArch64::Q22_HI, AArch64::Q23_HI,
+    AArch64::Q24_HI, AArch64::Q25_HI, AArch64::Q26_HI, AArch64::Q27_HI,
+    AArch64::Q28_HI, AArch64::Q29_HI, AArch64::Q30_HI, AArch64::Q31_HI,
+    AArch64::W0_HI,  AArch64::W1_HI,  AArch64::W2_HI,  AArch64::W3_HI,
+    AArch64::W4_HI,  AArch64::W5_HI,  AArch64::W6_HI,  AArch64::W7_HI,
+    AArch64::W8_HI,  AArch64::W9_HI,  AArch64::W10_HI, AArch64::W11_HI,
+    AArch64::W12_HI, AArch64::W13_HI, AArch64::W14_HI, AArch64::W15_HI,
+    AArch64::W16_HI, AArch64::W17_HI, AArch64::W18_HI, AArch64::W19_HI,
+    AArch64::W20_HI, AArch64::W21_HI, AArch64::W22_HI, AArch64::W23_HI,
+    AArch64::W24_HI, AArch64::W25_HI, AArch64::W26_HI, AArch64::W27_HI,
+    AArch64::W28_HI, AArch64::W29_HI, AArch64::W30_HI, AArch64::WSP_HI,
+    AArch64::WZR_HI
+    };
+
 BitVector
 AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
   const AArch64FrameLowering *TFI = getFrameLowering(MF);
@@ -490,7 +542,10 @@ AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
     markSuperRegs(Reserved, AArch64::W28);
   }
 
-  assert(checkAllSuperRegsMarked(Reserved));
+  for (Register R : ReservedHi)
+    Reserved.set(R);
+
+  assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
   return Reserved;
 }
 
@@ -514,7 +569,7 @@ AArch64RegisterInfo::getReservedRegs(const MachineFunction &MF) const {
       markSuperRegs(Reserved, AArch64::LR);
   }
 
-  assert(checkAllSuperRegsMarked(Reserved));
+  assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
   return Reserved;
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
index 4117d74d10c1e7..5f7f6aa7a7bf99 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
@@ -20,33 +20,39 @@ class AArch64Reg<bits<16> enc, string n, list<Register> subregs = [],
 
 let Namespace = "AArch64" in {
   // SubRegIndexes for GPR registers
-  def sub_32 : SubRegIndex<32>;
-  def sube64 : SubRegIndex<64>;
-  def subo64 : SubRegIndex<64>;
-  def sube32 : SubRegIndex<32>;
-  def subo32 : SubRegIndex<32>;
+  def sub_32   : SubRegIndex<32,  0>;
+  def sub_32_hi: SubRegIndex<32, 32>;
+  def sube64   : SubRegIndex<64>;
+  def subo64   : SubRegIndex<64>;
+  def sube32   : SubRegIndex<32>;
+  def subo32   : SubRegIndex<32>;
 
   // SubRegIndexes for FPR/Vector registers
-  def bsub : SubRegIndex<8>;
-  def hsub : SubRegIndex<16>;
-  def ssub : SubRegIndex<32>;
-  def dsub : SubRegIndex<64>;
-  def zsub : SubRegIndex<128>;
+  def bsub    : SubRegIndex<8, 0>;
+  def bsub_hi : SubRegIndex<8, 8>;
+  def hsub    : SubRegIndex<16, 0>;
+  def hsub_hi : SubRegIndex<16, 16>;
+  def ssub    : SubRegIndex<32, 0>;
+  def ssub_hi : SubRegIndex<32, 32>;
+  def dsub    : SubRegIndex<64, 0>;
+  def dsub_hi : SubRegIndex<64, 64>;
+  def zsub    : SubRegIndex<128, 0>;
+  def zsub_hi : SubRegIndex<-1, 128>;
   // Note: Code depends on these having consecutive numbers
-  def zsub0 : SubRegIndex<128, -1>;
-  def zsub1 : SubRegIndex<128, -1>;
-  def zsub2 : SubRegIndex<128, -1>;
-  def zsub3 : SubRegIndex<128, -1>;
-  // Note: Code depends on these having consecutive numbers
-  def dsub0 : SubRegIndex<64>;
-  def dsub1 : SubRegIndex<64>;
-  def dsub2 : SubRegIndex<64>;
-  def dsub3 : SubRegIndex<64>;
+  def zsub0 : SubRegIndex<-1>;
+  def zsub1 : SubRegIndex<-1>;
+  def zsub2 : SubRegIndex<-1>;
+  def zsub3 : SubRegIndex<-1>;
   // Note: Code depends on these having consecutive numbers
   def qsub0 : SubRegIndex<128>;
-  def qsub1 : SubRegIndex<128>;
-  def qsub2 : SubRegIndex<128>;
-  def qsub3 : SubRegIndex<128>;
+  def qsub1 : ComposedSubRegIndex<zsub1, zsub>;
+  def qsub2 : ComposedSubRegIndex<zsub2, zsub>;
+  def qsub3 : ComposedSubRegIndex<zsub3, zsub>;
+  // Note: Code depends on these having consecutive numbers
+  def dsub0 : SubRegIndex<64>;
+  def dsub1 : ComposedSubRegIndex<qsub1, dsub>;
+  def dsub2 : ComposedSubRegIndex<qsub2, dsub>;
+  def dsub3 : ComposedSubRegIndex<qsub3, dsub>;
 
   // SubRegIndexes for SME Matrix tiles
   def zasubb  : SubRegIndex<2048>; // (16 x 16)/1 bytes  = 2048 bits
@@ -60,10 +66,10 @@ let Namespace = "AArch64" in {
   def zasubq1 : SubRegIndex<128>;  // (16 x 16)/16 bytes = 128 bits
 
   // SubRegIndexes for SVE Predicates
-  def psub  : SubRegIndex<16>;
+  def psub  : SubRegIndex<-1>;
   // Note: Code depends on these having consecutive numbers
-  def psub0 : SubRegIndex<16, -1>;
-  def psub1 : SubRegIndex<16, -1>;
+  def psub0 : SubRegIndex<-1>;
+  def psub1 : SubRegIndex<-1>;
 }
 
 let Namespace = "AArch64" in {
@@ -74,6 +80,14 @@ let Namespace = "AArch64" in {
 //===----------------------------------------------------------------------===//
 // Registers
 //===----------------------------------------------------------------------===//
+
+foreach i = 0-30 in {
+  // Define W0_HI, W1_HI, .. W30_HI
+  def W#i#_HI : AArch64Reg<-1,  "w"#i#"_hi"> { let isArtificial = 1; }
+}
+def WSP_HI : AArch64Reg<-1,  "wsp_hi"> { let isArtificial = 1; }
+def WZR_HI : AArch64Reg<-1,  "wzr_hi"> { let isArtificial = 1; }
+
 def W0    : AArch64Reg<0,   "w0" >, DwarfRegNum<[0]>;
 def W1    : AArch64Reg<1,   "w1" >, DwarfRegNum<[1]>;
 def W2    : AArch64Reg<2,   "w2" >, DwarfRegNum<[2]>;
@@ -106,44 +120,42 @@ def W28   : AArch64Reg<28, "w28">, DwarfRegNum<[28]>;
 def W29   : AArch64Reg<29, "w29">, DwarfRegNum<[29]>;
 def W30   : AArch64Reg<30, "w30">, DwarfRegNum<[30]>;
 def WSP   : AArch64Reg<31, "wsp">, DwarfRegNum<[31]>;
-let isConstant = true in
-def WZR   : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP>;
-
-let SubRegIndices = [sub_32] in {
-def X0    : AArch64Reg<0,   "x0",  [W0]>, DwarfRegAlias<W0>;
-def X1    : AArch64Reg<1,   "x1",  [W1]>, DwarfRegAlias<W1>;
-def X2    : AArch64Reg<2,   "x2",  [W2]>, DwarfRegAlias<W2>;
-def X3    : AArch64Reg<3,   "x3",  [W3]>, DwarfRegAlias<W3>;
-def X4    : AArch64Reg<4,   "x4",  [W4]>, DwarfRegAlias<W4>;
-def X5    : AArch64Reg<5,   "x5",  [W5]>, DwarfRegAlias<W5>;
-def X6    : AArch64Reg<6,   "x6",  [W6]>, DwarfRegAlias<W6>;
-def X7    : AArch64Reg<7,   "x7",  [W7]>, DwarfRegAlias<W7>;
-def X8    : AArch64Reg<8,   "x8",  [W8]>, DwarfRegAlias<W8>;
-def X9    : AArch64Reg<9,   "x9",  [W9]>, DwarfRegAlias<W9>;
-def X10   : AArch64Reg<10, "x10", [W10]>, DwarfRegAlias<W10>;
-def X11   : AArch64Reg<11, "x11", [W11]>, DwarfRegAlias<W11>;
-def X12   : AArch64Reg<12, "x12", [W12]>, DwarfRegAlias<W12>;
-def X13   : AArch64Reg<13, "x13", [W13]>, DwarfRegAlias<W13>;
-def X14   : AArch64Reg<14, "x14", [W14]>, DwarfRegAlias<W14>;
-def X15   : AArch64Reg<15, "x15", [W15]>, DwarfRegAlias<W15>;
-def X16   : AArch64Reg<16, "x16", [W16]>, DwarfRegAlias<W16>;
-def X17   : AArch64Reg<17, "x17", [W17]>, DwarfRegAlias<W17>;
-def X18   : AArch64Reg<18, "x18", [W18]>, DwarfRegAlias<W18>;
-def X19   : AArch64Reg<19, "x19", [W19]>, DwarfRegAlias<W19>;
-def X20   : AArch64Reg<20, "x20", [W20]>, DwarfRegAlias<W20>;
-def X21   : AArch64Reg<21, "x21", [W21]>, DwarfRegAlias<W21>;
-def X22   : AArch64Reg<22, "x22", [W22]>, DwarfRegAlias<W22>;
-def X23   : AArch64Reg<23, "x23", [W23]>, DwarfRegAlias<W23>;
-def X24   : AArch64Reg<24, "x24", [W24]>, DwarfRegAlias<W24>;
-def X25   : AArch64Reg<25, "x25", [W25]>, DwarfRegAlias<W25>;
-def X26   : AArch64Reg<26, "x26", [W26]>, DwarfRegAlias<W26>;
-def X27   : AArch64Reg<27, "x27", [W27]>, DwarfRegAlias<W27>;
-def X28   : AArch64Reg<28, "x28", [W28]>, DwarfRegAlias<W28>;
-def FP    : AArch64Reg<29, "x29", [W29]>, DwarfRegAlias<W29>;
-def LR    : AArch64Reg<30, "x30", [W30]>, DwarfRegAlias<W30>;
-def SP    : AArch64Reg<31, "sp",  [WSP]>, DwarfRegAlias<WSP>;
-let isConstant = true in
-def XZR   : AArch64Reg<31, "xzr", [WZR]>, DwarfRegAlias<WSP>;
+def WZR   : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP> { let isConstant = true; }
+
+let SubRegIndices = [sub_32, sub_32_hi], CoveredBySubRegs = 1 in {
+def X0    : AArch64Reg<0,   "x0",  [W0,  W0_HI]>, DwarfRegAlias<W0>;
+def X1    : AArch64Reg<1,   "x1",  [W1,  W1_HI]>, DwarfRegAlias<W1>;
+def X2    : AArch64Reg<2,   "x2",  [W2,  W2_HI]>, DwarfRegAlias<W2>;
+def X3    : AArch64Reg<3,   "x3",  [W3,  W3_HI]>, DwarfRegAlias<W3>;
+def X4    : AArch64Reg<4,   "x4",  [W4,  W4_HI]>, DwarfRegAlias<W4>;
+def X5    : AArch64Reg<5,   "x5",  [W5,  W5_HI]>, DwarfRegAlias<W5>;
+def X6    : AArch64Reg<6,   "x6",  [W6,  W6_HI]>, DwarfRegAlias<W6>;
+def X7    : AArch64Reg<7,   "x7",  [W7,  W7_HI]>, DwarfRegAlias<W7>;
+def X8    : AArch64Reg<8,   "x8",  [W8,  W8_HI]>, DwarfRegAlias<W8>;
+def X9    : AArch64Reg<9,   "x9",  [W9,  W9_HI]>, DwarfRegAlias<W9>;
+def X10   : AArch64Reg<10, "x10", [W10, W10_HI]>, DwarfRegAlias<W10>;
+def X11   : AArch64Reg<11, "x11", [W11, W11_HI]>, DwarfRegAlias<W11>;
+def X12   : AArch64Reg<12, "x12", [W12, W12_HI]>, DwarfRegAlias<W12>;
+def X13   : AArch64Reg<13, "x13", [W13, W13_HI]>, DwarfRegAlias<W13>;
+def X14   : AArch64Reg<14, "x14", [W14, W14_HI]>, DwarfRegAlias<W14>;
+def X15   : AArch64Reg<15, "x15", [W15, W15_HI]>, DwarfRegAlias<W15>;
+def X16   : AArch64Reg<16, "x16", [W16, W16_HI]>, DwarfRegAlias<W16>;
+def X17   : AArch64Reg<17, "x17", [W17, W17_HI]>, DwarfRegAlias<W17>;
+def X18   : AArch64Reg<18, "x18", [W18, W18_HI]>, DwarfRegAlias<W18>;
+def X19   : AArch64Reg<19, "x19", [W19, W19_HI]>, DwarfRegAlias<W19>;
+def X20   : AArch64Reg<20, "x20", [W20, W20_HI]>, DwarfRegAlias<W20>;
+def X21   : AArch64Reg<21, "x21", [W21, W21_HI]>, DwarfRegAlias<W21>;
+def X22   : AArch64Reg<22, "x22", [W22, W22_HI]>, DwarfRegAlias<W22>;
+def X23   : AArch64Reg<23, "x23", [W23, W23_HI]>, DwarfRegAlias<W23>;
+def X24   : AArch64Reg<24, "x24", [W24, W24_HI]>, DwarfRegAlias<W24>;
+def X25   : AArch64Reg<25, "x25", [W25, W25_HI]>, DwarfRegAlias<W25>;
+def X26   : AArch64Reg<26, "x26", [W26, W26_HI]>, DwarfRegAlias<W26>;
+def X27   : AArch64Reg<27, "x27", [W27, W27_HI]>, DwarfRegAlias<W27>;
+def X28   : AArch64Reg<28, "x28", [W28, W28_HI]>, DwarfRegAlias<W28>;
+def FP    : AArch64Reg<29, "x29", [W29, W29_HI]>, DwarfRegAlias<W29>;
+def LR    : AArch64Reg<30, "x30", [W30, W30_HI]>, DwarfRegAlias<W30>;
+def SP    : AArch64Reg<31, "sp",  [WSP, WSP_HI]>, DwarfRegAlias<WSP>;
+def XZR   : AArch64Reg<31, "xzr", [WZR, WZR_HI]>, DwarfRegAlias<WSP> { let isConstant = true; }
 }
 
 // Condition code register.
@@ -290,6 +302,14 @@ def CCR : RegisterClass<"AArch64", [i32], 32, (add NZCV)> {
 // Floating Point Scalar Registers
 //===----------------------------------------------------------------------===//
 
+foreach i = 0-31 in {
+  def B#i#_HI : AArch64Reg<-1,  "b"#i#"_hi"> { let isArtificial = 1; }
+  def H#i#_HI : AArch64Reg<-1,  "h"#i#"_hi"> { let isArtificial = 1; }
+  def S#i#_HI : AArch64Reg<-1,  "s"#i#"_hi"> { let isArtificial = 1; }
+  def D#i#_HI : AArch64Reg<-1,  "d"#i#"_hi"> { let isArtificial = 1; }
+  def Q#i#_HI : AArch64Reg<-1,  "q"#i#"_hi"> { let isArtificial = 1; }
+}
+
 def B0    : AArch64Reg<0,   "b0">, DwarfRegNum<[64]>;
 def B1    : AArch64Reg<1,   "b1">, DwarfRegNum<[65]>;
 def B2    : AArch64Reg<2,   "b2">, DwarfRegNum<[66]>;
@@ -323,144 +343,144 @@ def B29   : AArch64Reg<29, "b29">, DwarfRegNum<[93]>;
 def B30   : AArch64Reg<30, "b30">, DwarfRegNum<[94]>;
 def B31   : AArch64Reg<31, "b31">, DwarfRegNum<[95]>;
 
-let SubRegIndices = [bsub] in {
-def H0    : AArch64Reg<0,   "h0", [B0]>, DwarfRegAlias<B0>;
-def H1    : AArch64Reg<1,   "h1", [B1]>, DwarfRegAlias<B1>;
-def H2    : AArch64Reg<2,   "h2", [B2]>, DwarfRegAlias<B2>;
-def H3    : AArch64Reg<3,   "h3", [B3]>, DwarfRegAlias<B3>;
-def H4    : AArch64Reg<4,   "h4", [B4]>, DwarfRegAlias<B4>;
-def H5    : AArch64Reg<5,   "h5", [B5]>, DwarfRegAlias<B5>;
-def H6    : AArch64Reg<6,   "h6", [B6]>, DwarfRegAlias<B6>;
-def H7    : AArch64Reg<7,   "h7", [B7]>, DwarfRegAlias<B7>;
-def H8    : AArch64Reg<8,   "h8", [B8]>, DwarfRegAlias<B8>;
-def H9    : AArch64Reg<9,   "h9", [B9]>, DwarfRegAlias<B9>;
-def H10   : AArch64Reg<10, "h10", [B10]>, DwarfRegAlias<B10>;
-def H11   : AArch64Reg<11, "h11", [B11]>, DwarfRegAlias<B11>;
-def H12   : AArch64Reg<12, "h12", [B12]>, DwarfRegAlias<B12>;
-def H13   : AArch64Reg<13, "h13", [B13]>, DwarfRegAlias<B13>;
-def H14   : AArch64Reg<14, "h14", [B14]>, DwarfRegAlias<B14>;
-def H15   : AArch64Reg<15, "h15", [B15]>, DwarfRegAlias<B15>;
-def H16   : AArch64Reg<16, "h16", [B16]>, DwarfRegAlias<B16>;
-def H17   : AArch64Reg<17, "h17", [B17]>, DwarfRegAlias<B17>;
-def H18   : AArch64Reg<18, "h18", [B18]>, DwarfRegAlias<B18>;
-def H19   : AArch64Reg<19, "h19", [B19]>, DwarfRegAlias<B19>;
-def H20   : AArch64Reg<20, "h20", [B20]>, DwarfRegAlias<B20>;
-def H21   : AArch64Reg<21, "h21", [B21]>, DwarfRegAlias<B21>;
-def H22   : AArch64Reg<22, "h22", [B22]>, DwarfRegAlias<B22>;
-def H23   : AArch64Reg<23, "h23", [B23]>, DwarfRegAlias<B23>;
-def H24   : AArch64Reg<24, "h24", [B24]>, DwarfRegAlias<B24>;
-def H25   : AArch64Reg<25, "h25", [B25]>, DwarfRegAlias<B25>;
-def H26   : AArch64Reg<26, "h26", [B26]>, DwarfRegAlias<B26>;
-def H27   : AArch64Reg<27, "h27", [B27]>, DwarfRegAlias<B27>;
-def H28   : AArch64Reg<28, "h28", [B28]>, DwarfRegAlias<B28>;
-def H29   : AArch64Reg<29, "h29", [B29]>, DwarfRegAlias<B29>;
-def H30   : AArch64Reg<30, "h30", [B30]>, DwarfRegAlias<B30>;
-def H31   : AArch64Reg<31, "h31", [B31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [hsub] in {
-def S0    : AArch64Reg<0,   "s0", [H0]>, DwarfRegAlias<B0>;
-def S1    : AArch64Reg<1,   "s1", [H1]>, DwarfRegAlias<B1>;
-def S2    : AArch64Reg<2,   "s2", [H2]>, DwarfRegAlias<B2>;
-def S3    : AArch64Reg<3,   "s3", [H3]>, DwarfRegAlias<B3>;
-def S4    : AArch64Reg<4,   "s4", [H4]>, DwarfRegAlias<B4>;
-def S5    : AArch64Reg<5,   "s5", [H5]>, DwarfRegAlias<B5>;
-def S6    : AArch64Reg<6,   "s6", [H6]>, DwarfRegAlias<B6>;
-def S7    : AArch64Reg<7,   "s7", [H7]>, DwarfRegAlias<B7>;
-def S8    : AArch64Reg<8,   "s8", [H8]>, DwarfRegAlias<B8>;
-def S9    : AArch64Reg<9,   "s9", [H9]>, DwarfRegAlias<B9>;
-def S10   : AArch64Reg<10, "s10", [H10]>, DwarfRegAlias<B10>;
-def S11   : AArch64Reg<11, "s11", [H11]>, DwarfRegAlias<B11>;
-def S12   : AArch64Reg<12, "s12", [H12]>, DwarfRegAlias<B12>;
-def S13   : AArch64Reg<13, "s13", [H13]>, DwarfRegAlias<B13>;
-def S14   : AArch64Reg<14, "s14", [H14]>, DwarfRegAlias<B14>;
-def S15   : AArch64Reg<15, "s15", [H15]>, DwarfRegAlias<B15>;
-def S16   : AArch64Reg<16, "s16", [H16]>, DwarfRegAlias<B16>;
-def S17   : AArch64Reg<17, "s17", [H17]>, DwarfRegAlias<B17>;
-def S18   : AArch64Reg<18, "s18", [H18]>, DwarfRegAlias<B18>;
-def S19   : AArch64Reg<19, "s19", [H19]>, DwarfRegAlias<B19>;
-def S20   : AArch64Reg<20, "s20", [H20]>, DwarfRegAlias<B20>;
-def S21   : AArch64Reg<21, "s21", [H21]>, DwarfRegAlias<B21>;
-def S22   : AArch64Reg<22, "s22", [H22]>, DwarfRegAlias<B22>;
-def S23   : AArch64Reg<23, "s23", [H23]>, DwarfRegAlias<B23>;
-def S24   : AArch64Reg<24, "s24", [H24]>, DwarfRegAlias<B24>;
-def S25   : AArch64Reg<25, "s25", [H25]>, DwarfRegAlias<B25>;
-def S26   : AArch64Reg<26, "s26", [H26]>, DwarfRegAlias<B26>;
-def S27   : AArch64Reg<27, "s27", [H27]>, DwarfRegAlias<B27>;
-def S28   : AArch64Reg<28, "s28", [H28]>, DwarfRegAlias<B28>;
-def S29   : AArch64Reg<29, "s29", [H29]>, DwarfRegAlias<B29>;
-def S30   : AArch64Reg<30, "s30", [H30]>, DwarfRegAlias<B30>;
-def S31   : AArch64Reg<31, "s31", [H31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [ssub], RegAltNameIndices = [vreg, vlist1] in {
-def D0    : AArch64Reg<0,   "d0", [S0], ["v0", ""]>, DwarfRegAlias<B0>;
-def D1    : AArch64Reg<1,   "d1", [S1], ["v1", ""]>, DwarfRegAlias<B1>;
-def D2    : AArch64Reg<2,   "d2", [S2], ["v2", ""]>, DwarfRegAlias<B2>;
-def D3    : AArch64Reg<3,   "d3", [S3], ["v3", ""]>, DwarfRegAlias<B3>;
-def D4    : AArch64Reg<4,   "d4", [S4], ["v4", ""]>, DwarfRegAlias<B4>;
-def D5    : AArch64Reg<5,   "d5", [S5], ["v5", ""]>, DwarfRegAlias<B5>;
-def D6    : AArch64Reg<6,   "d6", [S6], ["v6", ""]>, DwarfRegAlias<B6>;
-def D7    : AArch64Reg<7,   "d7", [S7], ["v7", ""]>, DwarfRegAlias<B7>;
-def D8    : AArch64Reg<8,   "d8", [S8], ["v8", ""]>, DwarfRegAlias<B8>;
-def D9    : AArch64Reg<9,   "d9", [S9], ["v9", ""]>, DwarfRegAlias<B9>;
-def D10   : AArch64Reg<10, "d10", [S10], ["v10", ""]>, DwarfRegAlias<B10>;
-def D11   : AArch64Reg<11, "d11", [S11], ["v11", ""]>, DwarfRegAlias<B11>;
-def D12   : AArch64Reg<12, "d12", [S12], ["v12", ""]>, DwarfRegAlias<B12>;
-def D13   : AArch64Reg<13, "d13", [S13], ["v13", ""]>,...
[truncated]

github-actions · 2024-10-30T16:57:11Z

✅ With the latest revision this PR passed the C/C++ code formatter.

arsenm

Is it possible to split out the tablegen changes

arsenm · 2024-10-30T16:59:24Z

llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp

@@ -424,6 +424,58 @@ AArch64RegisterInfo::explainReservedReg(const MachineFunction &MF,
  return {};
 }

+static SmallVector<MCPhysReg> ReservedHi = {


Make this std::array or a regular C array. Alternatively could introduce a new register class for the synthetic cases.

It might also be possible to get away without explicitly marking these as reserved

Alternatively could introduce a new register class for the synthetic cases.

That's actually something I had to do (see here), as the compiler otherwise runs into some assertion requiring the (artificial) registers to have a corresponding regclass.

It might also be possible to get away without explicitly marking these as reserved

When I remove these from the list of reserved registers, a lot of the tests fail.

What kind of failures? AMDGPU also has synthetic 16-bit high sub registers and they are not explicitly reserved. Are you adding these to an allocatable class?

Failures in LiveRangeCalc. This happens because in AArch64RegisterInfo::getStrictlyReservedRegs it marks e.g. 32-bit WSP and all of it's super-registers (in this case 64-bit SP) as reserved. WSP_HI is a sibling register of WSP but should also be marked as reserved.

But what are the actual failures, messages, location? If the high half of register isn't allocatable / addressable in the first place, it shouldn't just appear to cause issues

Without marking the registers as reserved, then for the example below:

--- name: sv2i64 tracksRegLiveness: true body: | bb.0.entry: liveins: $q0, $q1 %0:fpr128 = COPY $q0 %1:fpr128 = COPY $q1 %35:gpr64 = COPY %0.dsub %36:gpr64 = COPY %1.dsub %9:gpr64 = SDIVXr %35, %36 %37:gpr64 = UMOVvi64 %0, 1 %38:gpr64 = UMOVvi64 %1, 1 %10:gpr64 = SDIVXr %37, %38 %19:fpr128 = INSvi64gpr undef %19, 0, %9 %19:fpr128 = INSvi64gpr %19, 1, %10 %39:gpr64 = COPY %19.dsub %24:gpr64 = MADDXrrr %39, %36, $xzr %41:gpr64 = UMOVvi64 %19, 1 %25:gpr64 = MADDXrrr %41, %38, $xzr %34:fpr128 = INSvi64gpr undef %34, 0, %24 %34:fpr128 = INSvi64gpr %34, 1, %25 %2:fpr128 = SUBv2i64 %0, %34 $q0 = COPY %2 RET_ReallyLR implicit $q0 ...

When I run this with:

llc -global-isel -verify-machineinstrs -run-pass=machine-scheduler

It fails with:

Use of $xzr does not have a corresponding definition on every path: 216r %10:gpr64 = MADDXrrr %9:gpr64, %3:gpr64, $xzr LLVM ERROR: Use not jointly dominated by defs. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: ./bin/llc -global-isel -verify-machineinstrs -run-pass=machine-scheduler /tmp/t.mir -o - 1. Running pass 'Function Pass Manager' on module '/tmp/t.mir'. 2. Running pass 'Machine Instruction Scheduler' on function '@sv2i64' ... #8 0x0000ffff80062b7c llvm::LiveRangeCalc::findReachingDefs(llvm::LiveRange&, llvm::MachineBasicBlock&, llvm::SlotIndex, unsigned int, llvm::ArrayRef<llvm::SlotIndex>) #9 0x0000ffff80063e94 llvm::LiveRangeCalc::extend(llvm::LiveRange&, llvm::SlotIndex, unsigned int, llvm::ArrayRef<llvm::SlotIndex>) #10 0x0000ffff80064a18 llvm::LiveIntervalCalc::extendToUses(llvm::LiveRange&, llvm::Register, llvm::LaneBitmask, llvm::LiveInterval*) #11 0x0000ffff8003e82c llvm::LiveIntervals::computeRegUnitRange(llvm::LiveRange&, unsigned int) #12 0x0000ffff80044cdc llvm::LiveIntervals::HMEditor::updateAllRanges(llvm::MachineInstr*) #13 0x0000ffff8004848c llvm::LiveIntervals::handleMove(llvm::MachineInstr&, bool) #14 0x0000ffff801f44ec llvm::ScheduleDAGMI::moveInstruction(llvm::MachineInstr*, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>) #15 0x0000ffff801fdb58 llvm::ScheduleDAGMILive::scheduleMI(llvm::SUnit*, bool) #16 0x0000ffff8020b214 llvm::ScheduleDAGMILive::schedule() #17 0x0000ffff801f0934 (anonymous namespace)::MachineSchedulerBase::scheduleRegions(llvm::ScheduleDAGInstrs&, bool) (.isra.0) MachineScheduler.cpp:0:0

This smells like an unrelated bug, this is not the kind of error I expected

I don't think there is a bug; the code for moving an instruction goes through the list of operands to update the register's liverange. For each physreg it then goes through the regunits to calculate/update the liverange for that regunit, but only if the regunit is not reserved.

The code that determines if the register is reserved says:

// A register unit is considered reserved if all its roots and all their // super registers are reserved.

Without this change to AArch64RegisterInfo.cpp, WZR and XZR are marked as reserved, but WZR_HI isn't (because WZR_HI is a sibling of WZR, and markSuperRegs marks only XZR as reserved), and so IsReserved is false for the WZR_HI regunit.

Why this doesn't fail for AMDGPU I don't know, perhaps these registers are always virtual and they never go down this path.

arsenm · 2024-10-30T17:01:38Z

llvm/unittests/Target/AArch64/AArch64RegisterInfoTest.cpp

+
+  // Test that there is no overlap between different (sub)registers
+  // in a tuple.
+  ASSERT_EQ(TRI.getSubRegIndexLaneMask(AArch64::dsub0) &


EXPECT_EQ throughout

When CoveredBySubRegs is true and a sub-register consists of two parts; a regular subreg and an artificial subreg, then TableGen should consider both as a concatenation of subregs. This happens for example when a 64-bit register 'D0' consists of 32-bit 'S0_HI' (artificial) and 'S0', and 'S0' consists of (16-bit) 'H0_HI' (artificial) and 'H0'. Then the concatenation should be: S0_HI, H0_HI, H0.

TableGen builds up a map of "SubRegIdx -> Subclass" where Subclass is the largest class where all registers have SubRegIdx as a sub-register. When SubRegIdx (vis-a-vis the sub-register) is artificial it should still include it in the map. This map is used in various places, including in the calculation of the Lanemask of a register class, which otherwise calculates an incorrect lanemask.

This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here llvm#109797. There are several changes in this patch: * AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests. The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests. * LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples. * TableGen: The handling of the isArtificial flag was entirely broken. Skipping out too early from some of the loops led to incorrect internal representation of the (sub)register(index) hierarchy, and thus resulted in incorrect TableGen info.

sdesmalen-arm · 2024-10-31T11:46:26Z

Is it possible to split out the tablegen changes

Sure, I've created #114391 and #114392!

arsenm · 2024-10-31T15:02:29Z

llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp

@@ -424,6 +424,57 @@ AArch64RegisterInfo::explainReservedReg(const MachineFunction &MF,
  return {};
 }

+static MCPhysReg ReservedHi[] = {


missing const

arsenm · 2024-10-31T15:04:56Z

llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp

@@ -424,6 +424,58 @@ AArch64RegisterInfo::explainReservedReg(const MachineFunction &MF,
  return {};
 }

+static SmallVector<MCPhysReg> ReservedHi = {


What kind of failures? AMDGPU also has synthetic 16-bit high sub registers and they are not explicitly reserved. Are you adding these to an allocatable class?

arsenm · 2024-11-04T16:18:07Z

Was this reopened as a new PR?

sdesmalen-arm · 2024-11-04T16:21:28Z

It wasn't, but I also didn't realise that I closed it. Could Github have done this automatically after the branch it was based of was deleted? (I was about to push the rebased branch of this PR after merging #114391 and #114392)

sdesmalen-arm · 2024-11-04T16:23:15Z

Trying to reopen..

sdesmalen-arm · 2024-11-04T16:26:20Z

I think I need to create a new PR for this as Github doesn't allow me to reopen and choose a different branch to merge into.

sdesmalen-arm requested review from jayfoad, arsenm, davemgreen and kmclaughlin-arm October 30, 2024 16:53

llvmbot added backend:AArch64 tablegen llvm:globalisel labels Oct 30, 2024

arsenm reviewed Oct 30, 2024

View reviewed changes

sdesmalen-arm added 4 commits October 31, 2024 10:02

Baseline Tablegen patch, with test, no tablegen changes

8cb11b0

Fixups

61b3666

sdesmalen-arm force-pushed the srlt-define-high-bits branch from f7e1173 to 61b3666 Compare October 31, 2024 11:50

sdesmalen-arm changed the base branch from main to users/sdesmalen-arm/srlt-fix-tablegen-artificial-subreg-map October 31, 2024 11:51

arsenm reviewed Oct 31, 2024

View reviewed changes

Add const to ReservedHi

68105cf

sdesmalen-arm force-pushed the users/sdesmalen-arm/srlt-fix-tablegen-artificial-subreg-map branch 2 times, most recently from 303e1c8 to 371287e Compare November 4, 2024 15:54

sdesmalen-arm deleted the branch llvm:users/sdesmalen-arm/srlt-fix-tablegen-artificial-subreg-map November 4, 2024 16:10

sdesmalen-arm closed this Nov 4, 2024

sdesmalen-arm mentioned this pull request Nov 4, 2024

[AArch64] Define high bits of FPR and GPR registers (take 2) #114827

Merged

[AArch64] Define high bits of FPR and GPR registers. #114263

[AArch64] Define high bits of FPR and GPR registers. #114263

Uh oh!

Conversation

sdesmalen-arm commented Oct 30, 2024

Uh oh!

llvmbot commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 30, 2024

Uh oh!

github-actions bot commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdesmalen-arm commented Oct 31, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arsenm commented Nov 4, 2024

Uh oh!

sdesmalen-arm commented Nov 4, 2024

Uh oh!

sdesmalen-arm commented Nov 4, 2024

Uh oh!

sdesmalen-arm commented Nov 4, 2024

Uh oh!

Uh oh!

llvmbot commented Oct 30, 2024 •

edited

Loading

github-actions bot commented Oct 30, 2024 •

edited

Loading