-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Define high bits of FPR and GPR registers (take 2) #114827
Conversation
This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here llvm#109797. There are several changes in this patch: * AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests. The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests. * LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples. * TableGen: The handling of the isArtificial flag was entirely broken. Skipping out too early from some of the loops led to incorrect internal representation of the (sub)register(index) hierarchy, and thus resulted in incorrect TableGen info.
@llvm/pr-subscribers-tablegen @llvm/pr-subscribers-llvm-globalisel Author: Sander de Smalen (sdesmalen-arm) ChangesThis is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797. There are several changes in this patch:
(This follows on from #114263 which was mysteriously closed.) Patch is 134.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114827.diff 21 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
index 1a9e5899892a1b..b051122609883f 100644
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
@@ -1541,7 +1541,10 @@ static bool canRenameMOP(const MachineOperand &MOP,
// Note that this relies on the structure of the AArch64 register file. In
// particular, a subregister cannot be written without overwriting the
// whole register.
- if (RegClass->HasDisjunctSubRegs) {
+ if (RegClass->HasDisjunctSubRegs && RegClass->CoveredBySubRegs &&
+ (TRI->getSubRegisterClass(RegClass, AArch64::dsub0) ||
+ TRI->getSubRegisterClass(RegClass, AArch64::qsub0) ||
+ TRI->getSubRegisterClass(RegClass, AArch64::zsub0))) {
LLVM_DEBUG(
dbgs()
<< " Cannot rename operands with multiple disjunct subregisters ("
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index 18290dd5f32df9..a885ce218f6eb2 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -424,6 +424,57 @@ AArch64RegisterInfo::explainReservedReg(const MachineFunction &MF,
return {};
}
+static const MCPhysReg ReservedHi[] = {
+ AArch64::B0_HI, AArch64::B1_HI, AArch64::B2_HI, AArch64::B3_HI,
+ AArch64::B4_HI, AArch64::B5_HI, AArch64::B6_HI, AArch64::B7_HI,
+ AArch64::B8_HI, AArch64::B9_HI, AArch64::B10_HI, AArch64::B11_HI,
+ AArch64::B12_HI, AArch64::B13_HI, AArch64::B14_HI, AArch64::B15_HI,
+ AArch64::B16_HI, AArch64::B17_HI, AArch64::B18_HI, AArch64::B19_HI,
+ AArch64::B20_HI, AArch64::B21_HI, AArch64::B22_HI, AArch64::B23_HI,
+ AArch64::B24_HI, AArch64::B25_HI, AArch64::B26_HI, AArch64::B27_HI,
+ AArch64::B28_HI, AArch64::B29_HI, AArch64::B30_HI, AArch64::B31_HI,
+ AArch64::H0_HI, AArch64::H1_HI, AArch64::H2_HI, AArch64::H3_HI,
+ AArch64::H4_HI, AArch64::H5_HI, AArch64::H6_HI, AArch64::H7_HI,
+ AArch64::H8_HI, AArch64::H9_HI, AArch64::H10_HI, AArch64::H11_HI,
+ AArch64::H12_HI, AArch64::H13_HI, AArch64::H14_HI, AArch64::H15_HI,
+ AArch64::H16_HI, AArch64::H17_HI, AArch64::H18_HI, AArch64::H19_HI,
+ AArch64::H20_HI, AArch64::H21_HI, AArch64::H22_HI, AArch64::H23_HI,
+ AArch64::H24_HI, AArch64::H25_HI, AArch64::H26_HI, AArch64::H27_HI,
+ AArch64::H28_HI, AArch64::H29_HI, AArch64::H30_HI, AArch64::H31_HI,
+ AArch64::S0_HI, AArch64::S1_HI, AArch64::S2_HI, AArch64::S3_HI,
+ AArch64::S4_HI, AArch64::S5_HI, AArch64::S6_HI, AArch64::S7_HI,
+ AArch64::S8_HI, AArch64::S9_HI, AArch64::S10_HI, AArch64::S11_HI,
+ AArch64::S12_HI, AArch64::S13_HI, AArch64::S14_HI, AArch64::S15_HI,
+ AArch64::S16_HI, AArch64::S17_HI, AArch64::S18_HI, AArch64::S19_HI,
+ AArch64::S20_HI, AArch64::S21_HI, AArch64::S22_HI, AArch64::S23_HI,
+ AArch64::S24_HI, AArch64::S25_HI, AArch64::S26_HI, AArch64::S27_HI,
+ AArch64::S28_HI, AArch64::S29_HI, AArch64::S30_HI, AArch64::S31_HI,
+ AArch64::D0_HI, AArch64::D1_HI, AArch64::D2_HI, AArch64::D3_HI,
+ AArch64::D4_HI, AArch64::D5_HI, AArch64::D6_HI, AArch64::D7_HI,
+ AArch64::D8_HI, AArch64::D9_HI, AArch64::D10_HI, AArch64::D11_HI,
+ AArch64::D12_HI, AArch64::D13_HI, AArch64::D14_HI, AArch64::D15_HI,
+ AArch64::D16_HI, AArch64::D17_HI, AArch64::D18_HI, AArch64::D19_HI,
+ AArch64::D20_HI, AArch64::D21_HI, AArch64::D22_HI, AArch64::D23_HI,
+ AArch64::D24_HI, AArch64::D25_HI, AArch64::D26_HI, AArch64::D27_HI,
+ AArch64::D28_HI, AArch64::D29_HI, AArch64::D30_HI, AArch64::D31_HI,
+ AArch64::Q0_HI, AArch64::Q1_HI, AArch64::Q2_HI, AArch64::Q3_HI,
+ AArch64::Q4_HI, AArch64::Q5_HI, AArch64::Q6_HI, AArch64::Q7_HI,
+ AArch64::Q8_HI, AArch64::Q9_HI, AArch64::Q10_HI, AArch64::Q11_HI,
+ AArch64::Q12_HI, AArch64::Q13_HI, AArch64::Q14_HI, AArch64::Q15_HI,
+ AArch64::Q16_HI, AArch64::Q17_HI, AArch64::Q18_HI, AArch64::Q19_HI,
+ AArch64::Q20_HI, AArch64::Q21_HI, AArch64::Q22_HI, AArch64::Q23_HI,
+ AArch64::Q24_HI, AArch64::Q25_HI, AArch64::Q26_HI, AArch64::Q27_HI,
+ AArch64::Q28_HI, AArch64::Q29_HI, AArch64::Q30_HI, AArch64::Q31_HI,
+ AArch64::W0_HI, AArch64::W1_HI, AArch64::W2_HI, AArch64::W3_HI,
+ AArch64::W4_HI, AArch64::W5_HI, AArch64::W6_HI, AArch64::W7_HI,
+ AArch64::W8_HI, AArch64::W9_HI, AArch64::W10_HI, AArch64::W11_HI,
+ AArch64::W12_HI, AArch64::W13_HI, AArch64::W14_HI, AArch64::W15_HI,
+ AArch64::W16_HI, AArch64::W17_HI, AArch64::W18_HI, AArch64::W19_HI,
+ AArch64::W20_HI, AArch64::W21_HI, AArch64::W22_HI, AArch64::W23_HI,
+ AArch64::W24_HI, AArch64::W25_HI, AArch64::W26_HI, AArch64::W27_HI,
+ AArch64::W28_HI, AArch64::W29_HI, AArch64::W30_HI, AArch64::WSP_HI,
+ AArch64::WZR_HI};
+
BitVector
AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
const AArch64FrameLowering *TFI = getFrameLowering(MF);
@@ -490,7 +541,10 @@ AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
markSuperRegs(Reserved, AArch64::W28);
}
- assert(checkAllSuperRegsMarked(Reserved));
+ for (Register R : ReservedHi)
+ Reserved.set(R);
+
+ assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
return Reserved;
}
@@ -514,7 +568,7 @@ AArch64RegisterInfo::getReservedRegs(const MachineFunction &MF) const {
markSuperRegs(Reserved, AArch64::LR);
}
- assert(checkAllSuperRegsMarked(Reserved));
+ assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
return Reserved;
}
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
index 4117d74d10c1e7..5f7f6aa7a7bf99 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
@@ -20,33 +20,39 @@ class AArch64Reg<bits<16> enc, string n, list<Register> subregs = [],
let Namespace = "AArch64" in {
// SubRegIndexes for GPR registers
- def sub_32 : SubRegIndex<32>;
- def sube64 : SubRegIndex<64>;
- def subo64 : SubRegIndex<64>;
- def sube32 : SubRegIndex<32>;
- def subo32 : SubRegIndex<32>;
+ def sub_32 : SubRegIndex<32, 0>;
+ def sub_32_hi: SubRegIndex<32, 32>;
+ def sube64 : SubRegIndex<64>;
+ def subo64 : SubRegIndex<64>;
+ def sube32 : SubRegIndex<32>;
+ def subo32 : SubRegIndex<32>;
// SubRegIndexes for FPR/Vector registers
- def bsub : SubRegIndex<8>;
- def hsub : SubRegIndex<16>;
- def ssub : SubRegIndex<32>;
- def dsub : SubRegIndex<64>;
- def zsub : SubRegIndex<128>;
+ def bsub : SubRegIndex<8, 0>;
+ def bsub_hi : SubRegIndex<8, 8>;
+ def hsub : SubRegIndex<16, 0>;
+ def hsub_hi : SubRegIndex<16, 16>;
+ def ssub : SubRegIndex<32, 0>;
+ def ssub_hi : SubRegIndex<32, 32>;
+ def dsub : SubRegIndex<64, 0>;
+ def dsub_hi : SubRegIndex<64, 64>;
+ def zsub : SubRegIndex<128, 0>;
+ def zsub_hi : SubRegIndex<-1, 128>;
// Note: Code depends on these having consecutive numbers
- def zsub0 : SubRegIndex<128, -1>;
- def zsub1 : SubRegIndex<128, -1>;
- def zsub2 : SubRegIndex<128, -1>;
- def zsub3 : SubRegIndex<128, -1>;
- // Note: Code depends on these having consecutive numbers
- def dsub0 : SubRegIndex<64>;
- def dsub1 : SubRegIndex<64>;
- def dsub2 : SubRegIndex<64>;
- def dsub3 : SubRegIndex<64>;
+ def zsub0 : SubRegIndex<-1>;
+ def zsub1 : SubRegIndex<-1>;
+ def zsub2 : SubRegIndex<-1>;
+ def zsub3 : SubRegIndex<-1>;
// Note: Code depends on these having consecutive numbers
def qsub0 : SubRegIndex<128>;
- def qsub1 : SubRegIndex<128>;
- def qsub2 : SubRegIndex<128>;
- def qsub3 : SubRegIndex<128>;
+ def qsub1 : ComposedSubRegIndex<zsub1, zsub>;
+ def qsub2 : ComposedSubRegIndex<zsub2, zsub>;
+ def qsub3 : ComposedSubRegIndex<zsub3, zsub>;
+ // Note: Code depends on these having consecutive numbers
+ def dsub0 : SubRegIndex<64>;
+ def dsub1 : ComposedSubRegIndex<qsub1, dsub>;
+ def dsub2 : ComposedSubRegIndex<qsub2, dsub>;
+ def dsub3 : ComposedSubRegIndex<qsub3, dsub>;
// SubRegIndexes for SME Matrix tiles
def zasubb : SubRegIndex<2048>; // (16 x 16)/1 bytes = 2048 bits
@@ -60,10 +66,10 @@ let Namespace = "AArch64" in {
def zasubq1 : SubRegIndex<128>; // (16 x 16)/16 bytes = 128 bits
// SubRegIndexes for SVE Predicates
- def psub : SubRegIndex<16>;
+ def psub : SubRegIndex<-1>;
// Note: Code depends on these having consecutive numbers
- def psub0 : SubRegIndex<16, -1>;
- def psub1 : SubRegIndex<16, -1>;
+ def psub0 : SubRegIndex<-1>;
+ def psub1 : SubRegIndex<-1>;
}
let Namespace = "AArch64" in {
@@ -74,6 +80,14 @@ let Namespace = "AArch64" in {
//===----------------------------------------------------------------------===//
// Registers
//===----------------------------------------------------------------------===//
+
+foreach i = 0-30 in {
+ // Define W0_HI, W1_HI, .. W30_HI
+ def W#i#_HI : AArch64Reg<-1, "w"#i#"_hi"> { let isArtificial = 1; }
+}
+def WSP_HI : AArch64Reg<-1, "wsp_hi"> { let isArtificial = 1; }
+def WZR_HI : AArch64Reg<-1, "wzr_hi"> { let isArtificial = 1; }
+
def W0 : AArch64Reg<0, "w0" >, DwarfRegNum<[0]>;
def W1 : AArch64Reg<1, "w1" >, DwarfRegNum<[1]>;
def W2 : AArch64Reg<2, "w2" >, DwarfRegNum<[2]>;
@@ -106,44 +120,42 @@ def W28 : AArch64Reg<28, "w28">, DwarfRegNum<[28]>;
def W29 : AArch64Reg<29, "w29">, DwarfRegNum<[29]>;
def W30 : AArch64Reg<30, "w30">, DwarfRegNum<[30]>;
def WSP : AArch64Reg<31, "wsp">, DwarfRegNum<[31]>;
-let isConstant = true in
-def WZR : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP>;
-
-let SubRegIndices = [sub_32] in {
-def X0 : AArch64Reg<0, "x0", [W0]>, DwarfRegAlias<W0>;
-def X1 : AArch64Reg<1, "x1", [W1]>, DwarfRegAlias<W1>;
-def X2 : AArch64Reg<2, "x2", [W2]>, DwarfRegAlias<W2>;
-def X3 : AArch64Reg<3, "x3", [W3]>, DwarfRegAlias<W3>;
-def X4 : AArch64Reg<4, "x4", [W4]>, DwarfRegAlias<W4>;
-def X5 : AArch64Reg<5, "x5", [W5]>, DwarfRegAlias<W5>;
-def X6 : AArch64Reg<6, "x6", [W6]>, DwarfRegAlias<W6>;
-def X7 : AArch64Reg<7, "x7", [W7]>, DwarfRegAlias<W7>;
-def X8 : AArch64Reg<8, "x8", [W8]>, DwarfRegAlias<W8>;
-def X9 : AArch64Reg<9, "x9", [W9]>, DwarfRegAlias<W9>;
-def X10 : AArch64Reg<10, "x10", [W10]>, DwarfRegAlias<W10>;
-def X11 : AArch64Reg<11, "x11", [W11]>, DwarfRegAlias<W11>;
-def X12 : AArch64Reg<12, "x12", [W12]>, DwarfRegAlias<W12>;
-def X13 : AArch64Reg<13, "x13", [W13]>, DwarfRegAlias<W13>;
-def X14 : AArch64Reg<14, "x14", [W14]>, DwarfRegAlias<W14>;
-def X15 : AArch64Reg<15, "x15", [W15]>, DwarfRegAlias<W15>;
-def X16 : AArch64Reg<16, "x16", [W16]>, DwarfRegAlias<W16>;
-def X17 : AArch64Reg<17, "x17", [W17]>, DwarfRegAlias<W17>;
-def X18 : AArch64Reg<18, "x18", [W18]>, DwarfRegAlias<W18>;
-def X19 : AArch64Reg<19, "x19", [W19]>, DwarfRegAlias<W19>;
-def X20 : AArch64Reg<20, "x20", [W20]>, DwarfRegAlias<W20>;
-def X21 : AArch64Reg<21, "x21", [W21]>, DwarfRegAlias<W21>;
-def X22 : AArch64Reg<22, "x22", [W22]>, DwarfRegAlias<W22>;
-def X23 : AArch64Reg<23, "x23", [W23]>, DwarfRegAlias<W23>;
-def X24 : AArch64Reg<24, "x24", [W24]>, DwarfRegAlias<W24>;
-def X25 : AArch64Reg<25, "x25", [W25]>, DwarfRegAlias<W25>;
-def X26 : AArch64Reg<26, "x26", [W26]>, DwarfRegAlias<W26>;
-def X27 : AArch64Reg<27, "x27", [W27]>, DwarfRegAlias<W27>;
-def X28 : AArch64Reg<28, "x28", [W28]>, DwarfRegAlias<W28>;
-def FP : AArch64Reg<29, "x29", [W29]>, DwarfRegAlias<W29>;
-def LR : AArch64Reg<30, "x30", [W30]>, DwarfRegAlias<W30>;
-def SP : AArch64Reg<31, "sp", [WSP]>, DwarfRegAlias<WSP>;
-let isConstant = true in
-def XZR : AArch64Reg<31, "xzr", [WZR]>, DwarfRegAlias<WSP>;
+def WZR : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP> { let isConstant = true; }
+
+let SubRegIndices = [sub_32, sub_32_hi], CoveredBySubRegs = 1 in {
+def X0 : AArch64Reg<0, "x0", [W0, W0_HI]>, DwarfRegAlias<W0>;
+def X1 : AArch64Reg<1, "x1", [W1, W1_HI]>, DwarfRegAlias<W1>;
+def X2 : AArch64Reg<2, "x2", [W2, W2_HI]>, DwarfRegAlias<W2>;
+def X3 : AArch64Reg<3, "x3", [W3, W3_HI]>, DwarfRegAlias<W3>;
+def X4 : AArch64Reg<4, "x4", [W4, W4_HI]>, DwarfRegAlias<W4>;
+def X5 : AArch64Reg<5, "x5", [W5, W5_HI]>, DwarfRegAlias<W5>;
+def X6 : AArch64Reg<6, "x6", [W6, W6_HI]>, DwarfRegAlias<W6>;
+def X7 : AArch64Reg<7, "x7", [W7, W7_HI]>, DwarfRegAlias<W7>;
+def X8 : AArch64Reg<8, "x8", [W8, W8_HI]>, DwarfRegAlias<W8>;
+def X9 : AArch64Reg<9, "x9", [W9, W9_HI]>, DwarfRegAlias<W9>;
+def X10 : AArch64Reg<10, "x10", [W10, W10_HI]>, DwarfRegAlias<W10>;
+def X11 : AArch64Reg<11, "x11", [W11, W11_HI]>, DwarfRegAlias<W11>;
+def X12 : AArch64Reg<12, "x12", [W12, W12_HI]>, DwarfRegAlias<W12>;
+def X13 : AArch64Reg<13, "x13", [W13, W13_HI]>, DwarfRegAlias<W13>;
+def X14 : AArch64Reg<14, "x14", [W14, W14_HI]>, DwarfRegAlias<W14>;
+def X15 : AArch64Reg<15, "x15", [W15, W15_HI]>, DwarfRegAlias<W15>;
+def X16 : AArch64Reg<16, "x16", [W16, W16_HI]>, DwarfRegAlias<W16>;
+def X17 : AArch64Reg<17, "x17", [W17, W17_HI]>, DwarfRegAlias<W17>;
+def X18 : AArch64Reg<18, "x18", [W18, W18_HI]>, DwarfRegAlias<W18>;
+def X19 : AArch64Reg<19, "x19", [W19, W19_HI]>, DwarfRegAlias<W19>;
+def X20 : AArch64Reg<20, "x20", [W20, W20_HI]>, DwarfRegAlias<W20>;
+def X21 : AArch64Reg<21, "x21", [W21, W21_HI]>, DwarfRegAlias<W21>;
+def X22 : AArch64Reg<22, "x22", [W22, W22_HI]>, DwarfRegAlias<W22>;
+def X23 : AArch64Reg<23, "x23", [W23, W23_HI]>, DwarfRegAlias<W23>;
+def X24 : AArch64Reg<24, "x24", [W24, W24_HI]>, DwarfRegAlias<W24>;
+def X25 : AArch64Reg<25, "x25", [W25, W25_HI]>, DwarfRegAlias<W25>;
+def X26 : AArch64Reg<26, "x26", [W26, W26_HI]>, DwarfRegAlias<W26>;
+def X27 : AArch64Reg<27, "x27", [W27, W27_HI]>, DwarfRegAlias<W27>;
+def X28 : AArch64Reg<28, "x28", [W28, W28_HI]>, DwarfRegAlias<W28>;
+def FP : AArch64Reg<29, "x29", [W29, W29_HI]>, DwarfRegAlias<W29>;
+def LR : AArch64Reg<30, "x30", [W30, W30_HI]>, DwarfRegAlias<W30>;
+def SP : AArch64Reg<31, "sp", [WSP, WSP_HI]>, DwarfRegAlias<WSP>;
+def XZR : AArch64Reg<31, "xzr", [WZR, WZR_HI]>, DwarfRegAlias<WSP> { let isConstant = true; }
}
// Condition code register.
@@ -290,6 +302,14 @@ def CCR : RegisterClass<"AArch64", [i32], 32, (add NZCV)> {
// Floating Point Scalar Registers
//===----------------------------------------------------------------------===//
+foreach i = 0-31 in {
+ def B#i#_HI : AArch64Reg<-1, "b"#i#"_hi"> { let isArtificial = 1; }
+ def H#i#_HI : AArch64Reg<-1, "h"#i#"_hi"> { let isArtificial = 1; }
+ def S#i#_HI : AArch64Reg<-1, "s"#i#"_hi"> { let isArtificial = 1; }
+ def D#i#_HI : AArch64Reg<-1, "d"#i#"_hi"> { let isArtificial = 1; }
+ def Q#i#_HI : AArch64Reg<-1, "q"#i#"_hi"> { let isArtificial = 1; }
+}
+
def B0 : AArch64Reg<0, "b0">, DwarfRegNum<[64]>;
def B1 : AArch64Reg<1, "b1">, DwarfRegNum<[65]>;
def B2 : AArch64Reg<2, "b2">, DwarfRegNum<[66]>;
@@ -323,144 +343,144 @@ def B29 : AArch64Reg<29, "b29">, DwarfRegNum<[93]>;
def B30 : AArch64Reg<30, "b30">, DwarfRegNum<[94]>;
def B31 : AArch64Reg<31, "b31">, DwarfRegNum<[95]>;
-let SubRegIndices = [bsub] in {
-def H0 : AArch64Reg<0, "h0", [B0]>, DwarfRegAlias<B0>;
-def H1 : AArch64Reg<1, "h1", [B1]>, DwarfRegAlias<B1>;
-def H2 : AArch64Reg<2, "h2", [B2]>, DwarfRegAlias<B2>;
-def H3 : AArch64Reg<3, "h3", [B3]>, DwarfRegAlias<B3>;
-def H4 : AArch64Reg<4, "h4", [B4]>, DwarfRegAlias<B4>;
-def H5 : AArch64Reg<5, "h5", [B5]>, DwarfRegAlias<B5>;
-def H6 : AArch64Reg<6, "h6", [B6]>, DwarfRegAlias<B6>;
-def H7 : AArch64Reg<7, "h7", [B7]>, DwarfRegAlias<B7>;
-def H8 : AArch64Reg<8, "h8", [B8]>, DwarfRegAlias<B8>;
-def H9 : AArch64Reg<9, "h9", [B9]>, DwarfRegAlias<B9>;
-def H10 : AArch64Reg<10, "h10", [B10]>, DwarfRegAlias<B10>;
-def H11 : AArch64Reg<11, "h11", [B11]>, DwarfRegAlias<B11>;
-def H12 : AArch64Reg<12, "h12", [B12]>, DwarfRegAlias<B12>;
-def H13 : AArch64Reg<13, "h13", [B13]>, DwarfRegAlias<B13>;
-def H14 : AArch64Reg<14, "h14", [B14]>, DwarfRegAlias<B14>;
-def H15 : AArch64Reg<15, "h15", [B15]>, DwarfRegAlias<B15>;
-def H16 : AArch64Reg<16, "h16", [B16]>, DwarfRegAlias<B16>;
-def H17 : AArch64Reg<17, "h17", [B17]>, DwarfRegAlias<B17>;
-def H18 : AArch64Reg<18, "h18", [B18]>, DwarfRegAlias<B18>;
-def H19 : AArch64Reg<19, "h19", [B19]>, DwarfRegAlias<B19>;
-def H20 : AArch64Reg<20, "h20", [B20]>, DwarfRegAlias<B20>;
-def H21 : AArch64Reg<21, "h21", [B21]>, DwarfRegAlias<B21>;
-def H22 : AArch64Reg<22, "h22", [B22]>, DwarfRegAlias<B22>;
-def H23 : AArch64Reg<23, "h23", [B23]>, DwarfRegAlias<B23>;
-def H24 : AArch64Reg<24, "h24", [B24]>, DwarfRegAlias<B24>;
-def H25 : AArch64Reg<25, "h25", [B25]>, DwarfRegAlias<B25>;
-def H26 : AArch64Reg<26, "h26", [B26]>, DwarfRegAlias<B26>;
-def H27 : AArch64Reg<27, "h27", [B27]>, DwarfRegAlias<B27>;
-def H28 : AArch64Reg<28, "h28", [B28]>, DwarfRegAlias<B28>;
-def H29 : AArch64Reg<29, "h29", [B29]>, DwarfRegAlias<B29>;
-def H30 : AArch64Reg<30, "h30", [B30]>, DwarfRegAlias<B30>;
-def H31 : AArch64Reg<31, "h31", [B31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [hsub] in {
-def S0 : AArch64Reg<0, "s0", [H0]>, DwarfRegAlias<B0>;
-def S1 : AArch64Reg<1, "s1", [H1]>, DwarfRegAlias<B1>;
-def S2 : AArch64Reg<2, "s2", [H2]>, DwarfRegAlias<B2>;
-def S3 : AArch64Reg<3, "s3", [H3]>, DwarfRegAlias<B3>;
-def S4 : AArch64Reg<4, "s4", [H4]>, DwarfRegAlias<B4>;
-def S5 : AArch64Reg<5, "s5", [H5]>, DwarfRegAlias<B5>;
-def S6 : AArch64Reg<6, "s6", [H6]>, DwarfRegAlias<B6>;
-def S7 : AArch64Reg<7, "s7", [H7]>, DwarfRegAlias<B7>;
-def S8 : AArch64Reg<8, "s8", [H8]>, DwarfRegAlias<B8>;
-def S9 : AArch64Reg<9, "s9", [H9]>, DwarfRegAlias<B9>;
-def S10 : AArch64Reg<10, "s10", [H10]>, DwarfRegAlias<B10>;
-def S11 : AArch64Reg<11, "s11", [H11]>, DwarfRegAlias<B11>;
-def S12 : AArch64Reg<12, "s12", [H12]>, DwarfRegAlias<B12>;
-def S13 : AArch64Reg<13, "s13", [H13]>, DwarfRegAlias<B13>;
-def S14 : AArch64Reg<14, "s14", [H14]>, DwarfRegAlias<B14>;
-def S15 : AArch64Reg<15, "s15", [H15]>, DwarfRegAlias<B15>;
-def S16 : AArch64Reg<16, "s16", [H16]>, DwarfRegAlias<B16>;
-def S17 : AArch64Reg<17, "s17", [H17]>, DwarfRegAlias<B17>;
-def S18 : AArch64Reg<18, "s18", [H18]>, DwarfRegAlias<B18>;
-def S19 : AArch64Reg<19, "s19", [H19]>, DwarfRegAlias<B19>;
-def S20 : AArch64Reg<20, "s20", [H20]>, DwarfRegAlias<B20>;
-def S21 : AArch64Reg<21, "s21", [H21]>, DwarfRegAlias<B21>;
-def S22 : AArch64Reg<22, "s22", [H22]>, DwarfRegAlias<B22>;
-def S23 : AArch64Reg<23, "s23", [H23]>, DwarfRegAlias<B23>;
-def S24 : AArch64Reg<24, "s24", [H24]>, DwarfRegAlias<B24>;
-def S25 : AArch64Reg<25, "s25", [H25]>, DwarfRegAlias<B25>;
-def S26 : AArch64Reg<26, "s26", [H26]>, DwarfRegAlias<B26>;
-def S27 : AArch64Reg<27, "s27", [H27]>, DwarfRegAlias<B27>;
-def S28 : AArch64Reg<28, "s28", [H28]>, DwarfRegAlias<B28>;
-def S29 : AArch64Reg<29, "s29", [H29]>, DwarfRegAlias<B29>;
-def S30 : AArch64Reg<30, "s30", [H30]>, DwarfRegAlias<B30>;
-def S31 : AArch64Reg<31, "s31", [H31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [ssub], RegAltNameIndices = [vreg, vlist1] in {
-def D0 : AArch64Reg<0, "d0", [S0], ["v0", ""]>, DwarfRegAlias<B0>;
-def D1 : AArch64Reg<1, "d1", [S1], ["v1", ""]>, DwarfRegAlias<B1>;
-def D2 : AArch64Reg<2, "d2", [S2], ["v2", ""]>, DwarfRegAlias<B2>;
-def D3 : AArch64Reg<3, "d3", [S3], ["v3", ""]>, DwarfRegAlias<B3>;
-def D4 : AArch64Reg<4, "d4", [S4], ["v4", ""]>, DwarfRegAlias<B4>;
-def D5 : AArch64Reg<5, "d5", [S5], ["v5", ""]>, DwarfRegAlias<B5>;
-def D6 : AArch64Reg<6, "d6", [S6], ["v6", ""]>, DwarfRegAlias<B6>;
-def D7 : AArch64Reg<7, "d7", [S7], ["v7", ""]>, DwarfRegAlias<B7>;
-def D8 : AArch64Reg<8, "d8", [S8], ["v8", ""]>, DwarfRegAlias<B8>;
-def D9 : AArch64Reg<9, "d9", [S9], ["v9", ""]>, DwarfRegAlias<B9>;
-def D10 : AArch64Reg<10, "d10", [S10], ["v10", ""]>, DwarfRegAlias<B10>;
-def D11 : AArch64Reg<11, "d11", [S11], ["v11", ""]>, DwarfRegAlias<B11>;
-def D12 : AArch64Reg<12, "d12", [S12], ["v12", ""]>, DwarfRegAlias<B12>;
-def D13 : AArch64Reg<13, "d13", [S13], ["v13", ""]>, DwarfRegAl...
[truncated]
|
@llvm/pr-subscribers-backend-aarch64 Author: Sander de Smalen (sdesmalen-arm) ChangesThis is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797. There are several changes in this patch:
(This follows on from #114263 which was mysteriously closed.) Patch is 134.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114827.diff 21 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
index 1a9e5899892a1b..b051122609883f 100644
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
@@ -1541,7 +1541,10 @@ static bool canRenameMOP(const MachineOperand &MOP,
// Note that this relies on the structure of the AArch64 register file. In
// particular, a subregister cannot be written without overwriting the
// whole register.
- if (RegClass->HasDisjunctSubRegs) {
+ if (RegClass->HasDisjunctSubRegs && RegClass->CoveredBySubRegs &&
+ (TRI->getSubRegisterClass(RegClass, AArch64::dsub0) ||
+ TRI->getSubRegisterClass(RegClass, AArch64::qsub0) ||
+ TRI->getSubRegisterClass(RegClass, AArch64::zsub0))) {
LLVM_DEBUG(
dbgs()
<< " Cannot rename operands with multiple disjunct subregisters ("
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index 18290dd5f32df9..a885ce218f6eb2 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -424,6 +424,57 @@ AArch64RegisterInfo::explainReservedReg(const MachineFunction &MF,
return {};
}
+static const MCPhysReg ReservedHi[] = {
+ AArch64::B0_HI, AArch64::B1_HI, AArch64::B2_HI, AArch64::B3_HI,
+ AArch64::B4_HI, AArch64::B5_HI, AArch64::B6_HI, AArch64::B7_HI,
+ AArch64::B8_HI, AArch64::B9_HI, AArch64::B10_HI, AArch64::B11_HI,
+ AArch64::B12_HI, AArch64::B13_HI, AArch64::B14_HI, AArch64::B15_HI,
+ AArch64::B16_HI, AArch64::B17_HI, AArch64::B18_HI, AArch64::B19_HI,
+ AArch64::B20_HI, AArch64::B21_HI, AArch64::B22_HI, AArch64::B23_HI,
+ AArch64::B24_HI, AArch64::B25_HI, AArch64::B26_HI, AArch64::B27_HI,
+ AArch64::B28_HI, AArch64::B29_HI, AArch64::B30_HI, AArch64::B31_HI,
+ AArch64::H0_HI, AArch64::H1_HI, AArch64::H2_HI, AArch64::H3_HI,
+ AArch64::H4_HI, AArch64::H5_HI, AArch64::H6_HI, AArch64::H7_HI,
+ AArch64::H8_HI, AArch64::H9_HI, AArch64::H10_HI, AArch64::H11_HI,
+ AArch64::H12_HI, AArch64::H13_HI, AArch64::H14_HI, AArch64::H15_HI,
+ AArch64::H16_HI, AArch64::H17_HI, AArch64::H18_HI, AArch64::H19_HI,
+ AArch64::H20_HI, AArch64::H21_HI, AArch64::H22_HI, AArch64::H23_HI,
+ AArch64::H24_HI, AArch64::H25_HI, AArch64::H26_HI, AArch64::H27_HI,
+ AArch64::H28_HI, AArch64::H29_HI, AArch64::H30_HI, AArch64::H31_HI,
+ AArch64::S0_HI, AArch64::S1_HI, AArch64::S2_HI, AArch64::S3_HI,
+ AArch64::S4_HI, AArch64::S5_HI, AArch64::S6_HI, AArch64::S7_HI,
+ AArch64::S8_HI, AArch64::S9_HI, AArch64::S10_HI, AArch64::S11_HI,
+ AArch64::S12_HI, AArch64::S13_HI, AArch64::S14_HI, AArch64::S15_HI,
+ AArch64::S16_HI, AArch64::S17_HI, AArch64::S18_HI, AArch64::S19_HI,
+ AArch64::S20_HI, AArch64::S21_HI, AArch64::S22_HI, AArch64::S23_HI,
+ AArch64::S24_HI, AArch64::S25_HI, AArch64::S26_HI, AArch64::S27_HI,
+ AArch64::S28_HI, AArch64::S29_HI, AArch64::S30_HI, AArch64::S31_HI,
+ AArch64::D0_HI, AArch64::D1_HI, AArch64::D2_HI, AArch64::D3_HI,
+ AArch64::D4_HI, AArch64::D5_HI, AArch64::D6_HI, AArch64::D7_HI,
+ AArch64::D8_HI, AArch64::D9_HI, AArch64::D10_HI, AArch64::D11_HI,
+ AArch64::D12_HI, AArch64::D13_HI, AArch64::D14_HI, AArch64::D15_HI,
+ AArch64::D16_HI, AArch64::D17_HI, AArch64::D18_HI, AArch64::D19_HI,
+ AArch64::D20_HI, AArch64::D21_HI, AArch64::D22_HI, AArch64::D23_HI,
+ AArch64::D24_HI, AArch64::D25_HI, AArch64::D26_HI, AArch64::D27_HI,
+ AArch64::D28_HI, AArch64::D29_HI, AArch64::D30_HI, AArch64::D31_HI,
+ AArch64::Q0_HI, AArch64::Q1_HI, AArch64::Q2_HI, AArch64::Q3_HI,
+ AArch64::Q4_HI, AArch64::Q5_HI, AArch64::Q6_HI, AArch64::Q7_HI,
+ AArch64::Q8_HI, AArch64::Q9_HI, AArch64::Q10_HI, AArch64::Q11_HI,
+ AArch64::Q12_HI, AArch64::Q13_HI, AArch64::Q14_HI, AArch64::Q15_HI,
+ AArch64::Q16_HI, AArch64::Q17_HI, AArch64::Q18_HI, AArch64::Q19_HI,
+ AArch64::Q20_HI, AArch64::Q21_HI, AArch64::Q22_HI, AArch64::Q23_HI,
+ AArch64::Q24_HI, AArch64::Q25_HI, AArch64::Q26_HI, AArch64::Q27_HI,
+ AArch64::Q28_HI, AArch64::Q29_HI, AArch64::Q30_HI, AArch64::Q31_HI,
+ AArch64::W0_HI, AArch64::W1_HI, AArch64::W2_HI, AArch64::W3_HI,
+ AArch64::W4_HI, AArch64::W5_HI, AArch64::W6_HI, AArch64::W7_HI,
+ AArch64::W8_HI, AArch64::W9_HI, AArch64::W10_HI, AArch64::W11_HI,
+ AArch64::W12_HI, AArch64::W13_HI, AArch64::W14_HI, AArch64::W15_HI,
+ AArch64::W16_HI, AArch64::W17_HI, AArch64::W18_HI, AArch64::W19_HI,
+ AArch64::W20_HI, AArch64::W21_HI, AArch64::W22_HI, AArch64::W23_HI,
+ AArch64::W24_HI, AArch64::W25_HI, AArch64::W26_HI, AArch64::W27_HI,
+ AArch64::W28_HI, AArch64::W29_HI, AArch64::W30_HI, AArch64::WSP_HI,
+ AArch64::WZR_HI};
+
BitVector
AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
const AArch64FrameLowering *TFI = getFrameLowering(MF);
@@ -490,7 +541,10 @@ AArch64RegisterInfo::getStrictlyReservedRegs(const MachineFunction &MF) const {
markSuperRegs(Reserved, AArch64::W28);
}
- assert(checkAllSuperRegsMarked(Reserved));
+ for (Register R : ReservedHi)
+ Reserved.set(R);
+
+ assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
return Reserved;
}
@@ -514,7 +568,7 @@ AArch64RegisterInfo::getReservedRegs(const MachineFunction &MF) const {
markSuperRegs(Reserved, AArch64::LR);
}
- assert(checkAllSuperRegsMarked(Reserved));
+ assert(checkAllSuperRegsMarked(Reserved, ReservedHi));
return Reserved;
}
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
index 4117d74d10c1e7..5f7f6aa7a7bf99 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
@@ -20,33 +20,39 @@ class AArch64Reg<bits<16> enc, string n, list<Register> subregs = [],
let Namespace = "AArch64" in {
// SubRegIndexes for GPR registers
- def sub_32 : SubRegIndex<32>;
- def sube64 : SubRegIndex<64>;
- def subo64 : SubRegIndex<64>;
- def sube32 : SubRegIndex<32>;
- def subo32 : SubRegIndex<32>;
+ def sub_32 : SubRegIndex<32, 0>;
+ def sub_32_hi: SubRegIndex<32, 32>;
+ def sube64 : SubRegIndex<64>;
+ def subo64 : SubRegIndex<64>;
+ def sube32 : SubRegIndex<32>;
+ def subo32 : SubRegIndex<32>;
// SubRegIndexes for FPR/Vector registers
- def bsub : SubRegIndex<8>;
- def hsub : SubRegIndex<16>;
- def ssub : SubRegIndex<32>;
- def dsub : SubRegIndex<64>;
- def zsub : SubRegIndex<128>;
+ def bsub : SubRegIndex<8, 0>;
+ def bsub_hi : SubRegIndex<8, 8>;
+ def hsub : SubRegIndex<16, 0>;
+ def hsub_hi : SubRegIndex<16, 16>;
+ def ssub : SubRegIndex<32, 0>;
+ def ssub_hi : SubRegIndex<32, 32>;
+ def dsub : SubRegIndex<64, 0>;
+ def dsub_hi : SubRegIndex<64, 64>;
+ def zsub : SubRegIndex<128, 0>;
+ def zsub_hi : SubRegIndex<-1, 128>;
// Note: Code depends on these having consecutive numbers
- def zsub0 : SubRegIndex<128, -1>;
- def zsub1 : SubRegIndex<128, -1>;
- def zsub2 : SubRegIndex<128, -1>;
- def zsub3 : SubRegIndex<128, -1>;
- // Note: Code depends on these having consecutive numbers
- def dsub0 : SubRegIndex<64>;
- def dsub1 : SubRegIndex<64>;
- def dsub2 : SubRegIndex<64>;
- def dsub3 : SubRegIndex<64>;
+ def zsub0 : SubRegIndex<-1>;
+ def zsub1 : SubRegIndex<-1>;
+ def zsub2 : SubRegIndex<-1>;
+ def zsub3 : SubRegIndex<-1>;
// Note: Code depends on these having consecutive numbers
def qsub0 : SubRegIndex<128>;
- def qsub1 : SubRegIndex<128>;
- def qsub2 : SubRegIndex<128>;
- def qsub3 : SubRegIndex<128>;
+ def qsub1 : ComposedSubRegIndex<zsub1, zsub>;
+ def qsub2 : ComposedSubRegIndex<zsub2, zsub>;
+ def qsub3 : ComposedSubRegIndex<zsub3, zsub>;
+ // Note: Code depends on these having consecutive numbers
+ def dsub0 : SubRegIndex<64>;
+ def dsub1 : ComposedSubRegIndex<qsub1, dsub>;
+ def dsub2 : ComposedSubRegIndex<qsub2, dsub>;
+ def dsub3 : ComposedSubRegIndex<qsub3, dsub>;
// SubRegIndexes for SME Matrix tiles
def zasubb : SubRegIndex<2048>; // (16 x 16)/1 bytes = 2048 bits
@@ -60,10 +66,10 @@ let Namespace = "AArch64" in {
def zasubq1 : SubRegIndex<128>; // (16 x 16)/16 bytes = 128 bits
// SubRegIndexes for SVE Predicates
- def psub : SubRegIndex<16>;
+ def psub : SubRegIndex<-1>;
// Note: Code depends on these having consecutive numbers
- def psub0 : SubRegIndex<16, -1>;
- def psub1 : SubRegIndex<16, -1>;
+ def psub0 : SubRegIndex<-1>;
+ def psub1 : SubRegIndex<-1>;
}
let Namespace = "AArch64" in {
@@ -74,6 +80,14 @@ let Namespace = "AArch64" in {
//===----------------------------------------------------------------------===//
// Registers
//===----------------------------------------------------------------------===//
+
+foreach i = 0-30 in {
+ // Define W0_HI, W1_HI, .. W30_HI
+ def W#i#_HI : AArch64Reg<-1, "w"#i#"_hi"> { let isArtificial = 1; }
+}
+def WSP_HI : AArch64Reg<-1, "wsp_hi"> { let isArtificial = 1; }
+def WZR_HI : AArch64Reg<-1, "wzr_hi"> { let isArtificial = 1; }
+
def W0 : AArch64Reg<0, "w0" >, DwarfRegNum<[0]>;
def W1 : AArch64Reg<1, "w1" >, DwarfRegNum<[1]>;
def W2 : AArch64Reg<2, "w2" >, DwarfRegNum<[2]>;
@@ -106,44 +120,42 @@ def W28 : AArch64Reg<28, "w28">, DwarfRegNum<[28]>;
def W29 : AArch64Reg<29, "w29">, DwarfRegNum<[29]>;
def W30 : AArch64Reg<30, "w30">, DwarfRegNum<[30]>;
def WSP : AArch64Reg<31, "wsp">, DwarfRegNum<[31]>;
-let isConstant = true in
-def WZR : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP>;
-
-let SubRegIndices = [sub_32] in {
-def X0 : AArch64Reg<0, "x0", [W0]>, DwarfRegAlias<W0>;
-def X1 : AArch64Reg<1, "x1", [W1]>, DwarfRegAlias<W1>;
-def X2 : AArch64Reg<2, "x2", [W2]>, DwarfRegAlias<W2>;
-def X3 : AArch64Reg<3, "x3", [W3]>, DwarfRegAlias<W3>;
-def X4 : AArch64Reg<4, "x4", [W4]>, DwarfRegAlias<W4>;
-def X5 : AArch64Reg<5, "x5", [W5]>, DwarfRegAlias<W5>;
-def X6 : AArch64Reg<6, "x6", [W6]>, DwarfRegAlias<W6>;
-def X7 : AArch64Reg<7, "x7", [W7]>, DwarfRegAlias<W7>;
-def X8 : AArch64Reg<8, "x8", [W8]>, DwarfRegAlias<W8>;
-def X9 : AArch64Reg<9, "x9", [W9]>, DwarfRegAlias<W9>;
-def X10 : AArch64Reg<10, "x10", [W10]>, DwarfRegAlias<W10>;
-def X11 : AArch64Reg<11, "x11", [W11]>, DwarfRegAlias<W11>;
-def X12 : AArch64Reg<12, "x12", [W12]>, DwarfRegAlias<W12>;
-def X13 : AArch64Reg<13, "x13", [W13]>, DwarfRegAlias<W13>;
-def X14 : AArch64Reg<14, "x14", [W14]>, DwarfRegAlias<W14>;
-def X15 : AArch64Reg<15, "x15", [W15]>, DwarfRegAlias<W15>;
-def X16 : AArch64Reg<16, "x16", [W16]>, DwarfRegAlias<W16>;
-def X17 : AArch64Reg<17, "x17", [W17]>, DwarfRegAlias<W17>;
-def X18 : AArch64Reg<18, "x18", [W18]>, DwarfRegAlias<W18>;
-def X19 : AArch64Reg<19, "x19", [W19]>, DwarfRegAlias<W19>;
-def X20 : AArch64Reg<20, "x20", [W20]>, DwarfRegAlias<W20>;
-def X21 : AArch64Reg<21, "x21", [W21]>, DwarfRegAlias<W21>;
-def X22 : AArch64Reg<22, "x22", [W22]>, DwarfRegAlias<W22>;
-def X23 : AArch64Reg<23, "x23", [W23]>, DwarfRegAlias<W23>;
-def X24 : AArch64Reg<24, "x24", [W24]>, DwarfRegAlias<W24>;
-def X25 : AArch64Reg<25, "x25", [W25]>, DwarfRegAlias<W25>;
-def X26 : AArch64Reg<26, "x26", [W26]>, DwarfRegAlias<W26>;
-def X27 : AArch64Reg<27, "x27", [W27]>, DwarfRegAlias<W27>;
-def X28 : AArch64Reg<28, "x28", [W28]>, DwarfRegAlias<W28>;
-def FP : AArch64Reg<29, "x29", [W29]>, DwarfRegAlias<W29>;
-def LR : AArch64Reg<30, "x30", [W30]>, DwarfRegAlias<W30>;
-def SP : AArch64Reg<31, "sp", [WSP]>, DwarfRegAlias<WSP>;
-let isConstant = true in
-def XZR : AArch64Reg<31, "xzr", [WZR]>, DwarfRegAlias<WSP>;
+def WZR : AArch64Reg<31, "wzr">, DwarfRegAlias<WSP> { let isConstant = true; }
+
+let SubRegIndices = [sub_32, sub_32_hi], CoveredBySubRegs = 1 in {
+def X0 : AArch64Reg<0, "x0", [W0, W0_HI]>, DwarfRegAlias<W0>;
+def X1 : AArch64Reg<1, "x1", [W1, W1_HI]>, DwarfRegAlias<W1>;
+def X2 : AArch64Reg<2, "x2", [W2, W2_HI]>, DwarfRegAlias<W2>;
+def X3 : AArch64Reg<3, "x3", [W3, W3_HI]>, DwarfRegAlias<W3>;
+def X4 : AArch64Reg<4, "x4", [W4, W4_HI]>, DwarfRegAlias<W4>;
+def X5 : AArch64Reg<5, "x5", [W5, W5_HI]>, DwarfRegAlias<W5>;
+def X6 : AArch64Reg<6, "x6", [W6, W6_HI]>, DwarfRegAlias<W6>;
+def X7 : AArch64Reg<7, "x7", [W7, W7_HI]>, DwarfRegAlias<W7>;
+def X8 : AArch64Reg<8, "x8", [W8, W8_HI]>, DwarfRegAlias<W8>;
+def X9 : AArch64Reg<9, "x9", [W9, W9_HI]>, DwarfRegAlias<W9>;
+def X10 : AArch64Reg<10, "x10", [W10, W10_HI]>, DwarfRegAlias<W10>;
+def X11 : AArch64Reg<11, "x11", [W11, W11_HI]>, DwarfRegAlias<W11>;
+def X12 : AArch64Reg<12, "x12", [W12, W12_HI]>, DwarfRegAlias<W12>;
+def X13 : AArch64Reg<13, "x13", [W13, W13_HI]>, DwarfRegAlias<W13>;
+def X14 : AArch64Reg<14, "x14", [W14, W14_HI]>, DwarfRegAlias<W14>;
+def X15 : AArch64Reg<15, "x15", [W15, W15_HI]>, DwarfRegAlias<W15>;
+def X16 : AArch64Reg<16, "x16", [W16, W16_HI]>, DwarfRegAlias<W16>;
+def X17 : AArch64Reg<17, "x17", [W17, W17_HI]>, DwarfRegAlias<W17>;
+def X18 : AArch64Reg<18, "x18", [W18, W18_HI]>, DwarfRegAlias<W18>;
+def X19 : AArch64Reg<19, "x19", [W19, W19_HI]>, DwarfRegAlias<W19>;
+def X20 : AArch64Reg<20, "x20", [W20, W20_HI]>, DwarfRegAlias<W20>;
+def X21 : AArch64Reg<21, "x21", [W21, W21_HI]>, DwarfRegAlias<W21>;
+def X22 : AArch64Reg<22, "x22", [W22, W22_HI]>, DwarfRegAlias<W22>;
+def X23 : AArch64Reg<23, "x23", [W23, W23_HI]>, DwarfRegAlias<W23>;
+def X24 : AArch64Reg<24, "x24", [W24, W24_HI]>, DwarfRegAlias<W24>;
+def X25 : AArch64Reg<25, "x25", [W25, W25_HI]>, DwarfRegAlias<W25>;
+def X26 : AArch64Reg<26, "x26", [W26, W26_HI]>, DwarfRegAlias<W26>;
+def X27 : AArch64Reg<27, "x27", [W27, W27_HI]>, DwarfRegAlias<W27>;
+def X28 : AArch64Reg<28, "x28", [W28, W28_HI]>, DwarfRegAlias<W28>;
+def FP : AArch64Reg<29, "x29", [W29, W29_HI]>, DwarfRegAlias<W29>;
+def LR : AArch64Reg<30, "x30", [W30, W30_HI]>, DwarfRegAlias<W30>;
+def SP : AArch64Reg<31, "sp", [WSP, WSP_HI]>, DwarfRegAlias<WSP>;
+def XZR : AArch64Reg<31, "xzr", [WZR, WZR_HI]>, DwarfRegAlias<WSP> { let isConstant = true; }
}
// Condition code register.
@@ -290,6 +302,14 @@ def CCR : RegisterClass<"AArch64", [i32], 32, (add NZCV)> {
// Floating Point Scalar Registers
//===----------------------------------------------------------------------===//
+foreach i = 0-31 in {
+ def B#i#_HI : AArch64Reg<-1, "b"#i#"_hi"> { let isArtificial = 1; }
+ def H#i#_HI : AArch64Reg<-1, "h"#i#"_hi"> { let isArtificial = 1; }
+ def S#i#_HI : AArch64Reg<-1, "s"#i#"_hi"> { let isArtificial = 1; }
+ def D#i#_HI : AArch64Reg<-1, "d"#i#"_hi"> { let isArtificial = 1; }
+ def Q#i#_HI : AArch64Reg<-1, "q"#i#"_hi"> { let isArtificial = 1; }
+}
+
def B0 : AArch64Reg<0, "b0">, DwarfRegNum<[64]>;
def B1 : AArch64Reg<1, "b1">, DwarfRegNum<[65]>;
def B2 : AArch64Reg<2, "b2">, DwarfRegNum<[66]>;
@@ -323,144 +343,144 @@ def B29 : AArch64Reg<29, "b29">, DwarfRegNum<[93]>;
def B30 : AArch64Reg<30, "b30">, DwarfRegNum<[94]>;
def B31 : AArch64Reg<31, "b31">, DwarfRegNum<[95]>;
-let SubRegIndices = [bsub] in {
-def H0 : AArch64Reg<0, "h0", [B0]>, DwarfRegAlias<B0>;
-def H1 : AArch64Reg<1, "h1", [B1]>, DwarfRegAlias<B1>;
-def H2 : AArch64Reg<2, "h2", [B2]>, DwarfRegAlias<B2>;
-def H3 : AArch64Reg<3, "h3", [B3]>, DwarfRegAlias<B3>;
-def H4 : AArch64Reg<4, "h4", [B4]>, DwarfRegAlias<B4>;
-def H5 : AArch64Reg<5, "h5", [B5]>, DwarfRegAlias<B5>;
-def H6 : AArch64Reg<6, "h6", [B6]>, DwarfRegAlias<B6>;
-def H7 : AArch64Reg<7, "h7", [B7]>, DwarfRegAlias<B7>;
-def H8 : AArch64Reg<8, "h8", [B8]>, DwarfRegAlias<B8>;
-def H9 : AArch64Reg<9, "h9", [B9]>, DwarfRegAlias<B9>;
-def H10 : AArch64Reg<10, "h10", [B10]>, DwarfRegAlias<B10>;
-def H11 : AArch64Reg<11, "h11", [B11]>, DwarfRegAlias<B11>;
-def H12 : AArch64Reg<12, "h12", [B12]>, DwarfRegAlias<B12>;
-def H13 : AArch64Reg<13, "h13", [B13]>, DwarfRegAlias<B13>;
-def H14 : AArch64Reg<14, "h14", [B14]>, DwarfRegAlias<B14>;
-def H15 : AArch64Reg<15, "h15", [B15]>, DwarfRegAlias<B15>;
-def H16 : AArch64Reg<16, "h16", [B16]>, DwarfRegAlias<B16>;
-def H17 : AArch64Reg<17, "h17", [B17]>, DwarfRegAlias<B17>;
-def H18 : AArch64Reg<18, "h18", [B18]>, DwarfRegAlias<B18>;
-def H19 : AArch64Reg<19, "h19", [B19]>, DwarfRegAlias<B19>;
-def H20 : AArch64Reg<20, "h20", [B20]>, DwarfRegAlias<B20>;
-def H21 : AArch64Reg<21, "h21", [B21]>, DwarfRegAlias<B21>;
-def H22 : AArch64Reg<22, "h22", [B22]>, DwarfRegAlias<B22>;
-def H23 : AArch64Reg<23, "h23", [B23]>, DwarfRegAlias<B23>;
-def H24 : AArch64Reg<24, "h24", [B24]>, DwarfRegAlias<B24>;
-def H25 : AArch64Reg<25, "h25", [B25]>, DwarfRegAlias<B25>;
-def H26 : AArch64Reg<26, "h26", [B26]>, DwarfRegAlias<B26>;
-def H27 : AArch64Reg<27, "h27", [B27]>, DwarfRegAlias<B27>;
-def H28 : AArch64Reg<28, "h28", [B28]>, DwarfRegAlias<B28>;
-def H29 : AArch64Reg<29, "h29", [B29]>, DwarfRegAlias<B29>;
-def H30 : AArch64Reg<30, "h30", [B30]>, DwarfRegAlias<B30>;
-def H31 : AArch64Reg<31, "h31", [B31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [hsub] in {
-def S0 : AArch64Reg<0, "s0", [H0]>, DwarfRegAlias<B0>;
-def S1 : AArch64Reg<1, "s1", [H1]>, DwarfRegAlias<B1>;
-def S2 : AArch64Reg<2, "s2", [H2]>, DwarfRegAlias<B2>;
-def S3 : AArch64Reg<3, "s3", [H3]>, DwarfRegAlias<B3>;
-def S4 : AArch64Reg<4, "s4", [H4]>, DwarfRegAlias<B4>;
-def S5 : AArch64Reg<5, "s5", [H5]>, DwarfRegAlias<B5>;
-def S6 : AArch64Reg<6, "s6", [H6]>, DwarfRegAlias<B6>;
-def S7 : AArch64Reg<7, "s7", [H7]>, DwarfRegAlias<B7>;
-def S8 : AArch64Reg<8, "s8", [H8]>, DwarfRegAlias<B8>;
-def S9 : AArch64Reg<9, "s9", [H9]>, DwarfRegAlias<B9>;
-def S10 : AArch64Reg<10, "s10", [H10]>, DwarfRegAlias<B10>;
-def S11 : AArch64Reg<11, "s11", [H11]>, DwarfRegAlias<B11>;
-def S12 : AArch64Reg<12, "s12", [H12]>, DwarfRegAlias<B12>;
-def S13 : AArch64Reg<13, "s13", [H13]>, DwarfRegAlias<B13>;
-def S14 : AArch64Reg<14, "s14", [H14]>, DwarfRegAlias<B14>;
-def S15 : AArch64Reg<15, "s15", [H15]>, DwarfRegAlias<B15>;
-def S16 : AArch64Reg<16, "s16", [H16]>, DwarfRegAlias<B16>;
-def S17 : AArch64Reg<17, "s17", [H17]>, DwarfRegAlias<B17>;
-def S18 : AArch64Reg<18, "s18", [H18]>, DwarfRegAlias<B18>;
-def S19 : AArch64Reg<19, "s19", [H19]>, DwarfRegAlias<B19>;
-def S20 : AArch64Reg<20, "s20", [H20]>, DwarfRegAlias<B20>;
-def S21 : AArch64Reg<21, "s21", [H21]>, DwarfRegAlias<B21>;
-def S22 : AArch64Reg<22, "s22", [H22]>, DwarfRegAlias<B22>;
-def S23 : AArch64Reg<23, "s23", [H23]>, DwarfRegAlias<B23>;
-def S24 : AArch64Reg<24, "s24", [H24]>, DwarfRegAlias<B24>;
-def S25 : AArch64Reg<25, "s25", [H25]>, DwarfRegAlias<B25>;
-def S26 : AArch64Reg<26, "s26", [H26]>, DwarfRegAlias<B26>;
-def S27 : AArch64Reg<27, "s27", [H27]>, DwarfRegAlias<B27>;
-def S28 : AArch64Reg<28, "s28", [H28]>, DwarfRegAlias<B28>;
-def S29 : AArch64Reg<29, "s29", [H29]>, DwarfRegAlias<B29>;
-def S30 : AArch64Reg<30, "s30", [H30]>, DwarfRegAlias<B30>;
-def S31 : AArch64Reg<31, "s31", [H31]>, DwarfRegAlias<B31>;
-}
-
-let SubRegIndices = [ssub], RegAltNameIndices = [vreg, vlist1] in {
-def D0 : AArch64Reg<0, "d0", [S0], ["v0", ""]>, DwarfRegAlias<B0>;
-def D1 : AArch64Reg<1, "d1", [S1], ["v1", ""]>, DwarfRegAlias<B1>;
-def D2 : AArch64Reg<2, "d2", [S2], ["v2", ""]>, DwarfRegAlias<B2>;
-def D3 : AArch64Reg<3, "d3", [S3], ["v3", ""]>, DwarfRegAlias<B3>;
-def D4 : AArch64Reg<4, "d4", [S4], ["v4", ""]>, DwarfRegAlias<B4>;
-def D5 : AArch64Reg<5, "d5", [S5], ["v5", ""]>, DwarfRegAlias<B5>;
-def D6 : AArch64Reg<6, "d6", [S6], ["v6", ""]>, DwarfRegAlias<B6>;
-def D7 : AArch64Reg<7, "d7", [S7], ["v7", ""]>, DwarfRegAlias<B7>;
-def D8 : AArch64Reg<8, "d8", [S8], ["v8", ""]>, DwarfRegAlias<B8>;
-def D9 : AArch64Reg<9, "d9", [S9], ["v9", ""]>, DwarfRegAlias<B9>;
-def D10 : AArch64Reg<10, "d10", [S10], ["v10", ""]>, DwarfRegAlias<B10>;
-def D11 : AArch64Reg<11, "d11", [S11], ["v11", ""]>, DwarfRegAlias<B11>;
-def D12 : AArch64Reg<12, "d12", [S12], ["v12", ""]>, DwarfRegAlias<B12>;
-def D13 : AArch64Reg<13, "d13", [S13], ["v13", ""]>, DwarfRegAl...
[truncated]
|
def sube32 : SubRegIndex<32>; | ||
def subo32 : SubRegIndex<32>; | ||
def sub_32 : SubRegIndex<32, 0>; | ||
def sub_32_hi: SubRegIndex<32, 32>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment the artificial cases? Could we use an explicit marker for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment to describe the artificial cases. I'm not sure what you mean with an explicit marker though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean an isArtificial bit on SubRegIndex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isArtificial
is a property of a Register or a RegisterClass, but not a SubRegIndex.
@Rin18 would you be happy to have a look at the |
I took a look and compared the MCA output with and without your patch. The first thing that jumped to my attention is that your patch adds dependencies on super registers. If you run the V1-clear-upper-regs.s test with
Similarly for the other super registers in the remaining tests in that file. I've not been able to find the exact cause. I think this has something to do with the addition of |
I'm not entirely sure how llvm-mca works, but I think the issue is that the code adds dependences for all sub-registers of a register, but when |
That sounds right. Before your patch, there were no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general register changes sound to me.
@@ -396,6 +399,9 @@ class MCRegisterInfo { | |||
/// Returns true if the given register is constant. | |||
bool isConstant(MCRegister RegNo) const { return get(RegNo).IsConstant; } | |||
|
|||
/// Returns true if the given register is artificial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth explaining what Artificial means more precisely?
# CHECK-NEXT: # rdefs left : 0 | ||
# CHECK-NEXT: Latency : 3 | ||
# CHECK-NEXT: Depth : 0 | ||
# CHECK-NEXT: Height : 7 | ||
# CHECK-NEXT: Successors: | ||
# CHECK-NEXT: SU(7): Out Latency=1 | ||
# CHECK-NEXT: SU(7): Out Latency=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this happen just for bundles or for other instructions? I was wondering if it was worth filtering out Artificial dependencies as they should be redundant, I believe? It might not matter too much if it is only bundles, as it doesn't seem to be doing any harm and they come up less often so would have less of a compile-time overhead to have so many deps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not specific to BUNDLEs, but it is a little specific to how this test was written:
BUNDLE implicit-def $z1, implicit-def $q1, implicit-def $d1, implicit-def $s1, implicit-def $h1, implicit-def $b1, implicit $z5, implicit $p0, implicit killed $z4, implicit killed $z3 {
the instruction unnecessarily specifies all sub-registers of z1
as well as implicit-def
s; b1, h1, s1, .. q1
, even though implicit-def $z1
would have been sufficient. When I remove the implicit-def for the sub-registers, the scheduler no longer creates the extra out-dependences.
I can see that the scheduler goes through all register units for a given register to create dependences. Before this change there was only a single RegUnit for all registers z1, q1, s1, .. b1
, whereas now there are 6 separate regunits, hence the extra dependences.
I'd need to dig a bit further to understand why the scheduler adds these when the implicit-def operands are present, although this isn't a common case for regular instructions (it may be more common for BUNDLE's?) so don't think this has much of an effect in practice. If you're satisfied with that answer, I'd prefer to move forward with the patch as-is, since I've got a few more follow-up patches to fix up other things. I'll look further into it anyway, just in case there's something that needs fixing.
✅ With the latest revision this PR passed the C/C++ code formatter. |
4ef35e2
to
ab05ae5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. If there are no objections to the name "Artificial", then LGTM.
Thanks @davemgreen! The name Artificial is already established in |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/14632 Here is the relevant piece of the build log for the reference
|
Landing #114827 broke these tests, because they did not account for the new artificial registers.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/151/builds/3244 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/119/builds/3088 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/92/builds/9698 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/1601 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/85/builds/2780 Here is the relevant piece of the build log for the reference
|
…4827) This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here llvm#109797. There are several changes in this patch: * AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests. The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests. * LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples.
Landing llvm#114827 broke these tests, because they did not account for the new artificial registers.
Hi, We noticed a large clang/llvm test time regression on our clang linux-arm64 and mac-arm64 builders after this patch is landed. The test step took around 41mins after this change was landed. see: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-arm64/b8731304524561843377/overview We also noticed that our stage2 build now takes 2x of time. It appears this patch caused a large performance regression on arm64 binaries generated by clang. Could you take a look? |
Also this really affects our mac arm64 builders from 2.8hrs to do a full build and run tests (https://ci.chromium.org/ui/p/fuchsia/builders/prod/clang-mac-arm64/b8731305238018805313/overview) to >5hrs for a full build and run tests (https://ci.chromium.org/ui/p/fuchsia/builders/prod/clang-mac-arm64/b8731291539321484417/overview). |
Thanks for letting me know. From what I can quickly see there is only a time difference for the |
The patch is confirmed through bisecting. https://ci.chromium.org/b/8730624321112150225 builds e52238b and the test time looks normal. https://ci.chromium.org/b/8730630997932921601 builds c1c68ba and the test time regressed. c1c68ba (#114827) is 1 commit after e52238b For stage2 builders, I didn't do a bisection since they use LTO and they take very long time to build. But I have some indirect data: https://ci.chromium.org/ui/p/fuchsia/builders/prod/clang-linux-arm64/b8730794949594129265/overview was from 2 days ago (of course after this patch was landed). It took 2.5hrs for building clang and runtimes. and 2.1hrs for running all the tests. I suspect the patch caused a general performance regression of binaries built by clang. Because the stage1 clang build was using the host toolchian, so the build time wasn't affected. But clang, lld and runtimes tests were using the freshly built clang, they were affected. The stage2 builder was using the stage1 toolchain (which has the performance bug), so the build time also regressed. Let us know if you need additional information. |
This patch makes libcxx tests to timeouts here Many tests slowed down 10x or more. CC @fmayer |
…114827)" The issue with slow compile-time was caused by an assert in AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuperRegsMarked' after adding all the reserved registers. This call gets very expensive after adding the _HI registers due to the way the function searches in the 'Exception' list, which is expected to be a small list but isn't (the patch added 190 _HI regs). It was possible to rewrite the code in such a way that the _HI registers are marked as reserved after the check. This makes the problem go away entirely and restores compile-time to what it was before (tested for `check-runtimes`, which previously showed a ~5x slowdown). This reverts commits: 1434d2a 2704647
This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797.
There are several changes in this patch:
AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests.
The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests.
LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples.
TableGen: The handling of the isArtificial flag was entirely broken. Skipping out too early from some of the loops led to incorrect internal representation of the (sub)register(index) hierarchy, and thus resulted in incorrect TableGen info.
(This follows on from #114263 which was mysteriously closed.)