-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Disable FP loads/stores when fp-armv8 not enabled #77817
Conversation
Most of the floating-point instructions are already gated on the fp-armv8 subtarget feature (or some other feature), but most of the load and store instructions, and one move instruction, were not. I found this list of instructions with a script which consumes the output of llvm-tblgen --dump-json, looking for instructions which have an FPR operand but no predicate. That script now finds zero instructions. This only affects assembly, not codegen, because the floating-point types and registers are already not marked as legal when the FPU is disabled, so it is impossible for any of these to be selected.
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-mc Author: None (ostannard) ChangesMost of the floating-point instructions are already gated on the fp-armv8 subtarget feature (or some other feature), but most of the load and store instructions, and one move instruction, were not. I found this list of instructions with a script which consumes the output of llvm-tblgen --dump-json, looking for instructions which have an FPR operand but no predicate. That script now finds zero instructions. This only affects assembly, not codegen, because the floating-point types and registers are already not marked as legal when the FPU is disabled, so it is impossible for any of these to be selected. Patch is 22.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77817.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 62b2bf490f37a2..3f4875998fc004 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -2925,27 +2925,33 @@ def UDF : UDFType<0, "udf">;
// Pair (indexed, offset)
defm LDPW : LoadPairOffset<0b00, 0, GPR32z, simm7s4, "ldp">;
defm LDPX : LoadPairOffset<0b10, 0, GPR64z, simm7s8, "ldp">;
+let Predicates = [HasFPARMv8] in {
defm LDPS : LoadPairOffset<0b00, 1, FPR32Op, simm7s4, "ldp">;
defm LDPD : LoadPairOffset<0b01, 1, FPR64Op, simm7s8, "ldp">;
defm LDPQ : LoadPairOffset<0b10, 1, FPR128Op, simm7s16, "ldp">;
+}
defm LDPSW : LoadPairOffset<0b01, 0, GPR64z, simm7s4, "ldpsw">;
// Pair (pre-indexed)
def LDPWpre : LoadPairPreIdx<0b00, 0, GPR32z, simm7s4, "ldp">;
def LDPXpre : LoadPairPreIdx<0b10, 0, GPR64z, simm7s8, "ldp">;
+let Predicates = [HasFPARMv8] in {
def LDPSpre : LoadPairPreIdx<0b00, 1, FPR32Op, simm7s4, "ldp">;
def LDPDpre : LoadPairPreIdx<0b01, 1, FPR64Op, simm7s8, "ldp">;
def LDPQpre : LoadPairPreIdx<0b10, 1, FPR128Op, simm7s16, "ldp">;
+}
def LDPSWpre : LoadPairPreIdx<0b01, 0, GPR64z, simm7s4, "ldpsw">;
// Pair (post-indexed)
def LDPWpost : LoadPairPostIdx<0b00, 0, GPR32z, simm7s4, "ldp">;
def LDPXpost : LoadPairPostIdx<0b10, 0, GPR64z, simm7s8, "ldp">;
+let Predicates = [HasFPARMv8] in {
def LDPSpost : LoadPairPostIdx<0b00, 1, FPR32Op, simm7s4, "ldp">;
def LDPDpost : LoadPairPostIdx<0b01, 1, FPR64Op, simm7s8, "ldp">;
def LDPQpost : LoadPairPostIdx<0b10, 1, FPR128Op, simm7s16, "ldp">;
+}
def LDPSWpost : LoadPairPostIdx<0b01, 0, GPR64z, simm7s4, "ldpsw">;
@@ -2953,9 +2959,11 @@ def LDPSWpost : LoadPairPostIdx<0b01, 0, GPR64z, simm7s4, "ldpsw">;
// Pair (no allocate)
defm LDNPW : LoadPairNoAlloc<0b00, 0, GPR32z, simm7s4, "ldnp">;
defm LDNPX : LoadPairNoAlloc<0b10, 0, GPR64z, simm7s8, "ldnp">;
+let Predicates = [HasFPARMv8] in {
defm LDNPS : LoadPairNoAlloc<0b00, 1, FPR32Op, simm7s4, "ldnp">;
defm LDNPD : LoadPairNoAlloc<0b01, 1, FPR64Op, simm7s8, "ldnp">;
defm LDNPQ : LoadPairNoAlloc<0b10, 1, FPR128Op, simm7s16, "ldnp">;
+}
def : Pat<(AArch64ldp (am_indexed7s64 GPR64sp:$Rn, simm7s8:$offset)),
(LDPXi GPR64sp:$Rn, simm7s8:$offset)>;
@@ -2973,11 +2981,13 @@ defm LDRW : Load32RO<0b10, 0, 0b01, GPR32, "ldr", i32, load>;
defm LDRX : Load64RO<0b11, 0, 0b01, GPR64, "ldr", i64, load>;
// Floating-point
+let Predicates = [HasFPARMv8] in {
defm LDRB : Load8RO<0b00, 1, 0b01, FPR8Op, "ldr", i8, load>;
defm LDRH : Load16RO<0b01, 1, 0b01, FPR16Op, "ldr", f16, load>;
defm LDRS : Load32RO<0b10, 1, 0b01, FPR32Op, "ldr", f32, load>;
defm LDRD : Load64RO<0b11, 1, 0b01, FPR64Op, "ldr", f64, load>;
defm LDRQ : Load128RO<0b00, 1, 0b11, FPR128Op, "ldr", f128, load>;
+}
// Load sign-extended half-word
defm LDRSHW : Load16RO<0b01, 0, 0b11, GPR32, "ldrsh", i32, sextloadi16>;
@@ -3147,6 +3157,7 @@ defm LDRX : LoadUI<0b11, 0, 0b01, GPR64z, uimm12s8, "ldr",
defm LDRW : LoadUI<0b10, 0, 0b01, GPR32z, uimm12s4, "ldr",
[(set GPR32z:$Rt,
(load (am_indexed32 GPR64sp:$Rn, uimm12s4:$offset)))]>;
+let Predicates = [HasFPARMv8] in {
defm LDRB : LoadUI<0b00, 1, 0b01, FPR8Op, uimm12s1, "ldr",
[(set FPR8Op:$Rt,
(load (am_indexed8 GPR64sp:$Rn, uimm12s1:$offset)))]>;
@@ -3162,6 +3173,7 @@ defm LDRD : LoadUI<0b11, 1, 0b01, FPR64Op, uimm12s8, "ldr",
defm LDRQ : LoadUI<0b00, 1, 0b11, FPR128Op, uimm12s16, "ldr",
[(set (f128 FPR128Op:$Rt),
(load (am_indexed128 GPR64sp:$Rn, uimm12s16:$offset)))]>;
+}
// bf16 load pattern
def : Pat <(bf16 (load (am_indexed16 GPR64sp:$Rn, uimm12s2:$offset))),
@@ -3339,12 +3351,14 @@ def LDRWl : LoadLiteral<0b00, 0, GPR32z, "ldr",
[(set GPR32z:$Rt, (load (AArch64adr alignedglobal:$label)))]>;
def LDRXl : LoadLiteral<0b01, 0, GPR64z, "ldr",
[(set GPR64z:$Rt, (load (AArch64adr alignedglobal:$label)))]>;
+let Predicates = [HasFPARMv8] in {
def LDRSl : LoadLiteral<0b00, 1, FPR32Op, "ldr",
[(set (f32 FPR32Op:$Rt), (load (AArch64adr alignedglobal:$label)))]>;
def LDRDl : LoadLiteral<0b01, 1, FPR64Op, "ldr",
[(set (f64 FPR64Op:$Rt), (load (AArch64adr alignedglobal:$label)))]>;
def LDRQl : LoadLiteral<0b10, 1, FPR128Op, "ldr",
[(set (f128 FPR128Op:$Rt), (load (AArch64adr alignedglobal:$label)))]>;
+}
// load sign-extended word
def LDRSWl : LoadLiteral<0b10, 0, GPR64z, "ldrsw",
@@ -3367,6 +3381,7 @@ defm LDURX : LoadUnscaled<0b11, 0, 0b01, GPR64z, "ldur",
defm LDURW : LoadUnscaled<0b10, 0, 0b01, GPR32z, "ldur",
[(set GPR32z:$Rt,
(load (am_unscaled32 GPR64sp:$Rn, simm9:$offset)))]>;
+let Predicates = [HasFPARMv8] in {
defm LDURB : LoadUnscaled<0b00, 1, 0b01, FPR8Op, "ldur",
[(set FPR8Op:$Rt,
(load (am_unscaled8 GPR64sp:$Rn, simm9:$offset)))]>;
@@ -3382,6 +3397,7 @@ defm LDURD : LoadUnscaled<0b11, 1, 0b01, FPR64Op, "ldur",
defm LDURQ : LoadUnscaled<0b00, 1, 0b11, FPR128Op, "ldur",
[(set (f128 FPR128Op:$Rt),
(load (am_unscaled128 GPR64sp:$Rn, simm9:$offset)))]>;
+}
defm LDURHH
: LoadUnscaled<0b01, 0, 0b01, GPR32, "ldurh",
@@ -3641,11 +3657,13 @@ defm LDTRSW : LoadUnprivileged<0b10, 0, 0b10, GPR64, "ldtrsw">;
// (immediate pre-indexed)
def LDRWpre : LoadPreIdx<0b10, 0, 0b01, GPR32z, "ldr">;
def LDRXpre : LoadPreIdx<0b11, 0, 0b01, GPR64z, "ldr">;
+let Predicates = [HasFPARMv8] in {
def LDRBpre : LoadPreIdx<0b00, 1, 0b01, FPR8Op, "ldr">;
def LDRHpre : LoadPreIdx<0b01, 1, 0b01, FPR16Op, "ldr">;
def LDRSpre : LoadPreIdx<0b10, 1, 0b01, FPR32Op, "ldr">;
def LDRDpre : LoadPreIdx<0b11, 1, 0b01, FPR64Op, "ldr">;
def LDRQpre : LoadPreIdx<0b00, 1, 0b11, FPR128Op, "ldr">;
+}
// load sign-extended half-word
def LDRSHWpre : LoadPreIdx<0b01, 0, 0b11, GPR32z, "ldrsh">;
@@ -3666,11 +3684,13 @@ def LDRSWpre : LoadPreIdx<0b10, 0, 0b10, GPR64z, "ldrsw">;
// (immediate post-indexed)
def LDRWpost : LoadPostIdx<0b10, 0, 0b01, GPR32z, "ldr">;
def LDRXpost : LoadPostIdx<0b11, 0, 0b01, GPR64z, "ldr">;
+let Predicates = [HasFPARMv8] in {
def LDRBpost : LoadPostIdx<0b00, 1, 0b01, FPR8Op, "ldr">;
def LDRHpost : LoadPostIdx<0b01, 1, 0b01, FPR16Op, "ldr">;
def LDRSpost : LoadPostIdx<0b10, 1, 0b01, FPR32Op, "ldr">;
def LDRDpost : LoadPostIdx<0b11, 1, 0b01, FPR64Op, "ldr">;
def LDRQpost : LoadPostIdx<0b00, 1, 0b11, FPR128Op, "ldr">;
+}
// load sign-extended half-word
def LDRSHWpost : LoadPostIdx<0b01, 0, 0b11, GPR32z, "ldrsh">;
@@ -3695,30 +3715,38 @@ def LDRSWpost : LoadPostIdx<0b10, 0, 0b10, GPR64z, "ldrsw">;
// FIXME: Use dedicated range-checked addressing mode operand here.
defm STPW : StorePairOffset<0b00, 0, GPR32z, simm7s4, "stp">;
defm STPX : StorePairOffset<0b10, 0, GPR64z, simm7s8, "stp">;
+let Predicates = [HasFPARMv8] in {
defm STPS : StorePairOffset<0b00, 1, FPR32Op, simm7s4, "stp">;
defm STPD : StorePairOffset<0b01, 1, FPR64Op, simm7s8, "stp">;
defm STPQ : StorePairOffset<0b10, 1, FPR128Op, simm7s16, "stp">;
+}
// Pair (pre-indexed)
def STPWpre : StorePairPreIdx<0b00, 0, GPR32z, simm7s4, "stp">;
def STPXpre : StorePairPreIdx<0b10, 0, GPR64z, simm7s8, "stp">;
+let Predicates = [HasFPARMv8] in {
def STPSpre : StorePairPreIdx<0b00, 1, FPR32Op, simm7s4, "stp">;
def STPDpre : StorePairPreIdx<0b01, 1, FPR64Op, simm7s8, "stp">;
def STPQpre : StorePairPreIdx<0b10, 1, FPR128Op, simm7s16, "stp">;
+}
// Pair (post-indexed)
def STPWpost : StorePairPostIdx<0b00, 0, GPR32z, simm7s4, "stp">;
def STPXpost : StorePairPostIdx<0b10, 0, GPR64z, simm7s8, "stp">;
+let Predicates = [HasFPARMv8] in {
def STPSpost : StorePairPostIdx<0b00, 1, FPR32Op, simm7s4, "stp">;
def STPDpost : StorePairPostIdx<0b01, 1, FPR64Op, simm7s8, "stp">;
def STPQpost : StorePairPostIdx<0b10, 1, FPR128Op, simm7s16, "stp">;
+}
// Pair (no allocate)
defm STNPW : StorePairNoAlloc<0b00, 0, GPR32z, simm7s4, "stnp">;
defm STNPX : StorePairNoAlloc<0b10, 0, GPR64z, simm7s8, "stnp">;
+let Predicates = [HasFPARMv8] in {
defm STNPS : StorePairNoAlloc<0b00, 1, FPR32Op, simm7s4, "stnp">;
defm STNPD : StorePairNoAlloc<0b01, 1, FPR64Op, simm7s8, "stnp">;
defm STNPQ : StorePairNoAlloc<0b10, 1, FPR128Op, simm7s16, "stnp">;
+}
def : Pat<(AArch64stp GPR64z:$Rt, GPR64z:$Rt2, (am_indexed7s64 GPR64sp:$Rn, simm7s8:$offset)),
(STPXi GPR64z:$Rt, GPR64z:$Rt2, GPR64sp:$Rn, simm7s8:$offset)>;
@@ -3738,11 +3766,13 @@ defm STRX : Store64RO<0b11, 0, 0b00, GPR64, "str", i64, store>;
// Floating-point
+let Predicates = [HasFPARMv8] in {
defm STRB : Store8RO< 0b00, 1, 0b00, FPR8Op, "str", i8, store>;
defm STRH : Store16RO<0b01, 1, 0b00, FPR16Op, "str", f16, store>;
defm STRS : Store32RO<0b10, 1, 0b00, FPR32Op, "str", f32, store>;
defm STRD : Store64RO<0b11, 1, 0b00, FPR64Op, "str", f64, store>;
defm STRQ : Store128RO<0b00, 1, 0b10, FPR128Op, "str">;
+}
let Predicates = [UseSTRQro], AddedComplexity = 10 in {
def : Pat<(store (f128 FPR128:$Rt),
@@ -3851,6 +3881,7 @@ defm STRX : StoreUIz<0b11, 0, 0b00, GPR64z, uimm12s8, "str",
defm STRW : StoreUIz<0b10, 0, 0b00, GPR32z, uimm12s4, "str",
[(store GPR32z:$Rt,
(am_indexed32 GPR64sp:$Rn, uimm12s4:$offset))]>;
+let Predicates = [HasFPARMv8] in {
defm STRB : StoreUI<0b00, 1, 0b00, FPR8Op, uimm12s1, "str",
[(store FPR8Op:$Rt,
(am_indexed8 GPR64sp:$Rn, uimm12s1:$offset))]>;
@@ -3864,6 +3895,7 @@ defm STRD : StoreUI<0b11, 1, 0b00, FPR64Op, uimm12s8, "str",
[(store (f64 FPR64Op:$Rt),
(am_indexed64 GPR64sp:$Rn, uimm12s8:$offset))]>;
defm STRQ : StoreUI<0b00, 1, 0b10, FPR128Op, uimm12s16, "str", []>;
+}
defm STRHH : StoreUIz<0b01, 0, 0b00, GPR32z, uimm12s2, "strh",
[(truncstorei16 GPR32z:$Rt,
@@ -3985,6 +4017,7 @@ defm STURX : StoreUnscaled<0b11, 0, 0b00, GPR64z, "stur",
defm STURW : StoreUnscaled<0b10, 0, 0b00, GPR32z, "stur",
[(store GPR32z:$Rt,
(am_unscaled32 GPR64sp:$Rn, simm9:$offset))]>;
+let Predicates = [HasFPARMv8] in {
defm STURB : StoreUnscaled<0b00, 1, 0b00, FPR8Op, "stur",
[(store FPR8Op:$Rt,
(am_unscaled8 GPR64sp:$Rn, simm9:$offset))]>;
@@ -4000,6 +4033,7 @@ defm STURD : StoreUnscaled<0b11, 1, 0b00, FPR64Op, "stur",
defm STURQ : StoreUnscaled<0b00, 1, 0b10, FPR128Op, "stur",
[(store (f128 FPR128Op:$Rt),
(am_unscaled128 GPR64sp:$Rn, simm9:$offset))]>;
+}
defm STURHH : StoreUnscaled<0b01, 0, 0b00, GPR32z, "sturh",
[(truncstorei16 GPR32z:$Rt,
(am_unscaled16 GPR64sp:$Rn, simm9:$offset))]>;
@@ -4156,11 +4190,13 @@ defm STTRB : StoreUnprivileged<0b00, 0, 0b00, GPR32, "sttrb">;
// (immediate pre-indexed)
def STRWpre : StorePreIdx<0b10, 0, 0b00, GPR32z, "str", pre_store, i32>;
def STRXpre : StorePreIdx<0b11, 0, 0b00, GPR64z, "str", pre_store, i64>;
+let Predicates = [HasFPARMv8] in {
def STRBpre : StorePreIdx<0b00, 1, 0b00, FPR8Op, "str", pre_store, i8>;
def STRHpre : StorePreIdx<0b01, 1, 0b00, FPR16Op, "str", pre_store, f16>;
def STRSpre : StorePreIdx<0b10, 1, 0b00, FPR32Op, "str", pre_store, f32>;
def STRDpre : StorePreIdx<0b11, 1, 0b00, FPR64Op, "str", pre_store, f64>;
def STRQpre : StorePreIdx<0b00, 1, 0b10, FPR128Op, "str", pre_store, f128>;
+}
def STRBBpre : StorePreIdx<0b00, 0, 0b00, GPR32z, "strb", pre_truncsti8, i32>;
def STRHHpre : StorePreIdx<0b01, 0, 0b00, GPR32z, "strh", pre_truncsti16, i32>;
@@ -4210,11 +4246,13 @@ def : Pat<(pre_store (v8f16 FPR128:$Rt), GPR64sp:$addr, simm9:$off),
// (immediate post-indexed)
def STRWpost : StorePostIdx<0b10, 0, 0b00, GPR32z, "str", post_store, i32>;
def STRXpost : StorePostIdx<0b11, 0, 0b00, GPR64z, "str", post_store, i64>;
+let Predicates = [HasFPARMv8] in {
def STRBpost : StorePostIdx<0b00, 1, 0b00, FPR8Op, "str", post_store, i8>;
def STRHpost : StorePostIdx<0b01, 1, 0b00, FPR16Op, "str", post_store, f16>;
def STRSpost : StorePostIdx<0b10, 1, 0b00, FPR32Op, "str", post_store, f32>;
def STRDpost : StorePostIdx<0b11, 1, 0b00, FPR64Op, "str", post_store, f64>;
def STRQpost : StorePostIdx<0b00, 1, 0b10, FPR128Op, "str", post_store, f128>;
+}
def STRBBpost : StorePostIdx<0b00, 0, 0b00, GPR32z, "strb", post_truncsti8, i32>;
def STRHHpost : StorePostIdx<0b01, 0, 0b00, GPR32z, "strh", post_truncsti16, i32>;
@@ -4531,7 +4569,8 @@ def : Pat<(f64 (fdiv (f64 (any_uint_to_fp (i32 GPR32:$Rn))), fixedpoint_f64_i32:
defm FMOV : UnscaledConversion<"fmov">;
// Add pseudo ops for FMOV 0 so we can mark them as isReMaterializable
-let isReMaterializable = 1, isCodeGenOnly = 1, isAsCheapAsAMove = 1 in {
+let isReMaterializable = 1, isCodeGenOnly = 1, isAsCheapAsAMove = 1,
+ Predicates = [HasFPARMv8] in {
def FMOVH0 : Pseudo<(outs FPR16:$Rd), (ins), [(set f16:$Rd, (fpimm0))]>,
Sched<[WriteF]>;
def FMOVS0 : Pseudo<(outs FPR32:$Rd), (ins), [(set f32:$Rd, (fpimm0))]>,
@@ -4758,6 +4797,7 @@ def : Pat<(bf16 (AArch64csel (bf16 FPR16:$Rn), (bf16 FPR16:$Rm), (i32 imm:$cond)
// CSEL instructions providing f128 types need to be handled by a
// pseudo-instruction since the eventual code will need to introduce basic
// blocks and control flow.
+let Predicates = [HasFPARMv8] in
def F128CSEL : Pseudo<(outs FPR128:$Rd),
(ins FPR128:$Rn, FPR128:$Rm, ccode:$cond),
[(set (f128 FPR128:$Rd),
diff --git a/llvm/lib/Target/AArch64/AArch64SystemOperands.td b/llvm/lib/Target/AArch64/AArch64SystemOperands.td
index 0b80f263e12ee1..0564741c497000 100644
--- a/llvm/lib/Target/AArch64/AArch64SystemOperands.td
+++ b/llvm/lib/Target/AArch64/AArch64SystemOperands.td
@@ -986,8 +986,10 @@ def : RWSysReg<"SPSR_irq", 0b11, 0b100, 0b0100, 0b0011, 0b000>;
def : RWSysReg<"SPSR_abt", 0b11, 0b100, 0b0100, 0b0011, 0b001>;
def : RWSysReg<"SPSR_und", 0b11, 0b100, 0b0100, 0b0011, 0b010>;
def : RWSysReg<"SPSR_fiq", 0b11, 0b100, 0b0100, 0b0011, 0b011>;
+let Requires = [{ {AArch64::FeatureFPARMv8} }] in {
def : RWSysReg<"FPCR", 0b11, 0b011, 0b0100, 0b0100, 0b000>;
def : RWSysReg<"FPSR", 0b11, 0b011, 0b0100, 0b0100, 0b001>;
+}
def : RWSysReg<"DSPSR_EL0", 0b11, 0b011, 0b0100, 0b0101, 0b000>;
def : RWSysReg<"DLR_EL0", 0b11, 0b011, 0b0100, 0b0101, 0b001>;
def : RWSysReg<"IFSR32_EL2", 0b11, 0b100, 0b0101, 0b0000, 0b001>;
diff --git a/llvm/test/MC/AArch64/no-fp-errors.s b/llvm/test/MC/AArch64/no-fp-errors.s
new file mode 100644
index 00000000000000..1595ba4798b082
--- /dev/null
+++ b/llvm/test/MC/AArch64/no-fp-errors.s
@@ -0,0 +1,193 @@
+// RUN: not llvm-mc -triple aarch64-none-eabi -mattr=-fp-armv8 < %s 2>&1 | FileCheck %s --implicit-check-not error
+
+ ldr s0, [x0]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ str q0, [x0]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ fmov d0, xzr
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldnp s0, s1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldnp d0, d1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldnp q0, q1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldp s0, s1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldp d0, d1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldp q0, q1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldp s0, s1, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldp d0, d1, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldp q0, q1, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldp s0, s1, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldp d0, d1, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldp q0, q1, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+
+ ldr b0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr h0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr s0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr d0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr q0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldr b0, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr h0, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr s0, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr d0, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr q0, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldr b0, [x0, x1]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr h0, [x0, x1]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr s0, [x0, x1]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr d0, [x0, x1]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr q0, [x0, x1]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldr b0, [x0, w1, sxtw]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr h0, [x0, w1, sxtw]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr s0, [x0, w1, sxtw]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr d0, [x0, w1, sxtw]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr q0, [x0, w1, sxtw]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ ldr b0, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr h0, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr s0, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr d0, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr q0, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+label:
+ ldr s0, label
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr d0, label
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ ldr q0, label
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ stnp s0, s1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stnp d0, d1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stnp q0, q1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ stp s0, s1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stp d0, d1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stp q0, q1, [x0, #16]
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ stp s0, s1, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stp d0, d1, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stp q0, q1, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ stp s0, s1, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stp d0, d1, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ stp q0, q1, [x0, #16]!
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+
+ str b0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ str h0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ str s0, [x0], #16
+// CHECK: [[@LINE-1]]:3: error: instruction requires: fp-armv8
+ str d0, [x0], #16
+// CHECK: [[@LINE-1]]:3: ...
[truncated]
|
This is the script I used to check for non-predicated instructions: https://gist.github.com/ostannard/f15ad6b4cffc9b00b1ee50f990dd801d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We've observed a breakage in Fuchsia due to this commit: https://luci-milo.appspot.com/ui/p/fuchsia/builders/ci/clang_toolchain.ci.core.arm64-debug/b8758715042351668065/overview That module broke due to (amonst others) the inline asm I'd agree that these instructions should be disabled on systems that actually are configured to lack a FPU, but it would be prudent to roll this back and alter the semantics of |
After a bit more looking around, I found that this has been broken for quite a while (#30140), and it sounded like it was not quite trivial to fix. This PR just makes the issue somewhat worse. |
Since clang's Would something like this work for you as a workaround? I've checked and this is accepted by both gcc and clang with
|
Most of the floating-point instructions are already gated on the fp-armv8 subtarget feature (or some other feature), but most of the load and store instructions, and one move instruction, were not. I found this list of instructions with a script which consumes the output of llvm-tblgen --dump-json, looking for instructions which have an FPR operand but no predicate. That script now finds zero instructions. This only affects assembly, not codegen, because the floating-point types and registers are already not marked as legal when the FPU is disabled, so it is impossible for any of these to be selected.
Most of the floating-point instructions are already gated on the fp-armv8 subtarget feature (or some other feature), but most of the load and store instructions, and one move instruction, were not.
I found this list of instructions with a script which consumes the output of llvm-tblgen --dump-json, looking for instructions which have an FPR operand but no predicate. That script now finds zero instructions.
This only affects assembly, not codegen, because the floating-point types and registers are already not marked as legal when the FPU is disabled, so it is impossible for any of these to be selected.