Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM] Emit an error when the hard-float ABI is enabled but can't be used. #111334

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chrisnc
Copy link
Contributor

@chrisnc chrisnc commented Oct 7, 2024

Currently, compiling for eabihf with a CPU lacking floating-point
registers will silently use the soft-float ABI instead, even though the
Arm attributes section still has Tag_ABI_VFP_args: VFP registers, which
leads to silent ABI mismatches at link time.

Fixes #110383.

@llvmbot
Copy link
Member

llvmbot commented Oct 7, 2024

@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-backend-arm

Author: Chris Copeland (chrisnc)

Changes

Currently, compiling for eabihf with a CPU lacking floating-point
registers will silently use the soft-float ABI instead, even though the
Arm attributes section still has Tag_ABI_VFP_args: VFP registers, which
leads to silent ABI mismatches at link time.

Fixes #110383.


Full diff: https://github.com/llvm/llvm-project/pull/111334.diff

1 Files Affected:

  • (modified) llvm/lib/Target/ARM/ARMTargetMachine.cpp (+5-1)
diff --git a/llvm/lib/Target/ARM/ARMTargetMachine.cpp b/llvm/lib/Target/ARM/ARMTargetMachine.cpp
index 7553778c574033..82927a7430b5d0 100644
--- a/llvm/lib/Target/ARM/ARMTargetMachine.cpp
+++ b/llvm/lib/Target/ARM/ARMTargetMachine.cpp
@@ -309,11 +309,15 @@ ARMBaseTargetMachine::getSubtargetImpl(const Function &F) const {
     // function that reside in TargetOptions.
     resetTargetOptions(F);
     I = std::make_unique<ARMSubtarget>(TargetTriple, CPU, FS, *this, isLittle,
-                                        F.hasMinSize());
+                                       F.hasMinSize());
 
     if (!I->isThumb() && !I->hasARMOps())
       F.getContext().emitError("Function '" + F.getName() + "' uses ARM "
           "instructions, but the target does not support ARM mode execution.");
+
+    if (I->isTargetHardFloat() && !I->hasFPRegs())
+      F.getContext().emitError("The hardfloat ABI is enabled, but the target "
+                               "lacks floating-point registers.");
   }
 
   return I.get();

@chrisnc chrisnc force-pushed the arm-error-eabihf-nofp branch 2 times, most recently from b0e9c78 to b73e1b3 Compare October 7, 2024 04:34
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Oct 7, 2024
@chrisnc
Copy link
Contributor Author

chrisnc commented Oct 7, 2024

clang tests are passing now, but some llvm tests are not, e.g., invocations of llc in CodeGen/ARM/arm-eabi.ll that specify --target=arm-none-eabihf don't include enough features to actually use eabihf. Any suggestions on the right place to address this would be appreciated. Should the test just add the necessary -mattr=+fpregs, or should this be implicitly added somewhere else?

@davemgreen
Copy link
Collaborator

It looks like there is already a warning for this in clang, it only triggers from -mfloat-abi=hard though: https://godbolt.org/z/dcaz8had4. Could it be made to work with any hard-float env? And maybe be made an error down-gradable to a warning?

Generally clang-level warnings/errors are more user friendly then the llvm-level errors (but both may be useful for other frontends).

@chrisnc
Copy link
Contributor Author

chrisnc commented Oct 7, 2024

It looks like there is already a warning for this in clang, it only triggers from -mfloat-abi=hard though: https://godbolt.org/z/dcaz8had4. Could it be made to work with any hard-float env? And maybe be made an error down-gradable to a warning?

Generally clang-level warnings/errors are more user friendly then the llvm-level errors (but both may be useful for other frontends).

That seems like a decent improvement to the clang warning, but the backend behavior of still emitting the VFP ABI tag in this case seems like a bug, regardless of the frontend, so IMO codegen should be aborted in this situation.

@smithp35
Copy link
Collaborator

smithp35 commented Oct 7, 2024

Whatever we choose to do I think clang and llvm should be consistent. There's no point in clang giving a warning if llvm is going to give an error.

I can see two positions most clearly:

  • clang gives an error message for an eabihf target (or -mfloat-abi=hard). This is what GCC does today. While we could also put an error in LLVM I'm not sure it is necessary.
  • clang gives a warning message for both eabihf target (of -mfloat-abi=hard) and uses -mfloat-abi=soft, the warning should ideally say that is changing ABI rather than just unsupported. We'd need to change LLVM to get the build attributes right.

Personally I don't have a problem with clang giving an error in this case as that is what GCC does.

Appendix, some empirical data:
GCC 13

arm-none-eabi-gcc -c -mcpu=cortex-m3 -mfloat-abi=hard float.c
cc1: error: '-mfloat-abi=hard': selected architecture lacks an FPU

Clang -mfloat-abi

clang --target=arm-none-eabi -mcpu=cortex-m3 float.c -c -mfloat-abi=hard
clang: warning: '-mfloat-abi=hard': selected processor lacks floating point registers [-Wunsupported-abi]

Actual behaviour is soft-float but with incorrect build attributes.

Clang --target=arm-none-eabihf

clang --target=arm-none-eabi -mcpu=cortex-m3 float.c -c -mfloat-abi=hard

Actual behaviour is soft-float but with incorrect build attributes.

@chrisnc
Copy link
Contributor Author

chrisnc commented Oct 7, 2024

My preference would be to remove the clang warning and add the LLVM error so that other frontends can benefit from this checking, as well as the LLVM tools themselves. The LLVM test failures reveal a few places where this issue seems to be ignored.

@RalfJung
Copy link
Contributor

RalfJung commented Oct 7, 2024

Generally clang-level warnings/errors are more user friendly then the llvm-level errors (but both may be useful for other frontends).

What would be really nice is if the LLVM backend could tell the frontend about the error rather than reporting it itself, that way frontends can use their nice error reporting code but don't have to re-implement the error detection logic. :)

Anyway, that's a different issue...

@chrisnc
Copy link
Contributor Author

chrisnc commented Nov 3, 2024

I've updated this PR to remove the aforementioned clang warning for this case, and I've updated all clang and llvm tests that were implicitly using the soft-float ABI despite requesting hard-float. In some cases, explicitly testing hard-float seemed to be intended because there was a corresponding line for the soft-float version, so I added a feature or CPU selection that accomplishes this. In other cases, the hard-float-ness of the test did not seem relevant, so I just switched those to soft-float. I attempted to write a test case for the new LLVM error, but I'm struggling with having a lit test that tolerates llc exiting with a nonzero exit code, even if I can get the CHECK directives to match. Any pointers on this would be appreciated. Update: new test for llc is added, which checks that llc will error out with the correct message in this case.

@chrisnc chrisnc force-pushed the arm-error-eabihf-nofp branch from 5e916c3 to 4bd5ec3 Compare November 6, 2024 05:22
…sed.

Currently, compiling for an eabihf target with a CPU lacking
floating-point registers will silently use the soft-float ABI instead,
even though the Arm attributes section still has Tag_ABI_VFP_args: VFP
registers, which leads to silent ABI mismatches at link time.

Update all ARM tests that were using an affected combination to enable
the necessary FPU features or use a soft-float ABI.

[clang] Remove the warning from clang that detected this case only if
-mfloat-abi=hard or -mhard-float were specified explicitly.

Fixes llvm#110383.
@chrisnc chrisnc force-pushed the arm-error-eabihf-nofp branch from 4bd5ec3 to 9903206 Compare November 6, 2024 13:46
@chrisnc
Copy link
Contributor Author

chrisnc commented Nov 7, 2024

All unit tests are now updated and passing and this change is ready for review. Thanks!

@chrisnc
Copy link
Contributor Author

chrisnc commented Nov 13, 2024

Friendly bump :) @davemgreen @smithp35

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not yet convinced that llvm is the right place for this error message. There's quite a lot of test changes that I presume needed to make and at least the LTO use case looks like we don't want to require extra information. I'd also prefer this to be a front-end error/warning as this is a front-end/source-level mistake, with better diagnostic control options available.

I think if it is implemented in llvm then it will need to use the per-function subtarget to not risk fallout with LTO builds.

Apologies I'm probably not particularly enthusiastic about the approach. I've not worked in this area for some time so I'm happy for other reviewer's opinions to take precedence.


if (!I->isThumb() && !I->hasARMOps())
F.getContext().emitError("Function '" + F.getName() + "' uses ARM "
"instructions, but the target does not support ARM mode execution.");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC [*] TargetMachine::getSubtargetImpl is the "global" subtarget. It has been a long time since I've looked at the area so I could be wrong.

Putting the warning here makes me nervous as individual functions have their own subtargets. It is common for LTO to rely entirely on the per function subtarget. I can remember an Android problem that was caused by using the global subtarget for a per-function property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSubtargetImpl takes a const Function & as a parameter, and looks at that function's attributes, and can give per-function diagnostics for the thumb check, so I believe it is not the "global" subtarget.

@@ -459,9 +459,6 @@ def warn_drv_assuming_mfloat_abi_is : Warning<
def warn_drv_unsupported_float_abi_by_lib : Warning<
"float ABI '%0' is not supported by current library">,
InGroup<UnsupportedABI>;
def warn_drv_no_floating_point_registers: Warning<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a build break for anyone that is explicitly referring to this warning.

I'm not sure what the protocol for removing warnings is in clang. For example are they release noted/deprecated or just left in and ignored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-Wunsupported-abi still exists (it is a group containing several warnings). The only thing that is removed here is the particular manifestation of this warning for the ARM target. It should only be a build breakage if -Wunsupported-abi were removed completely.

@RalfJung
Copy link
Contributor

I'm not yet convinced that llvm is the right place for this error message. There's quite a lot of test changes that I presume needed to make and at least the LTO use case looks like we don't want to require extra information. I'd also prefer this to be a front-end error/warning as this is a front-end/source-level mistake, with better diagnostic control options available.

There's a lot of complicated logic required to determine, for each architecture, which target features affect the ABI in which ways. Are you suggesting that logic should be duplicated across all frontends? That seems like a huge waste of effort to me. Most frontends are inevitably going to get it subtly wrong, so in the end we have a ton more bugs than we did if there was a central location where such know-how could be properly encoded, and where there are enough people that can confidently answer ABI questions like this.

@smithp35
Copy link
Collaborator

front-end error/warning as this is a front-end/source-level mistake, with better diagnostic control options available.

There's a lot of complicated logic required to determine, for each architecture, which target features affect the ABI in which ways. Are you suggesting that logic should be duplicated across all frontends? That seems like a huge waste of effort to me. Most frontends are inevitably going to get it subtly wrong, so in the end we have a ton more bugs than we did if there was a central location where such know-how could be properly encoded, and where there are enough people that can confidently answer ABI questions like this.

It is a tradeoff. Yes it means redoing effort in front-ends, but it can mean a better experience for the end-user, and to me that is often more important than the work that we do as toolchain developers. I'm not saying I've definitely got the balance right in this case which is why I invite more opinions :-)

An error in the backend does not have a lot of context. In some cases it may end up triggering on a large LTO build without any source file information given in the error message.

@RalfJung
Copy link
Contributor

RalfJung commented Nov 13, 2024

Yeah ideally LLVM would provide APIs where frontends can explicitly query that kind of information, and then provide nice errors with context. What this PR does is only the second-best thing, in my eyes.

But very few frontends are able to have an expert for each and every architecture LLVM supports, so "do it in the frontend" is often simply not an option.

@chrisnc
Copy link
Contributor Author

chrisnc commented Nov 13, 2024

LLVM not checking for this leads to it producing incorrect outputs given certain inputs and arguments when using llc and the like directly. Even in an ideal world where every frontend is target-aware and implements the rules itself (or delegates to LLVM to check before attempting to codegen), LLVM itself should still generate correct artifacts and emit an error if it's not possible to do so when invoked with invalid options.

@chrisnc
Copy link
Contributor Author

chrisnc commented Nov 14, 2024

There's quite a lot of test changes that I presume needed to make

FWIW, virtually all of the test changes were due to cases where LLVM would silently use soft-float despite the hard-float ABI being explicitly requested, which is exactly the incorrect behavior that this PR is meant to fix. The total number of affected test cases is fairly small.

and at least the LTO use case looks like we don't want to require extra information.

The direct invocation of llvm-lto case without target information is one I will look into; however, like the others, this change was made because LLVM was generating code in a manner where the hard-float ABI was requested but couldn't be fulfilled. Anywhere that might be happening can lead to unsound ABI-mixing. To me the question is whether the equivalent of that command is invoked internally without the correct target information in typical usage of LTO driven by clang/lld. I'm inclined to say no, based on there being zero other test failures related to this change, but I will do some digging to understand the situation.

@chrisnc
Copy link
Contributor Author

chrisnc commented Nov 20, 2024

@ostannard or @davemgreen any other feedback on this change?

@ostannard
Copy link
Collaborator

I agree with the previous comments from @davemgreen and @smithp35, user-facing diagnostics should always be emitted by clang, so that cases like LTO work correctly. If you want to reduce the duplication of the code which works out what ABIs are compatible (which I think would be a good idea), you could probably put that code in TargetParser, and use it from clang to decide what error/warning to emit.

@davemgreen
Copy link
Collaborator

I agree but was having trouble putting it into words. I don't have a reference I can put my hands on, but we have generally considered the llvm error messages to be a poor substitute to those produced by clang and have often gone the other way, converting llvm errors to clang errors. I would think the best interface here is a error/warning that is downgradable to a warning/silent and that is difficult to do in llvm at the moment with the way the errors are setup. They are mostly reserved for things that have gone very wrong or cannot be supported.

It would be good to share this kind of logic where we can though. Either by building a proper way for the backends to report errors to the frontends, or for the logic to be shared in TargetParser.

@RalfJung
Copy link
Contributor

converting llvm errors to clang errors.

I assume you didn't also add those errors to all the other frontends, so every time you do this (assuming you actually remove the LLVM error at that point) you risk introducing bugs into other frontends (if they were relying on LLVM to reject some nonsensical situations). IMO this should be done with more consideration of non-clang frontends.

@RalfJung
Copy link
Contributor

As a Rust compiler dev, this part of LLVM feels extremely fragile. If we do anything wrong, LLVM will just use some ABI, and things will keep working for some situations, but really we have subtle ABI bugs that will eventually bite our users. The rules we have to follow are not documented anywhere (to my knowledge), and require target-specific expertise -- even if we get the x86 part right, we'll need a PowerPC expert to also get soft-float handling right for that target.

It's not a pleasant experience at all. Having LLVM be more robust, and throw errors rather than being like "yeah let's just continue, I'm sure it's going to be fine", would go a long way towards making LLVM easier to use and more reliable as a backend. It is concerning if attempts at improving the robustness of LLVM in this space are being blocked.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in responding.

I don't have much to add over davemgreen and ostannard.

Leaving aside whether this should be an error in the front or back-end. As mentioned before I haven't worked in this area for a long time, and there are enough test changes here that make me nervous enough that this is going to break some existing use cases. Some do look like sloppy triple selection, although a lot of the tests don't use any floating point so it could be seen as harsh to fault them. If I understood the area better that wouldn't be a problem, but as I'm not enthusiastic about moving a diagnostic to the backend, this isn't a change that I am best placed to approve. I won't object if there's a consensus in the other direction.

One use case I'd want to keep working is where we have two functions in the same object that are hardfp with hardware registers, and soft (with or without hardware registers). Something like:

__attribute__((target("arch=cortex-m33"))) __attribute__((pcs("aapcs-vfp"))) float f1(float a, float b) {
  return a + b;
}

__attribute__((target("arch=cortex-m0"))) __attribute__((pcs("aapcs"))) float f2(float a, float b) {
  return a + b;
}

There are some source files that do run-time selection based on hardware detection.

@@ -16,7 +16,7 @@

// RUN: %clang --target=arm-none-eabi -mcpu=cortex-m33 -mfloat-abi=hard -O1 %s -flto=thin -c -o %t.call_thin.bc -DCALL_LIB
// RUN: %clang --target=arm-none-eabi -mcpu=cortex-m33 -mfloat-abi=hard -O1 %s -flto=thin -c -o %t.define_thin.bc -DDEFINE_LIB
// RUN: llvm-lto2 run -o %t.lto_thin -save-temps %t.call_thin.bc %t.define_thin.bc \
// RUN: llvm-lto2 run --mcpu=cortex-m33 --float-abi=hard -o %t.lto_thin -save-temps %t.call_thin.bc %t.define_thin.bc \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was this particular test update that makes me most nervous about the change.

IIUC https://reviews.llvm.org/D89573 after a fix for https://bugs.llvm.org/show_bug.cgi?id=45524

From disassembling the bitcode the module has triple thumbv8m.main-unknown-none-eabihf and have target features that include the necessary hardware floating point registers.

The llvm-dis output from lto also looks like it preserves these. I would not expect any additional flags to be necessary here.

The following example is very common and we would need to avoid requiring additional mcpu and float-abi flags in the LTO step.

clang --target=arm-none-eabi -mcpu=cortex-m33 -mfloat-abi=hard -c -flto file1.c
clang --target=arm-none-eabi -mcpu=cortex-m33 -mfloat-abi=hard -c -flto file2.c
clang file1.o file2.o -o file.exe

Copy link
Contributor Author

@chrisnc chrisnc Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug into this further as promised, and the error goes away if you just remove the call to printf from the test, which is otherwise expected to be replaced with a call to putchar. This means that the llvm-lto2 invocation is eventually creating a new subtarget in which the target features and ABI are not inherited from the settings of either input file, and because the ones it lands on are incompatible, an error is raised. Pre-existing functions that were generated with a specific ABI and features are preserved as you would expect.

Is this possibly because, when generating code for builtins, the lto pass does actually need the caller to specify the target and features those builtins should be generated with respect to, and if it needs to insert a call to a builtin that it is providing (with default / no target features, because none were given) into a context where the ABI is set by a caller that came from the input which requires some target features, this leads to the error?

Copy link
Contributor Author

@chrisnc chrisnc Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test this theory, I had the error output the function name just like the nearby check for thumb does, and this is what I get when running this test unmodified:

The hard-float ABI is enabled for function 'llvm.lifetime.start.p0', but the target lacks floating-point registers.

This means that this function was being generated during LTO itself with an invalid (feature set, ABI) combination. Depending on what this function's parameters are, that could lead to incorrect behavior, so in general, it may not be safe to LTO without enough features enabled to support the ABI that the compiled inputs use, and it is good for LLVM to complain about this rather than potentially emit erroneous code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the investigation. I think that this use case of LTO needs to work without an error as while we can provide the target for this specific test case, it may cause other uses of LTO without a target on the linker driver to fail. This may mean that we have to fix this in LTO.

From git annotate the test case was added in https://reviews.llvm.org/D89573 ostensibly to make sure that if LTO calls a C function it uses the correct ABI.

Looking at llvm.lifetime.start this is a lifetime marker intrinsic https://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic . IIUC lifetime markers are for communicating lifetimes of objects to optimisers, so while they influence code-generation they don't get code generated for them.

It may be an area where the subtarget for the function isn't being passed through correctly, although in the case of lifetime markers it is benign. This is an area that needs a lot more in depth expertise of LLVM and LTO than I posess. I can ask around internally to see if anyone is willing to take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about making the error only fire if any of the function's argument types or return types are floating-point? That would eliminate the need for this test change and probably several others that weren't actually using floating-point, and be more defensible in that if you encounter such a function with the ABI/feature combination being checked for, it will generate incorrect code for it if not for this error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a good way forward. That would help reduce the chances of this breaking some existing projects.

That would then leave us with the debate about front-end and back-end error messages. I've been thinking about how to get past the differences of opinion here. My thoughts so far:

  • Make the clang warning an error (to match GCC) and make sure it triggers for the target and the -mfloat-abi=hard case. Then clang users will see a front-end error and shouldn't see the backend error unless they've messed about behind the compilers back with target attributes. Other front-ends can choose their own policy.
  • Make the back-end error opt-in, with clang opting out and retaining the existing clang warning. That retains the existing behaviour for clang. I personally would prefer clang/gcc to behave the same way and give an error message.

While I think that either of those would take care of user facing concerns, there's a possible objection of code complexity, but I'd have to leave that in the hands of the code-owners.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted this approach, but there's an issue due to how the getSubtargetImpl function works. It essentially maintains a map of subtargets it's seen before, keyed by a string of cpu + feature list, and only does the error checking when it sees a new combination, otherwise returns the subtarget that's already been created for the given cpu + feature list key. What this means is that if a non-fp-using function is encountered first in a translation unit, then the check passes because its use of inconsistent ABI+feature is harmless, but subsequent functions that should be flagged will be ignored because the map already has a subtarget for them.

Any suggestions on how to deal with this would be appreciated, or if there is a maintainer for this specific subsystem who could provide guidance here that would also be helpful.

@chrisnc
Copy link
Contributor Author

chrisnc commented Nov 22, 2024

One use case I'd want to keep working is where we have two functions in the same object that are hardfp with hardware registers, and soft (with or without hardware registers). Something like:

__attribute__((target("arch=cortex-m33"))) __attribute__((pcs("aapcs-vfp"))) float f1(float a, float b) {
  return a + b;
}

__attribute__((target("arch=cortex-m0"))) __attribute__((pcs("aapcs"))) float f2(float a, float b) {
  return a + b;
}

There are some source files that do run-time selection based on hardware detection.

I can confirm that this does not emit an error.
The way this change is implemented, each function will be checked for consistency of ABI + features, and both of these are fine, so there's no issue.

@workingjubilee
Copy link
Contributor

I agree but was having trouble putting it into words. I don't have a reference I can put my hands on, but we have generally considered the llvm error messages to be a poor substitute to those produced by clang and have often gone the other way, converting llvm errors to clang errors.

I am not exactly sure how LLVM developers are imagining that frontend developers determine that they should diagnose a situation and provide a friendlier error? The primary motivation we have is that we notice that LLVM vomits up a nasty stacktrace in some situation, and then we figure it out and cover up the sewer hole that people could otherwise fallthrough to.

I suppose this also works if LLVM developers use their mystical powers to instantly transmit an awareness of all the situations every frontend developer should be diagnosing and erroring on. Was there a psychic mind meld I missed? Because I had to spend an entire night and day going over all the edge cases for the interrupt ABIs that we "support" in Rust, instead of merely possessing mystical knowledge about which cases the frontend should catch and error on.

@chrisnc chrisnc requested a review from smithp35 November 22, 2024 16:17
Comment on lines +320 to +322
if (I->isTargetHardFloat() && !I->hasFPRegs())
F.getContext().emitError("The hard-float ABI is enabled, but the target "
"lacks floating-point registers.");
Copy link
Contributor

@RalfJung RalfJung Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to check the target triple and do automatic hard float detection (in ARMTargetMachine.isTargetHardFloat). Shouldn't this instead check Options.FloatABIType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. I initially avoided that because one of the values is Default which is meant to defer to the triple, but in the constructor of ARMBaseTargetMachine it sets the internal Options.FloatABIType to one or the other based on the passed in option and the triple, so it should be correct to just use that. I was testing with things like --target=arm-none-eabi -mcpu=cortex-m0 -mfloat-abi=hard and seeing the error trigger, so something else in the frontend may be generating a different triple that makes this "work".

Copy link
Contributor Author

@chrisnc chrisnc Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I remember now where I got stuck on this before. I need to obtain this property on the ARMSubtarget rather than the ARMBaseTargetMachine, but neither of them expose Options publicly, and ARMSubtarget just invokes isTargetHardFloat on its TM member, without accounting for its own Options, even though they are copied from the ARMBaseTargetMachine when the subtarget is created, which should be set to either hard or soft, based on the value that was set in ARMBaseTargetMachine's constructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ARMSubtarget.isTargetHardFloat seems like a footgun, I hope the rest of the backend properly has Options.FloatABIType override the target default.

@RalfJung
Copy link
Contributor

RalfJung commented Jan 1, 2025

RISC-V has similar checks here:

if ((ABI == RISCVABI::ABI_ILP32F || ABI == RISCVABI::ABI_LP64F) &&
!Subtarget.hasStdExtF()) {
errs() << "Hard-float 'f' ABI can't be used for a target that "
"doesn't support the F instruction set extension (ignoring "
"target-abi)\n";
ABI = Subtarget.is64Bit() ? RISCVABI::ABI_LP64 : RISCVABI::ABI_ILP32;
} else if ((ABI == RISCVABI::ABI_ILP32D || ABI == RISCVABI::ABI_LP64D) &&
!Subtarget.hasStdExtD()) {
errs() << "Hard-float 'd' ABI can't be used for a target that "
"doesn't support the D instruction set extension (ignoring "
"target-abi)\n";
ABI = Subtarget.is64Bit() ? RISCVABI::ABI_LP64 : RISCVABI::ABI_ILP32;
}

So maybe the ARM checks could be added in a similar place? RISC-V handles ABI variants in a very clean way, and it does seem to work in practice too, so it'd be a good model for other architectures to follow.

@chrisnc
Copy link
Contributor Author

chrisnc commented Jan 2, 2025

RISC-V has similar checks here:

if ((ABI == RISCVABI::ABI_ILP32F || ABI == RISCVABI::ABI_LP64F) &&
!Subtarget.hasStdExtF()) {
errs() << "Hard-float 'f' ABI can't be used for a target that "
"doesn't support the F instruction set extension (ignoring "
"target-abi)\n";
ABI = Subtarget.is64Bit() ? RISCVABI::ABI_LP64 : RISCVABI::ABI_ILP32;
} else if ((ABI == RISCVABI::ABI_ILP32D || ABI == RISCVABI::ABI_LP64D) &&
!Subtarget.hasStdExtD()) {
errs() << "Hard-float 'd' ABI can't be used for a target that "
"doesn't support the D instruction set extension (ignoring "
"target-abi)\n";
ABI = Subtarget.is64Bit() ? RISCVABI::ABI_LP64 : RISCVABI::ABI_ILP32;
}

So maybe the ARM checks could be added in a similar place? RISC-V handles ABI variants in a very clean way, and it does seem to work in practice too, so it'd be a good model for other architectures to follow.

I could try this approach here and see how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category debuginfo
Projects
None yet
7 participants