-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM] Emit a warning when the hard-float ABI is enabled but can't be used. #111334
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-debuginfo @llvm/pr-subscribers-backend-arm Author: Chris Copeland (chrisnc) ChangesCurrently, compiling for eabihf with a CPU lacking floating-point Fixes #110383. Full diff: https://github.com/llvm/llvm-project/pull/111334.diff 1 Files Affected:
diff --git a/llvm/lib/Target/ARM/ARMTargetMachine.cpp b/llvm/lib/Target/ARM/ARMTargetMachine.cpp
index 7553778c574033..82927a7430b5d0 100644
--- a/llvm/lib/Target/ARM/ARMTargetMachine.cpp
+++ b/llvm/lib/Target/ARM/ARMTargetMachine.cpp
@@ -309,11 +309,15 @@ ARMBaseTargetMachine::getSubtargetImpl(const Function &F) const {
// function that reside in TargetOptions.
resetTargetOptions(F);
I = std::make_unique<ARMSubtarget>(TargetTriple, CPU, FS, *this, isLittle,
- F.hasMinSize());
+ F.hasMinSize());
if (!I->isThumb() && !I->hasARMOps())
F.getContext().emitError("Function '" + F.getName() + "' uses ARM "
"instructions, but the target does not support ARM mode execution.");
+
+ if (I->isTargetHardFloat() && !I->hasFPRegs())
+ F.getContext().emitError("The hardfloat ABI is enabled, but the target "
+ "lacks floating-point registers.");
}
return I.get();
|
b0e9c78
to
b73e1b3
Compare
clang tests are passing now, but some llvm tests are not, e.g., invocations of |
It looks like there is already a warning for this in clang, it only triggers from -mfloat-abi=hard though: https://godbolt.org/z/dcaz8had4. Could it be made to work with any hard-float env? And maybe be made an error down-gradable to a warning? Generally clang-level warnings/errors are more user friendly then the llvm-level errors (but both may be useful for other frontends). |
That seems like a decent improvement to the clang warning, but the backend behavior of still emitting the VFP ABI tag in this case seems like a bug, regardless of the frontend, so IMO codegen should be aborted in this situation. |
Whatever we choose to do I think clang and llvm should be consistent. There's no point in clang giving a warning if llvm is going to give an error. I can see two positions most clearly:
Personally I don't have a problem with clang giving an error in this case as that is what GCC does. Appendix, some empirical data:
Clang -mfloat-abi
Actual behaviour is soft-float but with incorrect build attributes. Clang --target=arm-none-eabihf
Actual behaviour is soft-float but with incorrect build attributes. |
My preference would be to remove the clang warning and add the LLVM error so that other frontends can benefit from this checking, as well as the LLVM tools themselves. The LLVM test failures reveal a few places where this issue seems to be ignored. |
What would be really nice is if the LLVM backend could tell the frontend about the error rather than reporting it itself, that way frontends can use their nice error reporting code but don't have to re-implement the error detection logic. :) Anyway, that's a different issue... |
b73e1b3
to
5e916c3
Compare
I've updated this PR to remove the aforementioned |
4bd5ec3
to
9903206
Compare
All unit tests are now updated and passing and this change is ready for review. Thanks! |
Friendly bump :) @davemgreen @smithp35 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not yet convinced that llvm is the right place for this error message. There's quite a lot of test changes that I presume needed to make and at least the LTO use case looks like we don't want to require extra information. I'd also prefer this to be a front-end error/warning as this is a front-end/source-level mistake, with better diagnostic control options available.
I think if it is implemented in llvm then it will need to use the per-function subtarget to not risk fallout with LTO builds.
Apologies I'm probably not particularly enthusiastic about the approach. I've not worked in this area for some time so I'm happy for other reviewer's opinions to take precedence.
|
||
if (!I->isThumb() && !I->hasARMOps()) | ||
F.getContext().emitError("Function '" + F.getName() + "' uses ARM " | ||
"instructions, but the target does not support ARM mode execution."); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC [*] TargetMachine::getSubtargetImpl is the "global" subtarget. It has been a long time since I've looked at the area so I could be wrong.
Putting the warning here makes me nervous as individual functions have their own subtargets. It is common for LTO to rely entirely on the per function subtarget. I can remember an Android problem that was caused by using the global subtarget for a per-function property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSubtargetImpl
takes a const Function &
as a parameter, and looks at that function's attributes, and can give per-function diagnostics for the thumb check, so I believe it is not the "global" subtarget.
@@ -459,9 +459,6 @@ def warn_drv_assuming_mfloat_abi_is : Warning< | |||
def warn_drv_unsupported_float_abi_by_lib : Warning< | |||
"float ABI '%0' is not supported by current library">, | |||
InGroup<UnsupportedABI>; | |||
def warn_drv_no_floating_point_registers: Warning< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a build break for anyone that is explicitly referring to this warning.
I'm not sure what the protocol for removing warnings is in clang. For example are they release noted/deprecated or just left in and ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Wunsupported-abi
still exists (it is a group containing several warnings). The only thing that is removed here is the particular manifestation of this warning for the ARM target. It should only be a build breakage if -Wunsupported-abi
were removed completely.
There's a lot of complicated logic required to determine, for each architecture, which target features affect the ABI in which ways. Are you suggesting that logic should be duplicated across all frontends? That seems like a huge waste of effort to me. Most frontends are inevitably going to get it subtly wrong, so in the end we have a ton more bugs than we did if there was a central location where such know-how could be properly encoded, and where there are enough people that can confidently answer ABI questions like this. |
It is a tradeoff. Yes it means redoing effort in front-ends, but it can mean a better experience for the end-user, and to me that is often more important than the work that we do as toolchain developers. I'm not saying I've definitely got the balance right in this case which is why I invite more opinions :-) An error in the backend does not have a lot of context. In some cases it may end up triggering on a large LTO build without any source file information given in the error message. |
Yeah ideally LLVM would provide APIs where frontends can explicitly query that kind of information, and then provide nice errors with context. What this PR does is only the second-best thing, in my eyes. But very few frontends are able to have an expert for each and every architecture LLVM supports, so "do it in the frontend" is often simply not an option. |
LLVM not checking for this leads to it producing incorrect outputs given certain inputs and arguments when using |
FWIW, virtually all of the test changes were due to cases where LLVM would silently use soft-float despite the hard-float ABI being explicitly requested, which is exactly the incorrect behavior that this PR is meant to fix. The total number of affected test cases is fairly small.
The direct invocation of llvm-lto case without target information is one I will look into; however, like the others, this change was made because LLVM was generating code in a manner where the hard-float ABI was requested but couldn't be fulfilled. Anywhere that might be happening can lead to unsound ABI-mixing. To me the question is whether the equivalent of that command is invoked internally without the correct target information in typical usage of LTO driven by clang/lld. I'm inclined to say no, based on there being zero other test failures related to this change, but I will do some digging to understand the situation. |
I agree with the previous comments from @davemgreen and @smithp35, user-facing diagnostics should always be emitted by clang, so that cases like LTO work correctly. If you want to reduce the duplication of the code which works out what ABIs are compatible (which I think would be a good idea), you could probably put that code in |
I agree but was having trouble putting it into words. I don't have a reference I can put my hands on, but we have generally considered the llvm error messages to be a poor substitute to those produced by clang and have often gone the other way, converting llvm errors to clang errors. I would think the best interface here is a error/warning that is downgradable to a warning/silent and that is difficult to do in llvm at the moment with the way the errors are setup. They are mostly reserved for things that have gone very wrong or cannot be supported. It would be good to share this kind of logic where we can though. Either by building a proper way for the backends to report errors to the frontends, or for the logic to be shared in TargetParser. |
I assume you didn't also add those errors to all the other frontends, so every time you do this (assuming you actually remove the LLVM error at that point) you risk introducing bugs into other frontends (if they were relying on LLVM to reject some nonsensical situations). IMO this should be done with more consideration of non-clang frontends. |
As a Rust compiler dev, this part of LLVM feels extremely fragile. If we do anything wrong, LLVM will just use some ABI, and things will keep working for some situations, but really we have subtle ABI bugs that will eventually bite our users. The rules we have to follow are not documented anywhere (to my knowledge), and require target-specific expertise -- even if we get the x86 part right, we'll need a PowerPC expert to also get soft-float handling right for that target. It's not a pleasant experience at all. Having LLVM be more robust, and throw errors rather than being like "yeah let's just continue, I'm sure it's going to be fine", would go a long way towards making LLVM easier to use and more reliable as a backend. It is concerning if attempts at improving the robustness of LLVM in this space are being blocked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay in responding.
I don't have much to add over davemgreen and ostannard.
Leaving aside whether this should be an error in the front or back-end. As mentioned before I haven't worked in this area for a long time, and there are enough test changes here that make me nervous enough that this is going to break some existing use cases. Some do look like sloppy triple selection, although a lot of the tests don't use any floating point so it could be seen as harsh to fault them. If I understood the area better that wouldn't be a problem, but as I'm not enthusiastic about moving a diagnostic to the backend, this isn't a change that I am best placed to approve. I won't object if there's a consensus in the other direction.
One use case I'd want to keep working is where we have two functions in the same object that are hardfp with hardware registers, and soft (with or without hardware registers). Something like:
__attribute__((target("arch=cortex-m33"))) __attribute__((pcs("aapcs-vfp"))) float f1(float a, float b) {
return a + b;
}
__attribute__((target("arch=cortex-m0"))) __attribute__((pcs("aapcs"))) float f2(float a, float b) {
return a + b;
}
There are some source files that do run-time selection based on hardware detection.
I can confirm that this does not emit an error. |
I am not exactly sure how LLVM developers are imagining that frontend developers determine that they should diagnose a situation and provide a friendlier error? The primary motivation we have is that we notice that LLVM vomits up a nasty stacktrace in some situation, and then we figure it out and cover up the sewer hole that people could otherwise fallthrough to. I suppose this also works if LLVM developers use their mystical powers to instantly transmit an awareness of all the situations every frontend developer should be diagnosing and erroring on. Was there a psychic mind meld I missed? Because I had to spend an entire night and day going over all the edge cases for the interrupt ABIs that we "support" in Rust, instead of merely possessing mystical knowledge about which cases the frontend should catch and error on. |
if (I->isTargetHardFloat() && !I->hasFPRegs()) | ||
F.getContext().emitError("The hard-float ABI is enabled, but the target " | ||
"lacks floating-point registers."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to check the target triple and do automatic hard float detection (in ARMTargetMachine.isTargetHardFloat
). Shouldn't this instead check Options.FloatABIType?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. I initially avoided that because one of the values is Default
which is meant to defer to the triple, but in the constructor of ARMBaseTargetMachine
it sets the internal Options.FloatABIType
to one or the other based on the passed in option and the triple, so it should be correct to just use that. I was testing with things like --target=arm-none-eabi -mcpu=cortex-m0 -mfloat-abi=hard
and seeing the error trigger, so something else in the frontend may be generating a different triple that makes this "work".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I remember now where I got stuck on this before. I need to obtain this property on the ARMSubtarget
rather than the ARMBaseTargetMachine
, but neither of them expose Options
publicly, and ARMSubtarget
just invokes isTargetHardFloat
on its TM
member, without accounting for its own Options
, even though they are copied from the ARMBaseTargetMachine
when the subtarget is created, which should be set to either hard or soft, based on the value that was set in ARMBaseTargetMachine
's constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah ARMSubtarget.isTargetHardFloat
seems like a footgun, I hope the rest of the backend properly has Options.FloatABIType
override the target default.
RISC-V has similar checks here: llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp Lines 88 to 100 in ed572f2
So maybe the ARM checks could be added in a similar place? RISC-V handles ABI variants in a very clean way, and it does seem to work in practice too, so it'd be a good model for other architectures to follow. |
I could try this approach here and see how it goes. |
@chrisnc what are the next steps here? Maybe one solution, for maximum compatibility, would be to not emit any warnings when |
I've been fairly busy over the past few months so haven't had time to pursue this further. One of the issues is that, as far as I can tell, there doesn't seem to be a mechanism to issue a warning from within LLVM codegen, only to error. The behavior you are describing is already what My next step was going to experiment with the approach the RISC-V backend uses and see how it goes, especially to evaluate the blast radius in the tests and compare. |
RISC-V does Rust will always set |
Oh, good to know... I specifically went to check where that interface was defined to see if there was an equivalent Even if Rust always sets a non-default ABI, it's not clear which use cases of clang and the llvm tools result in the same, but that will be clearer once I see the test runs. I'll try and pick this up again this weekend. |
9903206
to
d5cae4e
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
I've adapted the approach used in the RISC-V backend to this PR and carried over the new test I added before (updated the warning message and changed the expected exit code of the erroneous Would appreciate another review. I've opted to leave the |
…used. Compiling for an eabihf target with a CPU lacking floating-point registers will silently use the soft-float ABI instead, even though the Arm attributes section still has Tag_ABI_VFP_args: VFP registers, which leads to silent ABI mismatches at link time. Fixes llvm#110383.
d5cae4e
to
06fede0
Compare
That seems to implement what was sketched above, but I am not familiar with this code so I can't say if there's any possible problems this could cause. |
CI is passing with no changes to existing tests. @davemgreen @ostannard @smithp35 please let me know if any further changes are needed. Having this backstop will give the users of other tools a chance at avoiding these situations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I don't have any fundamental objections to a diagnostic given that the existing clang diagnostic has been kept in place. As maintainers I think the final say needs to be with @ostannard and @davemgreen .
I'm wondering if there's any surrounding context that can used to make the warning more useful. Right now it seems to be appearing once per Subtarget. I've left a more specific comment where more context is available.
@@ -791,7 +791,9 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM, | |||
setAllExpand(MVT::f32); | |||
if (!Subtarget->hasFP64()) | |||
setAllExpand(MVT::f64); | |||
} | |||
} else if (TM.Options.FloatABIType == FloatABI::Hard) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I run this on a test program I get a couple of warnings
bin/clang --target=arm-none-eabihf -march=armv7-r nofp.c -c -mfloat-abi=hard -ffunction-sections
clang: warning: '-mfloat-abi=hard': selected processor lacks floating point registers [-Wunsupported-abi]
The hard-float ABI can't be used for a target that doesn't support floating-point (ignoring float-abi)
The hard-float ABI can't be used for a target that doesn't support floating-point (ignoring float-abi)
I think this is because this constructor is run once per ARMSubtarget. Tracing this back to getSubtargetImpl where ARMSubtargets are created https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/ARM/ARMTargetMachine.cpp#L280
It looks like that might be a better place to put this diagnostic. We may also want to see if some additional information like the function name can be used, possibly something from the Subtarget if the output is human readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does look like a better place for this, and already has one check like this, emitted through the LLVMContext
so that the compiler can exit with a non-zero status code, or report the error through whatever other means it normally used. I think an error is better than a warning for this, because it's not possible to generate the requested code correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the best place to report errors like this is in the frontend or driver, so that the errors can point to exact source files, line numbers, etc. We could probably add some functions to TargetParser
to help other frontends with these sort of compatibility checks.
However, I'm fine with this as a backup to avoid generating incorrect code when the frontend either misses a check, or is fine with the lower-quality error messages which the backend can create.
@@ -791,7 +791,9 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM, | |||
setAllExpand(MVT::f32); | |||
if (!Subtarget->hasFP64()) | |||
setAllExpand(MVT::f64); | |||
} | |||
} else if (TM.Options.FloatABIType == FloatABI::Hard) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does look like a better place for this, and already has one check like this, emitted through the LLVMContext
so that the compiler can exit with a non-zero status code, or report the error through whatever other means it normally used. I think an error is better than a warning for this, because it's not possible to generate the requested code correctly.
Would it make sense to adjust the riscv backend as well for consistency? Seems a bit odd to have this check in different places for different targets.
|
Yes, I think moving the riscv check too would make sense. |
Compiling for an eabihf target with a CPU lacking floating-point
registers will silently use the soft-float ABI instead, even though the
Arm attributes section still has Tag_ABI_VFP_args: VFP registers, which
leads to silent ABI mismatches at link time.
Fixes #110383.