-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support --fix-v4bx #50764
Comments
There's two separate issues here:
Supporting armv4t requires some modifications to the relocation handling for calls; that's what the warning is about. Supporting armv4 without thumb means we need to use a different sequence for function returns. Historically, this is a compiler thing, not a linker thing, but R_ARM_V4BX is a trick to allow writing code that works for both armv4t and armv4-without-thumb targets. --fix-v4bx essentially converts arm-mode armv4t code to armv4. If you're not using --fix-v4bx, the linker doesn't really care about the difference between armv4t and armv4; it doesn't generate function returns, anyway. |
(The documentation says the support is for >=v6.) Isn't armv4 so obscure that it it quite irrelevant now? |
armv5t has both the register and immediate forms of blx, I think. I think the issue here is there are some legacy hardware enthusiasts who build Linux for very old hardware, like https://en.wikipedia.org/wiki/Risc_PC , so legacy configs targeting armv4 persist in the kernel source tree, and randconfig stumbles over them. |
There are three classes of ARMv4 (not T) users that I can see from a kernel perspective:
For reference, we are still adding support for ARMv5E (ARM926E) SoC platforms, including SoCs designed as recently as 2020 (Microchip SAM9X60). |
On the LLD side I originally implemented v7-A and above then later added v5 as there was at least one person had some need for it, I think on the BSD side. It probably wouldn't be a huge amount of extra code and tests to support v4t but up until now there hasn't been anyone needing it in LLD. The assumption was that older hardware would already have toolchains they could use, and we could avoid extra complexity in the code. For v4t support in lld, there are a few things we'd need to do:
If the Thunks code has done its job properly then the relocation handling code should never be presented with an opportunity to use a BLX. On the linker side R_ARM_V4BX is trivial to implement (just write Arm state mov pc, lr encoding to the location), but as Eli points out the compiler has to generate the relocation when targeting Arm v4. If there is a need for support then it may be worth approaching Linaro TCWG. I could do it, but I've not got a lot of spare time so it might take a few weeks rather than a few days. |
The first thing any v4t Thumb thunk will do is switch to ARM mode. The remaining code is the same, so it could be shared. Not sure that's worth doing, though. |
Also, the kernel doesn't actually have any thumb code, so it doesn't actually need v4t thunks, I think. It would be enough to just suppress the warning if there aren't any Thumb symbols. |
It is possible, although it would mean two entry points for thunk that we'd need to select between per relocation type. My instinct would be to keep it simple unless there was a real need to minimise the code-size.
That would be possible, although at the moment that would require LLD to scan all the local symbols of every input object looking for the absence of $t. Although only programs that could trigger the warning would need to run the check. We can't use the BuildAttributes as objects built with -marm still have Tag_THUMB_ISA_use=1 (Thumb instructions were permitted to be used) Would a suppression of the warning be acceptable to the kernel folk? Or are there v4t kernels that use Thumb code that need supporting? |
I'm sure there is no thumb code in pre-v7 kernels that we need to link against. Suppressing the warning would seem to do the trick for armv4t kernels, but if I build for armv4, I still see 'bx' instructions in the thunks. I checked the compiler output from a recent clang and I see that building for armv4 does not generate any 'bx' instructions, so I don't think we actually need the R_ARM_V4BX/--fix-v4bx trick any more to work around compiler-generated instructions, but we still need to work around the linker adding them. Could we just have a linker flag to ask for armv4 style thunks, and use them for both v4 and v4t kernels? |
It does look like (ARMInstrInfo.td MOVPCLR https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/ARM/ARMInstrInfo.td#L2443) the return sequence is predicated so clang/llvm will use mov pc, lr for -march=armv4 so in theory no compiler work required. We can tell if all objects are ArmV4 from the build attributes. Wouldn't need a new command line option. Looking at LLD I'd expect that the only thunks needed for an Arm v4 kernel would be long range Arm to Arm. If --pic-veneer is set then these will have bx lr in them. However if --pic-veneer isn't set I'd expect to see all the thunks to be of the form: In theory these should be OK for v4. May be worth trying a build without --pic-veneer (The kernel wouldn't be position independent though). I think a new PI Thunk for v4 would look something like: Although if we're adding thunks for armv4, it may just be worth adding v4t thunks at the same time. |
Maybe it was the 'blx mcount' when building with '-pg' that caused the problems? I remember debugging into that in the past, but don't remember the exactly problem and this seems fine now, it always uses 'bl mcount' here with the clang version I see.
Correct
Yes, that works. I see the --pic-veneer flag was added by Ard in to work around veneers that are inserted into the early-boot code that is run before the MMU is enabled. We can probably get away with this on actual ARMv4/v4t hardware since their memory is too constrained to actually run a kernel that is large enough to require veneers, but it seems a little wrong. In theory one can run a large (allyesconfig style) kernel with ARMv4 enabled on an ARMv5 machine with a lot of memory.
That would be nice for user space, but in the kernel we don't really care about the distinction. I suggested a command line flag since we could then just unconditionally pick the v4 veneers for linking a v4/v4t kernel and avoid both the bx instructions and the warning. |
lld cannot build for ARMv4/v4T targets because it inserts 'blx' instructions that are unsupported there: ld.lld: warning: lld uses blx instruction, no object with architecture supporting feature detected Add a Kconfig time dependency to prevent those targets from being selected in randconfig builds. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: llvm/llvm-project#50764 Link: ClangBuiltLinux#964 Link: https://lore.kernel.org/r/20221215162635.3750763-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
https://reviews.llvm.org/rG62c605771a30 looks perhaps related? cc @stuij |
Ah, this is bad timing! I was unaware of this ticket. Yes, so my patch stops lld inserting BLX instructions for v4(t), and instead now creates veneers that are sensible for these architectures. Currently I only added support for absolute relocations, but PIC relocations are high on the list and should follow shortly. I think it's just two cases that we should still account for, and adding them is trivial. Which includes v4 PIC thunks not generating a BX which right now they still do. And as we don't insert BLX anymore, the LLD BLX warning is not emitted anymore. Let me know if I'm missing more things that need fixing. ATM I'm quite motivated to get v4(t) working for LLVM, so happy to help out where I can on this. |
I just put up a review for LLD that will make it support position independent thunks for v4/v4T: I think that would fix this issue, and no more work would be needed on the LLVM side. To summarise:
Am I missing anything? This would mean the intel-lab-lkp/linux@6a7ee50 patch can be reverted. Would we need to wait until say LLVM 16 is released (assuming this patch lands before that)? By the way, relatedly, I just pushed a fix (747fc27) for v4/v5/v6 where a |
- Position independent thunks now work for both Armv4 and Armv4T - Armv4 arm->arm thunks don't emit a BX anymore, which doesn't exist for the arch. This fixes llvm/llvm-project#50764. - Armv4 and Armv4T both have the same arm->arm behaviour. Which also is desirable for the above ticket. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D141272
I just tested this with clang 17.0.6 on the Gemini FA526 target, close to ARMv4T. sl-boot>go 0x400000 It boots all the way to userspace, gets network, routes packets and has framebuffer. |
Also tested with clang 17.0.6 on the SA110 Netwinder, which is a pure ARMv4 targer. Press '*' TWICE to enter debug......Booting kernel... All the way to userspace, network and graphics. |
Yay! Tackar for posting this, and I bet I speak for all armv4(t) fans all over the world. I Googled the Gemini SoC, and landed on a page you wrote. I'm tempted to buy a router, or perhaps one of those strongarm servers. |
- Position independent thunks now work for both Armv4 and Armv4T - Armv4 arm->arm thunks don't emit a BX anymore, which doesn't exist for the arch. This fixes llvm/llvm-project#50764. - Armv4 and Armv4T both have the same arm->arm behaviour. Which also is desirable for the above ticket. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D141272
Extended Description
In order to link armv4 Linux kernel images with LLD, it's required that LLD support --fix-v4bx, otherwise we get the warning:
IIUC, there's even a relocation type for this: R_ARM_V4BX.
ClangBuiltLinux/linux#964
The text was updated successfully, but these errors were encountered: