-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__int128_t multiply with -fsanitize=undefined fails to link (undefined reference to __muloti4) #16778
Comments
You'll need to make sure and link against compiler-rt for this to work. What does -v on your command line do? |
We don't link in compiler-rt by default on linux; we expect the symbols there to be provided by libgcc instead, but these ones aren't. |
I'm confused about why you think I should need to explicitly link with compiler-rt : if the compiler doesn't automatically link in its own runtime functions as it needs them surely that's a compiler bug? I don't need to explicitly link with libgcc if I'm compiling with gcc... Output if compiling with -v, as requested:
|
Well, LLVM is a modular project and there aren't clear rules on what you have to build or link, so that people can mix&match. The compiler-rt is not a required module for LLVM and until now there was no real dependency between both projects. The sanitizers have changed that, so maybe we need a different policy on compiler-rt when the sanitizers are enabled. At least a warning at compile time would suffice. Also, for distribution packagers, would be good to leave an advice that there is this dependency, if they're packing the sanitizers with clang. |
Well, LLVM may be modular, but at the point that you're distributing a C compiler binary configured to produce binaries for a specific OS/runtime (in this case your prebuilt clang binary building Linux binaries) it needs to automatically link in its runtime routines that it generates references to. |
Yes, this should work out of the box, and ideally we should be linking to compiler-rt by default. However, getting to that state is going to require someone putting in the relevant work updating the build system and our documentation and performing enough testing that we can be sure that preferring compiler-rt over libgcc does not regress anything (correctness, performance, binary size, ...). Perhaps we can work around this in the short term by including the int128 overflow routines in the UBSan runtime library as well as in libclang. |
Peter, I was not trying to excuse LLVM, but to explain the source of the bug. ;) The modularity of LLVM, added with the number of side-projects that depend on it, makes it harder for LLVM to have hard dependencies like GCC, and whenever one feature is implemented across modules, it'll be harder to make sure they keep coherence during release packaging. I could be wrong, but my view is that it should be the job of the distribution packager, not the LLVM community, to make sure that the distributed Clang has all its dependencies met, whether on the same package, or via additional packages, and have the appropriate warnings when these dependencies are not met. My reasoning is that different distributions might chose all-in-one approach, while others will prefer more modular, perhaps, because they also release other tools depending on LLVM core and want to reduce redundancy. It'll be very hard for the LLVM do that for every single distribution. |
This is problematic, since not all targets have the same support as libgcc, and not as tested and benchmarked. Maybe, the sanitizer libraries shouldn't have been on compiler-rt to begin with, especially because it became an integral part of Clang, which had no previous strong dependency on it.
That's a moving target, and there are more people working on libgcc on each target than on compiler-rt at all. I don't think that's a feasible goal by default unless the vendors adopt compiler-rt as they have adopted libgcc, and make it a reality.
Another solution would be to include them into libgcc. |
I don't think changing that would affect this bug at all. And note that the sanitizers are a red herring here: the same problem affects -ftrapv (and always has).
That doesn't help people who don't have bleeding-edge GCC. Perhaps a better option would be to include both libgcc and compiler-rt on the link line, such that libgcc is preferred if both provide a symbol. |
Of course. I should have said "a long term solution". Including both is the obvious short-term solution, but that should only be required is the feature is enabled, and it's up to the binary distributors (official packagers and distribution releasers) to make sure RT is bundled together. |
This is still an issue in 3.5 (though in my case it was __mulodi4). The only workaround I've found is to overtly link against libclang_rt.builtins-${ARCH}, which requires you tell the compiler where to look. |
You can link with builtins by providing --rtlib=compiler-rt. This line works for me: |
Please can you reopen this bug? It's already clear as per the discussion back in 2013 that you can work around this by manually linking against the runtime. The bug is exactly that the compiler is failing to automatically link against its own runtime. The user should not be having to manually work around the failures of the compiler like that. |
Right, sorry. I agree we should find out if __muloti4 is available, and, if it's not, either fallback to manually generated code, or avoid emitting the check altogether. |
This just bit us in Chromium. One of our developers was trying to use __builtin_mul_overflow, which Clang supports but not really? int f(__int128 a, __int128 b, __int128 *p) { $ bin/clang /tmp/a.c How can we make this work out of the box? |
Easiest would be to always link against compiler-rt. I think we're in a position to start making that a reality at this point. Thoughts? |
Always linking RT means always building the huge baggage that comes with it and testing, which add a big strain on our bots. If there was a way to only build the builtins, I'd gladly always include that, and even make that default on ARM. |
Do you mean building all of the sanitizer bits etc? I'm not sure I'd count it as "huge baggage", but our definitions might be different. Can you elaborate here? |
Compiler-RT, on a clean build, adds about 50% more files to build, tests to build and to run. The sanitizer tests also take a lot longer than the quick clang/llvm tests. On ARM, a clean LLVM+Clang takes at least 1h to build and 16 minutes to check-all. With RT it goes to 2h to build and 40 minutes to check-all. Of that, only a small amount is due to the builtins, and by far the largest part is due to the sanitizers, profilers, safestack, xray, etc. They are all cool tools, but they also slow down a lot our testing. To be honest, I'd be very happy if we moved the builtins inside LLVM (or Clang) and left Compiler-RT as the "cool run-time tools we have". They have very different purposes, very different footprints, and very specific dependencies. It would also make it easier to "always include the builtins" on all builds because they'd be guaranteed to be there. |
It definitely seems time to me to switch clang's default to clang_rt rather than libgcc_s. Note that compiler-rt already supports building just the builtins library and not the sanitizers etc. Am I right in thinking that the "building and testing a huge amount of stuff" concern is really about the set of targets built/tested by "make all" / "make check"? If so, it seems like one way to address that would be a configure-time setting to specify which pieces should be part of "all". I expect we'll need such a mechanism anyway once we switch to a monorepo. |
Compiler-RT's CMake is a big mess at the moment, especially for cross-building. It would be very hard to validate all configurations where Clang/LLVM work perfectly fine today if RT was made a requirement. As I said before, It'd be perfectly fine requiring just the builtins if they were either part of Clang/LLVM's repositories or on a repository in itself.
No. As above, separating the libraries in a clean fashion only works on the platforms that were made work, and a lot of work still needs to be done (Ask Chris M., Stephen H., Saleem A., Jonathan R., Joerg S. and others), before RT builds and tests correctly on the platforms Clang already works. The "make check-all" is a problem only as long as the issue above is a problem. Once we have a CMake that can only build certain components and knows which one of them to test based on that, we could "require" any number of components for its sub-components.
As was largely discussed for the past few years, this is not a small task, not a task that any one person (or company) alone can do it. I agree that requiring our own builtins is a very important thing to do, not just for dog feeding, but also to evolve as a platform on its own. But until we can solve those problems, requiring Compiler-RT is not an option. |
Attempting to compile GNU m4 for Windows commandline, I get Toolchain: LLVM 5.0 for Windows 64-bit (from LLVM binary repository) How to reproduce, I'm running make from Ubuntu WSL like so I checked clang_rt.builtins-x86_64.lib and confirmed that it contain muloti4.c.obj, but lack the symbol __muloti4. As a quick workaround I thought I could just compile muloti4.c separate and add it to the build. However well, it's not that easy. First we have #if (defined(LP64) || defined(wasm) || defined(__mips64)) Windows have LLP64 so this will not be defined, which also effectively disables the __muloti4 implementation (see muloti4.c), thus explains why the symbol is not present in the clang_rt.builtins-x86_64.lib. If I define CRT_HAS_128BIT anyway the code is not portable as is. I have tried a quick port with intel intrinsics __m128, but well ... no. Now I wonder regards Erik |
I can solve the dependencies above by compiling using clang.exe and define CRT_HAS_128BIT |
I know that we don't always know if we're using compiler_rt at compile time (as opposed to link time), but would it be feasible to define __has_builtin(__builtin_mul_overflow) to be false when we know we're going to be linking with libgcc? |
Linking against clang-11, libstdc++11, Ubuntu 20.04. |
UPD: adding |
It's a shame that 8 years have passed and this is still a problem, and linking libgcc is still the solution. :( With compiler-rt and the other libraries in the monorepo, it should be easier to build the basic runtime library (without santitizers and other tools) as part of the official Clang build. One problem we had back then was that a lot of downstream projects want modularity wrt components, ie. not force dependencies when they have their own alternatives in-house. GCC doesn't have that problem, of course, but this is a big part in what makes LLVM different than GCC. But I think the LLVM project is mature enough today to have a "toolchain" build, where we use our own components and build an actual toolchain. Something like "ninja toolchain && ninja check", which includes all components and allows distros and users to make sure they build and link all components, so we don't need third party stuff any more. This bug is as relevant today as it was back then. Perhaps even more so. Users shouldn't need to guess they need to build their own RT sources or include libgcc when using clang... |
This affects apple clang on macos with system libc++. No libgcc generally |
I am having some troubles with --rtlib=compiler-rt: llvm/llvm-bugzilla-archive#51227 |
I fixed the bug in the initial report in d0eeb64 / https://reviews.llvm.org/D108928 in clang-14. So closing this as resolved/fixed. That said, it's kind of tragic from a code size perspective that we can't use these compiler-rt-only symbols. It's also tragic IMO that clang doesn't use LLVM's own compiler runtime (or linker for that matter) by default. And because it doesn't, there's less incentive to ever get those working for a platform; if we can just depend on libgcc (or ld64), why put any effort into compiler-rt (or LLD) for those platforms? Part of me thinks it would be nice to still emit libcalls to mulodi/muloti and friends for platforms that do currently link against compiler-rt. To do that, LLVM would need knowledge of which compiler runtime was being linked against. Clang knows this (ToolChain::GetRuntimeLibType and ToolChain::GetDefaultRuntimeLibType are used to make code gen decisions in clang), but LLVM doesn't AFAICT. Perhaps we should have module level IR for this that clang can set? Then LLVM can make better code gen decisions? The one thing I haven't checked yet is whether one can specify something like Should muloti/mulodi just be deleted from compiler-rt? At the least, we should file a feature request for libgcc to add these. Another part of me thinks it would be nice to switch the default compiler runtime to compiler-rt for clang, and encourage platforms to get their compiler runtimes working with compiler-rt. Is having the compiler builtins separated from the reset of compiler-rt interesting? Perhaps. That feels orthogonal to the above point about module level IR, and not a yak I intend to shave. |
I think you have mismatched expectations of compiler-rt (and libgcc), but one that makes a lot of sense, regardless of how hard it is to fix this problem. When GCC was the only compiler, libgcc had a 1:1 relationship with it, so it could always say "I know I can use that". Other compilers had their own runtime and pretty much the same relationship. When languages are designed together with the compiler and runtime library, they all have the same certainty relationship with each other. However, Clang/LLVM came along as a compiler, with its own runtime, but not one that had all functionality and quirks required to build Linux binaries. Once we started pushing for people to use Clang on Linux, the immediate problem was GCC compatibility. We had to emulate GCC extensions and bugs, but we couldn't simply copy libgcc functions to compiler-rt (license wise), so the only practical way was to use libgcc directly. GCC probably still doesn't care about what compiler-rt implements or doesn't, or what Clang does it for that matter. I tried to create a communication channel between GCC and LLVM worlds (at the GNU cauldron) but they voiced a strong disdain for the idea. I only wanted to create an external standard that any Linux compiler would follow (language extensions, runtime functionality) but I believe they've seen it as a way for the LLVM community to meddle in the GCC's design choices. So, to make upstream Clang work on Linux, there are two remaining alternatives:
The first one is how we solved this issue. The second one would be a disaster for the users. Bear in mind that this only applies to upstream Clang. Downstream clang (Apple, Google, Arm, Intel, Microsoft, etc) can all control what they build and what runtime libraries they use and can happily use compiler-rt as a base for their own runtime. For your proposal to add this functionality to libgcc, it fixes this particular problem only when compiling on Linux assuming libgcc compatibility. It does not fix the bigger problem of compatibility, not it offers any progress in a good direction. Worse still, we end up with the same problem as "if libgcc_version > x.y.z, then use muloti4", which comes back to the same problem you describe when the compiler assumes a different runtime than the linker uses. |
Thanks for the additional thoughts and context. Sorry for hijacking this bug, but this context is very useful to me and being to link to it may come in handy in the future (modulo any moves to github issues).
What are the implications for the second? I can understand relying on libgcc until compiler-rt was a viable substitute; I guess I'm curious at what point that flips?
Then it seems like we're stuck in time with whatever libgcc has implemented from 100 years ago, or whatever a more reasonable/less hyperbolic time frame is. |
Assuming compiler-rt always comes with Clang has multiple implications. If we want people to be able to seemingly switch between Clang and other compilers on their natural habitat, we'll have to mirror the runtime libraries of all compilers we want to emulate, which is not limited to GCC, MSVC, ICC, IBM, etc. by either:
Either strategies is unfeasible for an open source project focused on common compiler infrastructure. Otherwise, we'll have to focus on what clang provides in compiler-rt and not be compatible with other "native" compilers, meaning native compiler users won't be able to try clang out on their code, and clang users won't be able to interoperate with other compilers for static objects, etc. Furthermore, if support for one environment is better than another, clang users will have to worry about cross-platform AND cross-runtime support issues in their code, which will make it even less compatible with other compilers if they decide to change.
Clang is already "native" to a few environments, for example MacOS, FreeBSD, Mandriva, Android. Those environments have their own C/C++ libraries and use compiler-RT (plus additional stuff, I think) for the runtime support. Linux uses GCC, Windows uses MSVC, etc. and the use of their runtime evolves with the native compiler. If we want Clang to work at all on those environments (we do), then we need to support all libraries (and extensions, and bugs, etc) that the native compiler provides. To answer your question, I don't think there will be a time when it "flips" for Linux/Windows, because GCC/MSVC are always evolving, so libgcc/MSVC-RT will have constant changes for the future.
We're not stuck, we're following them, at their speed. :) We chose to re-use the existing runtime libraries to make it easier for people to try/switch to Clang and that has worked fantastically well. Now we're paying the maintenance cost (that we knew we'd pay). If GCC picks up But I don't think replacing all native runtime libraries with one compiler-RT to rule them all will have less maintenance than what we have today... |
This is far from "fixed": https://reviews.llvm.org/D141836 I created a proposal to address this here: https://discourse.llvm.org/t/proposal-split-built-ins-from-the-rest-of-compiler-rt/67978 |
This is still a problem with clang-19 |
On which target? I believe that this issue has been fixed in the meantime by marking this builtin as unavailable for most targets, but it's not unlikely that one was missed. |
In my case it's on aarch64 (fedora 41/Asahi Linux) - happens when trying to build Qt with ubsan. |
Yeah, it looks like the libcalls is still emitted on aarch64. For reference, here's how the change to disable this looks like (for ppc): c1a31ee |
The logic for marking runtime libcalls unavailable currently duplicates essentially the logic for some subset of targets, where someone reported an issue and then someone went and fixed the issue for that specific target. However, the availability for most of these is completely target independent. In particular: * MULO_I128 is never available in libgcc * Various I128 libcalls are only available for 64-bit targets in libgcc * powi is never available in MSVCRT Unify the logic for these, so we don't miss any targets. This fixes llvm#16778 on AArch64, which is one of the targets that was previously missed in this logic.
The logic for marking runtime libcalls unavailable currently duplicates essentially the same logic for some random subset of targets, where someone reported an issue and then someone went and fixed the issue for that specific target only. However, the availability for most of these is completely target independent. In particular: * MULO_I128 is never available in libgcc * Various I128 libcalls are not available for 32-bit targets in libgcc * powi is never available in MSVCRT Unify the logic for these, so we don't miss any targets. This fixes #16778 on AArch64, which is one of the targets that was previously missed in this logic.
Extended Description
clang fails to link code which does a multiply of two int128_t variables on x86_64 if the code is compiled with -fsanitize=undefined : it fails with "undefined reference to '__muloti4'. The same code compiles OK if the -fsanitize argument is omitted:
(I'm using the prebuilt clang binaries on Ubuntu 12.04:
http://llvm.org/releases/3.3/clang+llvm-3.3-amd64-Ubuntu-12.04.2.tar.gz)
The text was updated successfully, but these errors were encountered: