-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clang and compiler-rt disagree on preconditions for __aeabi_llsr and __aeabi_llsl #62743
Comments
@llvm/issue-subscribers-clang-codegen |
Given the way LLVM IR defines shifts, if we require any preconditions on __aeabi_llsr and __aeabi_llsl, LLVM would have to mask the shift amount of every shift. (There's no way to tell if a specific shift is being executed speculatively.) I'd prefer to define the functions in a way which avoids that. |
Preconditions would also prevent the optimization here and, at least I assume that optimization is beneficial. 😄 Perhaps the answer is to define it to be wrapping? I.e. Looking at the compiler-rt function, I think the only changes needed to make it wrapping are:
Inspecting the output, the two versions seem to generate comparable code. The only variation is sometimes But I'm not actually a compiler person, so will defer to you all. I only stick my nose into such things because C/C++ are such catastrophic UB disasters that it's impossible to use them without a little exposure to such things. So take this as just a suggestion from an exasperated C/C++ user that maybe we don't need UB here. :-) |
Making the implementation in compiler-rt wrap is easy enough; the issue is other existing implementations that don't wrap. Since this is part of the ABI we can't expect users to use the latest version of compiler-rt; they can use any compatible runtime. It should be safe to assume existing implementations never crash, though, since both clang and gcc speculatively call them with out-of-bounds values. For ARM specifically, these functions are defined as part of the platform ABI (https://github.com/ARM-software/abi-aa/blob/main/rtabi32/rtabi32.rst), so we should coordinate. CC @davemgreen @stuij ? |
The plot thickens further! Looks like Rust already ran into this at some point and made their function total on the latest revisions: Though I'm not sure if their function wraps. It might just return some arbitrary garbage? (Which is sufficient for the optimization in this case, just kinda weird.) @taiki-e FYI. |
Hmm, given these are defined in so many places, maybe we'll have to settle for "function is guaranteed to be total, but out-of-bounds return some arbitrary unspecified value", just to avoid invalidating a bunch of impls. Do Clang and GCC only need the functions to be total, or do they have stricter requirements on them? |
Returning an arbitrary value is definitely sufficient for clang. Should also be fine for gcc, I think, although I haven't verified it. |
uint64_t{a} >> b
in C is only defined for0 <= b < 64
. On 32-bit platforms,uint64_t
is sometimes implemented with an intrinsic. But what are the preconditions on that intrinsic?The compiler-rt implementations say they share the precondition. See the "Precondition:" comment. In code, they seem to also assume this. The
b & bits_in_word
check only makes sense because of the bound, and ifb
were too large, those functions (themselves written in C) would hit the C-level UB.https://github.com/llvm/llvm-project/blob/689de4c6759fa810d827aee06a0ab060b01172ce/compiler-rt/lib/builtins/ashldi3.c
https://github.com/llvm/llvm-project/blob/2f4c96097a3cbf9d07eab699efd25f767bb4fdd5/compiler-rt/lib/builtins/lshrdi3.c
However, Clang seems to believe otherwise:
https://godbolt.org/z/rjPx5j6ne
First, note
-fsanitize=undefined
does not add checks to the guarded functions, only the unguarded functions. So Clang seems to believe thatr > 63 ? 0 : x >> r
is sufficient to dispatch the preconditions. However, the generated code for the guarded functions looks like:There's no branch before calling
__aeabi_llsr
and__aeabi_llsl
. Clang seems to be transforming this into:That is, Clang believe those functions are defined for all inputs. It doesn't seem to care what the output is, but it is relying on them to not diverge. This contradicts with the compiler-rt implementation, so I think one needs to change. We have to either believe:
(No preferences on my end which option makes more sense.)
We ran into this experimenting with Rust in Chrome, because Rust also provides a copy of these functions. When building that with trapping overflow, we get a miscompile. Initially we thought this was just a Rust bug, but it seems the preconditions are unclear even without Rust. I expect a UBSan-built compiler-rt would have the same issues. Possibly further fun times if one ever does LTO here... (We'll file a bug with Rust that they need to make sure their functions match whatever's decided here, depending on which it is.)
The text was updated successfully, but these errors were encountered: