-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small SIMD test fails with --release but passes without #50154
Comments
Are you compiling with the avx2 target feature enabled? |
Great question: I can avoid the bug by specifying But in the past, llvm has been able to polyfill the instructions down to SSE2 with stdsimd-0.0.4 and therefore I didn't have to specify any flag at all. I've noticed that asking LLVM to polyfill the instructions from avx2 down to core-avx-i actually improves performance on most available AVX2 hardware unless your instructions are sufficiently dense. This is because the AVX2 instructions downclock the chip for some time, and so it's much better to keep the full clock speed, but then keep the AVX2 code around for newer chips like skylake. I really loved writing AVX2 intrinsics and having the compiler match them to my desired architecture...having 4 code paths (avx2, avx and SSE4.2 and SSE2) to match my desired targets is a significant support burden, so from my perspective, the old behavior with the SSE2 polyfill was ideal. |
This... really isn't what those intrinsics are for. Sometimes the path of least resistance for the compiler is to treat some intrinsics as a generic operation that can be lowered to other instruction sets as well, but that is not at all guaranteed. If you want something portable across different SIMD instruction sets, you should use the (in-development) portable SIMD types, not AVX2 intrinsics. |
Good suggestion about the portable simd types: I was able to make this helper function which seems to translate into the avx2 intrinsic when needed. Would this kind of thing be worth providing for all portable SIMD types? #[inline(always)]
fn cmp_gt_i16x16(lhs: i16x16, rhs: i16x16) -> i16x16 {
let lz = rhs - lhs;
let sign_bit = lz & i16x16::splat(-32768);
sign_bit >> 15
} I do, however, think that this should either error out in the development build, or preferably yield a compiler error (or at least SIGILL) instead of providing wrong arithmetic results in release. Also, I suspect this is a recent LLVM bug in their polyfill...and may still be worth correcting |
This is an issue with opt-level 3 specifically and I believe is a bug inside of LLVM. The problem is that we're passing all arguments by reference (the SIMD arguments) and LLVM is accidentally promoting them to by-value which is known to produce bugs. Specifically LLVM's I've opened an upstream LLVM bug at https://bugs.llvm.org/show_bug.cgi?id=37358 |
fn baseline() {
let data = i16x16::new(4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64);
let one_to_16 = i16x16::new(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);
let output = one_to_16.gt(i16x16::splat(2i16)).select(data + 1i16, data);
// note: if the mask is often false for all lanes you could guard the select
// behind an `if mask.any() { ... }`
assert_eq!(
output,
i16x16::new(4, 8, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65)
);
}
Is there a way to tell which from the I-unsound issues affect stable Rust only? There are a lot of them, and many affect only nightly Rust, but it is hard to tell them apart. |
Thanks for the extra pings @gnzlbg! Let's see how that plays out... |
@gnzlbg Your playground does not work anymore :(
|
@hellow554 you need to use the packed_simd crate, |
So it appears that this won't be fixed in LLVM any time soon, and AFAICT this is not something we can easily warn about in the Rust side of things for the time being :/ |
I find this bug unfortunate, as I'm trying to do safe wrappers (called fearless_simd), but this basically makes that approach unworkable. If the bug won't be fixed soon, maybe we should document the danger zone. I read the llvm issue. It's interesting that this bug has persisted so long without getting triggered; it's evidence that the way people use C++ and Rust are quite different in spite of the similar approaches to zero-cost abstractions etc. |
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
rustc: Fix (again) simd vectors by-val in ABI The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes #50154 Closes #52636 Closes #54583 Closes #55059 [quite a lot]: #47743 [discussion]: #44367 [wasn't]: #50154
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
rustc: Fix (again) simd vectors by-val in ABI The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
I'm posting a revert for the fix in #55281 because I don't think the fix was quite right (causing segfaults for me). LLVM, however, in the meantime should have an official fix, so this should hopefully get closed out in the near future once that lands. |
Upstream fix has landed in Rust's LLVM fork: rust-lang/llvm-project@3d36e5c |
Does this still reproduce with nightly? If not we can close this. |
Is there any reason this is not yet closed? |
Full repro here:
https://github.com/danielrh/simd_playground run with
cargo test --release
failing test is here:
using:
on OSX 10.12.6 (16G1314)
and also on linux
I was unable to reproduce the above problem by making a simple main with the above function, so it could be something to do with the build options.
One more note: the same exact test and code have been working for months with stdsimd 0.0.3 and 0.0.4 crate. I couldn't get that crate to build with nightly 1.27.0, so I couldn't see if it still worked.
The text was updated successfully, but these errors were encountered: