-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD Misoptimization during "Promote 'by reference' arguments to scalars on SCC" #36706
Comments
assigned to @tstellar |
It's pretty clearly an LLVM bug. I'm not sure what part of LLVM, though; arguably, the bug is in the x86 backend and the way it chooses a calling convention based on target features. Related testcase; try with clang --target=i686-pc-freebsd": __attribute((target("sse2"),noinline)) static float x(float z) { return z+3; } This fails because we're transforming the call to x from the C calling convention to the LLVM-internal "fastcc" calling convention, and the fastcc convention uses SSE registers for arguments depending on the target features. |
Should this issue be moved to the x86 backend component then ? |
Any X86 folks on this bug that can take a look? Sounds like this might be nice to get fixed for the release. |
This bug appears to also be affect a Rust simd library of mine, preventing it from being used with recursive functions that inline into target_feature functions. Would love to set this fixed. |
Argument promotion pass needs to consider whether the caller/callee pair attributes make the promotion in question valid. The inliner already has exactly this kind of checks in place for things like target attributes. The downside is that this check in argument promotion is somewhat more subtle. However, I don't know that this makes sense to try to fix in time for the release... This bug seems likely to have always existed with LLVM's target attributes. =[ Eric, Craig, and myself are probably the best people to talk about how to fix this long-term. |
Thanks, sounds like there's not enough traction to wait on this for the release. |
Inlining with different target feature functions shouldn't happen. There are checks to ensure that they're at least compatible. |
It is a generic function in Rust, which at Rust compile time becomes either an SSE2 or AVX2 function, and is then inlined into appropriate target_feature tagged functions. But when the function is recursive this breaks down. Alex Crichton took at look at the IL and thought this was likely the cause. |
If it's an inlining failure it's unrelated to this. You might want to open a different bug with a stripped down testcase there. |
In general fixing this is going to require argument promotion to know ABI requirements from the back end in the presence of target attributes. This is probably going to be a little tricky - though I'm surprised that this hasn't caused problems before now. Right now the easiest mechanism is going to be a bit heavyweight: disable arg promotion in the face of conflicting target attributes even more extreme than the inlining code in that we don't want it to happen at all. Then each target can add ABI knowledge into how/whether we want arguments to be promoted after that. |
For the record, rustc now works around this problem by manually undoing argument promotion as a custom pass, see rust-lang/rust#55073. |
Is anyone working on a fix for this? If not, I will give it a try. |
I'm not, the advice I gave is probably how I'd go. I'm happy to look if you end up implementing it. |
Proposed Fix: https://reviews.llvm.org/D53554 |
For those following along with the rustc side of things, Nikita's prior comment is now inaccurate as #55073 has been reverted: rust-lang/rust#55281 . The sentiment now seems to be to just wait for an LLVM fix, and it's heartening to see that a patch is already being considered. |
A fix has been merged in r351296, can you verify that this is fixed? |
Thanks so much for landing the fix! We're updating rust-lang/rust in rust-lang/rust#57675 and I'll start running some tests with that once it's in-tree. In the meantime I'll go head and close this as resolved, and I'll follow-up with more issues if they crop up, but I suspect we should be good to go! |
mentioned in issue #38454 |
Extended Description
We've got an upstream bug in rust-lang/rust at rust-lang/rust#50154 where LLVM at opt-level=3 is mis-optimizing promotion of an argument passed by reference to pass-by-value. The attached IR exhibits the difference by looking at:
Note that at opt-level=2 the two arguments to this function continue to be passed by reference, but at opt-level=3 they're promoted to being passed by value. In this situation the target function,
_mm256_cmpgt_epi16
, has the "avx2" feature enabled. The caller,baseline
, does not have any extra target features enabled (aka doesn't have "avx2" available). This means that if attempting to pass by value this'll be an ABI mismatch at codegen time, producing invalid results on optimized IR.Using opt-bisect-limit I found that this happens during the "Promote 'by reference' arguments to scalars on SCC" pass. Are we correct in thinking that this optimization shouldn't happen? Or is this a valid optimization that we'll need to work around on rustc's end?
The text was updated successfully, but these errors were encountered: