Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD Misoptimization during "Promote 'by reference' arguments to scalars on SCC" #36706

Closed
alexcrichton opened this issue May 7, 2018 · 19 comments
Assignees
Labels
backend:X86 bugzilla Issues migrated from bugzilla

Comments

@alexcrichton
Copy link
Contributor

Bugzilla Link 37358
Resolution FIXED
Resolved on Jan 16, 2019 13:21
Version trunk
OS Windows NT
Blocks #38454
Attachments failing IR
CC @chandlerc,@topperc,@davidbolvansky,@echristo,@efriedma-quic,@gnzlbg,@zmodem,@hfinkel,@cuviper,@RKSimon,@nikic,@rotateright,@tstellar
Fixed by commit(s) r351296

Extended Description

We've got an upstream bug in rust-lang/rust at rust-lang/rust#50154 where LLVM at opt-level=3 is mis-optimizing promotion of an argument passed by reference to pass-by-value. The attached IR exhibits the difference by looking at:

$ opt -O2 tmp.ll -S | grep '^define.*m256'
define internal fastcc void @&#8203;_mm256_cmpgt_epi16(<4 x i64>* nocapture, <4 x i64>* nocapture readonly %a, <4 x i64>* nocapture readonly %b) unnamed_addr #&#8203;2 {
$ opt -O3 tmp.ll -S | grep '^define.*m256'
define internal fastcc void @&#8203;_mm256_cmpgt_epi16(<4 x i64>* nocapture, <4 x i64> %a.val, <4 x i64> %b.val) unnamed_addr #&#8203;2 {

Note that at opt-level=2 the two arguments to this function continue to be passed by reference, but at opt-level=3 they're promoted to being passed by value. In this situation the target function, _mm256_cmpgt_epi16, has the "avx2" feature enabled. The caller, baseline, does not have any extra target features enabled (aka doesn't have "avx2" available). This means that if attempting to pass by value this'll be an ABI mismatch at codegen time, producing invalid results on optimized IR.

Using opt-bisect-limit I found that this happens during the "Promote 'by reference' arguments to scalars on SCC" pass. Are we correct in thinking that this optimization shouldn't happen? Or is this a valid optimization that we'll need to work around on rustc's end?

@alexcrichton
Copy link
Contributor Author

assigned to @tstellar

@efriedma-quic
Copy link
Collaborator

It's pretty clearly an LLVM bug. I'm not sure what part of LLVM, though; arguably, the bug is in the x86 backend and the way it chooses a calling convention based on target features.


Related testcase; try with clang --target=i686-pc-freebsd":

__attribute((target("sse2"),noinline)) static float x(float z) { return z+3; }
float y(float z) { return x(z)+3; }

This fails because we're transforming the call to x from the C calling convention to the LLVM-internal "fastcc" calling convention, and the fastcc convention uses SSE registers for arguments depending on the target features.

@gnzlbg
Copy link
Mannequin

gnzlbg mannequin commented Aug 6, 2018

Should this issue be moved to the x86 backend component then ?

@zmodem
Copy link
Collaborator

zmodem commented Aug 16, 2018

Any X86 folks on this bug that can take a look? Sounds like this might be nice to get fixed for the release.

@llvmbot
Copy link
Member

llvmbot commented Sep 1, 2018

This bug appears to also be affect a Rust simd library of mine, preventing it from being used with recursive functions that inline into target_feature functions. Would love to set this fixed.

@chandlerc
Copy link
Member

Argument promotion pass needs to consider whether the caller/callee pair attributes make the promotion in question valid. The inliner already has exactly this kind of checks in place for things like target attributes.

The downside is that this check in argument promotion is somewhat more subtle.

However, I don't know that this makes sense to try to fix in time for the release... This bug seems likely to have always existed with LLVM's target attributes. =[

Eric, Craig, and myself are probably the best people to talk about how to fix this long-term.

@zmodem
Copy link
Collaborator

zmodem commented Sep 4, 2018

Thanks, sounds like there's not enough traction to wait on this for the release.

@echristo
Copy link
Contributor

This bug appears to also be affect a Rust simd library of mine, preventing
it from being used with recursive functions that inline into target_feature
functions. Would love to set this fixed.

Inlining with different target feature functions shouldn't happen. There are checks to ensure that they're at least compatible.

@llvmbot
Copy link
Member

llvmbot commented Oct 19, 2018

This bug appears to also be affect a Rust simd library of mine, preventing
it from being used with recursive functions that inline into target_feature
functions. Would love to set this fixed.

Inlining with different target feature functions shouldn't happen. There are
checks to ensure that they're at least compatible.

It is a generic function in Rust, which at Rust compile time becomes either an SSE2 or AVX2 function, and is then inlined into appropriate target_feature tagged functions. But when the function is recursive this breaks down. Alex Crichton took at look at the IL and thought this was likely the cause.

@echristo
Copy link
Contributor

This bug appears to also be affect a Rust simd library of mine, preventing
it from being used with recursive functions that inline into target_feature
functions. Would love to set this fixed.

Inlining with different target feature functions shouldn't happen. There are
checks to ensure that they're at least compatible.

It is a generic function in Rust, which at Rust compile time becomes either
an SSE2 or AVX2 function, and is then inlined into appropriate
target_feature tagged functions. But when the function is recursive this
breaks down. Alex Crichton took at look at the IL and thought this was
likely the cause.

If it's an inlining failure it's unrelated to this. You might want to open a different bug with a stripped down testcase there.

@echristo
Copy link
Contributor

Argument promotion pass needs to consider whether the caller/callee pair
attributes make the promotion in question valid. The inliner already has
exactly this kind of checks in place for things like target attributes.

The downside is that this check in argument promotion is somewhat more
subtle.

However, I don't know that this makes sense to try to fix in time for the
release... This bug seems likely to have always existed with LLVM's target
attributes. =[

Eric, Craig, and myself are probably the best people to talk about how to
fix this long-term.

In general fixing this is going to require argument promotion to know ABI requirements from the back end in the presence of target attributes. This is probably going to be a little tricky - though I'm surprised that this hasn't caused problems before now.

Right now the easiest mechanism is going to be a bit heavyweight: disable arg promotion in the face of conflicting target attributes even more extreme than the inlining code in that we don't want it to happen at all. Then each target can add ABI knowledge into how/whether we want arguments to be promoted after that.

@nikic
Copy link
Contributor

nikic commented Oct 21, 2018

For the record, rustc now works around this problem by manually undoing argument promotion as a custom pass, see rust-lang/rust#55073.

@tstellar
Copy link
Collaborator

Is anyone working on a fix for this? If not, I will give it a try.

@echristo
Copy link
Contributor

I'm not, the advice I gave is probably how I'd go. I'm happy to look if you end up implementing it.

@tstellar
Copy link
Collaborator

Proposed Fix: https://reviews.llvm.org/D53554

@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2018

For those following along with the rustc side of things, Nikita's prior comment is now inaccurate as #​55073 has been reverted: rust-lang/rust#55281 . The sentiment now seems to be to just wait for an LLVM fix, and it's heartening to see that a patch is already being considered.

@tstellar
Copy link
Collaborator

A fix has been merged in r351296, can you verify that this is fixed?

@alexcrichton
Copy link
Contributor Author

Thanks so much for landing the fix! We're updating rust-lang/rust in rust-lang/rust#57675 and I'll start running some tests with that once it's in-tree. In the meantime I'll go head and close this as resolved, and I'll follow-up with more issues if they crop up, but I suspect we should be good to go!

@tstellar
Copy link
Collaborator

mentioned in issue #38454

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

8 participants