Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nontemporal_store: make sure that the intrinsic is truly just a hint #128149

Merged
merged 3 commits into from
Aug 12, 2024

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Jul 24, 2024

The !nontemporal flag for stores in LLVM sounds like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is highly dubious. Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the _mm_stream intrinsics are unaffected by this change.

Fixes #114582
Cc @Amanieu @workingjubilee

@rustbot
Copy link
Collaborator

rustbot commented Jul 24, 2024

r? @fmease

rustbot has assigned @fmease.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 24, 2024
@rustbot
Copy link
Collaborator

rustbot commented Jul 24, 2024

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the nontemporal_store branch from 05e50ef to 8d0d26e Compare July 24, 2024 19:40
@RalfJung RalfJung force-pushed the nontemporal_store branch 2 times, most recently from e2efa65 to d1e92af Compare July 24, 2024 20:03
@Kobzol
Copy link
Contributor

Kobzol commented Jul 24, 2024

It's a bit sad if this becomes an actual no-op on x86, nontemporal stores can be (very rarely) very useful. Is this usable through anything else than mm_stream_* and friends? These functions already contain a safety note that a fence has to be used after them.

@workingjubilee
Copy link
Member

It's a bit sad if this becomes an actual no-op on x86, nontemporal stores can be (very rarely) very useful. Is this usable through anything else than mm_stream_* and friends? These functions already contain a safety note that a fence has to be used after them.

Those intrinsic functions are, in the latest commits to std::arch, internally assembly. They should behave identically to how they did before, please file a bug if they emit notably worse codegen. The distinction is that, in general, LLVM does not try to perform store deduplication into assembly code.

@Kobzol
Copy link
Contributor

Kobzol commented Jul 25, 2024

So mm_stream will still emit nontemporal instructions on x86? Thanks for clarifying!

@RalfJung
Copy link
Member Author

Yes it will, that's why rust-lang/stdarch#1541 needs to propagate first.

@Kobzol
Copy link
Contributor

Kobzol commented Jul 25, 2024

Ok, great, then I misunderstood this change, sorry!

@workingjubilee
Copy link
Member

Thank you, @RalfJung.

@RalfJung RalfJung force-pushed the nontemporal_store branch from 0135bb1 to 697787a Compare August 5, 2024 08:57
@RalfJung
Copy link
Member Author

RalfJung commented Aug 5, 2024

rust-lang/stdarch#1541 has propagated, so this should be good to go. We haven't heard from @fmease though so let's re-roll the reviewer dice.

r? compiler

@rustbot rustbot assigned TaKO8Ki and unassigned fmease Aug 5, 2024
Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The changes themselves look good based on the prior discussions. Just a question about the codegen test with regards to target coverage.

Comment on lines +740 to +741
const WELL_BEHAVED_NONTEMPORAL_ARCHS: &[&str] =
&["aarch64", "arm", "riscv32", "riscv64"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question [REV1 (2/2)]: If I understand it correctly, the codegen test is modified such that it only codegens for aarch64-unknown-linux-gnu target, but here we have 4 "well-behaved" architectures that we do emit !nontemporal for.

@jieyouxu
Copy link
Member

r? jieyouxu

@rustbot rustbot assigned jieyouxu and unassigned TaKO8Ki Aug 11, 2024
Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, feel free to r=me,Amanieu,Jubilee after maybe fixing the nit :3

@RalfJung
Copy link
Member Author

@bors r=jieyouxu,Amanieu,Jubilee

@bors
Copy link
Contributor

bors commented Aug 12, 2024

📌 Commit 5be6c7d has been approved by jieyouxu,Amanieu,Jubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 12, 2024
@RalfJung
Copy link
Member Author

@bors r=jieyouxu,Amanieu,Jubilee

@bors
Copy link
Contributor

bors commented Aug 12, 2024

📌 Commit 75743dc has been approved by jieyouxu,Amanieu,Jubilee

It is now in the queue for this repository.

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 12, 2024
…llaumeGomez

Rollup of 10 pull requests

Successful merges:

 - rust-lang#128149 (nontemporal_store: make sure that the intrinsic is truly just a hint)
 - rust-lang#128394 (Unify run button display with "copy code" button and with mdbook buttons)
 - rust-lang#128537 (const vector passed through to codegen)
 - rust-lang#128632 (std: do not overwrite style in `get_backtrace_style`)
 - rust-lang#128878 (Slightly refactor `Flags` in bootstrap)
 - rust-lang#128886 (Get rid of some `#[allow(rustc::untranslatable_diagnostic)]`)
 - rust-lang#128929 (Fix codegen-units tests that were disabled 8 years ago)
 - rust-lang#128937 (Fix warnings in rmake tests on `x86_64-unknown-linux-gnu`)
 - rust-lang#128978 (Use `assert_matches` around the compiler more)
 - rust-lang#128994 (Fix bug in `Parser::look_ahead`.)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 095ca33 into rust-lang:master Aug 12, 2024
6 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Aug 12, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Aug 12, 2024
Rollup merge of rust-lang#128149 - RalfJung:nontemporal_store, r=jieyouxu,Amanieu,Jubilee

nontemporal_store: make sure that the intrinsic is truly just a hint

The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](llvm/llvm-project#64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

~~Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~

Fixes rust-lang#114582
Cc `@Amanieu` `@workingjubilee`
@RalfJung RalfJung deleted the nontemporal_store branch August 14, 2024 12:15
bjorn3 pushed a commit to bjorn3/rust that referenced this pull request Sep 22, 2024
…ouxu,Amanieu,Jubilee

nontemporal_store: make sure that the intrinsic is truly just a hint

The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](llvm/llvm-project#64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

~~Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~

Fixes rust-lang#114582
Cc `@Amanieu` `@workingjubilee`
antoyo pushed a commit to antoyo/rust that referenced this pull request Jan 13, 2025
…ouxu,Amanieu,Jubilee

nontemporal_store: make sure that the intrinsic is truly just a hint

The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](llvm/llvm-project#64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

~~Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~

Fixes rust-lang#114582
Cc `@Amanieu` `@workingjubilee`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-temporal stores (and _mm_stream operations in stdarch) break our memory model
10 participants