Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix alignment passed down to LLVM for simd_masked_load #118864

Merged
merged 1 commit into from
Dec 13, 2023

Conversation

farnoy
Copy link
Contributor

@farnoy farnoy commented Dec 12, 2023

Follow up to #117953

The alignment for a masked load operation should be that of the element/lane, not the vector as a whole

It can produce miscompilations after the LLVM optimizer notices the higher alignment and promotes this to an unmasked, aligned load followed up by blend/select - https://rust.godbolt.org/z/KEeGbevbb

@rustbot
Copy link
Collaborator

rustbot commented Dec 12, 2023

r? @compiler-errors

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 12, 2023
@@ -21,7 +21,7 @@ extern "platform-intrinsic" {
#[no_mangle]
pub unsafe fn load_f32x2(mask: Vec2<i32>, pointer: *const f32,
values: Vec2<f32>) -> Vec2<f32> {
// CHECK: call <2 x float> @llvm.masked.load.v2f32.p0(ptr {{.*}}, i32 {{.*}}, <2 x i1> {{.*}}, <2 x float> {{.*}})
// CHECK: call <2 x float> @llvm.masked.load.v2f32.p0(ptr {{.*}}, i32 4, <2 x i1> {{.*}}, <2 x float> {{.*}})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This serves as a regression test, it fails without the code change in intrinsic.rs

@farnoy
Copy link
Contributor Author

farnoy commented Dec 12, 2023

r? @workingjubilee

@jhorstmann
Copy link
Contributor

The alignment changes look good to me.

However, I think in the godbolt example the transformation to an unmasked load is more likely caused by the dereferenceable(16) attribute on the [i32; 4] parameters plus maybe some cost heuristics of aligned vs unaligned loads.

@workingjubilee
Copy link
Member

workingjubilee commented Dec 12, 2023

Yes. A reference is already readable: with this alignment change, for every case of &[i32; 4] being pointer-cast and then submitted to simd_masked_load, it is always acceptable to promote the resulting llvm.masked.load to a simple unaligned load of the vector type and then an and-or dance with the mask (or reg-to-reg under a writemask, depending on ISA) to change the value to a desired value.

Consider the following instead, which still demonstrates the alignment problem without the transformation to the unmasked load, because the origin is a pointer-to-element: https://rust.godbolt.org/z/scM87WbjW

@bors r+

@bors
Copy link
Contributor

bors commented Dec 12, 2023

📌 Commit 95b5a80 has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 12, 2023
@workingjubilee
Copy link
Member

@bors rollup

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2023
…kingjubilee

Rollup of 10 pull requests

Successful merges:

 - rust-lang#118858 (Remove dead codes in core)
 - rust-lang#118864 (Fix alignment passed down to LLVM for simd_masked_load)
 - rust-lang#118872 (Add rustX check to codeblock attributes lint)
 - rust-lang#118873 (fix `waker_getters` tracking issue number)
 - rust-lang#118884 (NFC: simplify merging of two vecs)
 - rust-lang#118885 (clippy::complexity fixes)
 - rust-lang#118886 (Clean up variables in `search.js`)
 - rust-lang#118887 (Typo)
 - rust-lang#118889 (more clippy::complexity fixes)
 - rust-lang#118891 (Actually parse async gen blocks correctly)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit a33f1a3 into rust-lang:master Dec 13, 2023
11 checks passed
@rustbot rustbot added this to the 1.76.0 milestone Dec 13, 2023
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2023
Rollup merge of rust-lang#118864 - farnoy:masked-load-store-fixes, r=workingjubilee

Fix alignment passed down to LLVM for simd_masked_load

Follow up to rust-lang#117953

The alignment for a masked load operation should be that of the element/lane, not the vector as a whole

It can produce miscompilations after the LLVM optimizer notices the higher alignment and promotes this to an unmasked, aligned load followed up by blend/select - https://rust.godbolt.org/z/KEeGbevbb
@farnoy farnoy mentioned this pull request Dec 13, 2023
6 tasks
@workingjubilee workingjubilee added A-SIMD Area: SIMD (Single Instruction Multiple Data) PG-portable-simd Project group: Portable SIMD (https://github.com/rust-lang/project-portable-simd) labels Dec 14, 2023
celinval added a commit to celinval/rust-dev that referenced this pull request Jun 4, 2024
Update Rust toolchain from nightly-2023-12-13 to nightly-2023-12-14
without any other source changes.
This is an automatically generated pull request. If any of the CI checks
fail, manual intervention is required. In such a case, review the
changes at https://github.com/rust-lang/rust from
rust-lang@3340d49
up to
rust-lang@eeff92a.
The log for this commit range is:
rust-lang@eeff92ad32 Auto merge of
rust-lang#118402 - notriddle:notriddle/ranking-and-filtering, r=GuillaumeGomez
rust-lang@a90372c6e8 Auto merge of
rust-lang#118213 - Urgau:check-cfg-diagnostics-rustc-cargo, r=petrochenkov
rust-lang@2862500152 Auto merge of
rust-lang#118919 - matthiaskrgr:rollup-02udckl, r=matthiaskrgr
rust-lang@bec6672984 rustdoc-search:
clean up handleSingleArg type handling
rust-lang@9dfcf131b3 rustdoc-search:
better hashing, faster unification
rust-lang@9a9695a052 rustdoc-search: use
set ops for ranking and filtering
rust-lang@fd1d256d61 rustdoc-search:
remove the now-redundant `validateResult`
rust-lang@251d1af0d2 Rollup merge of
rust-lang#118906 - Kobzol:bootstrap-is-windows, r=petrochenkov
rust-lang@666353e7ba Rollup merge of
rust-lang#118883 - HosseinAssaran:patch-1, r=fmease
rust-lang@1dd36119d0 Rollup merge of
rust-lang#118871 - tmiasko:coroutine-maybe-uninit-fields, r=compiler-errors
rust-lang@dbc6ec6636 Rollup merge of
rust-lang#118759 - compiler-errors:bare-unit-structs, r=petrochenkov
rust-lang@f6617d050d Remove dangling
check-cfg ui tests files
rust-lang@5345a166fe Add more suggestion
to unexpected cfg names and values
rust-lang@7176b8babd Auto merge of
rust-lang#118894 - dtolnay:bootstrapwrite, r=onur-ozkan
rust-lang@c3def263a4 Auto merge of
rust-lang#118870 - Enselic:rustc_passes-query-stability, r=compiler-errors
rust-lang@56d25ba5ea Auto merge of
rust-lang#118500 - ZetaNumbers:tcx_hir_refactor, r=petrochenkov
rust-lang@2fdd9eda0c Auto merge of
rust-lang#118534 - RalfJung:extern-type-size-of-val, r=WaffleLapkin
rust-lang@066e6ffa02 Fix LLD thread flag
selection for Windows targets
rust-lang@c5208518fa Add
`TargetSelection::is_windows` method
rust-lang@f651b436ce Auto merge of
rust-lang#117050 - c410-f3r:here-we-go-again, r=petrochenkov
rust-lang@9f1bfe53b6 Auto merge of
rust-lang#118900 - workingjubilee:rollup-wkv9hq1, r=workingjubilee
rust-lang@f9078a40ee Rollup merge of
rust-lang#118891 - compiler-errors:async-gen-blocks, r=eholk
rust-lang@4583a0134f Rollup merge of
rust-lang#118889 - matthiaskrgr:compl_2023_2, r=WaffleLapkin
rust-lang@df0686b629 Rollup merge of
rust-lang#118887 - smoelius:patch-1, r=Nilstrieb
rust-lang@2f937c720d Rollup merge of
rust-lang#118886 - GuillaumeGomez:clean-up-search-vars, r=notriddle
rust-lang@5308733112 Rollup merge of
rust-lang#118885 - matthiaskrgr:compl_2023, r=compiler-errors
rust-lang@89d4a9bee9 Rollup merge of
rust-lang#118884 - matthiaskrgr:auszweimacheins, r=Nadrieril
rust-lang@18e0966f39 Rollup merge of
rust-lang#118873 - lukas-code:fix_waker_getter_tracking_issue_number,
r=workingjubilee
rust-lang@0430782d1d Rollup merge of
rust-lang#118872 - GuillaumeGomez:codeblock-attr-lint, r=notriddle
rust-lang@a33f1a3d3a Rollup merge of
rust-lang#118864 - farnoy:masked-load-store-fixes, r=workingjubilee
rust-lang@2d1d443d7f Rollup merge of
rust-lang#118858 - mu001999:dead_code/clean, r=cuviper
rust-lang@77d1699756 Auto merge of
rust-lang#116438 - ChrisDenton:truncate, r=thomcc
rust-lang@b30e94b7bb Unbreak non-unix
non-windows bootstrap
rust-lang@1d78ce681e Actually parse async
gen blocks correctly
rust-lang@2a1acc26a0 Update
compiler/rustc_pattern_analysis/src/constructor.rs
rust-lang@3795cc8eb0 more
clippy::complexity fixes
rust-lang@046f2dea33 Typo
rust-lang@58327c10c5 Add a test for a
codeblock with multiple invalid attributes
rust-lang@f1342f30a5 Clean up variables
in `search.js`
rust-lang@d707461a1a clippy::complexity
fixes
rust-lang@6892fcd690 simplify merging of
two vecs
rust-lang@a2ffff0708 Change a typo
mistake in the-doc-attribute.md
rust-lang@f813ccd784 also add a Miri test
rust-lang@edcb7aba6b also test projecting
to some sized fields at non-zero offset in structs with an extern type
tail
rust-lang@a47416beb5 test that both
size_of_val and align_of_val panic
rust-lang@bb0fd665a8 Follow guidelines
for lint suggestions
rust-lang@98aa20b0a7 Add test for `rustX`
codeblock attribute
rust-lang@d3cb25f4cf Add `rustX` check to
codeblock attributes lint
rust-lang@24f009c5e5 Move some methods
from `tcx.hir()` to `tcx`
rust-lang@04f3adb4a7 fix `waker_getters`
tracking issue number
rust-lang@e9b16cc2c5 rustc_passes:
Enforce `rustc::potential_query_instability` lint
rust-lang@95b5a80f47 Fix alignment passed
down to LLVM for simd_masked_load
rust-lang@fb32eb3529 Clean up
CodeBlocks::next code
rust-lang@df227f78c6 make it more clear
what comments refer to; avoid dangling unaligned references
rust-lang@b9c9b3e7a2 remove a cranelift
test that doesn't make sense any more
rust-lang@9ef1e35166 reject projecting to
fields whose offset we cannot compute
rust-lang@b1613ebc43 codegen: panic when
trying to compute size/align of extern type
rust-lang@6c0dbb8cc6 Remove dead codes in
core
rust-lang@a48cebc4b8 Coroutine variant
fields can be uninitialized
rust-lang@d473bdfdc3 Support bare unit
structs in destructuring assignments
rust-lang@0278505691 Attempt to try to
resolve blocking concerns
rust-lang@c6f7aa0eea Make File::create
work on Windows hidden files

Co-authored-by: celinval <celinval@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-SIMD Area: SIMD (Single Instruction Multiple Data) PG-portable-simd Project group: Portable SIMD (https://github.com/rust-lang/project-portable-simd) S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants