-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc: Add support for some more x86 SIMD ops #45367
Conversation
r? @arielb1 (rust_highfive has picked a reviewer for you, use r? to override) |
cc @eddyb |
src/librustc/ty/layout.rs
Outdated
@@ -1022,6 +1022,9 @@ pub enum Layout { | |||
count: u64 | |||
}, | |||
|
|||
/// The `x86_mmx` type, structs marked with `#[repr(x86_mmx)]` | |||
X86Mmx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intuition would be to use Vector
and special-case it in rustc_trans
for LLVM.
#45225 changes things a bit, and I'm not sure I'm uncomfortable merging something before then, as there would be roughly a dozen individual rebase conflicts, but that's the worst part, figuring out a coherent design is.
All the FFI ABI decisions in #45225 are done without involving the Rust type, intentionally, so probably the best compromise right now would be to have an x86_mmx
bool
inside Vector
.
EDIT: Actually, is it always an u64
and not semantically a vector in any way (aside from what LLVM does with its own type)? Because then we have another option, putting it in Scalar
, which is a struct
in #45225, and would require less work.
To make a relatively optimal decision here I'd suggest completely ignoring LLVM as a target and focusing on the closest match in terms of memory access and call ABIs.
Also, do we need this in anything other than intrinsics? Because for platform intrinsics we already transform the arguments and return, so there's no real need for matching Rust types (yes not even for #[repr(simd)]
, although we do use it right now).
src/librustc_trans/abi.rs
Outdated
@@ -139,7 +139,8 @@ impl ArgAttributes { | |||
pub enum RegKind { | |||
Integer, | |||
Float, | |||
Vector | |||
Vector, | |||
X86Mmx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here (x86_mmx: bool
inside Vector
), with the caveat that the system here is subject to change/removal in the future.
ca75d34
to
d8026b8
Compare
Good points @eddyb! I've updated with a slightly-hacky solution, but I think should probably be adequate for now. |
The error index test failed without any error message. This seems to be related to the test case around E0509 to E0517 as the test stopped there.
I suspect E0511 since only it touches SIMD. // should be run-pass
#![feature(repr_simd)]
#![feature(platform_intrinsics)]
#[repr(simd)]
#[derive(Copy, Clone)]
struct i32x1(i32);
extern "platform-intrinsic" {
fn simd_add<T>(a: T, b: T) -> T;
}
unsafe { simd_add(i32x1(0), i32x1(1)); } // ok! |
src/librustc_trans/abi.rs
Outdated
// much else. | ||
Layout::Vector { count, .. } => { | ||
let size = self.size(ccx); | ||
let x86_mmx = count == 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also check size.bits() == 64
and maybe the architecture? Otherwise this is a neat hack!
src/librustc_trans/type_of.rs
Outdated
@@ -178,7 +178,13 @@ pub fn in_memory_type_of<'a, 'tcx>(cx: &CrateContext<'a, 'tcx>, t: Ty<'tcx>) -> | |||
} | |||
let llet = in_memory_type_of(cx, e); | |||
let n = t.simd_size(cx.tcx()) as u64; | |||
Type::vector(&llet, n) | |||
// see comment in abi.rs where we set `x86_mmx` to true for why we | |||
// compare to 1 here and return a different type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I'd expect this to have the "main copy" of the comment and abi.rs
to point to it. Not a big deal though. The same thing about checking the size applies here too I guess.
I think we should not care about MMX. One can do anything that can be done with MMX with SSE2. All x86 chips produced 2000 onwards support SSE2. Moreover MMX is full of traps (reusing floating point registers, etc). Moreover I think the absolute minimum should be SSSE3 (NetBurst) which is 17 years old now. And this is if we want to run on dinosaurs. My real recommendation is to forget anything built past the last 7 years and require that x86 supports at minimum SSE42+AES-NI. This means Westmere onwards which is 7 years old now. SSE42 is already decent SIMD and AES-NI is extremely nice to have by default. Surely it won't work for everyone; that said 7 years is an eternity on hardware. So we should be weighing the cost benefit ratio of these choices. How many usecases we target with supporting anything before Westmere? Also consider this number is diminishing year over year. TLDR: do not add support for MMX, require at least SSSE3 or preferably SSE42+AES-NI. |
@alkis thanks for the comment! To be clear though this is just providing support for the |
d8026b8
to
52fbb75
Compare
@eddyb updated! |
Why do we want to expose mmx APIs through stdsimd? (I assume stdsimd is https://github.com/rust-lang-nursery/simd, correct?) |
@alkis the job of stdsimd (this crate) it to provide all platform intrinsics, and the platform intrinsics apparently cover mmx pieces. It's not really up to us to decide what or what not to expose, it's up to authors what to use. |
This statement resonates with me - we don't want to cherry-pick functionality unless there is a very good reason for it. This is the same approach we took when designing a SIMD API for C++. We excluded MMX and everything that would be slow to implement on top of intrinsics. MMX is dead weight and adding support for it is a disservice to your users:
The same care applied to the standard library for minimal, interoperable and ergonomic APIs of which Rust does a good job at, should apply here as well. For MMX the costs of adding are high because they introduce more types and surprising APIs (not to mention there is extra work done to add support for it) and the benefits of adding it are arguably negative. On the other end of the spectrum if you decide to eschew support for older architectures you can provide added value to users. Saying rust has AES on every platform you support is a pretty nice guarantee and will have tangible benefits on random number generation, security, etc. Doing AES in software is risky and troublesome, you need to be careful to avoid secret-dependent branches and loads. Relying on its presence in hardware can be much safer. For me the cost benefit ratio of that decision is as clear as that of dropping MMX support. As mentioned, I was recently involved in the design of a SIMD API for C++ as a secondary author/reviewer. The primary author is quite the expert in SIMD development. The library is open source but the open source version is only periodically pushed to github. As I understand the spirit of stdsimd, it has very similar goals as this library. Perhaps you can take a look at the design and reference - maybe you can steal some of the good ideas. |
I don't really have a dog in this fight but a couple quick notes:
|
Requiring this seems way too early still, e.g looking at the steam hw survey (Look under "other settings"), if this survey is somewhat accurate, about 91% of steam users have a SSE42-capable processor. I.e even amongst people that play games and would probably have way more up to date computers than the average user, 8-9% don't have a processor supporting these instructions, which seems way to high to consider dropping support for. Granted, this is probably the wrong thread to be discussing this. |
True, but in absolute terms ~140 intrinsics add unnecessary clutter. Also, the "intrusive" x86_mmx backend changes could be avoided. I agree it makes sense to drop MMX, including any subsequent instructions that still use its registers. |
52fbb75
to
a8c33c6
Compare
Thanks again for the input everyone! I personally still feel that we should stay the current course of "bind all the intrinsics", but keep in mind that everything here is unstable and will go through more discussion before stabilization. In that sense I think this is certainly all highly relevant for stabilization! |
Also ping r? @arielb1 |
⌛ Testing commit 74dd1c2 with merge db393cf3f409943fada59a18409ebf8c46001118... |
💔 Test failed - status-travis |
rustc: Add support for some more x86 SIMD ops This commit adds compiler support for two basic operations needed for binding SIMD on x86 platforms: * First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen in rust-lang/stdarch#114. This was relatively straightforward and is quite similar to the volatile store intrinsic. * Next, and much more intrusively, a new type to the backend was added. The `x86_mmx` type is used in LLVM for a 64-bit vector register and is used in various intrinsics like `_mm_abs_pi8` as seen in rust-lang/stdarch#74. This new type was added as a new layout option as well as having support added to the trans backend. The type is enabled with the `#[repr(x86_mmx)]` attribute which is intended to just be an implementation detail of SIMD in Rust. I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes or thoughts on that would be greatly appreciated!
💔 Test failed - status-travis |
|
74dd1c2
to
952f6e9
Compare
@bors: r=eddyb |
📌 Commit 952f6e9 has been approved by |
⌛ Testing commit 952f6e9fca35564bed8ffe436e80618c60a6f59e with merge f5520ed945f4e729b4edf8650db73b8d6890aff8... |
💔 Test failed - status-travis |
CI failed on
|
This commit adds compiler support for two basic operations needed for binding SIMD on x86 platforms: * First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen in rust-lang/stdarch#114. This was relatively straightforward and is quite similar to the volatile store intrinsic. * Next, and much more intrusively, a new type to the backend was added. The `x86_mmx` type is used in LLVM for a 64-bit vector register and is used in various intrinsics like `_mm_abs_pi8` as seen in rust-lang/stdarch#74. This new type was added as a new layout option as well as having support added to the trans backend. The type is enabled with the `#[repr(x86_mmx)]` attribute which is intended to just be an implementation detail of SIMD in Rust. I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes or thoughts on that would be greatly appreciated!
952f6e9
to
fe53a81
Compare
@bors: r=eddyb |
📌 Commit fe53a81 has been approved by |
To be safe I think you need to ignore everything like
Or, maybe, could we just add |
rustc: Add support for some more x86 SIMD ops This commit adds compiler support for two basic operations needed for binding SIMD on x86 platforms: * First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen in rust-lang/stdarch#114. This was relatively straightforward and is quite similar to the volatile store intrinsic. * Next, and much more intrusively, a new type to the backend was added. The `x86_mmx` type is used in LLVM for a 64-bit vector register and is used in various intrinsics like `_mm_abs_pi8` as seen in rust-lang/stdarch#74. This new type was added as a new layout option as well as having support added to the trans backend. The type is enabled with the `#[repr(x86_mmx)]` attribute which is intended to just be an implementation detail of SIMD in Rust. I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes or thoughts on that would be greatly appreciated!
☀️ Test successful - status-appveyor, status-travis |
This commit adds compiler support for two basic operations needed for binding
SIMD on x86 platforms:
First, a
nontemporal_store
intrinsic was added for the_mm_stream_ps
, seenin Rust support for nontemporal stores? stdarch#114. This was relatively straightforward and is
quite similar to the volatile store intrinsic.
Next, and much more intrusively, a new type to the backend was added. The
x86_mmx
type is used in LLVM for a 64-bit vector register and is used invarious intrinsics like
_mm_abs_pi8
as seen in ssse3_mm_abs_pi8
: Intrinsic has incorrect return type! stdarch#74.This new type was added as a new layout option as well as having support added
to the trans backend. The type is enabled with the
#[repr(x86_mmx)]
attribute which is intended to just be an implementation detail of SIMD in
Rust.
I'm not 100% certain about how the
x86_mmx
type was added, so any extra eyesor thoughts on that would be greatly appreciated!