-
Notifications
You must be signed in to change notification settings - Fork 301
XSAVE, CPUID, Runtime AVX-512 and ARM support #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This all looks great to me, thanks so much for digging into all this! One day I'd love to dig into all the cpu detection but it looks quite thorough here and there's plenty of links to follow so I'm more than ok with the current implementation. I'm ok personally landing this whenever CI is green (I don't think we need AVX-512 support in rustc to block this b/c it doesn't add any AVX-512 intrinsics, right?). It looks like CI currently has an error with |
|
I'll also note that personally I'm a little anxious about these intrinsics in the sense that there's not an Intel manual with the signatures (or maybe there is?) in the same way we have such a guide for all the SIMD intrinsics. That being said though I think this library will go through normal stabilization processes as other libraries in libstd, so it's ok for part of this library to be unstable (such as these intrinsics which I'm not currently sure about) and the other parts can be stable (like the Intel-defined signatures). |
AFAIK such a manual does not exist (maybe there is one in the Intel icc compiler?) but there is something much better, Intel's online data-base ! The data-base gives you the C signatures, the assembly instruction the CPUID flag (or flags) that you need to check... Clang and GCC deviate slightly from the C signatures in the Intel guide though (mainly when it comes to signed vs unsigned integers), so it is useful to also check the:
at least, when you read that Intel says some
I got these in my computer as well but I thought I fixed them, will recheck these once the rust PR is merged. LLVM is picky with the pointer types that the intrinsics take. |
|
Also technically |
Right. The only thing blocking this is the |
Nice! That's perfect. This all sounds great to me, let's merge when CI is green |
3dcee13 to
294aef9
Compare
|
@gnzlbg could the 4000 line macro be left to a separate PR perhaps? I'm not sure I'm entirely convinced we should add it yet... |
|
@alexcrichton yes, I can split that to a different PR before merging this one. But there was some runtime detection code there that I wanted to have here. Is that ok if I do that when the |
bf2b5cf to
6445928
Compare
d103013 to
605b5ae
Compare
a3ee736 to
be24581
Compare
|
@BurntSushi @alexcrichton this one should be ready to go as well. The ARM ci still doesn't use run-time feature detection because there is a PR for @alexcrichton this PR also contains the bextri fix. Shall I split it into a different PR? |
|
@gnzlbg yeah want to split out the bextri fix? Otherwise seems fine by me! |
be24581 to
0a0aef6
Compare
|
@alexcrichton done :) |
0a0aef6 to
4126259
Compare
4126259 to
2aa0e81
Compare
| strict = [] | ||
| std = [] | ||
| std = [] | ||
| intel_sde = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you get a chance, could you add documentation here for what intel_sde is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add a comment to the Cargo.toml.
This PR implements:
xsaveintrinsics,__readeflags/__writeeflagsintrinsics,cpuidintrinsics (except for GCC's__get_cpuid_max, Issue Implement GCC's__get_cpuid_max#174 ), andxsaveandAVX-512,and fixes one bug:
CPUIDinstruction was used before testing whether the CPU supported it.Note that the
has_cpuidintrinsic does not exist in GCC nor Clang (it is a sub-set of GCC's__get_cpuid_max) but I've included it because we can make it a no-op onx86_64which is what most folks are targeting nowadays, which is useful for our own run-time feature detection support and for cpuid crates like @shepmaster's cupid.This PR does arguably a lot of stuff. We need the
EFLAGSintrinsics to properly refactorcpuidinto its own intrinsic. We needxsaverun-time support for properly refactoring thexgetbvintrinsic. Andxsaverun-time support is tightly coupled with AVX run-time feature detection, which is different for AVX/AVX2 and AVX-512 and since we want to support AVX-512 soon anyways I did not want to have to go through the spec twice. I also wanted to test thexgetbvintrinsic and ended implementing all of thexsaveintrinsics for this. Arguably I just needed to implementxsetbvbut, again, I did not wanted to have to go through the spec twice.The run-time feature detection for x86 grows significantly with AVX-512 support, so I backported one of the refactorings I have in the run-time feature detection module for ARM (which is even bigger). There I refactor it further into its own top-level module to be able to share some code between both.
This is blocked on:
[xsave] whitelist xsave target feature rust#45761 (whitelisting XSAVE in rustc)Closes #171 .
Helps with the run-time part of #146 .