Skip to content

XSAVE, CPUID, Runtime AVX-512 and ARM support #175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 17, 2017

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Nov 6, 2017

This PR implements:

  • the xsave intrinsics,
  • the __readeflags/__writeeflags intrinsics,
  • the cpuid intrinsics (except for GCC's __get_cpuid_max, Issue Implement GCC's __get_cpuid_max #174 ), and
  • run-time feature detection for xsave and AVX-512,

and fixes one bug:

  • the CPUID instruction was used before testing whether the CPU supported it.

Note that the has_cpuid intrinsic does not exist in GCC nor Clang (it is a sub-set of GCC's __get_cpuid_max) but I've included it because we can make it a no-op on x86_64 which is what most folks are targeting nowadays, which is useful for our own run-time feature detection support and for cpuid crates like @shepmaster's cupid.

This PR does arguably a lot of stuff. We need the EFLAGS intrinsics to properly refactor cpuid into its own intrinsic. We need xsave run-time support for properly refactoring the xgetbv intrinsic. And xsave run-time support is tightly coupled with AVX run-time feature detection, which is different for AVX/AVX2 and AVX-512 and since we want to support AVX-512 soon anyways I did not want to have to go through the spec twice. I also wanted to test the xgetbv intrinsic and ended implementing all of the xsave intrinsics for this. Arguably I just needed to implement xsetbv but, again, I did not wanted to have to go through the spec twice.

The run-time feature detection for x86 grows significantly with AVX-512 support, so I backported one of the refactorings I have in the run-time feature detection module for ARM (which is even bigger). There I refactor it further into its own top-level module to be able to share some code between both.


This is blocked on:

Closes #171 .
Helps with the run-time part of #146 .

@alexcrichton
Copy link
Member

This all looks great to me, thanks so much for digging into all this! One day I'd love to dig into all the cpu detection but it looks quite thorough here and there's plenty of links to follow so I'm more than ok with the current implementation.

I'm ok personally landing this whenever CI is green (I don't think we need AVX-512 support in rustc to block this b/c it doesn't add any AVX-512 intrinsics, right?). It looks like CI currently has an error with llvm.x86.xrstor though?

@alexcrichton
Copy link
Member

I'll also note that personally I'm a little anxious about these intrinsics in the sense that there's not an Intel manual with the signatures (or maybe there is?) in the same way we have such a guide for all the SIMD intrinsics. That being said though I think this library will go through normal stabilization processes as other libraries in libstd, so it's ok for part of this library to be unstable (such as these intrinsics which I'm not currently sure about) and the other parts can be stable (like the Intel-defined signatures).

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 6, 2017

I'm a little anxious about these intrinsics in the sense that there's not an Intel manual with the signatures (or maybe there is?)

AFAIK such a manual does not exist (maybe there is one in the Intel icc compiler?) but there is something much better, Intel's online data-base !

The data-base gives you the C signatures, the assembly instruction the CPUID flag (or flags) that you need to check... Clang and GCC deviate slightly from the C signatures in the Intel guide though (mainly when it comes to signed vs unsigned integers), so it is useful to also check the:

at least, when you read that Intel says some int should be signed, but the operation is clearly doing bit manipulations, in that case, if clang chooses an unsigned integer type I would suggest doing the same in Rust.


It looks like CI currently has an error with llvm.x86.xrstor though?

I got these in my computer as well but I thought I fixed them, will recheck these once the rust PR is merged. LLVM is picky with the pointer types that the intrinsics take.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 6, 2017

Also technically XSAVE intrinsics are SIMD-related, because they allow you to save the state of the SIMD registers, so if a Rust library is going to, for whatever reasons (e.g. green threads) implement context switches correctly, it should probably use them.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 6, 2017

I don't think we need AVX-512 support in rustc to block this b/c it doesn't add any AVX-512 intrinsics, right?

Right. The only thing blocking this is the XSAVE whitelisting PR that you already accepted.

@alexcrichton
Copy link
Member

AFAIK such a manual does not exist (maybe there is one in the Intel icc compiler?) but there is something much better, Intel's online data-base !

Nice! That's perfect.

This all sounds great to me, let's merge when CI is green

@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 4 times, most recently from 3dcee13 to 294aef9 Compare November 9, 2017 17:30
@gnzlbg gnzlbg mentioned this pull request Nov 9, 2017
@alexcrichton
Copy link
Member

@gnzlbg could the 4000 line macro be left to a separate PR perhaps? I'm not sure I'm entirely convinced we should add it yet...

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 9, 2017

@alexcrichton yes, I can split that to a different PR before merging this one. But there was some runtime detection code there that I wanted to have here. Is that ok if I do that when the xsave PR is merged into rustc (it seems to be taking a while longer than I expected).

@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 3 times, most recently from bf2b5cf to 6445928 Compare November 10, 2017 12:13
@gnzlbg gnzlbg changed the title XSAVE, CPUID, and Runtime AVX-512 support XSAVE, CPUID, Runtime AVX-512 and ARM support Nov 10, 2017
@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 13 times, most recently from d103013 to 605b5ae Compare November 13, 2017 18:28
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 13, 2017

@BurntSushi @alexcrichton this one should be ready to go as well. The ARM ci still doesn't use run-time feature detection because there is a PR for cross that adds support for qemu-system and I think it makes more sense to use cross instead of trying to roll our own solution.


@alexcrichton this PR also contains the bextri fix. Shall I split it into a different PR?

@alexcrichton
Copy link
Member

@gnzlbg yeah want to split out the bextri fix? Otherwise seems fine by me!

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 14, 2017

@alexcrichton done :)

This was referenced Nov 16, 2017
@gnzlbg gnzlbg merged commit c68f66a into rust-lang:master Nov 17, 2017
std = []
intel_sde = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you get a chance, could you add documentation here for what intel_sde is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add a comment to the Cargo.toml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement missing xsave intrinsics
2 participants