Skip to content

[x86] expose cpuid, xgetbv, pushfd, popfd #166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Nov 2, 2017

This refactors the cpuid and xgetby intrinsics into its own functions and exposes them as part of the x86 and x86_64 architectures.

The cpuid intrinsic is not available on all x86 CPUs, so this also exposes a has_cpuid() -> bool intrinsic that detect this on non x86_64 hosts, and the pushfd and popfd intrinsics required to implement it (GCC exposes all of this as intrinsics as well).

When doing run-time feature detection for x86/x86_64 we now check whether the cpuid instruction is available, and if not, then just "disable" all features.

One TODO:


@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 2 times, most recently from 276cd3e to 980f50c Compare November 2, 2017 20:54
@petrochenkov
Copy link
Contributor

AFAIK if there is an x86 processor for which it is not available, detecting this at run-time is impossible because... that's cpuids job.

Presence of CPUID can be checked by setting/clearing EFLAGS.ID (if you need to support really old CPUs).

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 2, 2017

@petrochenkov do you have an example of how that is done, or a link to some relevant material? I would then make cpuid unsafe and update the docs about how to check it (maybe offering a simpler way to check it). I'll check it out tomorrow and try to see what can be done about it.

@gnzlbg gnzlbg closed this Nov 2, 2017
@petrochenkov
Copy link
Contributor

(I wish I didn't leave that comment, 80486 is very old.)
Intel SDM / Vol.2 Instruction Set Reference / CPUID
http://wiki.osdev.org/CPUID

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 3, 2017

@petrochenkov don't worry, I am on it.

I have added a has_cpuid function that returns true if the CPU has the cpuid intrinsic. On x86_64 it always returns true. On x86 it does this: https://github.com/gnzlbg/stdsimd/blob/dogfood_cpuid/src/x86/x86.rs#L52

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 3, 2017

So this is mostly fixed. For some reason, i586 fails to compile, but I can't differentiate between i686 and i586 (target_arch = "x86" is the only tool I have available here). Is there a way to make this portable?

@gnzlbg gnzlbg reopened this Nov 3, 2017
@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 2 times, most recently from bae1b96 to dd69d94 Compare November 3, 2017 17:27
@nominolo
Copy link
Contributor

nominolo commented Nov 4, 2017

To distinguish between i586 and i686 you should check for target_arch="x86" and target_feature="sse2". The latter is only available on i686.

@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 9 times, most recently from 9d60edc to 9bc28f6 Compare November 4, 2017 11:29
src/x86/misc.rs Outdated
let eflags_after: u32 = pushfd();

// Check if the ID bit changed:
eflags_after != eflags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnzlbg
Have you checked what exact instruction sequence is produced?
OSDev wiki warns: "Note: Implementing this routine in for example C can lead to issues, because the compiler may change EFLAGS at any time."
(Any intermediate arithmetic operations can change EFLAGS.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I have, and the build should be green now :)

Copy link
Contributor Author

@gnzlbg gnzlbg Nov 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It produces this in release:

has_cpuid:
	pushfl
	popl	%eax
	movl	%eax, %ecx
	orl	$2097152, %ecx
	pushl	%ecx
	popfl
	pushfl
	popl	%ecx
	cmpl	%eax, %ecx
	setne	%al
	retl

and this in debug:

has_cpuid:
	subl	$12, %esp
	pushfl
	popl	%eax
	movl	%eax, 4(%esp)
	movl	4(%esp), %eax
	movl	%eax, %ecx
	orl	$2097152, %ecx
	pushl	%ecx
	popfl
	pushfl
	popl	%ecx
	movl	%ecx, 8(%esp)
	cmpl	%eax, 8(%esp)
	setne	%dl
	movb	%dl, 3(%esp)
	movb	3(%esp), %al
	andb	$1, %al
	movzbl	%al, %eax
	addl	$12, %esp
	retl

for i686; not "perfect" but good enough, plus I got to factor pushfd and popf out :)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 4, 2017

So @alexcrichton @petrochenkov this is finished and green.

The only thing that worries me is that pushfd clobbers the stack but AFAIK it is impossible to indicate this via the asm! macro; it is not possible in clang and gcc either.

Please review this with care since I am an inline assembly noob.

@gnzlbg gnzlbg changed the title [x86] expose cpuid and xgetbv intrinsics [x86] expose cpuid, xgetbv, pushfd, popfd Nov 4, 2017
@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 2 times, most recently from 8732131 to 73938f2 Compare November 4, 2017 12:31
src/x86/misc.rs Outdated
#[cfg(target_arch = "x86")]
#[inline(always)]
pub unsafe fn popfd(eflags: u32) {
asm!("pushl $0; popfd" : : "r"(eflags) : "cc", "flags" : "volatile");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I am not 100% sure if the cc clobber is required here.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 4, 2017

@BurntSushi general design question: I've put these intrinsics in a misc module. For users this doesn't matter much because our internal modules are private, but from the point-of-view of library organization I think misc is a pretty bad name.

The only other idea I've had is to name the modules .../i386/i486/i586/i686/... instead. The main issue I have with this is that, for example, havint to use cfg!(all(target_arch = "x86", target_feature = "-sse2")) to detect i686 is... messy.


EDIT: I've opened #169 so that we don't have to block deciding on this to merge this. We can always fix this later.

@alexcrichton
Copy link
Member

Are these listed in Intel's list of intrinsics? Or are there otherwise known canonical names for these?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 4, 2017

@alexcrichton

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 4, 2017

I could rename pushfd and popfd to __readeflags and __writeeflags and put them in a ia32.rs module, rename xgetbv to _xgetbv and move that to an intrin.rs module, and move cpuid into its own module and rename it to __cpuid_count. (but I don't know whether I should rename it to __cpuid or __get_cpuid, clang and gcc differ here).

Note that the signature of cpuid in clang and gcc is identical and takes 4 int by reference but we can just return a struct or a tuple here (I prefer the struct with the named members but can switch that to a tuple).

@gnzlbg gnzlbg force-pushed the dogfood_cpuid branch 3 times, most recently from 09b358c to e935a62 Compare November 4, 2017 15:41
Expose the `__cpuid` and `_xgetby` `x86`/`x86_64` intrinsics.

The `__cpuid` and `__cpuid_count` intrinsics are not available on all
`x86` CPUs. The `has_cpuid() -> bool` intrinsic detect this on non
`x86_64` hosts. For convenience, this is exposed on `x86_64` as well
but there it always returns `true`. These are exposed by Clang and
GCC.

The `__readeflags` and `__writeeflags` intrinsics, which read/write
the `EFLAGS` register and are required to implement `has_cpuid`, are
exposed as well. GCC and Clang exposes them too.

When doing run-time feature detection for `x86`/`x86_64` we
now properly check whether the `cpuid` instruction is
available before using it. If it is not available,
are features are exposes as "not available".

One TODO:

- The `_xgetbv` intrinsic requires the `xsave` target feature
but this is not currently exposed by rustc, see rust-lang#167 .
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 4, 2017

@alexcrichton I've refactor this to agree naming-wise with GCC and Clang.

The main difference remains that the cpuid instructions return the read values in a struct instead of modifying 4 ints via &mut. Also, AFAIK neither Clang nor GCC expose their intrinsics to detect cpuid support.

Expose the `__cpuid` and `_xgetby` `x86`/`x86_64` intrinsics.

The `__cpuid` and `__cpuid_count` intrinsics are not available on all
`x86` CPUs. The `has_cpuid() -> bool` intrinsic detect this on non
`x86_64` hosts. For convenience, this is exposed on `x86_64` as well
but there it always returns `true`. These are exposed by Clang and
GCC.

The `__readeflags` and `__writeeflags` intrinsics, which read/write
the `EFLAGS` register and are required to implement `has_cpuid`, are
exposed as well. GCC and Clang exposes them too.

When doing run-time feature detection for `x86`/`x86_64` we
now properly check whether the `cpuid` instruction is
available before using it. If it is not available,
are features are exposes as "not available".

One TODO:

- The `_xgetbv` intrinsic requires the `xsave` target feature
but this is not currently exposed by rustc, see rust-lang#167 .
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 5, 2017

Blocked by rust-lang/rust#45761 . I'll reopen this PR once that is merged upstream and I can properly use the xsave and xsaveopt target features. Turns out LLVM has a llvm.x86.xgetbv intrinsic that might allow me to remove some of the assembly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants