Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bextri #180

Closed
wants to merge 4 commits into from
Closed

Fix bextri #180

wants to merge 4 commits into from

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Nov 6, 2017

Closes #26 .

edx[2],
edx[3],
];
let vendor_id_amd: [u8; 12] = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could use a byte literal for this? e.g., b"AuthenticAMD". Unlike string literals, the type of a byte literal is a fixed sized u8 array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, fixed :)

@@ -350,3 +350,4109 @@ macro_rules! assert_approx_eq {
*a, *b, $eps, (*a - *b).abs());
})
}


macro_rules! constify_bextri {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... What kind of hit does this have on compile times?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly low. The version using 65000 match arms took more than 2 hours to compile, and this version with 4000 match arms compiles just as fast as without the macro. There might be some quadratic or exponential behavior in rustc somewhere. I need to fill a bug.

@gnzlbg gnzlbg force-pushed the fix_bextri branch 3 times, most recently from 7a6ad1c to 39c5b77 Compare November 6, 2017 23:09
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 6, 2017

@alexcrichton something weird is going on here with the simd_test macro. It complains that the tbm features are not being tested on Intel CPUs because they don't have the tbm feature, but IIUC that's the expected behavior? That is, if the CPU cannot run the tests, then it shouldn't run them, or what am I missing?

@gnzlbg gnzlbg force-pushed the fix_bextri branch 15 times, most recently from d07b542 to 99fd26e Compare November 7, 2017 16:52
@gnzlbg gnzlbg force-pushed the fix_bextri branch 2 times, most recently from d8f5817 to f6c3b6b Compare November 7, 2017 17:05
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 7, 2017

All tests are green now.


@BurntSushi this PR does increase the compile time. Since we don't have any AMD builds, it is hard to say whether the increase that we are seeing is the real increase (or whether it gets larger once the intrinsic is used a couple of times in a program).

We might be hitting the ceiling of what the constify! macros can achieve.

Ideally, all those arguments that belong in immediate-mode registers would be const function arguments, so that the signature of bextri would become:

pub unsafe fn _bextri2_u64(a: u64, control: const u64) -> u64;

so that we don't need the constify! macros at all, and so that users calling these with non-const values get a compile-time error. Currently, they will get a SIGILL at best, or run-time undefined behavior at worst:

EDIT: All Intel CPUs support bextr in non-immediate mode. AMD CPUs' with TBM, support bextr with immediate and non-immediate registers. Reaching the bextr intrinsic with an immediate-mode register on the 3 Intel CPUs I've experimented with produces a SIGILL.


@alexcrichton I've disabled the STDSIMD_TEST_EVERYTHING on the gnu-emulation build bot because it does not support the TBM instruction set and complains about that. I doubt that such a build bot can actually exist.

@alexcrichton
Copy link
Member

@gnzlbg hm disabling "test everything" seems pretty worrisome as that's the whole point of that bot? It's using Intel's own intstruction emulator which presumably supports everything, right?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 8, 2017

It's using Intel's own intstruction emulator which presumably supports everything, right?

Intel's own instruction emulator does not support the TBM instruction set. This is not surprising given that Intel CPUs do not support TBM either, only AMD CPUs do.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 8, 2017

hm disabling "test everything" seems pretty worrisome as that's the whole point of that bot?

If the objective of that bot was to test all x86 intrinsics in a single build bot I think that we either need to use a better emulator, or a different approach.

I could add an "intel_only" dev feature to the library, use it to disable tbm, and enable the feature only in the emulator.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 8, 2017

@alexcrichton so i've done that and all looks green. We probably want to remove STDSIMD_TEST_EVERYTHING at some point, rename the intel feature to intel_test_everything, and check that instead.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 9, 2017

@alexcrichton another feature that is AMD only and probably won't work in the emulator is SSE4a support.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 9, 2017

I've moved these changes to #175 because I don't have the throughput to maintain 3 versions of the run-time detection one (the current one here, the cleaner one in that pull-request, and the one that also does run-time detection for ARM in my arm_rt branch).

@gnzlbg gnzlbg closed this Nov 9, 2017
@alexcrichton
Copy link
Member

@gnzlbg oh sorry I had no idea! That's fascinating...

Something that forces only intel tests to run sounds great though and we could look at AMD testing later

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 9, 2017

It seems that AMD has an emulator but that's proprietary... maybe I can get a qemu-system-x86_64 with an AMD cpu up and running to test those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants