Skip to content

Work-around buggy Intel chips erroneously reporting BMI1/BMI2 support #1249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 19, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions crates/std_detect/src/detect/os/x86.rs
Original file line number Diff line number Diff line change
Expand Up @@ -249,5 +249,25 @@ pub(crate) fn detect_features() -> cache::Initializer {
}
}

// Unfortunately, some Skylake chips erroneously report support for BMI1 and
// BMI2 without actual support. These chips don't support AVX, and it seems
// that all Intel chips with non-erroneous support BMI do (I didn't check
// other vendors), so we can disable these flags for chips that don't also
// report support for AVX.
//
// It's possible this will pessimize future chips that do support BMI and
// not AVX, but this seems minor compared to a hard crash you get when
// executing an unsupported instruction (to put it another way, it's safe
// for us to under-report CPU features, but not to over-report them). Still,
// to limit any impact this may have in the future, we only do this for
// Intel chips, as it's a bug only present in their chips.
//
// This bug is documented as `SKL052` in the errata section of this document:
// http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/desktop-6th-gen-core-family-spec-update.pdf
if vendor_id == *b"GenuineIntel" && !value.test(Feature::avx as u32) {
value.unset(Feature::bmi1 as u32);
value.unset(Feature::bmi2 as u32);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you limit this to the specific cpu family that has the bug?

Copy link
Member Author

@thomcc thomcc Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably in theory, I mean there's model/family info in the bits of __cpuid_count(1, 0).eax (and we already have this from earlier in the function)

But it's not clear to me at all which values for these correspond to the impacted chip :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Affected Steppings for this appear to be...

Number Y U - H - S -
- D-1 D-1 K-1 N-0 R-0 R-0 S-0
SKL052 X X X X X X X

So that looks like effectively "all Skylake (non-server)", according to Intel, therefore we can constrain this logic to all Skylake processors. We don't know which of the processors this is actually true for, since it's a "may", so we must apply it to all Skylake "client" chips. The possible values for family, model, etc., are specified on page 21 of this specification update, so we just add another constraint.

New Skylake chips are not being shipped so we can assume this won't change.


value
}