-
Notifications
You must be signed in to change notification settings - Fork 27
Handle Intel CPUID leaves 4 and 18h, specialize CPUID for VM shape #941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/propolis/src/cpuid.rs
Outdated
| // indicate that SMT is enabled, then vCPUs are presented as pairs | ||
| // of sibling threads on vproc-many processors. | ||
| let num_vproc = if self.has_smt { | ||
| // TODO: What if num_vcpu is odd?! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, on Linux if you're indicating SMT and then have an odd number of cores, at least with the RFD 314-like CPUID leaves, the OS decides that your topology is nonsense and asserts it's all one big core:
root@debian:~# cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
0-2
root@debian:~# cat /sys/devices/system/cpu/cpu2/topology/thread_siblings_list
0-2
where the guest did in fact get reasonable leaf B bits:
root@debian:~# cpuid -r | grep 0000000b
0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000000
0x0000000b 0x01: eax=0x00000008 ebx=0x00000003 ecx=0x00000201 edx=0x00000000
0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000001
0x0000000b 0x01: eax=0x00000008 ebx=0x00000003 ecx=0x00000201 edx=0x00000001
0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000002
0x0000000b 0x01: eax=0x00000008 ebx=0x00000003 ecx=0x00000201 edx=0x00000002
if you have an even number of CPUs then Linux sees them all paired up (see the new test in cpuid.rs). it'd be kinda nice to just not allow odd vCPU-count guests until there's a reason to permit them. for the time being I think it's time to let this sit, do virtual platforms, and have reasonable topologies in the non-initial platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually the guest might be in a better state if the non-paired core does not have hyperthreading indicated in its CPUID. that makes me a little less worried about potential guest issues but is more work than this merits right now.
meaning, don't change leaves as guests get them on a rack
lib/propolis/src/cpuid.rs
Outdated
| // Cache types come in any order, but type 0 means there are | ||
| // no more caches, so iterate and adjust as needed until we | ||
| // see that. | ||
| const MAX_REASONABLE_LEVEL: u32 = 0x20; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to hoist this to the top of the cpuid-utils, considering there are several places where such a heuristic limit is appropriate?
lib/propolis/src/cpuid.rs
Outdated
| let ty = leaf.eax & 0b11111; | ||
| let level = (leaf.eax >> 5) & 0b111; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other bits get named-constant masks. Should these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was on the fence because names here are a bit silly ( .. yes, let ty = leaf.eax & LEAF4_EAX_CACHE_TYPE; ..), but it's kind of nice to line up the masks as much as rustfmt permits to see they don't overlap.
lib/propolis/src/cpuid.rs
Outdated
| // that point. | ||
| if num_vcpu >= 256 { | ||
| return Err(SpecializeError::IncompatibleTopology { | ||
| leaf: 4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaf B, not 4
Should TopoKind have some accessor like leaf() which emits a u32, so it could be passed in here, rather than typed in by hand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doubly awkward: leaf, which is used at various points in the match, is the u32. the enum is also #[repr(u32)] which is why that value comes from a simple *topo as u32. i've struck the explicit numbers in these IncompatibleTopology errors in favor of just leaf,.
|
on the D-1713NTE with this change we mostly just see that migration doesn't work on Intel: the cpuid failures were because i'd used a the hyperv failures are because those tests involve migration, and trip over the broader "can't migrate on Intel". other tests that pass do include guests that boot, so some passes are legitimate exercises of functionality that now work. |
=| interesting failure mode after 417 minutes. |
This is kind of three parts:
cpuid-utilsabout Intel-only leaves 4 and 18hwith_cpu_topo()so we update to-be-specialized topo leaves instead of just discard themthis fixes #921 and is part of fixing #940 (the other part being https://www.illumos.org/issues/17529 or getting leaf B from an instance spec)
I've run Linux and OmniOS guests on a 2697v4 and Xeon D-1713NTE now, neither directly complained about the bogus leaf 18h entries but the incorrect leaf 4 entries had resulted in Linux spinning fairly early in boot.
On AMD, specializing leaf B fixes #940, but I've deferred doing that in
propolis-serveruntil at least landing oxidecomputer/omicron#8728. Since nothing is obviously wrong about behavior with the silly "multi-socket" VM topology I'd rather not change it until it's part of a Milan V2 CPU platform. That can come with a constraint that vCPU counts have to be consistent with SMT and nudge all that forward more gracefully too.