-
Notifications
You must be signed in to change notification settings - Fork 66
Support a baker's dozen of disks #9540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Bump `MAX_DISKS_PER_INSTANCE` from 8 to 12. This plus the cloud init volume means that instances could now see up to 13 disks. All affected sagas were able to easily deal with this change due to being parameterized by either that constant or integers (in the case of the saga action node repetition syntax). `slot_to_pci_bdf` will place these new disks after all existing ones, which will not break existing instances: if disks were added to the beginning of the device range instead, guests would number the devices in a different order. The trickiest part here was the schema migration: the original schema added a CHECK constraint to a disk's slot, and this wasn't named like regular constraints are. A little digging into cockroach's internal tables found the generated constraint name (which the upgrader matches for compatibility). Fixes oxidecomputer#9513
|
@lgfa29 you should take a look at this to see if/how it affects the work you're doing at the hypervisor layer to enable more disks for the CSI plugin. |
hawkw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No major concerns from me, provided that works with what it sounds like @lgfa29 is doing --- maybe worth waiting to hear back from him before merging.
| // disks get 16 through 23, and the cloud-init disk is device 24: | ||
| // | ||
| // 0 1 | ||
| // 0123456789ABCDEF0123456789ABCDEF | ||
| // NNNNNNNNDDDDDDDDC DDDD | ||
| // ^^^^ | ||
| // | ||
| // The additional disks at the end (marked with ^) were added to support up | ||
| // to 12 disks. Adding to the end of the range won't break existing | ||
| // instances: if disks were added to the beginning of the range, then guests | ||
| // (like Linux) would number the devices in a different order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for this!
| PciDeviceKind::Nic if logical_slot < 8 => logical_slot + 0x8, | ||
| PciDeviceKind::Disk if logical_slot < 8 => logical_slot + 0x10, | ||
| PciDeviceKind::CloudInitDisk if logical_slot == 0 => 0x18, | ||
| PciDeviceKind::Disk if logical_slot >= 8 && logical_slot < 12 => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, I think this could be
| PciDeviceKind::Disk if logical_slot >= 8 && logical_slot < 12 => { | |
| PciDeviceKind::Disk if (8..12).contains(logical_slot) < 12 => { |
which...is either more or less clear, i think. probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| client: &ClientTestContext, | ||
| name: &str, | ||
| ) -> Vec<Disk> { | ||
| let url = format!("/v1/instances/{name}/disks?project={}", PROJECT_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unimportant nit: could also inline the PROJECT_NAME:
| let url = format!("/v1/instances/{name}/disks?project={}", PROJECT_NAME); | |
| let url = format!("/v1/instances/{name}/disks?project={PROJECT_NAME}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| static_assertions::const_assert!(MAX_DISKS_PER_INSTANCE == 12); | ||
| seq!(N in 0..12 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i presume that the seq macro needs a literal and cannot be N in 0..MAX_DISKS_PER_INSTANCE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
error: expected integer
--> nexus/src/app/sagas/instance_start.rs:175:22
|
175 | seq!(N in 0..MAX_DISKS_PER_INSTANCE {
| ^^^^^^^^^^^^^^^^^^^^^^
| ALTER TABLE | ||
| omicron.public.disk | ||
| DROP CONSTRAINT IF EXISTS | ||
| check_slot_slot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool that you were able to find this!
Realistically that needs the punted on phase 2 work outlined in #9513 (comment). |
Thanks for the ping, but yeah, I was following the discussions and there's no impact on my work, so feel free to merge this whenever it's ready. Increasing the number of disks was something I was planning for a later time since it's not strictly a blocker for CSI. |
Bump
MAX_DISKS_PER_INSTANCEfrom 8 to 12. This plus the cloud init volume means that instances could now see up to 13 disks.All affected sagas were able to easily deal with this change due to being parameterized by either that constant or integers (in the case of the saga action node repetition syntax).
slot_to_pci_bdfwill place these new disks after all existing ones, which will not break existing instances: if disks were added to the beginning of the device range instead, guests would number the devices in a different order.The trickiest part here was the schema migration: the original schema added a CHECK constraint to a disk's slot, and this wasn't named like regular constraints are. A little digging into cockroach's internal tables found the generated constraint name (which the upgrader matches for compatibility).
Fixes #9513