Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing KubeVirt Dynamic Resource Allocation (DRA) Driver #1

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

TheRealSibasishBehera
Copy link
Owner

@TheRealSibasishBehera TheRealSibasishBehera commented Jun 25, 2024

Introducing KubeVirt Dynamic Resource Allocation (DRA) Driver [WIP]

This PR introduces the KubeVirt Device Resource Allocation (DRA) Driver. The driver is responsible for managing the lifecycle of devices and ensuring the correct allocation of resources to pods. In KubeVirt, host device passthrough is facilitated through environment variables. If the correct environment variable is set, it can be converted to a libvirt domain XML. This allows the QEMU process to access host devices. This PR aims to provide the admin with a similar style of defining permitted host devices as in KubeVirtCR , with the help of DeviceClassParameters . The user can specify the exact resource required using PciClaimParameters. This driver currently supports management of PCI devices.

Key Components

  1. Virt-DRA-Controller Deployment: The controller communicates with the kube-scheduler, making allocation decisions based on the overall cluster state and resource availability.

  2. Kubelet Plugin DaemonSet: The plugin maintains and advertises the state of devices, communicating with the centralized controller to provide local state information and receive resource allocation instructions.

  3. CRDs - NodeAllocationState, PciClaimParameters & DeviceClassParameters: These CRDs track resource state in a node, specify parameters for ResourceClaims, and provide parameters for ResourceClass respectively.

  4. Manifests: These include manifests to deploy the controller and kubelet-plugin, as well as ServiceAccount and related objects to provide appropriate permissions to these components.

Future Scope

  • Enable direct communication between controller and plugin.
  • Setting up environment variables in the virt-launcher pod.
  • Add support for mdev and other device types.

More details regarding the problems of device plugin framework in kubevirt and usecase of DRA can be found here:
kubevirt/community#254 (comment)

Signed-off-by: TheRealSibasishBehera <fangedhamster3114@gmail.com>
@TheRealSibasishBehera TheRealSibasishBehera changed the title Introducing KubeVirt Device Resource Allocation (DRA) Driver Introducing KubeVirt Dynamic Resource Allocation (DRA) Driver Jun 25, 2024
}

func (c *Client) Create(ctx context.Context) error {
crd := c.nas.DeepCopy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? You are overwriting in the next line

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DeepCopy is necessary to create a separate copy of the object that can be modified and sent to the API without affecting the original object (c.nas).

TheCreate(ctx, crd, metav1.CreateOptions{})call sends this copy to the Kubernetes API, which will modify the object

}

func (c *Client) Update(ctx context.Context, spec *nascrd.NodeAllocationStateSpec) error {
crd := c.nas.DeepCopy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above


func (d driver) GetClaimParameters(ctx context.Context, claim *resourcev1.ResourceClaim, class *resourcev1.ResourceClass, classParameters interface{}) (interface{}, error) {
if claim.Spec.ParametersRef == nil {
return pcicrd.DefaultPciClaimParametersSpec(), nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not returning nil here? The default PCI parameter looks wrong to me, like the device name shouldn't never be empty

type PciClaimParametersSpec struct {
//matches the ResourceName from class parameters
DeviceName string `json:"deviceName"`
Count int `json:"count"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, please remove the count since a single claim = single device

GetDevicePCIID(basepath string, pciAddress string) (string, error)
}

type DeviceUtilsHandler struct{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to expose this type or can we simply use the interface? Usually, the type is private to the package and there is a costructor that returns the interface:

func NewDeviceHandler() DeviceHandler {
    return &deviceUtilsHandler{}
}

Copy link
Owner Author

@TheRealSibasishBehera TheRealSibasishBehera Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes exposing this is not required , I will add the NewDeviceHandler() function

}

func (p *pcidriver) ValidateClaimParameters(claimParams *pcicrd.PciClaimParametersSpec) error {
if claimParams.DeviceName != "devices.kubevirt.io/nvme" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this can be any arbitary string, right?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No , DeviceName is supposed to be same as one of the supported resourceName in DeviceClassParameters .

I think we need to have somekind of mapping of what devices this driver supports . For now i have kept only devices.kubevirt.io/nvme

Currently VMI declaration [1] in kubevirt uses something similar as

      hostDevices:  
      - name: nvme  
        deviceName: devices.kubevirt.io/nvme  

I also think a name field needs to added in the PciClaimParameters , which will be arbitary defined by user

[1] https://kubevirt.io/user-guide/compute/host-devices/#nvme-pci-passthrough

iommuGroup, err := Handler.GetDeviceIOMMUGroup(pciBasePath, info.Name())
if err != nil {
log.Printf("IOMMU group error: %v", err)
return nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation

return nil
})
if err != nil {
log.Printf("Failed to discover host devices, error: %v", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation

pciID string
}

func DiscoverPermittedHostPCIDevices(supportedPCIDeviceMap map[string]string) (map[string][]*PCIDevice) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't it make sense to return an error as well here?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure , because , in any case we need return a empty map[string][]*PCIDevice) , in case there are error while reading the PCIdevice , its logged right now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is in general best practise to let the caller handle errors

}

// MockDiscoverPermittedHostPCIDevices returns predefined data for testing
func MockDiscoverPermittedHostPCIDevices(supportedPCIDeviceMap map[string]string) (map[string][]*PCIDevice, map[string]string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature of this function is different from the function DiscoverPermittedHostPCIDevices, is it done on purpose?

DiscoverPermittedHostPCIDevices(supportedPCIDeviceMap map[string]string) (map[string][]*PCIDevice)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes its not right , we dont need the iommuToPCIMap for any allocation decision , I will update that

Comment on lines +101 to +106
switch device.Type() {
case nascrd.PciDeviceType:
available[device.Pci.UUID] = device.Pci
default:
// skip other devices
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for a single value you can use a if

Copy link
Owner Author

@TheRealSibasishBehera TheRealSibasishBehera Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense , we can use switch case once this driver supports other kind of devices

Comment on lines +100 to +117
for _, device := range crd.Spec.AllocatableDevices {
switch device.Type() {
case nascrd.PciDeviceType:
available[device.Pci.UUID] = device.Pci
default:
// skip other devices
}
}

for _, allocation := range crd.Spec.AllocatedClaims {
switch allocation.Type() {
case nascrd.PciDeviceType:
for _, device := range allocation.Pci.Devices {
delete(available, device.UUID)
}
default:
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified as:

Suggested change
for _, device := range crd.Spec.AllocatableDevices {
switch device.Type() {
case nascrd.PciDeviceType:
available[device.Pci.UUID] = device.Pci
default:
// skip other devices
}
}
for _, allocation := range crd.Spec.AllocatedClaims {
switch allocation.Type() {
case nascrd.PciDeviceType:
for _, device := range allocation.Pci.Devices {
delete(available, device.UUID)
}
default:
}
}
for _, device := range crd.Spec.AllocatableDevices {
if device.Type() != nascrd.PciDeviceType {
continue
}
if _, exist := crd.Spec.AllocatedClaims[device.UUID]; exist {
continue
}
available[device.Pci.UUID] = device.Pci
}

Comment on lines +124 to +125
if _, exists := crd.Spec.AllocatedClaims[claimUID]; exists {
devices := crd.Spec.AllocatedClaims[claimUID].Pci.Devices
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use the value of the mapping

Suggested change
if _, exists := crd.Spec.AllocatedClaims[claimUID]; exists {
devices := crd.Spec.AllocatedClaims[claimUID].Pci.Devices
if v, exists := crd.Spec.AllocatedClaims[claimUID]; exists {
devices := v.Pci.Devices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants