This guide will walk you through the steps to install and use KubeVirt with the dra-pci-driver
.
-
Download the KubeVirt repository:
git clone https://github.com/TheRealSibasishBehera/kubevirt/tree/host-rc cd kubevirt
-
Start a Kubernetes cluster with 1 NICs:
export KUBEVIRT_PROVIDER_EXTRA_ARGS="--nvme 1G --nvme 500M" # Needed to emulate NVMe devices export KUBEVIRT_PROVIDER=k8s-1.30 export FEATURE_GATES=HostDevices,DynamicResourceAllocation export KUBEVIRT_NUM_NODES=1 export KUBEVIRT_NUM_SECONDARY_NICS=1 make cluster-up make cluster-sync
-
Verify that the nodes are up and running:
cluster-up/kubectl.sh get nodes
It should show something like this:
NAME STATUS ROLES AGE VERSION node01 Ready control-plane,worker 122m v1.30.3
-
SSH and verify that the nodes have the devices present:
cluster-up/ssh.sh node01
$ lspci -Dnn | grep NVM
It should show something like this:
0000:00:07.0 Non-Volatile memory controller [0108]: Red Hat, Inc. QEMU NVM Express Controller [1b36:0010] (rev 02) 0000:00:08.0 Non-Volatile memory controller [0108]: Red Hat, Inc. QEMU NVM Express Controller [1b36:0010] (rev 02)
-
SSH into the control plane node:
cluster-up/ssh.sh node01
-
Edit the manifest files to add the necessary parameters:
-
kube-apiserver: Edit
/etc/kubernetes/manifests/kube-apiserver.yaml
and add the following to thecommand
section under thecontainer
section:- --feature-gates=DynamicResourceAllocation=true - --runtime-config=resource.k8s.io/v1alpha2=true
-
kube-controller-manager: Edit
/etc/kubernetes/manifests/kube-controller-manager.yaml
and add the following to thecommand
section under thecontainer
section:- --feature-gates=DynamicResourceAllocation=true
-
kube-scheduler: Edit
/etc/kubernetes/manifests/kube-scheduler.yaml
and add the following to thecommand
section under thecontainer
section:- --feature-gates=DynamicResourceAllocation=true
-
kubelet config: Edit
/var/lib/kubelet/config.yaml
and add the following:featureGates: DynamicResourceAllocation: true
-
-
SSH into the control plane node:
cluster-up/ssh.sh node01
-
Create a script to unbind the device from its default driver and bind it to the
vfio-pci
driver:vi pci-nvme-bind.sh
Paste the following script and save:
#!/bin/bash # Check if PCI address is provided if [ -z "$1" ]; then echo "Usage: $0 <PCI_ADDRESS>" exit 1 fi PCI_ADDRESS=$1 # Unbind the device from its current driver echo "Unbinding the device $PCI_ADDRESS from its current driver..." echo $PCI_ADDRESS > /sys/bus/pci/drivers/nvme/unbind # Set the driver override to vfio-pci echo "Setting the driver override to vfio-pci for device $PCI_ADDRESS..." echo "vfio-pci" > /sys/bus/pci/devices/$PCI_ADDRESS/driver_override # Bind the device to the vfio-pci driver echo "Binding the device $PCI_ADDRESS to the vfio-pci driver..." echo $PCI_ADDRESS > /sys/bus/pci/drivers/vfio-pci/bind echo "Device $PCI_ADDRESS has been successfully bound to vfio-pci."
-
Make the script executable and use it by passing the PCI address of the devices as an argument. The address can be found using
lspci -Dnn | grep NVM
:chmod +x pci-nvme-bind.sh sudo ./pci-nvme-bind.sh 0000:00:07.0 sudo ./pci-nvme-bind.sh 0000:00:08.0
-
Verify if the devices are bound to the
vfio-pci
driver:lspci -Dnnk -s 0000:00:07.0 lspci -Dnnk -s 0000:00:08.0
It should show
Kernel driver in use: vfio-pci
for the devices:0000:00:07.0 Non-Volatile memory controller [0108]: Red Hat, Inc. QEMU NVM Express Controller [1b36:0010] (rev 02) Subsystem: Red Hat, Inc. Device [1af4:1100] Kernel driver in use: vfio-pci Kernel modules: nvme 0000:00:08.0 Non-Volatile memory controller [0108]: Red Hat, Inc. QEMU NVM Express Controller [1b36:0010] (rev 02) Subsystem: Red Hat, Inc. Device [1af4:1100] Kernel driver in use: vfio-pci Kernel modules: nvme
-
Disable SELinux inside the node:
sudo setenforce 0
-
Check if your Kubernetes cluster supports dynamic resource allocation:
cluster-up/kubectl.sh get resourceclasses
-
If your cluster supports dynamic resource allocation, the response will either be a list of
ResourceClass
objects or:No resources found
-
If dynamic resource allocation is not supported, you will see the following error:
error: the server does not have a resource type "resourceclasses"
-
-
Download the KubeVirt DRA PCI Driver repository:
git clone https://github.com/kubevirt/dra-pci-driver.git cd dra-pci-driver
-
Build the driver:
cd demo ./build-driver.sh
The driver will be saved as an image named
registry.example.com/dra-pci-driver:v0.1.0
. -
Push the driver image into the cluster where KubeVirt is running:
export K8S_VERSION=k8s-1.30 # Change this if you use a different version export CONT=${K8S_VERSION}-dnsmasq chmod +x image-push-docker.sh ./image-push-docker.sh registry.example.com/dra-pci-driver:v0.1.0
-
Apply the DRA PCI Driver manifests:
export KUBECONFIG=$(path/to/kubevirt/cluster-up/kubeconfig.sh) ./deploy-native.sh
-
Verify Node State:
kubectl describe node node01 -n dra-pci-driver
The Spec should contain only Allocatable Devices:
Spec: Allocatable Devices: Pci: Pci Address: 0000:00:07.0 Resource Name: devices.kubevirt.io/nvme Uuid: bc628854-6471-463a-878d-b96b8c7022dd Pci: Pci Address: 0000:00:08.0 Resource Name: devices.kubevirt.io/nvme Uuid: c98572f0-37a0-41bf-b4e0-70d8a12278a3
-
Deploy an example VMI:
kubectl apply -f vmi-resource-claim.yaml
-
Verify if the PCI device is allocated:
kubectl describe node node01 -n dra-pci-driver
The Spec should contain Allocated Devices:
Spec: Allocatable Devices: Pci: Pci Address: 0000:00:07.0 Resource Name: devices.kubevirt.io/nvme Uuid: bc628854-6471-463a-878d-b96b8c7022dd Pci: Pci Address: 0000:00:08.0 Resource Name: devices.kubevirt.io/nvme Uuid: c98572f0-37a0-41bf-b4e0-70d8a12278a3 Allocated Claims: d313f2ab-a84f-449e-bf98-1a379f256ec3: Pci: Devices: Uuid: bc628854-6471-463a-878d-b96b8c7022dd Prepared Claims: d313f2ab-a84f-449e-bf98-1a379f256ec3: Pci: Devices: Uuid: bc628854-6471-463a-878d-b96b8c7022dd
-
Verify if the
virt-launcher
pod is running:kubectl get pods -A
This should show a
virt-launcher
pod with the namevirt-launcher-vmi-nvme-xxx
in theRunning
state.