kubernetes · nojnhuh · Dec 12, 2024 · Dec 13, 2024 · Jan 14, 2025 · sftim
diff --git a/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md b/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md
@@ -383,3 +383,5 @@ is enabled in the kube-apiserver.
 - For more information on the design, see the
   [Dynamic Resource Allocation with Structured Parameters](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters)
   KEP.
+- For more examples showing different ways DRA can be configured for workloads,
+  see [Assign Resources to Containers and Pods with Dynamic Resource Allocation](/docs/tasks/configure-pod-container/assign-dra-resource/).
diff --git a/content/en/docs/tasks/configure-pod-container/assign-dra-resource.md b/content/en/docs/tasks/configure-pod-container/assign-dra-resource.md
@@ -0,0 +1,70 @@
+---
+title: Assign Resources to Containers and Pods with Dynamic Resource Allocation
+content_type: task
+weight: 270
+---
+
+<!-- overview -->
+
+{{< feature-state feature_gate_name="DynamicResourceAllocation" >}}
+
+This page shows how to assign resources defined with the Dynamic Resource
+Allocation (DRA) APIs to containers.
+
+
+## {{% heading "prerequisites" %}}
+
+- `kind`
+- `kubectl`
+- `helm`
+
+
+<!-- steps -->
+
+## Deploy an example DRA driver
+
+- Reproduce the steps from https://github.com/kubernetes-sigs/dra-example-driver?tab=readme-ov-file#demo to create a cluster and install the driver
+
+- Show DeviceClass
+- Show ResourceSlice for a Node
+
+
+## Allocate one device for a container
+
+- Create a ResourceClaim requesting one device
+- Create a Pod with one container referencing the ResourceClaim
+- Show that the ResourceClaim is allocated
+
+
+## Allocate one device to be shared among multiple Pods
+
+- Same as first example, with multiple Pods referencing the same
+  ResourceClaim
+
+
+## Allocate one device per replica of a Deployment
+
+- Same as first example, using a Deployment with several replicas and the Pod
+  template references a ResourceClaimTemplate.
+
+- Show how several ResourceClaims are generated based on the one
+  ResourceClaimTemplate
+
+- Scale the Deployment beyond the number of available devices
+  - Show the unallocatable ResourceClaims
+
+
+## Clean up
+
+- Delete the kind cluster
+
+
+## {{% heading "whatsnext" %}}
+
+### For workload administrators
+
+* [Schedule GPUs](/docs/tasks/manage-gpus/scheduling-gpus/)
+
+### For device driver authors
+
+* [Example Resource Driver for Dynamic Resource Allocation](https://github.com/kubernetes-sigs/dra-example-driver)
diff --git a/content/en/docs/tutorials/_index.md b/content/en/docs/tutorials/_index.md
@@ -60,6 +60,10 @@ Before walking through each tutorial, you may want to bookmark the
 
 * [Running Kubelet in Standalone Mode](/docs/tutorials/cluster-management/kubelet-standalone/)
 
+## Dynamic Resource Allocation
+
+* [Comparing Dynamic Resource Allocation to Device Plugins](/docs/tutorials/dynamic-resource-allocation/comparing-dra-device-plugin/)
+
 ## {{% heading "whatsnext" %}}
 
 If you would like to write a tutorial, see

diff --git a/content/en/docs/tutorials/dynamic-resource-allocation/_index.md b/content/en/docs/tutorials/dynamic-resource-allocation/_index.md
@@ -0,0 +1,5 @@
+---
+title: "Dynamic Resource Allocation"
+weight: 80
+---
+
diff --git a/...nt/en/docs/tutorials/dynamic-resource-allocation/comapring-dra-device-plugin.md b/...nt/en/docs/tutorials/dynamic-resource-allocation/comapring-dra-device-plugin.md
@@ -0,0 +1,163 @@
+---
+title: Comparing Dynamic Resource Allocation to Device Plugins
+content_type: tutorial
+weight: 10
+---
+
+<!-- overview -->
+
+Both Dynamic Resource Allocation (DRA) and device plugins enable Kubernetes
+workloads to leverage specialized hardware from various vendors. This tutorial
+will show how to configure the same GPU-enabled workload with DRA and device
+plugins to illustrate the differences between the two sets of APIs.
+
+
+## {{% heading "objectives" %}}
+
+* Learn when to prefer using device plugins or DRA when configuring containers'
+  requests for devices.
+
+
+## {{% heading "prerequisites" %}}
+
+* An NVIDIA GPU-enabled cluster with GPU Operator installed
+* `kubectl`
+* `helm`
+
+
+<!-- lessoncontent -->
+
+## Deploy a workload using GPUs configured via device plugin
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: device-plugin-deploy
+  labels:
+    app: device-plugin
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: device-plugin
+  template:
+    metadata:
+      labels:
+        app: device-plugin
+    spec:
+      containers:
+      - name: ctr
+        image: ubuntu:22.04
+        command: ["bash", "-c"]
+        args: ["export; trap 'exit 0' TERM; sleep 9999 & wait"]
+        resources:
+          limits:
+            nvidia.com/gpu: 1
+      affinity:
+        podAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+          - labelSelector:
+              matchExpressions:
+                - key: app
+                  operator: In
+                  values:
+                    - device-plugin
+            topologyKey: nvidia.com/gpu.imex-domain
+        podAntiAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+          - labelSelector:
+              matchExpressions:
+                - key: app
+                  operator: NotIn
+                  values:
+                    - device-plugin
+            topologyKey: nvidia.com/gpu.imex-domain
+```
+
+- GPU resources are specified in the container's `resources.limits` and
+  `resources.requests`
+- `podAffinity` keeps this Deployment's Pods together distributed among Nodes
+  within the same IMEX domain
+- `podAntiAffinity` ensures this Deployment's Pods will all run 
+
+
+## Deploy a workload using GPUs configured via DRA
+
+```yaml
+apiVersion: resource.k8s.io/v1beta1
+kind: ResourceClaimTemplate
+metadata:
+  name: test-gpu-claim
+spec:
+  spec:
+    devices:
+      requests:
+      - name: gpu
+        deviceClassName: gpu.nvidia.com
+---
+apiVersion: resource.k8s.io/v1beta1
+kind: ResourceClaim
+metadata:
+  name: test-imex-claim
+spec:
+  devices:
+    requests:
+    - name: imex
+      deviceClassName: imex.nvidia.com
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: dra-deploy
+  labels:
+    app: dra
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: dra
+  template:
+    metadata:
+      labels:
+        app: dra
+    spec:
+      containers:
+      - name: ctr
+        image: ubuntu:22.04
+        command: ["bash", "-c"]
+        args: ["export; trap 'exit 0' TERM; sleep 9999 & wait"]
+        resources:
+          claims:
+          - name: imex
+          - name: gpu
+      resourceClaims:
+      - name: imex
+        resourceClaimName: test-imex-claim
+      - name: gpu
+        resourceClaimTemplateName: test-gpu-claim
+```
+
+- GPU resources are specified in the container's `resources.claims` which maps
+  to a ResourceClaimTemplate in this example.
+- All of the Deployment's Pods share a single ResourceClaim for one distinct
+  NVIDIA IMEX channel. This ensures all of these Pods are running within the same
+  IMEX domain and that other Pods will not run in that IMEX domain without also
+  referring to the same ResourceClaim.
+
+
+## Clean up
+
+
+## Conclusion
+
+### Reasons to prefer device plugins
+
+### Reasons to prefer DRA
+
+
+## {{% heading "whatsnext" %}}
+
+* Learn more about [Device Plugins](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)
+* Learn more about [Dynamic Resource Allocation](/docs/concepts/scheduling-eviction/dynamic-resource-allocation/)
+* See more examples of how to [Assign Resources to Containers and Pods with Dynamic Resource Allocation](/docs/tasks/configure-pod-container/assign-dra-resource/)