Skip to content

Commit

Permalink
Extend documentation on pod group support
Browse files Browse the repository at this point in the history
  • Loading branch information
mimowo committed Feb 5, 2024
1 parent a8c11f3 commit 2f6fea0
Show file tree
Hide file tree
Showing 2 changed files with 88 additions and 2 deletions.
48 changes: 46 additions & 2 deletions site/content/en/docs/tasks/run_plain_pods.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
title: "Run A Plain Pod"
title: "Run Plain Pods"
date: 2023-09-27
weight: 6
description: >
Run a Kueue scheduled Pod.
Run Jobs represented by plain pods, either single pods, or pod groups.
---

This page shows how to leverage Kueue's scheduling and resource management capabilities when running plain Pods.
Expand Down Expand Up @@ -105,3 +105,47 @@ You can create the Pod using the following command:
# Create the pod
kubectl apply -f kueue-pod.yaml
```

## Pod Group definition

In order to run a set of pods as a single unit, called Pod Group, add the
"pod-group-name" label, and the "pod-group-total-count" annotation to all
members of the group, consistently:

```yaml
metadata:
labels:
kueue.x-k8s.io/pod-group-name: "group-name"
annotations:
kueue.x-k8s.io/pod-group-total-count: "2"
```

## Feature limitations

Kueue provides only the minimal required functionallity of running pod groups,
just for the need of environments where the pods are managed by external
controllers directly, without a Job-level CRD.

As a consequence of this design decision Kueue does not re-implement core
functionalities that are available at the Job-level API, such as advanced retry
policies. In particular, Kueue does not re-create failed pods.

Note that, this design choice impacts the scenario of
[preemption](/docs/concepts/cluster_queue/#preemption).
When a workload represented by the pod group is preempted all of its pods
are killed by Kueue (by delete requests). However, later, when the workload is
re-admitted, Kueue will not re-create the terminated pods. This task is left to
the user (or the external controller).

**NOTE:** We recommend migration to using Job-level APIs for managing sets of pods.

## Example Pod Group

Here is a sample Pod that just sleeps for a few seconds:

{{< include "examples/pods-kueue/kueue-pod-group.yaml" "yaml" >}}

You can create the Pod using the following command:
```sh
kubectl apply -f kueue-pod-group.yaml
```
42 changes: 42 additions & 0 deletions site/static/examples/pods-kueue/kueue-pod-group.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
apiVersion: v1
kind: Pod
metadata:
generateName: sample-pod-
labels:
kueue.x-k8s.io/queue-name: user-queue
kueue.x-k8s.io/pod-group-name: "sample-group"
annotations:
kueue.x-k8s.io/pod-group-total-count: "2"
spec:
containers:
- name: sleep
image: busybox
command:
- sleep
args:
- 3s
resources:
requests:
cpu: 3
---
apiVersion: v1
kind: Pod
metadata:
generateName: sample-pod-
labels:
kueue.x-k8s.io/queue-name: user-queue
kueue.x-k8s.io/pod-group-name: "sample-group"
annotations:
kueue.x-k8s.io/pod-group-total-count: "2"
spec:
containers:
- name: sleep
image: busybox
command:
- sleep
args:
- 3s
resources:
requests:
cpu: 3

0 comments on commit 2f6fea0

Please sign in to comment.