Skip to content

Commit

Permalink
Merge branch 'main' into release-0.2
Browse files Browse the repository at this point in the history
Change-Id: I47993710ef423009836f848680240078a1b9fe34
  • Loading branch information
alculquicondor committed Aug 25, 2022
2 parents ddadb5b + 8016971 commit 7e47a65
Show file tree
Hide file tree
Showing 16 changed files with 153 additions and 143 deletions.
30 changes: 19 additions & 11 deletions CHANGELOG/CHANGELOG-0.2.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,32 @@
Changes since `v0.1.0`:

### Features
- Bumped the API version from v1alpha1 to v1alpha2. v1alpha1 is no longer supported and Queue is now named LocalQueue.
- Add webhooks to validate and add defaults to all kueue APIs.

- Upgrade the API version from v1alpha1 to v1alpha2. v1alpha1 is no longer supported.
v1alpha2 includes the following changes:
- Rename Queue to LocalQueue.
- Remove ResourceFlavor.labels. Use ResourceFlavor.metadata.labels instead.
- Add webhooks to validate and to add defaults to all kueue APIs.
- Add internal cert manager to serve webhooks with TLS.
- Use finalizers to prevent ClusterQueues and ResourceFlavors in use from being
deleted prematurely.
- Support [codependent resources](/docs/concepts/cluster_queue.md#codepedent-resources)
by assigning the same flavor to codependent resources in a pod set.
- Support [pod overhead](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-overhead/)
in Workload pod sets.
- Default requests to limits if requests are not set in a Workload pod set, to
match internal defaulting for k8s Pods.
- Added [prometheus metrics](/docs/reference/metrics.md) to monitor health of
- Set requests to limits if requests are not set in a Workload pod set,
matching [internal defaulting for k8s Pods](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#resources).
- Add [prometheus metrics](/docs/reference/metrics.md) to monitor health of
the system and the status of ClusterQueues.
- Use Server Side Apply for Workload admission to reduce API conflicts.

### Bug fixes

- Prevent Workloads that don't match the ClusterQueue's namespaceSelector from
blocking other Workloads in a StrictFIFO ClusterQueue.
- Fixed number of pending workloads in a BestEffortFIFO ClusterQueue.
- Fixed bug in a BestEffortFIFO ClusterQueue where a workload might not be
- Fix bug that caused Workloads that don't match the ClusterQueue's
namespaceSelector to block other Workloads in StrictFIFO ClusterQueues.
- Fix the number of pending workloads in BestEffortFIFO ClusterQueues status.
- Fix a bug in BestEffortFIFO ClusterQueues where a workload might not be
retried after a transient error.
- Fixed requeuing an out-of-date workload when failed to admit it.
- Fixed bug in a BestEffortFIFO ClusterQueue where unadmissible workloads
- Fix requeuing an out-of-date workload when failed to admit it.
- Fix a bug in BestEffortFIFO ClusterQueues where inadmissible workloads
were not removed from the ClusterQueue when removing the corresponding Queue.
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,15 @@ created) and when it should stop (as in active pods should be deleted).
## Why use Kueue

Kueue is a lean controller that you can install on top of a vanilla Kubernetes
cluster without replacing any components. It is compatible with cloud
environments where:
- Nodes and other compute resources can be scaled up and down.
cluster. Kueue does not replace any existing Kubernetes components. Kueue is
compatible with cloud environments where:
- Compute resources are elastic and can be scaled up and down.
- Compute resources are heterogeneous (in architecture, availability, price, etc.).

Kueue APIs allow you to express:
- Quotas and policies for fair sharing among tenants.
- Resource fungibility: if a [resource flavor](docs/concepts/cluster_queue.md#resourceflavor-object)
is fully utilized, run the [job](docs/concepts/workload.md) using a different
flavor.
is fully utilized, Kueue can admit the job using a different flavor.

The main design principle for Kueue is to avoid duplicating mature functionality
in [Kubernetes components](https://kubernetes.io/docs/concepts/overview/components/)
Expand Down Expand Up @@ -62,11 +61,11 @@ Learn more about:

<!-- TODO(#64) Remove links to google docs once the contents have been migrated to this repo -->

Learn more about the architecture of Kueue in the design docs:
Learn more about the architecture of Kueue with the following design docs:

- [bit.ly/kueue-apis](https://bit.ly/kueue-apis) (please join the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-batch)
to get access) discusses the API proposal and a high-level description of how it
operates.
- [bit.ly/kueue-apis](https://bit.ly/kueue-apis) discusses the API proposal and a high
level description of how Kueue operates. Join the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-batch)
to get document access.
- [bit.ly/kueue-controller-design](https://bit.ly/kueue-controller-design)
presents the detailed design of the controller.

Expand Down
14 changes: 5 additions & 9 deletions apis/kueue/v1alpha2/resourceflavor_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,19 +24,15 @@ import (
//+kubebuilder:object:root=true
//+kubebuilder:resource:scope=Cluster

// ResourceFlavor is the Schema for the resourceflavors API
// ResourceFlavor is the Schema for the resourceflavors API.
//
// .metadata.labels associated with this flavor are matched against or
// converted to node affinity constraints on the workload’s pods.
// .metadata.labels can be up to 8 elements.
type ResourceFlavor struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

// labels associated with this flavor. They are matched against or
// converted to node affinity constraints on the workload’s pods.
// For example, cloud.provider.com/accelerator: nvidia-tesla-k80.
// More info: http://kubernetes.io/docs/user-guide/labels
//
// labels can be up to 8 elements.
Labels map[string]string `json:"labels,omitempty"`

// taints associated with this flavor that workloads must explicitly
// “tolerate” to be able to use this flavor.
// For example, cloud.provider.com/preemptible="true":NoSchedule
Expand Down
7 changes: 0 additions & 7 deletions apis/kueue/v1alpha2/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 1 addition & 3 deletions apis/kueue/webhooks/resourceflavor_webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,11 +88,9 @@ func (w *ResourceFlavorWebhook) ValidateDelete(ctx context.Context, obj runtime.
func ValidateResourceFlavor(rf *kueue.ResourceFlavor) field.ErrorList {
var allErrs field.ErrorList

labelsPath := field.NewPath("labels")
if len(rf.Labels) > 8 {
allErrs = append(allErrs, field.TooMany(labelsPath, len(rf.Labels), 8))
allErrs = append(allErrs, field.TooMany(field.NewPath("metadata", "labels"), len(rf.Labels), 8))
}
allErrs = append(allErrs, metavalidation.ValidateLabels(rf.Labels, labelsPath)...)

taintsPath := field.NewPath("taints")
if len(rf.Taints) > 8 {
Expand Down
20 changes: 1 addition & 19 deletions apis/kueue/webhooks/resourceflavor_webhook_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,24 +50,6 @@ func TestValidateResourceFlavor(t *testing.T) {
Effect: corev1.TaintEffectNoSchedule,
}).Obj(),
},
{
name: "invalid label name",
rf: utiltesting.MakeResourceFlavor("resource-flavor").MultiLabels(map[string]string{
"foo@bar": "",
}).Obj(),
wantErr: field.ErrorList{
field.Invalid(field.NewPath("labels"), nil, ""),
},
},
{
name: "invalid label value",
rf: utiltesting.MakeResourceFlavor("resource-flavor").MultiLabels(map[string]string{
"foo": "@abcdefg",
}).Obj(),
wantErr: field.ErrorList{
field.Invalid(field.NewPath("labels"), nil, ""),
},
},
{
// Taint validation is not exhaustively tested, because the code was copied from upstream k8s.
name: "invalid taint",
Expand All @@ -88,7 +70,7 @@ func TestValidateResourceFlavor(t *testing.T) {
return m
}()).Obj(),
wantErr: field.ErrorList{
field.TooMany(field.NewPath("labels"), 9, 8),
field.TooMany(field.NewPath("metadata", "labels"), 9, 8),
},
},
{
Expand Down
14 changes: 4 additions & 10 deletions config/components/crd/bases/kueue.x-k8s.io_resourceflavors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ spec:
- name: v1alpha2
schema:
openAPIV3Schema:
description: ResourceFlavor is the Schema for the resourceflavors API
description: "ResourceFlavor is the Schema for the resourceflavors API. \n
.metadata.labels associated with this flavor are matched against or converted
to node affinity constraints on the workload’s pods. .metadata.labels can
be up to 8 elements."
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
Expand All @@ -30,15 +33,6 @@ spec:
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
labels:
additionalProperties:
type: string
description: "labels associated with this flavor. They are matched against
or converted to node affinity constraints on the workload’s pods. For
example, cloud.provider.com/accelerator: nvidia-tesla-k80. More info:
http://kubernetes.io/docs/user-guide/labels \n labels can be up to 8
elements."
type: object
metadata:
type: object
taints:
Expand Down
75 changes: 43 additions & 32 deletions docs/concepts/cluster_queue.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Cluster Queue

A ClusterQueue is a cluster-scoped object that governs a pool of resources
such as CPU, memory and hardware accelerators. A `ClusterQueue` defines:
- The [resource _flavors_](#resourceflavor-object) that it manages, with usage
limits and order of consumption.
such as CPU, memory, and hardware accelerators. A ClusterQueue defines:
- The [resource _flavors_](#resourceflavor-object) that the ClusterQueue manages,
with usage limits and order of consumption.
- Fair sharing rules across the tenants of the cluster.

Only [cluster administrators](/docs/tasks#batch-administrator) should create `ClusterQueue` objects.
Expand Down Expand Up @@ -39,29 +39,29 @@ You can specify the quota as a [quantity](https://kubernetes.io/docs/reference/k
## Resources
In a ClusterQueue, you can define quotas for multiple [compute resources](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-types)
(cpu, memory, GPUs, etc.).
(CPU, memory, GPUs, etc.).
For each resource, you can define quotas for multiple _flavors_. A
flavor represents different variations of a resource. The variations can be
defined in a [ResourceFlavor object](#resourceflavor-object).
For each resource, you can define quotas for multiple _flavors_.
Flavors represent different variations of a resource (for example, different GPU
models). A flavor is defined using a [ResourceFlavor object](#resourceflavor-object).
In a process called [admission](.#admission), Kueue assigns
[Workload pod sets](workload.md#pod-sets) a flavor for each resource it requests.
In a process called [admission](.#admission), Kueue assigns to the
[Workload pod sets](workload.md#pod-sets) a flavor for each resource the pod set
requests.
Kueue assigns the first flavor in the ClusterQueue's `.spec.resources[*].flavors`
list that has enough unused `min` quota in the ClusterQueue or the
ClusterQueue's [cohort](#cohort).

### Codepedent resources

It is possible that multiple resources are tied to the same flavors. This is
typical for `cpu` and `memory`, where the flavors are generally tied to a
machine family or availability guarantees.
It is possible that multiple resources in a ClusterQueue have the same flavors.
This is typical for `cpu` and `memory`, where the flavors are generally tied to
a machine family or VM availability policies. When two or more resources in a
ClusterQueue match their flavors, they are said to be codependent resources.

If this is the case, the resources in the ClusterQueue must list the same
flavors in the same order. When two or more resources match their flavors,
they are said to be codependent. During admission, for each pod set in a
Workload, Kueue assigns the same flavor to the codependent resources that the
pod set requests.
To manage codependent resources, you should list the flavors in the ClusterQueue
resources in the same order. During admission, for each pod set in a Workload,
Kueue assigns the same flavor to the codependent resources that the pod set requests.

An example of a ClusterQueue with codependent resources looks like the following:

Expand Down Expand Up @@ -150,8 +150,8 @@ Resources in a cluster are typically not homogeneous. Resources could differ in:
- architecture (ex: x86 vs ARM CPUs)
- brands and models (ex: Radeon 7000 vs Nvidia A100 vs T4 GPUs)

A ResourceFlavor is an object that represents these variations and allows you
to associate them with node labels and taints.
A ResourceFlavor is an object that represents these resource variations and
allows you to associate them with node labels and taints.

**Note**: If your cluster is homogeneous, you can use an [empty ResourceFlavor](#empty-resourceflavor)
instead of adding labels to custom ResourceFlavors.
Expand All @@ -163,8 +163,8 @@ apiVersion: kueue.x-k8s.io/v1alpha1
kind: ResourceFlavor
metadata:
name: spot
labels:
instance-type: spot
labels:
instance-type: spot
taints:
- effect: NoSchedule
key: spot
Expand All @@ -177,7 +177,7 @@ ClusterQueue in the `.spec.resources[*].flavors[*].name` field.
### ResourceFlavor labels

To associate a ResourceFlavor with a subset of nodes of you cluster, you can
configure the `.labels` field with matching node labels that uniquely identify
configure the `.metadata.labels` field with matching node labels that uniquely identify
the nodes. If you are using [cluster autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler)
(or equivalent controllers), make sure it is configured to add those labels when
adding new nodes.
Expand All @@ -197,8 +197,8 @@ steps:

For example, for a [batch/v1.Job](https://kubernetes.io/docs/concepts/workloads/controllers/job/),
Kueue adds the labels to the `.spec.template.spec.nodeSelector` field. This
guarantees that the workload Pods run on the nodes associated to the flavor
that Kueue decided that the workload should use.
guarantees that the Workload's Pods can only be scheduled on the nodes
targeted by the flavor that Kueue assigned to the Workload.

### ResourceFlavor taints

Expand All @@ -208,7 +208,7 @@ with taints.
Taints on the ResourceFlavor work similarly to [node taints](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/).
For Kueue to admit a workload to use the ResourceFlavor, the PodSpecs in the
workload should have a toleration for it. As opposed to the behavior for
[ResourceFlavor labels](#resourceflavor-labels), Kueue will not add tolerations
[ResourceFlavor labels](#resourceflavor-labels), Kueue does not add tolerations
for the flavor taints.

### Empty ResourceFlavor
Expand Down Expand Up @@ -238,16 +238,27 @@ ClusterQueue.

### Flavors and borrowing semantics

When borrowing, Kueue satisfies the following admission semantics:
When a ClusterQueue is part of a cohort, Kueue satisfies the following admission
semantics:

- When assigning flavors, Kueue goes through the list of flavors in the
ClusterQueue's `.spec.resources[*].flavors`. For each flavor, Kueue attempts
to fit a Workload's pod set using the `min` quota of the ClusterQueue or the
unused `min` quota of other ClusterQueues in the cohort, up to the `max` quota
of the ClusterQueue. If the workload doesn't fit, Kueue proceeds evaluating the next
flavor in the list.
- A ClusterQueue can only borrow quota of flavors it defines and it can only
borrow quota for one flavor.
to fit a Workload's pod set according to the quota defined in the
ClusterQueue for the flavor and the unused quota in the cohort.
If the workload doesn't fit, Kueue evaluates the next flavor in the list.
- A Workload's pod set resource fits in a flavor defined for a ClusterQueue
resource if the sum of requests for the resource:
1. Is less than or equal to the unused `.quota.min` for the flavor in the
ClusterQueue; or
2. Is less than or equal to the sum of unused `.quota.min` for the flavor in
the ClusterQueues in the cohort, and
3. Is less than or equal to the unused `.quota.max` for the flavor in the
ClusterQueue.
In Kueue, when (2) and (3) are satisfied, but not (1), this is called
_borrowing quota_.
- A ClusterQueue can only borrow quota for flavors that the ClusterQueue defines.
- For each pod set resource in a Workload, a ClusterQueue can only borrow quota
for one flavor.

### Borrowing example

Expand Down
4 changes: 2 additions & 2 deletions docs/concepts/local_queue.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ A `LocalQueue` is a namespaced object that groups closely related workloads
belonging to a single tenant. A `LocalQueue` points to one [`ClusterQueue`](cluster_queue.md)
from which resources are allocated to run its workloads.

Users submit jobs to a `LocalQueue`, instead of directly to a `ClusterQueue`.
Users submit jobs to a `LocalQueue`, instead of to a `ClusterQueue` directly.
Tenants can discover which queues they can submit jobs to by listing the
local queues in their namespace. The command looks similar to the following:
local queues in their namespace. The command is similar to the following:

```sh
kubectl get -n my-namespace localqueues
Expand Down
3 changes: 2 additions & 1 deletion docs/setup/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ kubectl delete -f https://github.com/kubernetes-sigs/kueue/releases/download/$VE

### Upgrading from 0.1 to 0.2

Upgrading from `0.1.x` to `0.2.y` is not supported due to breaking API changes.
Upgrading from `0.1.x` to `0.2.y` is not supported because of breaking API
changes.
To install Kueue `0.2.y`, [uninstall](#uninstall) the older version first.

## Install a custom-configured released version
Expand Down
8 changes: 4 additions & 4 deletions docs/tasks/administer_cluster_quotas.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,8 @@ apiVersion: kueue.x-k8s.io/v1alpha1
kind: ResourceFlavor
metadata:
name: x86
labels:
cpu-arch: x86
labels:
cpu-arch: x86
```
```yaml
Expand All @@ -153,8 +153,8 @@ apiVersion: kueue.x-k8s.io/v1alpha1
kind: ResourceFlavor
metadata:
name: arm
labels:
cpu-arch: arm
labels:
cpu-arch: arm
```
To create the ResourceFlavors, run the following command:
Expand Down
Loading

0 comments on commit 7e47a65

Please sign in to comment.