Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use annotations to set labels and taints for clusterapi nodegroups #5382

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 16 additions & 4 deletions cluster-autoscaler/cloudprovider/clusterapi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,10 +255,22 @@ rules:

#### Pre-defined labels and taints on nodes scaled from zero

The Cluster API provider currently does not support the addition of pre-defined
labels and taints for node groups that are scaling from zero. This work is on-going
and will be included in a future release once the API for specifying those
labels and taints has been accepted by the community.
To provide labels or taint information for scale from zero, the optional
capacity annotations may be supplied as a comma separated list, as
demonstrated in the example below:

```yaml
apiVersion: cluster.x-k8s.io/v1alpha4
kind: MachineDeployment
metadata:
annotations:
cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "5"
cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "0"
capacity.cluster-autoscaler.kubernetes.io/memory: "128G"
capacity.cluster-autoscaler.kubernetes.io/cpu: "16"
capacity.cluster-autoscaler.kubernetes.io/labels: "key1=value1,key2=value2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can avoid asking the users to specify the list of labels by inferring if from the MachineDeployment itself.

According to what is defined in https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20220927-label-sync-between-machine-and-nodes.md and in https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20221003-In-place-propagation-of-Kubernetes-objects-only-changes.md

  • Labels that will be propagated to nodes derive from labels defined in MachineDeployment.spec.template.metadata.labels
  • Among those labels only the one node-role.kubernetes.io/* label, node-restriction.kubernetes.io/ domain, node.cluster.x-k8s.io domain are going to be propagated

(NOTE: this will apply to CAPI >=1.4 if we manage to complete the design proposal implementation during the current release cycle)
(NOTE: this is only the subset of labels that CAPI will apply, it doesn't include labels Kubelet adds automatically)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can avoid asking the users to specify the list of labels by inferring if from the MachineDeployment itself.

i think we definitely want to do this once that implementation is released, but i think we could still keep these annotations as a continuation of the overrides we currently have (eg for cpu, memory, etc).

so, ultimately, the cluster autoscaler would inspect the labels from MachineDeployment.spec.template.metadata.labels, and use some of those labels (based on the rules in the enhancement), and then inspect the annotations on the machinedeployment to see if any labels should be overridden.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabriziopandini just a follow up here, is there an objection to adding these override annotations in addition to whatever we do with reading the labels from the MachineDeployment?

capacity.cluster-autoscaler.kubernetes.io/taints: "key1=value1:NoSchedule,key2=value2:NoExecute"
```

## Specifying a Custom Resource Group

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/util/validation"
klog "k8s.io/klog/v2"
)

Expand Down Expand Up @@ -169,15 +170,36 @@ func (r unstructuredScalableResource) MarkMachineForDeletion(machine *unstructur
}

func (r unstructuredScalableResource) Labels() map[string]string {
// TODO implement this once the community has decided how they will handle labels
// this issue is related, https://github.com/kubernetes-sigs/cluster-api/issues/7006

annotations := r.unstructured.GetAnnotations()
// annotation value of the form "key1=value1,key2=value2"
if val, found := annotations[labelsKey]; found {
labels := strings.Split(val, ",")
kv := make(map[string]string, len(labels))
for _, label := range labels {
split := strings.SplitN(label, "=", 2)
if len(split) == 2 {
kv[split[0]] = split[1]
}
}
return kv
}
return nil
}

func (r unstructuredScalableResource) Taints() []apiv1.Taint {
// TODO implement this once the community has decided how they will handle taints

annotations := r.unstructured.GetAnnotations()
// annotation value the form of "key1=value1:condition,key2=value2:condition"
if val, found := annotations[taintsKey]; found {
taints := strings.Split(val, ",")
ret := make([]apiv1.Taint, 0, len(taints))
for _, taintStr := range taints {
taint, err := parseTaint(taintStr)
if err == nil {
ret = append(ret, taint)
}
}
return ret
}
return nil
}

Expand Down Expand Up @@ -359,3 +381,44 @@ func resourceCapacityFromInfrastructureObject(infraobj *unstructured.Unstructure

return capacity
}

// adapted from https://github.com/kubernetes/kubernetes/blob/release-1.25/pkg/util/taints/taints.go#L39
func parseTaint(st string) (apiv1.Taint, error) {
var taint apiv1.Taint

var key string
var value string
var effect apiv1.TaintEffect

parts := strings.Split(st, ":")
switch len(parts) {
case 1:
key = parts[0]
case 2:
effect = apiv1.TaintEffect(parts[1])

partsKV := strings.Split(parts[0], "=")
if len(partsKV) > 2 {
return taint, fmt.Errorf("invalid taint spec: %v", st)
}
key = partsKV[0]
if len(partsKV) == 2 {
value = partsKV[1]
if errs := validation.IsValidLabelValue(value); len(errs) > 0 {
return taint, fmt.Errorf("invalid taint spec: %v, %s", st, strings.Join(errs, "; "))
}
}
default:
return taint, fmt.Errorf("invalid taint spec: %v", st)
}

if errs := validation.IsQualifiedName(key); len(errs) > 0 {
return taint, fmt.Errorf("invalid taint spec: %v, %s", st, strings.Join(errs, "; "))
}

taint.Key = key
taint.Value = value
taint.Effect = effect

return taint, nil
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ import (
"testing"
"time"

"github.com/stretchr/testify/assert"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
Expand Down Expand Up @@ -287,12 +289,15 @@ func TestAnnotations(t *testing.T) {
diskQuantity := resource.MustParse("100Gi")
gpuQuantity := resource.MustParse("1")
maxPodsQuantity := resource.MustParse("42")
expectedTaints := []v1.Taint{{Key: "key1", Effect: v1.TaintEffectNoSchedule, Value: "value1"}, {Key: "key2", Effect: v1.TaintEffectNoExecute, Value: "value2"}}
annotations := map[string]string{
cpuKey: cpuQuantity.String(),
memoryKey: memQuantity.String(),
diskCapacityKey: diskQuantity.String(),
gpuCountKey: gpuQuantity.String(),
maxPodsKey: maxPodsQuantity.String(),
taintsKey: "key1=value1:NoSchedule,key2=value2:NoExecute",
labelsKey: "key3=value3,key4=value4,key5=value5",
}

test := func(t *testing.T, testConfig *testConfig, testResource *unstructured.Unstructured) {
Expand Down Expand Up @@ -333,6 +338,15 @@ func TestAnnotations(t *testing.T) {
} else if maxPodsQuantity.Cmp(maxPods) != 0 {
t.Errorf("expected %v, got %v", maxPodsQuantity, maxPods)
}

taints := sr.Taints()
assert.Equal(t, expectedTaints, taints)

labels := sr.Labels()
assert.Len(t, labels, 3)
assert.Equal(t, "value3", labels["key3"])
assert.Equal(t, "value4", labels["key4"])
assert.Equal(t, "value5", labels["key5"])
}

t.Run("MachineSet", func(t *testing.T) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ const (
gpuTypeKey = "capacity.cluster-autoscaler.kubernetes.io/gpu-type"
gpuCountKey = "capacity.cluster-autoscaler.kubernetes.io/gpu-count"
maxPodsKey = "capacity.cluster-autoscaler.kubernetes.io/maxPods"
taintsKey = "capacity.cluster-autoscaler.kubernetes.io/taints"
labelsKey = "capacity.cluster-autoscaler.kubernetes.io/labels"
)

var (
Expand Down