Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster-autoscaler: priority-expander sort order always changing #16664

Closed
elliotdobson opened this issue Jul 9, 2024 · 0 comments · Fixed by #16670
Closed

cluster-autoscaler: priority-expander sort order always changing #16664

elliotdobson opened this issue Jul 9, 2024 · 0 comments · Fixed by #16670
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@elliotdobson
Copy link
Contributor

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

Client version: 1.28.5 (git-v1.28.5)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

Server Version: v1.28.11

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?
Configure the cluster-autoscaler managed-addon using expander: priority & (multiple) instancegroups with autoscale: true & autoscalePriority set.
Then run kops update cluster a few times and observe the sort order of the cluster-autoscaler-priority-expander configMap changing on each run.

5. What happened after the commands executed?
The sort order of the instancegroups in the cluster-autoscaler-priority-expander configMap changing which shows updates are required.

First run (shouldn't have changed this time)

Will modify resources:
  ManagedFile/cluster.example.com-addons-bootstrap
  	Contents            
  	                    	...
  	                    	    - id: k8s-1.15
  	                    	      manifest: cluster-autoscaler.addons.k8s.io/k8s-1.15.yaml
  	                    	+     manifestHash: 2632a6222ec08e5fd1166ff29c0fed020dd81f7ca9c8400a8ec0e24a48d4e2c9
  	                    	-     manifestHash: 9383ffd41a1eb9e3d299f9e3ddaf1f1a4d440aaba92dedd7b1d4bd2f0fa3818d
  	                    	      name: cluster-autoscaler.addons.k8s.io
  	                    	      selector:
  	                    	...
  	                    	

  ManagedFile/cluster.example.com-addons-cluster-autoscaler.addons.k8s.io-k8s-1.15
  	Contents            
  	                    	...
  	                    	    priorities: |-
  	                    	      0:
  	                    	+     - nodes-b.cluster.example.com
  	                    	      - nodes-a.cluster.example.com
  	                    	-     - nodes-b.cluster.example.com
  	                    	      10:
  	                    	      - nodes-cifs.cluster.example.com
  	                    	      20:
  	                    	+     - nodes-a-ondemand.cluster.example.com
  	                    	-     - nodes-b-ondemand.cluster.example.com
  	                    	+     - nodes-b-ondemand.cluster.example.com
  	                    	-     - nodes-a-ondemand.cluster.example.com
  	                    	      30:
  	                    	+     - nodes-b-worker.cluster.example.com
  	                    	-     - nodes-b-import-worker.cluster.example.com
  	                    	+     - nodes-b-courier-worker.cluster.example.com
  	                    	-     - nodes-b-worker.cluster.example.com
  	                    	+     - nodes-a-spot.cluster.example.com
  	                    	-     - nodes-a-worker.cluster.example.com
  	                    	-     - nodes-b-spot.cluster.example.com
  	                    	+     - nodes-b-import-worker.cluster.example.com
  	                    	-     - nodes-a-spot.cluster.example.com
  	                    	+     - nodes-a-courier-worker.cluster.example.com
  	                    	-     - nodes-b-courier-worker.cluster.example.com
  	                    	      - nodes-a-import-worker.cluster.example.com
  	                    	+     - nodes-b-spot.cluster.example.com
  	                    	+     - nodes-a-worker.cluster.example.com
  	                    	-     - nodes-a-courier-worker.cluster.example.com
  	                    	  kind: ConfigMap
  	                    	  metadata:
  	                    	...
  	                    	

Second run (again still shouldn't have changed)

Will modify resources:
  ManagedFile/cluster.example.com-addons-bootstrap
  	Contents            
  	                    	...
  	                    	    - id: k8s-1.15
  	                    	      manifest: cluster-autoscaler.addons.k8s.io/k8s-1.15.yaml
  	                    	+     manifestHash: 312c1246f94a347ff98f6aba70d6060f0de5d1ae488bedfdcd082d7a14b2555e
  	                    	-     manifestHash: 9383ffd41a1eb9e3d299f9e3ddaf1f1a4d440aaba92dedd7b1d4bd2f0fa3818d
  	                    	      name: cluster-autoscaler.addons.k8s.io
  	                    	      selector:
  	                    	...
  	                    	

  ManagedFile/cluster.example.com-addons-cluster-autoscaler.addons.k8s.io-k8s-1.15
  	Contents            
  	                    	...
  	                    	    priorities: |-
  	                    	      0:
  	                    	+     - nodes-b.cluster.example.com
  	                    	      - nodes-a.cluster.example.com
  	                    	-     - nodes-b.cluster.example.com
  	                    	      10:
  	                    	      - nodes-cifs.cluster.example.com
  	                    	...
  	                    	      - nodes-a-ondemand.cluster.example.com
  	                    	      30:
  	                    	+     - nodes-a-courier-worker.cluster.example.com
  	                    	-     - nodes-b-import-worker.cluster.example.com
  	                    	+     - nodes-a-import-worker.cluster.example.com
  	                    	-     - nodes-b-worker.cluster.example.com
  	                    	+     - nodes-a-spot.cluster.example.com
  	                    	-     - nodes-a-worker.cluster.example.com
  	                    	+     - nodes-a-worker.cluster.example.com
  	                    	-     - nodes-b-spot.cluster.example.com
  	                    	+     - nodes-b-courier-worker.cluster.example.com
  	                    	-     - nodes-a-spot.cluster.example.com
  	                    	+     - nodes-b-import-worker.cluster.example.com
  	                    	-     - nodes-b-courier-worker.cluster.example.com
  	                    	+     - nodes-b-spot.cluster.example.com
  	                    	-     - nodes-a-import-worker.cluster.example.com
  	                    	+     - nodes-b-worker.cluster.example.com
  	                    	-     - nodes-a-courier-worker.cluster.example.com
  	                    	  kind: ConfigMap
  	                    	  metadata:
  	                    	...
  	                    	

6. What did you expect to happen?
The sort order of the instancegroups in the cluster-autoscaler-priority-expander configMap to be consistent.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
...
spec:
...
  cloudProvider: aws
  clusterAutoscaler:
    awsUseStaticInstanceList: false
    balanceSimilarNodeGroups: true
    cordonNodeBeforeTerminating: false
    cpuRequest: 100m
    enabled: true
    expander: priority
    maxNodeProvisionTime: 10m0s
    memoryRequest: 384Mi
    newPodScaleUpDelay: 30s
    scaleDownDelayAfterAdd: 10m0s
    scaleDownUnneededTime: 10m0s
    scaleDownUnreadyTime: 20m0s
    scaleDownUtilizationThreshold: "0.5"
    skipNodesWithLocalStorage: false
    skipNodesWithSystemPods: false
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-a
...
spec:
  autoscale: false
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-a-ondemand
...
spec:
  autoscale: true
  autoscalePriority: 20
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-a-spot
...
spec:
  autoscale: true
  autoscalePriority: 30
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-a-worker
...
spec:
  autoscale: true
  autoscalePriority: 30
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-b
...
spec:
  autoscale: false
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-b-ondemand
...
spec:
  autoscale: true
  autoscalePriority: 20
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-b-spot
...
spec:
  autoscale: true
  autoscalePriority: 30
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  name: nodes-b-worker
...
spec:
  autoscale: true
  autoscalePriority: 30
...

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?
I also noticed that instanceGroups that have autoscale: false are included in the cluster-autoscaler-priority-expander configMap but IMO they shouldn't be (they are not included in the cluster-autoscaler Deployment as expected).

Removing autoscale: false from the instanceGroup actually removes it from the cluster-autoscaler-priority-expander configMap which is good, however it adds it to the cluster-autoscaler Deployment, which is not good.

So need to tidy up this behaviour too.

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants