When using externalgrpc provider, nodes are created and then immediately deleted next loop iteration #5935

PeterGrace · 2023-07-06T19:09:07Z

Which component are you using?:
Cluster-Autoscaler externalgrpc provider

What version of the component are you using?:
registry.k8s.io/autoscaling/cluster-autoscaler:v1.27.2

Component version:
helm chart version 9.29.x (edited to allow externalgrpc to consume cloudConfig path)

What k8s version are you using (kubectl version)?:

kubectl version Output

$ kubectl version WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version. Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:13:27Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1 Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.5+k3s1", GitCommit:"7cefebeaac7dbdd0bfec131ea7a43a45cb125354", GitTreeState:"clean", BuildDate:"2023-05-27T00:05:40Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?:
Lab environment

What did you expect to happen?:
I'm working on a libvirt module for the externalgrpc provider. After creating a node via the externalgrpc provisioner, at the next iteration loop the node is then deleted by the autoscaler saying that Scale-Up has timed out.

What happened instead?:
I expected the node to be given enough time to provision and register to the kubernetes master (60-90 seconds)

How to reproduce it (as minimally and precisely as possible):
Write a grpc server that creates nodes and then observe the Scale-Up timeout occurring at the next iteration.

Anything else we need to know?:
I spent a lot of time in the Kubernetes #sig-autoscaling slack channel discussing this with @vadasambar . They indicated that this appears to be an issue in the protobuf definition for externalgrpc, as PR #5649 had moved MaxNodeProvisionTime into NodeGroupAutoscalingOptions, but the externalgrpc/protobuf code was not updated to match this change. They mentioned they will likely be submitting a PR to fix this issue within the next day or so.

The text was updated successfully, but these errors were encountered:

vadasambar · 2023-07-06T19:27:10Z

/assign vadasambar

vadasambar · 2023-07-06T19:27:34Z

WIP PR: #5936. If you are interested you can give it a try (you might have to re-build CA image which you can do using make dev-release).

vadasambar · 2023-07-07T06:11:13Z

I spent a lot of time in the Kubernetes #sig-autoscaling slack channel discussing this with @vadasambar .

Link to the slack thread: https://kubernetes.slack.com/archives/C09R1LV8S/p1688649364267339

vadasambar · 2023-07-07T06:22:46Z

PR is ready for review. Waiting for reviews (#5936).

PeterGrace added the kind/bug Categorizes issue or PR as related to a bug. label Jul 6, 2023

This was referenced Jul 6, 2023

fix(cloudprovider/externalgrpc): GetOptions returns MaxNodeProvisionTime as 0 seconds #5936

Merged

Jul 2023 vadafoss/daily-updates#11

Closed

k8s-ci-robot assigned vadasambar Jul 6, 2023

jbartosik added the area/cluster-autoscaler label Jul 24, 2023

k8s-ci-robot closed this as completed in #5936 Sep 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When using externalgrpc provider, nodes are created and then immediately deleted next loop iteration #5935

When using externalgrpc provider, nodes are created and then immediately deleted next loop iteration #5935

PeterGrace commented Jul 6, 2023 •

edited

Loading

vadasambar commented Jul 6, 2023

vadasambar commented Jul 6, 2023 •

edited

Loading

vadasambar commented Jul 7, 2023

vadasambar commented Jul 7, 2023 •

edited

Loading

When using externalgrpc provider, nodes are created and then immediately deleted next loop iteration #5935

When using externalgrpc provider, nodes are created and then immediately deleted next loop iteration #5935

Comments

PeterGrace commented Jul 6, 2023 • edited Loading

vadasambar commented Jul 6, 2023

vadasambar commented Jul 6, 2023 • edited Loading

vadasambar commented Jul 7, 2023

vadasambar commented Jul 7, 2023 • edited Loading

PeterGrace commented Jul 6, 2023 •

edited

Loading

vadasambar commented Jul 6, 2023 •

edited

Loading

vadasambar commented Jul 7, 2023 •

edited

Loading