Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please reconsider deprecating google_container_node_pool -> initial_node_count #1160

Closed
james-masson opened this issue Mar 7, 2018 · 6 comments · Fixed by #1176
Closed
Assignees
Labels
enhancement forward/review In review; remove label to forward service/container

Comments

@james-masson
Copy link

Terraform Version

Terraform v0.11.3
google-provider 1.6.0

Affected Resource(s)

google_container_node_pool

Terraform Configuration Files

resource "google_container_node_pool" "k8s" {
  name_prefix         = "${var.nodepool_name}-"
  zone                = "${var.region}-${element( split(",", lookup(var.zones_lookup, var.region)), 0 )}"
  cluster             = "${var.cluster_name}"
  initial_node_count  = "${var.initial_nodes_per_zone}"
.....

  autoscaling {
    min_node_count = "${var.min_nodes_per_zone}"
    max_node_count = "${var.max_nodes_per_zone}"
  }

  lifecycle {
    create_before_destroy = true
  }
}

Problem

I'm trying to provide seamless nodepool upgrade/replacement with Terraform - always maintaining enough nodes to run all services on the cluster during the upgrade.
The cluster workload is highly elastic, and uses node autoscaling heavily.

I have blue/green nodepools, and create_before_destroy, and use initial_node_count to ensure that there's enough free capacity for this seamless migration.

I cannot use node_count, as every subsequent terraform run will show my actual number of autoscaled nodes does not match my pre-configured amount - and offer to destroy the nodepool.

Not specifying either node_count or initial_node_count results in zero nodes in the nodepool initially, which doesn't provide capacity to run all services during the upgrade process. Node autoscaling is too slow to catch-up to workload demands during the nodepool replacement cycle, to be useful during this replacement process.

initial_node_count provides this "on-creation" boost to the nodepool, to provide capacity for the seamless migration, before autoscaling drags the node numbers down to the required minimum.

@paultyng
Copy link

paultyng commented Mar 7, 2018

@james-masson I'm not sure it would work in this scenario, but have you experimented with ignore_changes?

@danawillow
Copy link
Contributor

There's a bit of discussion around this in #844. Can you take a look at #844 (comment) and see if solution #2 would work for your use case?

@james-masson
Copy link
Author

Apologies for not finding #844 - it is a similar use case.

Unfortunately, the solution you mentioned wouldn't work for me. I want the initial_node_count to be significantly higher than min_node_count and close to the max_node_count.

This initial capacity boost and multiple nodepools is the only way I've found to make the upgrade process close to seamless.

The reason I have blue/green nodepools is that the Google API for creating nodepools ( and hence terraform ) returns success too soon. It's quite normal for the API to return success, and to have zero nodes available to host services for quite a while afterwards. I've also seen problems where no nodes will ever be created successfully, due to bugs or misconfigs.

create_before_destroy gives the new nodes a chance to be up and ready, before the old ones are removed, although this isn't guarrenteed. Multiple nodepools gives some protection against bugs, and losing the create/destroy race.

What I'm trying to say is that my need for large initial_node_count and everything else is a workaround, because I can't trust Google ( and hence Terraform ) to replace nodepools in a reliable way.

I presume you're waiting for "RUNNING" from https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1beta1/projects.locations.clusters.nodePools#status ? I wonder why the discrepency between the API state and what K8s sees...

@paultyng
Copy link

paultyng commented Mar 8, 2018

So you are saying something similar to how ASG's work on the AWS provider (https://www.terraform.io/docs/providers/aws/r/autoscaling_group.html#wait_for_elb_capacity) is what you are looking for here? The ability to wait for a certain number of healthy nodes before moving on to dependencies?

@james-masson
Copy link
Author

Exactly.

modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Sep 27, 2019
Signed-off-by: Modular Magician <magic-modules@google.com>
@ghost
Copy link

ghost commented Mar 29, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 29, 2020
@github-actions github-actions bot added service/container forward/review In review; remove label to forward labels Jan 15, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement forward/review In review; remove label to forward service/container
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants