Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency issues when doing multiple Google Container Engine cluster or Pool operations #930

Closed
Stono opened this issue Jan 8, 2018 · 8 comments · Fixed by #937
Closed
Assignees

Comments

@Stono
Copy link

Stono commented Jan 8, 2018

Hi there,
We have a terraform file that manages multiple GKE clusters and associated node pools.
You can however, only perform a single operation on a cluster pool at a time.

In this example, I had two operations to two different node pools within the same cluster. Terraform attempts to do such calls in parallel resulting in one of them failing:

Error: Error applying plan:

1 error(s) occurred:

* module.k8-elastic.google_container_node_pool.warm-nodes (destroy): 1 error(s) occurred:

* google_container_node_pool.warm-nodes: Error deleting NodePool: googleapi: Error 400: Operation operation-1515398931908-b36cbd64 is currently upgrading cluster elastic-dev. Please wait and try again once it's done., failedPrecondition

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

There should be a concurrency limit on node pool, and cluster operations.

@sl1pm4t
Copy link
Contributor

sl1pm4t commented Jan 8, 2018

We've worked around this using the depends_on attribute to set explicit dependencies.

@rosbo
Copy link
Contributor

rosbo commented Jan 9, 2018

I added a mutex to ensure operations on the same cluster are applied serially. The work-around using depends_on won't be necessary next release :)

@Stono
Copy link
Author

Stono commented Jan 9, 2018 via email

@rosbo
Copy link
Contributor

rosbo commented Jan 9, 2018

No dates set yet but should be soon :) We have a good amount of changes/bug fixes waiting for a release.

@Stono
Copy link
Author

Stono commented Jan 9, 2018

@rosbo just to config, it will solve this problem (this is me trying to update the min node count on three pools):

module.k8-elastic.module.warm-nodes.google_container_node_pool.elastic-pool: Modifying... (ID: europe-west2-a/elastic-dev/warm)
  autoscaling.0.min_node_count: "2" => "0"
module.k8-elastic.module.hot-nodes.google_container_node_pool.elastic-pool: Modifying... (ID: europe-west2-a/elastic-dev/hot)
  autoscaling.0.min_node_count: "2" => "0"
module.k8-elastic.module.hot-nodes-spot.google_container_node_pool.elastic-pool: Modifying... (ID: europe-west2-a/elastic-dev/hot-spot)
  autoscaling.0.min_node_count: "2" => "0"
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 10s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 20s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 30s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 40s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 50s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 1m0s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 1m10s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 1m20s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 1m30s elapsed)
module.k8-elastic.warm-nodes.google_container_node_pool.elastic-pool: Still modifying... (ID: europe-west2-a/elastic-dev/warm, 1m40s elapsed)
module.k8-elastic.module.warm-nodes.google_container_node_pool.elastic-pool: Modifications complete after 1m46s (ID: europe-west2-a/elastic-dev/warm)

Error: Error applying plan:

2 error(s) occurred:

* module.k8-elastic.module.hot-nodes.google_container_node_pool.elastic-pool: 1 error(s) occurred:

* google_container_node_pool.elastic-pool: googleapi: Error 400: Operation operation-1515531645547-17de0f4d is currently upgrading cluster elastic-dev. Please wait and try again once it's done., failedPrecondition
* module.k8-elastic.module.hot-nodes-spot.google_container_node_pool.elastic-pool: 1 error(s) occurred:

* google_container_node_pool.elastic-pool: googleapi: Error 400: Operation operation-1515531645547-17de0f4d is currently upgrading cluster elastic-dev. Please wait and try again once it's done., failedPrecondition

@rosbo
Copy link
Contributor

rosbo commented Jan 9, 2018

Yes it should, I tested my fix with a similar example.

@Stono
Copy link
Author

Stono commented Jan 9, 2018 via email

modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Sep 27, 2019
Signed-off-by: Modular Magician <magic-modules@google.com>
steved added a commit to dominodatalab/terraform-gcp-gke that referenced this issue Feb 11, 2020
hashicorp/terraform-provider-google#930
seems to imply that each provider has a mutex lock so having one node
pool on a different provider could cause parallel ops.
@ghost
Copy link

ghost commented Mar 30, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants