ECS plan fails if cluster has been deleted outside Terraform #15917

adam-tylr · 2020-10-29T18:45:07Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Terraform v0.13.2
+ provider registry.terraform.io/hashicorp/aws v3.12.0

Affected Resource(s)

aws_ecs_service
aws_ecs_cluster

Terraform Configuration Files

resource "aws_ecs_cluster" "foo" {
  name = "my-cluster"
}

resource "aws_ecs_task_definition" "task" {
  family                = "service"
  container_definitions = file("service.json")
}

resource "aws_ecs_service" "service" {
  name            = "my-service"
  cluster         = aws_ecs_cluster.foo.id
  task_definition = aws_ecs_task_definition.task.arn
}

Service.json is taken straight from the example in the docs https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ecs_task_definition

Debug Output

I cannot provide the full debug output because of security restrictions with my employer but this is the relevant section:

2020-10-29T14:11:51.026-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: 2020/10/29 14:11:51 [DEBUG] [aws-sdk-go] DEBUG: Response ecs/DescribeServices Details:
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: ---[ RESPONSE ]--------------------------------------
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: HTTP/1.1 400
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: Connection: close
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: Content-Length: 68
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: Content-Type: application/x-amz-json-1.1
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: Date: Thu, 29 Oct 2020 18:11:50 GMT
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: X-Amzn-Requestid: 
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe:
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe:
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: -----------------------------------------------------
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: 2020/10/29 14:11:51 [DEBUG] [aws-sdk-go] {"__type":"ClusterNotFoundException","message":"Cluster not found."}
2020-10-29T14:11:51.027-0400 [DEBUG] plugin.terraform-provider-aws_v3.12.0_x5.exe: 2020/10/29 14:11:51 [DEBUG] [aws-sdk-go] DEBUG: Validate Response ecs/DescribeServices failed, attempt 0/25, error ClusterNotFoundException: Cluster not found.
2020/10/29 14:11:51 [ERROR] eval: *terraform.EvalRefresh, err: Error reading ECS service: ClusterNotFoundException: Cluster not found.
2020/10/29 14:11:51 [ERROR] eval: *terraform.EvalSequence, err: Error reading ECS service: ClusterNotFoundException: Cluster not found.

Panic Output

Expected Behavior

I created an ECS cluster with an associated service then manually deleted the cluster and the service in the console (or as part of a regular clean up script). When I run a plan again, I expect it to produce a valid plan to re-create the cluster and the service.

Actual Behavior

I created an ECS cluster with an associated service then manually deleted the cluster and the service in the console. When I run a plan again, terraform outputs Error: Error reading ECS service: ClusterNotFoundException: Cluster not found.

Steps to Reproduce

terraform apply to create the cluster and service
Manually delete the cluster in the AWS console
Wait some undetermined amount of time for cluster to actually be removed (See factoids) or manually update the terraform state for the service to point to a non-existent cluster to simulate the same
terraform plan

Important Factoids

When an ECS cluster and service are deleted, they are put in an inactive state and disappear from the UI but are not actually removed from the account. Described Here. As long as they exist in an inactive state there is no issue. What we've seen happen is the cluster being removed completely such that aws ecs describe-clusters --clusters <cluster-arn> produces an error instead of returning an inactive cluster. During the failed plan I see a sequence of events like:

Call ecs/DescribeClusters with the expected cluster ARN from state
Return code of 200 but with message saying cluster is missing
Terraform output [WARN] ECS Cluster (arn:aws:ecs:us-east-1::cluster/my-cluster) not found, removing from state
Call ecs/DescribeServices with the expected service and cluster ARN from state
Return code of 400 with message saying ClusterNotFoundException
Plan fails

So it seems like terraform needs to interpret a ClusterNotFoundException as a sign of needing to re-create the service.

It's difficult to fully replicate the issue because it depends on the cluster being removed from the account. I'm not sure how long that takes. I've had two internal customers come to me with this issue within 2 weeks of an account clean up. I was able to re-create for my simple example by updating the state of the service to point to a cluster that never existed.

References

The text was updated successfully, but these errors were encountered:

adam-tylr · 2020-10-29T18:48:14Z

I should also add our current work around is to run terraform state rm aws_ecs_service.service after seeing this failure then running the plan again.

bflad · 2020-11-09T15:34:47Z

The fix for this has been merged and will release in version 3.15.0 of Terraform AWS Provider, later this week. Thanks to @adam-tylr for the implementation. 👍

ghost · 2020-11-12T23:41:10Z

This has been released in version 3.15.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

ghost · 2020-12-09T17:10:45Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

ghost added the service/ecs Issues and PRs that pertain to the ecs service. label Oct 29, 2020

github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Oct 29, 2020

adam-tylr mentioned this issue Oct 30, 2020

[WIP] Check for ClusterNotFoundException when reading ECS service #15927

Merged

anGie44 added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Oct 30, 2020

bflad added this to the v3.15.0 milestone Nov 9, 2020

bflad closed this as completed in #15927 Nov 9, 2020

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ECS plan fails if cluster has been deleted outside Terraform #15917

ECS plan fails if cluster has been deleted outside Terraform #15917

adam-tylr commented Oct 29, 2020

adam-tylr commented Oct 29, 2020

bflad commented Nov 9, 2020

ghost commented Nov 12, 2020

ghost commented Dec 9, 2020

ECS plan fails if cluster has been deleted outside Terraform #15917

ECS plan fails if cluster has been deleted outside Terraform #15917

Comments

adam-tylr commented Oct 29, 2020

Community Note

Terraform CLI and Terraform AWS Provider Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

adam-tylr commented Oct 29, 2020

bflad commented Nov 9, 2020

ghost commented Nov 12, 2020

ghost commented Dec 9, 2020