Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECS Cluster : Cycle when using capacity_providers #12739

Closed
meriouma opened this issue Apr 9, 2020 · 6 comments
Closed

ECS Cluster : Cycle when using capacity_providers #12739

meriouma opened this issue Apr 9, 2020 · 6 comments
Labels
new-resource Introduces a new resource. service/ecs Issues and PRs that pertain to the ecs service.

Comments

@meriouma
Copy link

meriouma commented Apr 9, 2020

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.24
+ provider.archive v1.3.0
+ provider.aws v2.54.0
+ provider.null v2.1.2
+ provider.random v2.2.1
+ provider.template v2.1.2

Affected Resource(s)

  • aws_ecs_cluster

Terraform Configuration Files

locals {
    hyphenized_name = replace(var.cluster_name, "/\\s+/", "-")
    capacity_provider_name = "${local.hyphenized_name}-${random_id.capacity_provider.b64_url}"
}

resource "aws_ecs_cluster" "cluster" {
    name = local.hyphenized_name
    capacity_providers = [local.capacity_provider_name]
    setting {
        name = "containerInsights"
        value = "enabled"
    }
}

data "aws_ami" "ecs-optimized-ami" {
    most_recent = true
    owners = ["amazon"]

    filter {
        name = "name"
        values = ["amzn-ami-*-amazon-ecs-optimized"]
    }
}

resource "aws_launch_configuration" "ecs" {
    image_id = data.aws_ami.ecs-optimized-ami.image_id
    instance_type = "t3a.medium"
    security_groups = var.security_group_ids_for_ec2_instances
    iam_instance_profile = aws_iam_instance_profile.ecs.name
    user_data_base64 = base64encode(templatefile("${path.module}/ecs-user-data.sh", {
        cluster_name = aws_ecs_cluster.cluster.name
    }))

    lifecycle {
        create_before_destroy = true
    }
}

resource "aws_autoscaling_group" "ecs" {
    name = "${local.hyphenized_name}-scaling-group"
    launch_configuration = aws_launch_configuration.ecs.name

    min_size = var.minimum_instances
    max_size = var.maximum_instances
    desired_capacity = var.desired_instance_count

    vpc_zone_identifier = var.subnet_ids
    health_check_type = "ELB"
    health_check_grace_period = 30
}

/*
 aws_ecs_capacity_provider cannot be deleted or updated so we need a unique ID
 in case of destroy or update so we can create a new one
 https://github.com/aws/containers-roadmap/issues/632
*/
resource "random_id" "capacity_provider" {
    byte_length = 16
    keepers = {
        auto_scaling_group_arn = aws_autoscaling_group.ecs.arn
    }
}

resource "aws_ecs_capacity_provider" "capacity_provider" {
    name = local.capacity_provider_name

    auto_scaling_group_provider {
        auto_scaling_group_arn = random_id.capacity_provider.keepers.auto_scaling_group_arn
        managed_scaling {
            status = "ENABLED"
            minimum_scaling_step_size = 1
            maximum_scaling_step_size = 1
            target_capacity = 75
        }
    }
}

Expected Behavior

Apply works.

Actual Behavior

Error: Cycle: module.engine.module.main_ecs_cluster.data.template_file.ecs_instance_policy, module.engine.module.main_ecs_cluster.aws_iam_policy.ecs-instance-policy, modul
e.engine.module.main_ecs_cluster.aws_iam_role_policy_attachment.ecs_role, module.engine.module.main_ecs_cluster.aws_iam_instance_profile.ecs, module.engine.module.main_ecs
_cluster.aws_launch_configuration.ecs, module.engine.module.main_ecs_cluster.aws_autoscaling_group.ecs, module.engine.module.main_ecs_cluster.random_id.capacity_provider,
module.engine.module.main_ecs_cluster.local.capacity_provider_name, module.engine.module.main_ecs_cluster.aws_ecs_cluster.cluster

Steps to Reproduce

  1. terraform apply
@ghost ghost added service/autoscaling Issues and PRs that pertain to the autoscaling service. service/ec2 Issues and PRs that pertain to the ec2 service. service/ecs Issues and PRs that pertain to the ecs service. labels Apr 9, 2020
@meriouma
Copy link
Author

meriouma commented Apr 9, 2020

I think the capacity_providers argument should not be in the aws_ecs_cluster resource. Since it uses the PutClusterCapacityProviders API method, using a new resource would allow to reference the capacity provider without creating a cycle.

It's easy to create a cycle when using capacity providers, since aws_ecs_capacity_provider references the aws_autoscaling_group arn, which references an aws_launch_configuration, which references the aws_ecs_cluster because we need to add the cluster name in the user_data. Therefore, we cannot add the capacity_provider when creating the ECS cluster.

@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Apr 9, 2020
@bflad bflad added new-resource Introduces a new resource. and removed needs-triage Waiting for first response or review from a maintainer. service/autoscaling Issues and PRs that pertain to the autoscaling service. service/ec2 Issues and PRs that pertain to the ec2 service. labels Apr 9, 2020
@bflad bflad self-assigned this Apr 9, 2020
@toredash
Copy link

toredash commented Apr 16, 2020

It's easy to create a cycle when using capacity providers, since aws_ecs_capacity_provider references the aws_autoscaling_group arn, which references an aws_launch_configuration, which references the aws_ecs_cluster because we need to add the cluster name in the user_data. Therefore, we cannot add the capacity_provider when creating the ECS cluster.

@meriouma I've had the same issue. My workaround was to modify my launch config block as such:

resource "aws_launch_configuration" "ecs" {
    image_id = data.aws_ami.ecs-optimized-ami.image_id
    instance_type = "t3a.medium"
    security_groups = var.security_group_ids_for_ec2_instances
    iam_instance_profile = aws_iam_instance_profile.ecs.name
    user_data_base64 = base64encode(templatefile("${path.module}/ecs-user-data.sh", {
-       cluster_name = aws_ecs_cluster.cluster.name
+       cluster_name = local.hyphenized_name
    }))

    lifecycle {
        create_before_destroy = true
    }
}

This works since the ECS cluster name does not contain a random uid when it is created, such as an ASG. If you recreate an ASG, you will get a new ARN, but for an ECS cluster, the name/arn is predictable/the same.

Hope this can help until the provider is fixed.

@samjgalbraith
Copy link

This workaround works - thanks for that - although it does violate one principle of good IAC because it creates a pair of resources which are dependent on one another without the dependency being represented in the code or understood by Terraform. It means Terraform might be more prone to creating badly ordered plans.

For that reason, the workaround is acceptable temporarily but I think it does ultimately need to be fixed rather than just worked around.

@chriskinsman
Copy link

Also seeing issues with this around apply vs destroy ordering.

If you set it up to apply in this order:
autoscale group
capacity provider
ecs cluster

Destroy is typically the opposite:
ecs cluster
capacity provider
autoscale group

The ecs cluster destroy hangs because the container instances are still running from the auto scale group:

Error: Error deleting ECS cluster: ClusterContainsContainerInstancesException: The Cluster cannot be deleted while Container Instances are active or draining.

@DrFaust92
Copy link
Collaborator

Closed via #22672

@github-actions
Copy link

github-actions bot commented May 5, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
new-resource Introduces a new resource. service/ecs Issues and PRs that pertain to the ecs service.
Projects
None yet
Development

No branches or pull requests

6 participants