Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply fails when aws_autoscaling_group is into a module and its aws_launch_configuration changes. #11557

Closed
jordiclariana opened this issue Jan 31, 2017 · 15 comments

Comments

@jordiclariana
Copy link

Terraform Version

Terraform v0.8.5

Affected Resource(s)

  • aws_autoscaling_group
  • aws_launch_configuration

Terraform Configuration Files

The main.tf file:

resource "aws_launch_configuration" "aws_lc" {
  name_prefix = "test"
  image_id = "ami-fe408091"
  instance_type = "t2.small"

  security_groups = ["sg-73d80d1b"]
  key_name = "ubuntu"

  lifecycle {
    create_before_destroy = true
  }
}

module "asg_module" {
  source = "aws_autoscaling_group_module"
  name = "test_asg"
  vpc_subnets = ["subnet-6760e60e"]
  availability_zones = ["eu-central-1a"]
  launch_configuration_name = "${aws_launch_configuration.aws_lc.name}"
}

The aws_autoscaling_group_module module file:

variable "name" {}
variable "vpc_subnets" { type = "list" }
variable "availability_zones" { type = "list" }
variable "launch_configuration_name" {}

resource "aws_autoscaling_group" "asg" {
  name = "${var.name}"
  launch_configuration = "${var.launch_configuration_name}"
  vpc_zone_identifier = ["${var.vpc_subnets}"]
  availability_zones = ["${var.availability_zones}"]
  max_size = "0"
  min_size = "0"
  health_check_type = "EC2"

}

Debug Output

https://gist.github.com/jordiclariana/151f1d04c32b60c856fab970ab560bd7

Expected Behavior

It is expected to work the first time, and also when you change aws_launch_configuration and apply again.

Actual Behavior

It works the first time, but if run after changing the aws_launch_configuration we get this message:

Error applying plan:

1 error(s) occurred:

* aws_launch_configuration.aws_lc (deposed #0): ResourceInUse: Cannot delete launch configuration test00e459ef7a52f055982e412c68 because it is attached to AutoScalingGroup test_asg
	status code: 400, request id: 4e2e469b-e7d3-11e6-92ef-f139a587520d

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

Steps to Reproduce

  1. terraform apply
  2. Modify something in the aws_launch_configuration.aws_lc (like for instance, change instance_type from t2.small to t2.medium)
  3. Run terraform apply again. We then get the error.

References

This is related with #1109, and the proposed solution (add the lifecycle parameter) normally works, but when the aws_autoscaling_group is into a separated module it fails.

@StefanSmith
Copy link

StefanSmith commented Feb 9, 2017

I am experiencing this as well (v0.8.5).

In my case, as part of a blue/green deployment mechanism, I am issuing terraform apply having intentionally orphaned the launch configuration and autoscaling group by changing their resource IDs in the terraform configuration (e.g. aws_autoscaling_group.blue is changed to aws_autoscaling_group.green). Terraform should therefore destroy both resources but when the ASG and LC are in a child module it fails with the ResourceInUse error mentioned above.

To provide more detail, when both resources are in the root module, running grep -C1 Destroying terraform.log over the terraform apply trace log produces:

2017/02/09 17:59:30 [DEBUG] apply: aws_autoscaling_group.blue: executing Apply
aws_autoscaling_group.blue: Destroying...
2017/02/09 17:59:30 [TRACE] [walkApply] Entering eval tree: aws_launch_configuration.test5
--
2017/02/09 18:00:59 [DEBUG] root: eval: *terraform.EvalApply
aws_launch_configuration.blue: Destroying...
2017/02/09 18:00:59 [DEBUG] apply: aws_launch_configuration.blue: executing Apply

Notice the 29 second delay between commencing ASG destruction and commencing LC destruction, implying the correct dependency graph.

In contrast, when both resources are in a child module, Terraform attempts to destroy them at the same time:

module.test.aws_launch_configuration.blue: Destroying...
2017/02/09 17:43:07 [DEBUG] vertex "module.test.aws_autoscaling_group.blue (destroy)", got dep: "module.test.provider.aws"
--
2017/02/09 17:43:07 [DEBUG] plugin: terraform: -----------------------------------------------------
module.test.aws_autoscaling_group.blue: Destroying...
2017/02/09 17:43:07 [DEBUG] root.test: eval: *terraform.EvalDiff

Notice how destruction of both resources is commenced at the same time.

Note, the bug does not occur when I issue terraform destroy, only when terraform apply is destroying an orphaned ASG and LC in a child module. Further, nothing in the terraform graph output indicates a material change in the DAG between the two scenarios.

Terraform Configuration 1: ASG and LC in root module

./service.tf

variable aws_region {
  type = "string"
}

variable subnet_ids {
  type = "list"
}

provider aws {
  region = "${var.aws_region}"
}

resource aws_autoscaling_group blue {
  max_size = 1
  min_size = 1
  launch_configuration = "${aws_launch_configuration.blue.name}"
  vpc_zone_identifier = [
    "${var.subnet_ids}"
  ]
}

resource aws_launch_configuration blue {
  image_id = "ami-e83ed4fe"
  instance_type = "t2.nano"
}

Terraform Configuration 2: ASG and LC in child modules

./service.tf:

variable aws_region {
  type = "string"
}

variable subnet_ids {
  type = "list"
}

provider aws {
  region = "${var.aws_region}"
}

module test {
  source = "./module"
  subnet_ids = "${var.subnet_ids}"
}

./module/config.tf

variable subnet_ids {
  type = "list"
}

resource aws_autoscaling_group blue {
  max_size = 1
  min_size = 1
  launch_configuration = "${aws_launch_configuration.blue.name}"
  vpc_zone_identifier = [
    "${var.subnet_ids}"
  ]
}

resource aws_launch_configuration blue {
  image_id = "ami-e83ed4fe"
  instance_type = "t2.nano"
}

Note that the usual create_before_destroy statement is not required in my case, since the ASG and LC are always created and destroyed together.

@StefanSmith
Copy link

As of today, I have noticed that ResourceInUse error stills occurs intermittently even when the ASG and LC are in the root module. Adding an explicit depends_on = [ "aws_launch_configuration.blue" ] to the aws_autoscaling_group resource seems to force Terraform to defer destroying the LC until the ASG is destroyed. This shouldn't be necessary since there is already a launch_configuration = "${aws_launch_configuration.blue.name}" directive. Note that adding depends_on = [ "aws_launch_configuration.blue" ] does not change the behaviour when the resources are in a child module.

@mhlias
Copy link

mhlias commented Feb 10, 2017

I have similar issues with both v0.8.4 and v0.8.6.

Both my launch configuration and autoscaling group resources are inside modules.

I have lifecycle create_before_destroy for both and pass the launch_config id as module output to module input of the module that has the autoscaling group. At initial creation everything is fine but if anything changes that forces recreation of the launch configuration I get the following:


1 error(s) occurred:

* aws_launch_configuration.lc (deposed #0): ResourceInUse: Cannot delete launch configuration bastion-lc-00fe8619e1004544b91f5db782 because it is attached to AutoScalingGroup bastion-asg
	status code: 400, request id: cc756603-ef8a-11e6-91dd-6f06b479e5ba```

That wasn't an issue with 0.7.x.

@strongoose
Copy link

I have similar issues in 0.8.6 using aws_cloudformation_stack to create the autoscaling group (in order to make use of the RollingUpdate functionality available through CF). Interestingly, I also experience the same issue when the launch configuration name is reference in a template instead of inline within the aws_cloudformation_stack resource.

Here are detailed configs and logs for three cases:

  1. launch_configuration.name is referenced inline inside aws_cloudformation_stack (this case works fine and is just for comparison)
  2. launch_configuration.name is passed as a variable to a template. The rendered template is then used inside aws_cloudformation_stack
  3. launch_configuration.name is passed as a variable to a module, and then the variable is referenced inline inside aws_cloudformation_stack

In each case I attach the debug log output for the initial terraform apply and the debug log output for the modification of the launch config.

Inline launch config name (works as expected)

Terraform config:

# Centos7 default
variable "image_id" {
  default = "ami-bb373ddf"
}

provider "aws" {
  region = "eu-west-2"
}

resource "aws_launch_configuration" "launch_configuration" {
  image_id      = "${var.image_id}"
  instance_type = "t2.micro"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_cloudformation_stack" "all_zones_asg" {
  name          = "my-asg"
  template_body = <<EOF
{
  "Resources": {
    "MyAsg": {
      "Type": "AWS::AutoScaling::AutoScalingGroup",
      "Properties": {
        "AvailabilityZones": {"Fn::GetAZs": "eu-west-2"},
        "LaunchConfigurationName": "${aws_launch_configuration.launch_configuration.name}",
        "MaxSize": 2,
        "MinSize": 2
      },
      "UpdatePolicy": {
        "AutoScalingRollingUpdate": {
          "MinInstancesInService": "1",
          "MaxBatchSize": "1",
          "PauseTime": "PT0S"
        }
      }
    }
  }
}
EOF
}

terraform apply (succeeds): logs | raw

TF_VAR_image_id="ami-ede2e889" terraform apply (succeeds): logs | raw

Launch config name as template variable (fails on recreation of launch_configuration)

Terraform config:

# Centos7 default
variable "image_id" {
  default = "ami-bb373ddf"
}

provider "aws" {
  region = "eu-west-2"
}

resource "aws_launch_configuration" "launch_configuration" {
  image_id      = "${var.image_id}"
  instance_type = "t2.micro"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_cloudformation_stack" "all_zones_asg" {
  name          = "my-asg"
  template_body = "${data.template_file.cloudformation_auto_scaling_group.rendered}"
}

data "template_file" "cloudformation_auto_scaling_group" {
  template = "${file("./cf.tpl")}"

  vars {
    launch_configuration = "${aws_launch_configuration.launch_configuration.name}"
    max_size             = 2
    min_size             = 2
  }
}

Template:

{
  "Resources": {
    "MyAsg": {
      "Type": "AWS::AutoScaling::AutoScalingGroup",
      "Properties": {
        "AvailabilityZones": {"Fn::GetAZs": "eu-west-2"},
        "LaunchConfigurationName": "${launch_configuration}",
        "MaxSize": ${max_size},
        "MinSize": ${min_size}
      },
      "UpdatePolicy": {
        "AutoScalingRollingUpdate": {
          "MinInstancesInService": "1",
          "MaxBatchSize": "1",
          "PauseTime": "PT0S"
        }
      }
    }
  }
}

terraform apply (succeeds): logs | raw

TF_VAR_image_id="ami-ede2e889" terraform apply (fails): logs | raw

Launch config name as module variable (fails on recreation of launch_configuration)

Terraform config:

# Centos7 default
variable "image_id" {
  default = "ami-bb373ddf"
}

provider "aws" {
  region = "eu-west-2"
}

resource "aws_launch_configuration" "launch_configuration" {
  image_id      = "${var.image_id}"
  instance_type = "t2.micro"

  lifecycle {
    create_before_destroy = true
  }
}

module "asg" {
  source                   = "./modules/asg"
  launch_config_name       = "${aws_launch_configuration.launch_configuration.name}"
  max_size                 = 2
  min_size                 = 2
  min_instances_in_service = 1
}

Module:

variable "launch_config_name" {}

variable "max_size" {}

variable "min_size" {}

variable "min_instances_in_service" {}

variable "max_batch_size" {
  default = 1
}

variable "region" {
  default = "eu-west-2"
}

resource "aws_cloudformation_stack" "multi_zone_rolling_upgrade_asg" {
  name = "my-asg"

  template_body = <<EOF
{
  "Resources": {
    "MyAsg": {
      "Type": "AWS::AutoScaling::AutoScalingGroup",
      "Properties": {
        "AvailabilityZones": {"Fn::GetAZs": "${var.region}"},
        "LaunchConfigurationName": "${var.launch_config_name}",
        "MaxSize": ${var.max_size},
        "MinSize": ${var.min_size}
      },
      "UpdatePolicy": {
        "AutoScalingRollingUpdate": {
          "MinInstancesInService": ${var.min_instances_in_service},
          "MaxBatchSize": ${var.max_batch_size}
        }
      }
    }
  }
}
EOF
}

terraform apply (succeeds): logs | raw

TF_VAR_image_id="ami-ede2e889" terraform apply (fails): logs | raw

Notes

  • Though the API call to destroy the old LaunchConfiguration fails, manually inspecting it through the AWS management console afterwards shows that it has been successfully detached from the AutoScalingGroup, and the new config is attached in its place.

@theramis
Copy link

After a bunch of investigation I've figured out where the bug lies and a workaround for now.
With the 0.8.x releases the new core graphs system is being used to determine dependencies.
(mentioned at the bottom of this page https://www.terraform.io/upgrade-guides/0-8.html )

By using -Xlegacy-graph when doing an apply, everything seems to work correctly. There is a bug with the new graphing system by the looks of it.

@jsmickey
Copy link

@theramis We used your workaround terraform apply -Xlegacy-graph for a launch configuration change and can confirm it works

@jordiclariana
Copy link
Author

Can you confirm that it works too when using target?

Something like:
terraform apply -target=aws_cloudformation_stack.all_zones_asg -Xlegacy-graph
or
terraform apply -target=module.asg_module -Xlegacy-graph

@eedwardsdisco
Copy link

Confirmed hitting this on 0.8.6. Same situation as others, using launch configuration and ASGs inside modules.

@strongoose
Copy link

Just to confirm that -Xlegacy-graph fixes this for me too. Looks like a core bug then.

@eedwardsdisco
Copy link

this is nasty...

-Xlegacy-graph didn't fix it for me!

Maybe due to using the plan/apply outfile behavior?

e.g. I can use -Xlegacy-graph on the plan, but it's not supported on the apply when using a plan file

So I still reprod the issue when doing a plan/apply that way

@theramis
Copy link

@eedwardsdisco I'm using -Xlegacy-graph with a plan file too.

this is how my command is structured. apply -input=false -Xlegacy-graph plan_file

I think from memory having the -Xlegacy-graph after the plan_file fails, but thats because all optional arguments should be before the main argument.

@eedwardsdisco
Copy link

@theramis Aha!

I just confirmed I was using terraform apply plan_file -Xlegacy-graph.

I will try your suggestion.

I would certainly prefer that this was limited to the new graphing structure, good repros are always easier to fix ;)

@eedwardsdisco
Copy link

I've confirmed it's still broken in 0.9.0

@tyjonesAncestry
Copy link

Still broken in 0.9.5. Causing mayhem with our deploys today as we roll out patched Windows AMIs to 200+ ASGs. Also appears to be a duplicate of #13517

@ghost
Copy link

ghost commented Apr 10, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants