core: Fix interp error msgs on module vars during destroy #6557

phinze · 2016-05-09T17:28:20Z

Wow this one was tricky!

This bug presents itself only when using planfiles, because when doing a
straight terraform apply the interpolations are left in place from the
Plan graph walk and paper over the issue. (This detail is what made it
so hard to reproduce initially.)

Basically, graph nodes for module variables are visited during the apply
walk and attempt to interpolate. During a destroy walk, no attributes
are interpolated from resource nodes, so these interpolations fail.

This scenario is supposed to be handled by the PruneNoopTransformer -
in fact it's described as the example use case in the comment above it!

So the bug had to do with the actual behavor of the Noop transformer.
The resource nodes were not properly reporting themselves as Noops
during a destroy, so they were being left in the graph.

This in turn triggered the module variable nodes to see that they had
another node depending on them, so they also reported that they could
not be pruned.

Therefore we had two nodes in the graph that were effectively noops but
were being visited anyways. The module variable nodes were already graph
leaves, which is why this error presented itself as just stray messages
instead of actual failure to destroy.

Fixes #5440
Fixes #5708
Fixes #4988
Fixes #3268

Wow this one was tricky! This bug presents itself only when using planfiles, because when doing a straight `terraform apply` the interpolations are left in place from the Plan graph walk and paper over the issue. (This detail is what made it so hard to reproduce initially.) Basically, graph nodes for module variables are visited during the apply walk and attempt to interpolate. During a destroy walk, no attributes are interpolated from resource nodes, so these interpolations fail. This scenario is supposed to be handled by the `PruneNoopTransformer` - in fact it's described as the example use case in the comment above it! So the bug had to do with the actual behavor of the Noop transformer. The resource nodes were not properly reporting themselves as Noops during a destroy, so they were being left in the graph. This in turn triggered the module variable nodes to see that they had another node depending on them, so they also reported that they could not be pruned. Therefore we had two nodes in the graph that were effectively noops but were being visited anyways. The module variable nodes were already graph leaves, which is why this error presented itself as just stray messages instead of actual failure to destroy. Fixes #5440 Fixes #5708 Fixes #4988 Fixes #3268

jen20 · 2016-05-09T17:30:46Z

LGTM - great catch - typically stupid amount of effort for a three line core diff once again!

richardbowden · 2016-05-10T09:38:45Z

Still seeing this issue on v0.6.16 the output of my test is below, let me know if more info is required ??

`Terraform v0.6.16

Setting up remote state...
Downloading remote modules...
Refreshing Terraform state prior to plan...
aws_iam_instance_profile.consul_instance_profile: Destroying...
aws_iam_role_policy.consul_role_policy: Destroying...
aws_instance.consul2: Destroying...
aws_instance.consul0: Destroying...
aws_instance.consul1: Destroying...
aws_iam_role_policy.bastion_role_policy: Destroying...
aws_route53_record.bastion_domain_name: Destroying...
aws_instance.splunk_indexer_01: Destroying...
aws_iam_role_policy.splunk_role_policy: Destroying...
aws_iam_role_policy.bastion_role_policy: Destruction complete
aws_route53_record.bastion_domain_name: Destroying...
aws_iam_role_policy.consul_role_policy: Destruction complete
aws_iam_role_policy.gocd_role_policy: Destroying...
aws_iam_role_policy.splunk_role_policy: Destruction complete
aws_route_table_association.private_b: Destroying...
aws_iam_instance_profile.consul_instance_profile: Destruction complete
aws_route_table_association.public_b: Destroying...
aws_iam_role_policy.gocd_role_policy: Destruction complete
aws_iam_role_policy.flow_log_policy: Destroying...
aws_route_table_association.public_a: Destroying...
aws_iam_role_policy.flow_log_policy: Destruction complete
aws_route.internet_access: Destroying...
aws_route_table_association.private_b: Destruction complete
aws_route_table_association.public_c: Destroying...
aws_route_table_association.public_b: Destruction complete
aws_flow_log.flow_log: Destroying...
aws_route_table_association.public_a: Destruction complete
aws_route_table_association.private_a: Destroying...
aws_route_table_association.public_c: Destruction complete
aws_route_table_association.private_c: Destroying...
aws_flow_log.flow_log: Destruction complete
aws_iam_role.consul_role: Destroying...
aws_route_table_association.private_a: Destruction complete
aws_subnet.public_core_services_b: Destroying...
aws_route.internet_access: Destruction complete
aws_route_table.public_route_table: Destroying...
aws_route53_record.bastion_domain_name: Destruction complete
aws_subnet.public_core_services_c: Destroying...
aws_iam_role.flow_log_role: Destroying...
aws_iam_role.consul_role: Destruction complete
aws_nat_gateway.gw: Destroying...
aws_route_table_association.private_c: Destruction complete
aws_eip.bastion: Destroying...
aws_iam_role.flow_log_role: Destruction complete
aws_route53_record.project_subdomain_ns: Destroying...
aws_subnet.public_core_services_b: Destruction complete
aws_instance.instance: Destroying...
aws_route_table.public_route_table: Destruction complete
aws_route_table.private_route_table: Destroying...
aws_subnet.public_core_services_c: Destruction complete
aws_internet_gateway.internet_gateway: Destroying...
aws_route53_record.project_subdomain_ns: Destruction complete
aws_route53_zone.project_subdomain: Destroying...
aws_route53_zone.project_subdomain: Destruction complete
aws_eip.bastion: Destruction complete
aws_instance.bastion: Destroying...
aws_route_table.private_route_table: Destruction complete
aws_instance.consul2: Still destroying... (10s elapsed)
aws_instance.consul1: Still destroying... (10s elapsed)
aws_instance.consul0: Still destroying... (10s elapsed)
aws_instance.splunk_indexer_01: Still destroying... (10s elapsed)
aws_nat_gateway.gw: Still destroying... (10s elapsed)
aws_instance.instance: Still destroying... (10s elapsed)
aws_internet_gateway.internet_gateway: Still destroying... (10s elapsed)
aws_instance.bastion: Still destroying... (10s elapsed)
aws_instance.consul2: Still destroying... (20s elapsed)
aws_instance.consul0: Still destroying... (20s elapsed)
aws_instance.consul1: Still destroying... (20s elapsed)
aws_instance.splunk_indexer_01: Still destroying... (20s elapsed)
aws_nat_gateway.gw: Still destroying... (20s elapsed)
aws_instance.instance: Still destroying... (20s elapsed)
aws_internet_gateway.internet_gateway: Still destroying... (20s elapsed)
aws_instance.bastion: Still destroying... (20s elapsed)
aws_instance.consul1: Destruction complete
aws_subnet.private_core_services_b: Destroying...
aws_subnet.private_core_services_b: Destruction complete
aws_instance.consul2: Still destroying... (30s elapsed)
aws_instance.consul0: Still destroying... (30s elapsed)
aws_instance.splunk_indexer_01: Still destroying... (30s elapsed)
aws_instance.instance: Destruction complete
aws_security_group.gocd_cluster: Destroying...
aws_iam_instance_profile.gocd_instance_profile: Destroying...
aws_iam_instance_profile.gocd_instance_profile: Destruction complete
aws_iam_role.gocd_role: Destroying...
aws_nat_gateway.gw: Still destroying... (30s elapsed)
aws_iam_role.gocd_role: Destruction complete
aws_instance.consul0: Destruction complete
aws_instance.splunk_indexer_01: Destruction complete
aws_subnet.private_core_services_a: Destroying...
aws_iam_instance_profile.splunk_instance_profile: Destroying...
aws_security_group.splunk_cluster: Destroying...
aws_instance.consul2: Destruction complete
aws_security_group.consul_cluster: Destroying...
aws_subnet.private_core_services_c: Destroying...
aws_iam_instance_profile.splunk_instance_profile: Destruction complete
aws_iam_role.splunk_role: Destroying...
aws_security_group.gocd_cluster: Destruction complete
aws_iam_role.splunk_role: Destruction complete
aws_internet_gateway.internet_gateway: Still destroying... (30s elapsed)
aws_security_group.splunk_cluster: Destruction complete
aws_security_group.consul_cluster: Destruction complete
aws_subnet.private_core_services_a: Destruction complete
aws_subnet.private_core_services_c: Destruction complete
aws_instance.bastion: Still destroying... (30s elapsed)
aws_instance.bastion: Destruction complete
atlas_artifact.tu_base: Destroying...
aws_security_group.allow_ssh_bastion: Destroying...
aws_iam_instance_profile.bastion_instance_profile: Destroying...
atlas_artifact.tu_base: Destruction complete
aws_iam_instance_profile.bastion_instance_profile: Destruction complete
aws_iam_role.bastion_role: Destroying...
aws_iam_role.bastion_role: Destruction complete
aws_security_group.allow_ssh_bastion: Destruction complete
aws_nat_gateway.gw: Still destroying... (40s elapsed)
aws_internet_gateway.internet_gateway: Still destroying... (40s elapsed)
aws_internet_gateway.internet_gateway: Destruction complete
aws_nat_gateway.gw: Still destroying... (50s elapsed)
aws_nat_gateway.gw: Destruction complete
aws_subnet.public_core_services_a: Destroying...
aws_eip.nat_gw: Destroying...
aws_eip.nat_gw: Destruction complete
aws_subnet.public_core_services_a: Destruction complete
aws_vpc.core_services: Destroying...
aws_vpc.core_services: Destruction complete
Error applying plan:

6 error(s) occurred:

Resource 'aws_iam_instance_profile.gocd_instance_profile' does not have attribute 'name' for variable 'aws_iam_instance_profile.gocd_instance_profile.name'
Resource 'atlas_artifact.tu_base' does not have attribute 'metadata_full.region-eu-west-1' for variable 'atlas_artifact.tu_base.metadata_full.region-eu-west-1'
Resource 'atlas_artifact.tu_base' does not have attribute 'metadata_full.region-eu-west-1' for variable 'atlas_artifact.tu_base.metadata_full.region-eu-west-1'
Resource 'atlas_artifact.tu_base' does not have attribute 'metadata_full.region-eu-west-1' for variable 'atlas_artifact.tu_base.metadata_full.region-eu-west-1'
Resource 'atlas_artifact.tu_base' does not have attribute 'metadata_full.region-eu-west-1' for variable 'atlas_artifact.tu_base.metadata_full.region-eu-west-1'
Resource 'aws_security_group.gocd_cluster' does not have attribute 'id' for variable 'aws_security_group.gocd_cluster.id'

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.`

sethvargo · 2016-05-10T09:55:09Z

@phinze ^

phinze · 2016-05-10T18:15:05Z

Hi @richardbowden - yeah it does look like there is another issue here. Digging in now!

The fix that landed in #6557 was unfortunately the wrong subset of the work I had been doing locally, and users of the attached bugs are still reporting problems with Terraform v0.6.16. At the very last step, I attempted to scope down both the failing test and the implementation to their bare essentials, but ended up with a test that did not exercise the root of the problem and a subset of the implementation that was insufficient for a full bugfix. The key thing I removed from the test was a _referencing output_ for the module, which is what breaks down the #6557 solution. I've re-tested the examples in #5440 and #3268 to verify this solution does indeed solve the problem.

The fix that landed in #6557 was unfortunately the wrong subset of the work I had been doing locally, and users of the attached bugs are still reporting problems with Terraform v0.6.16. At the very last step, I attempted to scope down both the failing test and the implementation to their bare essentials, but ended up with a test that did not exercise the root of the problem and a subset of the implementation that was insufficient for a full bugfix. The key thing I removed from the test was a _referencing output_ for the module, which is what breaks down the #6557 solution. I've re-tested the examples in #5440 and #3268 to verify this solution does indeed solve the problem. (cherry picked from commit 559f017)

The fix that landed in hashicorp#6557 was unfortunately the wrong subset of the work I had been doing locally, and users of the attached bugs are still reporting problems with Terraform v0.6.16. At the very last step, I attempted to scope down both the failing test and the implementation to their bare essentials, but ended up with a test that did not exercise the root of the problem and a subset of the implementation that was insufficient for a full bugfix. The key thing I removed from the test was a _referencing output_ for the module, which is what breaks down the hashicorp#6557 solution. I've re-tested the examples in hashicorp#5440 and hashicorp#3268 to verify this solution does indeed solve the problem.

For `terraform destroy`, we currently build up the same graph we do for `plan` and `apply` and we do a walk with a special Diff that says "destroy everything". We have fought the interpolation subsystem time and again through this code path. Beginning in #2775 we gained a new feature to selectively prune out problematic graph nodes. The past chain of destroy fixes I have been involved with (#6557, #6599, #6753) have attempted to massage the "noop" definitions to properly handle the edge cases reported. "Variable is depended on by provider config" is another edge case we add here and try to fix. This dive only makes me more convinced that the whole `terraform destroy` code path needs to be reworked. For now, I went with a "surgical strike" approach to the problem expressed in #7047. I found a couple of issues with the existing Noop and DestroyEdgeInclude logic, especially with regards to flattening, but I'm explicitly ignoring these for now so we can get this particular bug fixed ahead of the 0.7 release. My hope is that we can circle around with a fully specced initiative to refactor `terraform destroy`'s graph to be more state-derived than config-derived. Until then, this fixes #7407

ghost · 2020-04-25T02:39:34Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

phinze added bug core labels May 9, 2016

jen20 merged commit 061a34c into master May 9, 2016

jen20 deleted the phinze/destroy-interpolation-error-fix branch May 9, 2016 17:30

phinze mentioned this pull request May 10, 2016

terraform: Correct fix for destroy interp errors #6599

Merged

phinze mentioned this pull request May 11, 2016

terraform: Correct fix for destroy interp errors [0.6 backport] #6614

Merged

catsby mentioned this pull request Jun 1, 2016

Inconsistent results when creating AWS VPC infrastructure using Terraform #6813

Closed

phinze mentioned this pull request Jun 12, 2016

core: Fix destroy when module vars used in provider config #7131

Merged

ghost locked and limited conversation to collaborators Apr 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: Fix interp error msgs on module vars during destroy #6557

core: Fix interp error msgs on module vars during destroy #6557

phinze commented May 9, 2016

jen20 commented May 9, 2016

richardbowden commented May 10, 2016

sethvargo commented May 10, 2016

phinze commented May 10, 2016

ghost commented Apr 25, 2020

core: Fix interp error msgs on module vars during destroy #6557

core: Fix interp error msgs on module vars during destroy #6557

Conversation

phinze commented May 9, 2016

jen20 commented May 9, 2016

richardbowden commented May 10, 2016

sethvargo commented May 10, 2016

phinze commented May 10, 2016

ghost commented Apr 25, 2020