-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: Fix interp error msgs on module vars during destroy #6557
Conversation
Wow this one was tricky! This bug presents itself only when using planfiles, because when doing a straight `terraform apply` the interpolations are left in place from the Plan graph walk and paper over the issue. (This detail is what made it so hard to reproduce initially.) Basically, graph nodes for module variables are visited during the apply walk and attempt to interpolate. During a destroy walk, no attributes are interpolated from resource nodes, so these interpolations fail. This scenario is supposed to be handled by the `PruneNoopTransformer` - in fact it's described as the example use case in the comment above it! So the bug had to do with the actual behavor of the Noop transformer. The resource nodes were not properly reporting themselves as Noops during a destroy, so they were being left in the graph. This in turn triggered the module variable nodes to see that they had another node depending on them, so they also reported that they could not be pruned. Therefore we had two nodes in the graph that were effectively noops but were being visited anyways. The module variable nodes were already graph leaves, which is why this error presented itself as just stray messages instead of actual failure to destroy. Fixes #5440 Fixes #5708 Fixes #4988 Fixes #3268
LGTM - great catch - typically stupid amount of effort for a three line core diff once again! |
Still seeing this issue on v0.6.16 the output of my test is below, let me know if more info is required ?? `Terraform v0.6.16 Setting up remote state... 6 error(s) occurred:
Terraform does not automatically rollback in the face of errors. |
@phinze ^ |
Hi @richardbowden - yeah it does look like there is another issue here. Digging in now! |
The fix that landed in #6557 was unfortunately the wrong subset of the work I had been doing locally, and users of the attached bugs are still reporting problems with Terraform v0.6.16. At the very last step, I attempted to scope down both the failing test and the implementation to their bare essentials, but ended up with a test that did not exercise the root of the problem and a subset of the implementation that was insufficient for a full bugfix. The key thing I removed from the test was a _referencing output_ for the module, which is what breaks down the #6557 solution. I've re-tested the examples in #5440 and #3268 to verify this solution does indeed solve the problem.
The fix that landed in #6557 was unfortunately the wrong subset of the work I had been doing locally, and users of the attached bugs are still reporting problems with Terraform v0.6.16. At the very last step, I attempted to scope down both the failing test and the implementation to their bare essentials, but ended up with a test that did not exercise the root of the problem and a subset of the implementation that was insufficient for a full bugfix. The key thing I removed from the test was a _referencing output_ for the module, which is what breaks down the #6557 solution. I've re-tested the examples in #5440 and #3268 to verify this solution does indeed solve the problem.
The fix that landed in #6557 was unfortunately the wrong subset of the work I had been doing locally, and users of the attached bugs are still reporting problems with Terraform v0.6.16. At the very last step, I attempted to scope down both the failing test and the implementation to their bare essentials, but ended up with a test that did not exercise the root of the problem and a subset of the implementation that was insufficient for a full bugfix. The key thing I removed from the test was a _referencing output_ for the module, which is what breaks down the #6557 solution. I've re-tested the examples in #5440 and #3268 to verify this solution does indeed solve the problem. (cherry picked from commit 559f017)
The fix that landed in hashicorp#6557 was unfortunately the wrong subset of the work I had been doing locally, and users of the attached bugs are still reporting problems with Terraform v0.6.16. At the very last step, I attempted to scope down both the failing test and the implementation to their bare essentials, but ended up with a test that did not exercise the root of the problem and a subset of the implementation that was insufficient for a full bugfix. The key thing I removed from the test was a _referencing output_ for the module, which is what breaks down the hashicorp#6557 solution. I've re-tested the examples in hashicorp#5440 and hashicorp#3268 to verify this solution does indeed solve the problem.
For `terraform destroy`, we currently build up the same graph we do for `plan` and `apply` and we do a walk with a special Diff that says "destroy everything". We have fought the interpolation subsystem time and again through this code path. Beginning in #2775 we gained a new feature to selectively prune out problematic graph nodes. The past chain of destroy fixes I have been involved with (#6557, #6599, #6753) have attempted to massage the "noop" definitions to properly handle the edge cases reported. "Variable is depended on by provider config" is another edge case we add here and try to fix. This dive only makes me more convinced that the whole `terraform destroy` code path needs to be reworked. For now, I went with a "surgical strike" approach to the problem expressed in #7047. I found a couple of issues with the existing Noop and DestroyEdgeInclude logic, especially with regards to flattening, but I'm explicitly ignoring these for now so we can get this particular bug fixed ahead of the 0.7 release. My hope is that we can circle around with a fully specced initiative to refactor `terraform destroy`'s graph to be more state-derived than config-derived. Until then, this fixes #7407
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Wow this one was tricky!
This bug presents itself only when using planfiles, because when doing a
straight
terraform apply
the interpolations are left in place from thePlan graph walk and paper over the issue. (This detail is what made it
so hard to reproduce initially.)
Basically, graph nodes for module variables are visited during the apply
walk and attempt to interpolate. During a destroy walk, no attributes
are interpolated from resource nodes, so these interpolations fail.
This scenario is supposed to be handled by the
PruneNoopTransformer
-in fact it's described as the example use case in the comment above it!
So the bug had to do with the actual behavor of the Noop transformer.
The resource nodes were not properly reporting themselves as Noops
during a destroy, so they were being left in the graph.
This in turn triggered the module variable nodes to see that they had
another node depending on them, so they also reported that they could
not be pruned.
Therefore we had two nodes in the graph that were effectively noops but
were being visited anyways. The module variable nodes were already graph
leaves, which is why this error presented itself as just stray messages
instead of actual failure to destroy.
Fixes #5440
Fixes #5708
Fixes #4988
Fixes #3268