Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent lifecycle destroy breaks a plan output, rendering diagnosis very difficult. #30271

Closed
gtmtech opened this issue Dec 28, 2021 · 7 comments · Fixed by #34312
Closed

prevent lifecycle destroy breaks a plan output, rendering diagnosis very difficult. #30271

gtmtech opened this issue Dec 28, 2021 · 7 comments · Fixed by #34312

Comments

@gtmtech
Copy link

gtmtech commented Dec 28, 2021

Terraform 0.13.7 - but I believe it affects all later versions.

I did some state file moves to prepare for a refactor moving resources from one module to another, and now when I plan, I see this error:

Resource
module.aws_account["foo"].aws_organizations_account.this
has lifecycle.prevent_destroy set, but the plan calls for this resource to be
destroyed. To avoid this error and continue with the plan, either disable
lifecycle.prevent_destroy or reduce the scope of the plan using the -target
flag.

This is an error which prevents the displaying of a plan altogether, stopping the plan in its tracks. This is unfortunately because if I have made mistakes in my state resources moves, and for some reason they don't match the codebase, I now have no way of working out what the problem is on account of the prevent lifecycle destroy.

What I would need in this situation is for terraform to actually output the plan so I can see if and where there is a mismatch between the actual resources and the renamed resources in the statefile. I am unable to remove the lifecycle prevent_destroy stanzas because it would require an apply to fix, and this would require a plan, and the plan wont output because of this error.

I am left to my own devices to figure out where/how I have a mismatch, as far as I can tell my codebase should produce an identically named resource so this shouldnt be occurring, but since terraform errors, I know this isnt the case, but there is no way of finding out why, other than analysing the codebase in minute detail.

What I would like to see

  1. Prevent destroy causing an error at runtime and not at plantime - OR
  2. Errors at plantime still resulting in a readable plan, albeit one that cant be applied - to better aid diagnosis.

Thanks.

@gtmtech gtmtech added bug new new issue not yet triaged labels Dec 28, 2021
@gtmtech
Copy link
Author

gtmtech commented Dec 28, 2021

What I need to answer is the question WHY to this part of the error:

but the plan calls for this resource to be destroyed

Why is the plan in my case calling for the resource to be destroyed - is it because the resource is named something different? Or because an attribute is forcing a recreation which is forcing a destroy+create, which is blocked by the prevent_destroy?

There is no way of knowing from terraform. I have to puzzle it out myself by directly manually comparing state file with codebase.

@jbardin jbardin added config core enhancement and removed bug new new issue not yet triaged labels Jan 3, 2022
@crw
Copy link
Collaborator

crw commented Jan 5, 2022

Thanks for the request! I've added it to the list to triage.

@apparentlymart
Copy link
Contributor

Hi @gtmtech! Thanks for sharing this feedback.

I think what you've described here is the same problem that #16392 was describing. That's a pretty old issue at this point, but it does describe the same problem you've described here.

It has honestly been a while since we spent time on the prevent_destroy feature. Indeed, it hasn't really changed significantly since we originally added it, and I think we've come to recognize is as a design mistake and in retrospect we regret not spending more time designing it before adding it.

We do agree that there's a real use-case here for declaring that a particular object ought to have some additional "friction" when deleting it, such as if it's a stateful object that would be difficult or impossible to recreate, but it also seems clear that the prevent_destroy doesn't really meet that need, and that we probably need an entirely new feature that is better suited to the use-case. (Lots of folks have shared possible ideas in that direction in the other issue, though most of them seem to have other drawbacks; it doesn't seem like there's an "easy answer".)

We'll continue to use that other issue to represent that need, and post updates there when we have them, though I want to be up front that nobody is currently working on design work there and we'll likely need to do some non-trivial R&D before we'd be able to pick that up again, because it's been a long time.

In the meantime, we've typically been suggesting that folks rely instead on features of their underlying cloud platform to meet similar use-cases. For example, for an object in AWS it's typically possible to write an IAM policy that allows for modification but not for destruction of a particular object, in which case the additional "friction" to delete comes in the form of first changing the policy to permit the deletion. That approach is also superior to prevent_destroy in a number of other ways; for example, because it's being enforced by the remote system rather than by Terraform itself it can make the stronger guarantee of not deleting the object even if the entire resource block were moved from the Terraform configuration, whereas with prevent_destroy that would also simultaneously remove that argument and therefore defeat the mechanism altogether.


With all of that said though, I think you're right that it would be reasonable for Terraform to produce at least a partial plan result even when encountering this error, and so we can use this particular issue to represent that shorter-term idea, just in case implementing that is more straightforward than all of what I described above.

I suspect that to be true, but we'll still need to do some more research to be sure. In particular, Terraform Core is currently built in such a way that any error immediately halts the planning process without actually producing a plan, and so unfortunately to achieve this I think we'll need to rework how exactly this particular error condition works. Perhaps Terraform Core would no longer treat it as an error diagnostic at all and instead somehow mark the specific actions in the plan as invalid (in a machine-readable way), leaving the frontend UI to be responsible for annotating the invalid actions for the human user, and the apply step to be responsible for noticing the invalid actions and refusing to take any of the actions described in the plan.

@dorkamotorka
Copy link

+1

@apparentlymart
Copy link
Contributor

At some point in the last few minor releases -- unfortunately I don't recall which -- we changed terraform plan and terraform apply to be able to return a partial plan if the planning process produced an error partway through, at least in most cases where Terraform Core is still able to proceed far enough to produce a valid partial plan data structure.

This specific error seems like one that should be supported under that new behavior. I would expect the output in this case, for recent Terraform versions, to start by saying something like "Terraform planned the following actions, but then encountered an error:" and then list out whatever it was able to successfully plan before reaching the resource that had prevent_destroy set, and then it will append the error.

That behavior was one of the outcomes listed under "What I would like to see", so if we can reproduce that it is indeed behaving that way in modern Terraform then perhaps this issue is complete. If not, then hopefully we can figure out what's preventing Terraform from using that behavior for this particular error message and fix it so that it will behave as I described; the necessary mechanics should already be in place so I would expect this to only change the fine details.

@ravick4u
Copy link

I am also having issues with lifecycle event like below
#34305

Copy link
Contributor

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
6 participants