Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not delete a resource but create a new resource when change is detected #15485

Open
Puneeth-n opened this issue Jul 6, 2017 · 66 comments
Open

Comments

@Puneeth-n
Copy link
Contributor

Can terraform be configured to create a new resource but not delete the existing resource when it sees a change? For example with AWS step functions, one can either create or delete a state machine and not modify it.

I want terraform to create a new state machine each time it sees a change but not delete the old one as it might contain states.

@Puneeth-n Puneeth-n changed the title Terraform do not delete but create a new resource Do not delete a resource but create a new resource when change is detected Jul 6, 2017
@apparentlymart
Copy link
Contributor

Interesting idea, @Puneeth-n! Thanks for suggesting it.

I think a question we'd need to figure out here is what does happen to the old instance. Should Terraform just "forget" it (no longer reference it from state) and leave some other process to clean it up? Or maybe it should become "deposed" but not deleted. In that case, a subsequent run would delete it, so that's probably not what you want here.

@Puneeth-n
Copy link
Contributor Author

Thanks @apparentlymart I have been giving this some thought since the past few days as we plan to use AWS step functions in the near future.

My thoughts on this:

resource "aws_sfn_state_machine" "sfn_state_machine" {
  name     = "my-state-machine"
  role_arn = "${aws_iam_role.test_role.arn}"

  definition = <<EOF
{
  "Comment": "A Hello World example of the Amazon States Language using an AWS Lambda Function",
  "StartAt": "HelloWorld",
  "States": {
    "HelloWorld": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:337909755902:function:test_lambda",
      "Next": "wait_using_seconds"
    },
    "wait_using_seconds": {
      "Type": "Wait",
      "Seconds": 10,
      "End": true
    }
  }
}
EOF

destroy_action {
  create_new = true
  decommission = true    
  }
}

and a new cli option terraform --cleanup -force -target=resource to clean up decomissioned resources

@apparentlymart
Copy link
Contributor

Okay... so this implies an entirely new instance state "decommissioned", in addition to "tainted" and "deposed", which behaves a bit like deposed but only gets deleted when specifically requested.

Ideally I'd rather avoid the complexity of introducing a new instance state, so I'd like to let this soak for a little while and see if we can find a way to build something similar out of our existing concepts, or to add a new concept that addresses a broader problem in this same area of gradually decommissioning things.

For example: a common problem is gracefully migrating between clusters that have actions to be taken when nodes leave, like Consul and Nomad clusters. Currently people do this in several manual steps of ramping up a new cluster, gradually phasing out the old cluster, and then destroying the old cluster. This seems like a similar problem to yours, and in both cases it seems like there's some action that is to be taken between the creation of the new thing and the destruction of the old thing. This idea has come up a number of times with different details.

@cobusbernard
Copy link

@apparentlymart: This would be really useful for dealing with i.e. AWS Launch Config changes. Each time you change them, you need to first create one, point your ASG to it, then destroy the old one via

lifecycle {
  create_before_destroy = true
}

It would be great to be able to somehow indicate to always create a new one and not delete the old one. That way you preserve the history of what the values were and can switch back easily.

@maulik887
Copy link

Hi, Any chance of getting this in recent release?

@Puneeth-n
Copy link
Contributor Author

@maulik887 what is your use case? When I was working with Step functions a year before I had this requirement Since there was no Update API and we didn't want Terraform to delete our step functions.

@maulik887
Copy link

My case is, I'm creating API Gateway API and using it for Lambda proxy. Now I want to create API Stages per Lambda version, and don't want to delete old Stage version.
e.g. On fresh start, I will create API, Lambda & Stage in API called V1_0, now when new lambda version comes, I want to create new API Stage v1_1 but don't want to delete older version.

@ChappIO
Copy link

ChappIO commented Apr 17, 2018

I would like to check on the state of this, my usecase would be that I have an application which performs long running processes (hours, sometimes days) but I would like to roll out updates seemlessly.
During an update I would place a new instance next to the current one (which I can do with create_before_destroy) and then leave the current instance running untill all processes are finished. (which I cannot do).

For me I would have 2 suggestions, either I would need a way to query the application for its status (http request) and create an endpoint for terraform, or I would like to schedule a resource to be deleted after X (in my case I would probably set it to a week). This way an update will also delete all older (finished) instances.

@sjmh
Copy link

sjmh commented Jun 21, 2018

I'd also find this useful. I have a use case where I'm deploying an s3 object, we're deploying them with the version tag in the object name, so 'myjs-.js'. When we change the version, I want a new s3 object deployed, but I don't want the old version removed.

@jsmartt
Copy link

jsmartt commented Jun 25, 2018

I have a similar need to create a new version of a resource without destroying the old one, but in my use case, I don't really care about cleaning up old versions, so I'd be OK with Terraform just forgetting about the old resource. The way I'd see it working would be to have an additional attribute modifier similar to ForceNew, but just with a different workflow. It could be called ForceNewNoDelete for example, where it basically just skips the delete and verify via read steps.

I'm not sure how this would work with dependent resources though, which I wouldn't necessarily want destroyed, even though I need to grab the ID of the new resource that got created.

@philnielsen
Copy link

Similar use case to @sjmh , I want to keep my old lambda code versions around for a bit in s3, since they can now be attached to older lambda versions with alias and (soon) I should be able route traffic to those old versions, but there is no way to update and add a new code version and alias without deleting the old code version with aws_s3_bucket_object.

@apparentlymart
Copy link
Contributor

The Terraform Core team is not currently doing any work in the area of this issue due to being focused elsewhere. In the mean I think some of the use-cases described here (thanks!) could be handled more locally by features within providers themselves, such as flags to skip deletion on a per-resource-type basis, so I'd encourage you all to open an issue within the relevant provider (unless I've missed someone, it looks like you're all talking about AWS provider stuff) to discuss the more specific use-case and see if a more short-term solution is possible.

There is already some precedent for resource-type-specific flags to disable destroying an object, such as nomad_job's deregister_on_destroy argument. Terraform Core still thinks it's destroying the job, but the provider ignores that request if the flag is set and leaves the job present in Nomad, no longer being tracked by Terraform at all.

Having some specific examples of solutions for this in individual providers is often a good way to figure out what a general solution might look like, or even to see if a general solution is warranted, so if you do open such a ticket please mention hashicorp/terraform#15485 in it so we can find them all again later when doing design/prototype work for this issue.

@mohitm108
Copy link

I am also trying to get a similar feature. Whenever there is a feature change, it fires up a jenkins job and new tasks/services are created in AWS ECS cluster. I want to keep the old tasks/services also so that if anything goes wrong, I can roll my load balancer to the old tasks/services

@Tensho
Copy link
Contributor

Tensho commented Jan 24, 2019

I build EBS volume with Packer, then take EBS volume snapshot with Terraform. For the potentioal rollback purpose, I don't want Terraform replaces (deletes) old EBS volume snapshot. ATM there is a hack with Terraform state manipulation in my script wrapper that runs Terraform commands to achieve desired behavior:

terraform apply ...
terraform state rm aws_ebs_snapshot.component

I just remove the resource from Terraform state to free place for the new one on the next apply.

It would be nice to have HCL resource declaration for this.

@jpbochi
Copy link

jpbochi commented Jan 25, 2019

the solution my team implemented was similar to what @Tensho described. We just use terraform state rm ... in between applies so that the lambda aliases that we create wouldn't get destroyed.

@WigglesMcMuffin
Copy link

For some things we did recently. We actually use terraform state mv, to move the resource out of the way, and a -target to get a new one built. Then, subsequent applies will want to tear down the old resources, so we could apply that when we were ready, and that way the state was still tracked, and we didn't have to do manual clicking about

@tomelliff
Copy link
Contributor

tomelliff commented Feb 15, 2019 via email

@thakkerdhawal
Copy link

+1

@deindorfer
Copy link

Hard to believe Terraform doesn't have this feature. In the case of AWS snapshots, I somewhat obviously want more than one snapshot. I want to create a new snapshot, but KEEP THE EXISTING ONES, TOO!

Y'all don't support that? Seriously? "Make new, but keep existing"

Like when I install a new binary on my windows laptop, I don't want to delete and reinstall all of the other binaries, I want TO KEEP WHAT I'VE GOT and ADD THE NEW ONE.

Could this please get a look from The Hashicorp Core Dev Team?

@hdryx
Copy link

hdryx commented Jun 28, 2019

Same thing here with EMR cluster. I want to launch new cluster but keep the old one running. Terraform always destroy the old one and replace it with a new one.
Hope there is a solution for that.

@DavidGamba
Copy link

DavidGamba commented Aug 21, 2019

Similar workflow here, want to deploy a new EC2 instance, leave the old one running until the load balancer marks the new one as good. Then run another plan/apply to delete the instance.
I would like something like terraform plan -no-destroy.

It raises some issues around indexes, because it has to increment them, for example, my LB is using the instance count to add to the entries to the target group. During the no-destroy phase I expect the extra entry added (so somehow the state needs to increment its count) and then when we actually apply the plan that destroys then the count goes back to normal.

@tomelliff
Copy link
Contributor

@DavidGamba you can already do that as long as your change to the EC2 instance forces a destroy (eg changing the AMI) by using the create_before_destroy lifecycle block.

It will also do this for you in a single terraform apply.

@RaniSputnik
Copy link

I think this feature would be useful in many cases and should be considered cross-provider.

I want to keep my old lambda code versions around for a bit in s3, since they can now be attached to older lambda versions with alias and (soon) I should be able route traffic to those old versions, but there is no way to update and add a new code version and alias without deleting the old code version with aws_s3_bucket_object.

This is exactly the same issue I face but with Google Cloud. Ideally I can have Terraform create new objects in cloud storage and publish new function versions without destroying the old ones.

Another area where I see this being useful is certificate management (again, cross provider), you never really want to delete your old cert, just provision a new one. This feature would also help there.

In terms of where I would expect this to be surfaced, I would like to see it as a lifecycle_rule perhaps duplicate_on_modify or abandon_on_destroy (as already suggested)? I think personally, I prefer abandon_on_destroy because I think if a resource could be modified, then it wouldn't be replaced, but perhaps there's a use case for that also?

@avoidik
Copy link

avoidik commented Jun 18, 2021

How do you plan to pick up resources left after abandon_on_destroy?

@MeNsaaH
Copy link

MeNsaaH commented Jun 27, 2021

In my case I create an AMI from an image everytime I run terraform. It will be great if terraform can create new AMIs without having to destroy the old AMIs. We can have like a lifecycle for max_history and terraform will delete the older resource after the number of resource created is same as the max_history. The variable can be set to null if there should be no limit.

something like replacing a resource should use the max_history and create a new resource instead while outright destruction should destroy all the resources tracked.

I don't think abandon_on_destroy is such a great approach, it'll create a lot of orphaned resources. But having terraform keep a historical context of resources should be the best way to go.

@okedeji
Copy link

okedeji commented Sep 9, 2021

terraform state rm aws_ebs_snapshot.component

This works all well for me. I am able to forget the resource from state and keep history of the resource in AWS

@ccmattr
Copy link

ccmattr commented Oct 5, 2021

My use case is we are trying to migrate to a new aws account and we want to create the new resources in the target destination first, test them thoroughly. Then do a cutover. Then tidy up on success. Ideally i would like to do this in 3 apply's:

  1. create resources
  2. cutover
  3. destroy old resources

On those lines, abandoning the resource wouldn't be ideal as i would have to manually clean up the old resources.

Is that something that could be done?

@jammymalina
Copy link

My use case is not to destroy the Lambda layer version when the source code changes. I want Terraform to deploy the new layer version and keep the old ones.

@siran
Copy link

siran commented Dec 16, 2021

I am sharing Layer between accounts, since this is done by version when the layer is deleted we have to update/redeploy the lambda functions from other accounts that use this layer (update the version number, since the previous layer is destroyed).

@blafry
Copy link

blafry commented Jun 6, 2022

In our case, abandon_on_destroy would solve the problem in the case of certificate rotation on Azure Key Vault and purge protection turned on.

@crose-varde
Copy link

We have a use case for this: We have some terraform configurations that manage both a resource and a CloudWatch log group that the resource logs to. If we ever want to change the name of the log group that the resource logs to, we can't just change it in the configuration, because a log group name changes forces a recreation, but our log groups are undeletable for audit reasons. To accomplish what we want we have to manually terraform state rm the log group before applying our changes. abandon_on_destroy is exactly what we need to avoid this manual step.

@shridhargavai1
Copy link

shridhargavai1 commented Aug 29, 2022

There is a way.
user

  1. Plan
  2. Apply
  3. terraform state rm "resource_name" --------This will eliminate or remove resource from current state
  4. next Apply
    Worked perfectly on GCP for creating 2 successive VM using same TF script.
    Only thing is we need to write/code to get current resources and store somewhere and create commands in config: require blocks for upstream dependencies #3. While destroying we can add back using terraform state mv "resource_name"

Note : This has risk as the very first VM did not get deleted as it is considered generated out of scope of terraform. So your cost may persist. So you have to have back up (incremental for states)

Hope this helps.

@flovouin
Copy link

flovouin commented Nov 8, 2022

I have another use case, similar but not identical to those that were presented here, which could be solved by something like abandon_on_destroy.

On GCP, I'd like to remove a BigQuery table from the Terraform state without deleting the actual underlying table, which would result in an unrecoverable loss of data. Setting any kind of lifecycle parameter would make it clear that I know what a destroy means, and that I do not want the actual data to be deleted. The entire process is part of CI/CD and running terraform state rm is not really an option.
The reason behind this use case are tables storing events piped from Pub/Sub topics: when a topic is created, a BigQuery table and a Pub/Sub subscription are created at the same time by Terraform. When a Pub/Sub topic is deleted (after having been deprecated), the Pub/Sub subscription should also be deleted. However the BigQuery table should be kept around. No new data will be piped into the table, however the historical data is still relevant for analysis and auditing.

Please note that the google_bigquery_table has a deletion_protection argument that kinda interferes with the lifecycle (it has to be set to true in order for a - real - deletion to succeed). One could argue that my feature request should be implemented by the GCP provider, and I'd be fine with it. However it sounds like the deletion_protection argument is close to the prevent_destroy Terraform lifecycle argument (the documentation even states "a measure of safety against the accidental replacement of [...] database instances").
For me, this shows that the boundary between Terraform and the provider's responsibilities is not cristal clear. If I had to chose, I'd rather have the abandon_on_destroy behaviour implemented once by Terraform in a generic manner, rather than relying on each provider implementing it in its own way.

On a slightly different note, by browsing around I stumbled upon CloudFormation's DeletionPolicy, which looks like their solution to the need expressed in this issue. (I never used CloudFormation though, and could be completely wrong.)

@mt-empty
Copy link

I also want this.
I wanted to automate GitHub repository creation, so I created a simple terraform script to create new GitHub repository.
Problem is whenever I want to create a new repo I have to delete the old state.

@rasmus-rudling
Copy link

Any existing workarounds?

@marocchino
Copy link

I can't give you a code sample, but if you erase the difference of task definition after tf apply with jq or somewhat script, the plan's behavior will be as intended.

@mvadu
Copy link

mvadu commented Apr 11, 2023

adding another usecase, we use terraform to automate creating new stacks in Grafana cloud with the stack name passed as a variable. First time it creates a new stack and stores the details in the state. We don't want to destroy (thus lose all the data) for the second stack. Currently approach is a teardown stage to remove the references from state terraform state list | %{if($_ -match "new_stack|grafana_data_source|grafana_cloud_api_key"){terraform state rm $_}} but if there is a better less hacky way it would be preferred.

@shridhargavai1
Copy link

shridhargavai1 commented Apr 12, 2023 via email

@aharo-lumificyber
Copy link

Does anyone has any good idea to do this when you're building things within Azure?

@spkane
Copy link

spkane commented Feb 15, 2024

I am adding this here, and it was in its own feature request but has been deemed to be a duplicate of this...


Use Cases

The core idea is to create a way to tell Terraform to remove a resource from the state file during a destroy workflow instead of contacting the owning API to delete the object.

This would make it possible to handle nested objects, like kubernetes_namespace resources, that exist inside a Kubernetes cluster, which you will destroy in the same workflow, but you do not need Terraform to remove via the owning API.

Attempted Solutions

Hashicorp would recommend that people use a single Terraform workflow to spin up a k8s cluster and then install things into that cluster in another Terraform workflow.

However, there are many times that at least some bootstrapping will occur in the initial Terraform workflow. This fix would allow users to quickly identify resources that need not be destroyed via their API.

Proposal

lifecycle { 
  # This would cause Terraform to remove the resource from the state file instead of calling the owning API to delete.
  state_only_destroy = true
}

@darkn3rd
Copy link

Any update? It's been 7 years?

@alexeyinkin
Copy link

Another use case for abandon_on_destroy.

I have Google Cloud Spanner instance created outside Terraform that should be permanent. With Terraform, I want to create a database in that instance and then drop the database in destroy (but keep the instance).

I need the instance in the configuration to refer to its properties, so I use import block, which is a declarative way to skip creation. If only for symmetry, we need a declarative way to skip the deletion too, which abandon_on_destroy is.

state rm before destroy is a poor substitute for my case because if destroy fails the instance will already be removed from the state. This brings tons of logical problems. The sheer concept of the reverse order of deletion says that state rm before destroy is wrong. The right way is to let destroy figure out the order of deletion (or abandoning).

Another idea for this use case is to add an option to import block to unimport the resource when destroying the configuration. This is even better symmetry but it does not allow for other use cases suggested for abandon_on_destroy here.

@bcsgh
Copy link

bcsgh commented Apr 17, 2024

@alexeyinkin Your case, where the config doesn't actually manage a resource, seems like a a prime example of where to use a data "google_spanner_instance" ....

Now I was kinda expecting you were going to say you want Terraform to manage the configuration (i.e. all the knobs) but not the lifetime (creation/deletion) and I'd see that as a possible case but, from what you described, that doesn't seem to be your use case.

@umesh07feb2022
Copy link

I'm creating AMI from the instances on which I'm deploying code and using that AMI for my launch template, When I'm creating AMI from another instance it destroys the previous AMI, In this case, I'm getting locked as if something goes wrong I can't roll out my previous AMI version, Is there any way in terraform so I can create new AMI while keeping older AMI's

@Bombdog
Copy link

Bombdog commented Oct 4, 2024

This "state_only_destroy = true" flag would be the dogs bollocks for us. If you do frequent destroy and rebuild cycles you can preserve one or two resources and then actually pick them up again using an import{ } block. There's not only pet things like databases but sometimes you have infrastructure that simply wont go away. I can think of a few resources on GCP, but also I am working with Vault and a vault auth backend - it can only be emptied out on our setup, curiously it cant be deleted at all. So when you absolutely cannot delete something but you need to destroy everything else then this state_only_destroy = true idea is a winner as far as I'm concerned.

@par-texx
Copy link

Adding my 2 cents on this.

Our usecase is for the rotation of passwords / certificates / etc. We are using azuread_application_password to create a clientid and secret, but we want to rotate them before they expire. Running azuread_application_password to rotate it deletes the existing secret which will kick out everyone using the existing secret.

If we can just append the new one to the application, we can set an expiry for 2X the rotation period, allowing applications to cycle through their lifecycle and grab the new one from our Vault instance without disrupting the applications running at the time of secret creation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests