Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform should take backups of the DB by default when removing an RDS instance #1139

Closed
ColinHebert opened this issue Mar 6, 2015 · 19 comments

Comments

@ColinHebert
Copy link
Contributor

Terraform tends to remove the DB for certain trivial changes. It would be better for new users to assume that by default a snapshot needs to be taken. This will prevent DBs from being removed by mistake (yes I just did that...)

@phinze
Copy link
Contributor

phinze commented Mar 6, 2015

I hear what you're saying, and I'm sorry this caused you trouble. The other side of this tradeoff is "should terraform by default leak artifacts that a user will have to pay for?"

One of the key features of Terraform that helps prevent it from ever doing something unexpected is terraform plan. It's part of the core design of the project that Terraform will never take an action that does not show up in the plan.

I'll mark this as thinking to stew on whether there might be a way to make this nicer for new users.

@ColinHebert
Copy link
Contributor Author

Another option could be something along the lines of making final_snapshot_identifier mandatory, with a special value (empty string/null/none) which would disable the final snapshot.

@willejs
Copy link

willejs commented Mar 9, 2015

perhaps termination protection on resources?

@ColinHebert
Copy link
Contributor Author

That would be nice, if AWS supported it

@ketzacoatl
Copy link
Contributor

If Terraform is to be used by ops, Terraform ought to have reasonable support for typical migrations and long-term handling of resources. The AWS RDS API is sensitive to change and Terraform will often want to re-create an RDS instance for something the user might disagee with. Having support for persistence of the db is critical for real-world use.

@eukaryote
Copy link

I agree very strongly with this issue, and I am right now trying to deal with total data loss. Terraform reported:

The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed.

Your plan was also saved to the path below. Call the "apply" subcommand
with this plan file and Terraform will exactly execute this execution
plan.

Path: plan

~ module.db
    1 resource(s)

There was no 'red' showing, only yellow, and I expected an in-place update when applying the saved plan, but then I ran the plan and watched in horror as terraform was deleting the instance.

My database terraform conf contained "final_snapshot_identifier", but it may not have contained the param when the database was created a month ago or so (I don't remember adding it later, but it's possible it wasn't there originally). No final snapshot appears to have been taken, nor was there any indication why that didn't happen, so now I'm paying for a month of technical support so I can file a tech support case and beg Amazon for some way to restore the data, because all the existing snapshots are gone.

[edit] No final snapshot was taken, and I ended up losing data -- Amazon had already deleted all their automatic snapshots. As I explained in #564 (comment) though, the plan output shown above does not show the changes that will be made in the module.db because I didn't include "-module-depth=1" as a plan parameter, and when I do that, terraform does correctly show that it intends to delete and then recreate the database. I still am not sure why the final snapshot wasn't made though, unless it's because it wasn't specified when the database was created a month ago.

@knuckolls
Copy link
Contributor

The fact that Terraform does automated destruction of things makes it a very powerful and very dangerous tool. When it comes to stateful services I believe that there will have to be primitives supplied for doing much more controlled destruction of things. I have considered, but not yet implemented a "decomission-on-delete" flag that would upon deletion of a resource, merely put it into a defunct part of the statefile. That way the cleanup and destruction of the resource old resource is on the human. Terraform would then nag that decomissioned resources need to be cleaned up in a controlled manner by the operator.

I hate to hear that you lost data, that's a nightmare scenario. Regardless, I fully believe that "terraform apply" should always be run on a UAT or Staging environment before being run on a production environment. That's one of those practices that I feel like is hard to encode into the tool, without formally introducing the notion of environments and promotion into the tool itself. Food for thought, at the very least.

@mitchellh
Copy link
Contributor

@knuckolls Actually we've talked about this elsewhere and have planned a lifecycle (for the lifecycle block) parameter on one extreme which tells Terraform to never delete something for any reason. We can then get fancier and perhaps add more modes which are confirmations or something.

@ketzacoatl
Copy link
Contributor

+2 on improving the user experience with terraform's destroy, and -1 on keeping it brittle or otherwise inflexible. Terraform's destroy operation could help me a lot by letting me choose what to delete, rather than removing everything on me if I say yes. Destroy could also show you what will be deleted in a concise list. But in either case, it would be even more awesome to have a way to soft delete and object to keep it around alittle while longer (but not in your way).

I have attempted to use terraform plan/apply with the plan file to come up with a more structured method for deploying/testing against one environment before another, but it doesn't work so well in practice. I do not have clear suggestions for improvement, but it would be wonderful if terraform had a way to help along migrations. In the context of this issue about dbs, that would mean helping to make sure your data is still available after performing a change that AWS would require killing the existing db instance for.

@matthewford
Copy link

👍

@nathanielks
Copy link
Contributor

I believe an interim solution is to actually utilize 2 separate Terraform configs, one for the ephemeral resources and one for the not-so (rds). I think it'd be possible to hardcode the ids of the other resources when connecting the two, yeah? (thinking in context of the VPC, subnet, and routing table relationships)

@stack72
Copy link
Contributor

stack72 commented Nov 10, 2015

There is an open PR #3853 that will now follow the AWS Guidelines and make a final snapshot on db instance delete - http://docs.aws.amazon.com/cli/latest/reference/rds/delete-db-instance.html

--skip-final-snapshot | --no-skip-final-snapshot (boolean)

Determines whether a final DB snapshot is created before the DB instance is deleted. If true is specified, no DBSnapshot is created. If false is specified, a DB snapshot is created before the DB instance is deleted.
Note that when a DB instance is in a failure state and has a status of 'failed', 'incompatible-restore', or 'incompatible-network', it can only be deleted when the SkipFinalSnapshot parameter is set to "true".
Specify true when deleting a Read Replica.
Note
The FinalDBSnapshotIdentifier parameter must be specified if SkipFinalSnapshot is false .
Default: false

@stack72
Copy link
Contributor

stack72 commented Dec 11, 2015

@phinze / @catsby as my PR above has been closed now, I think this can be closed

@phinze
Copy link
Contributor

phinze commented Dec 15, 2015

👍

@phinze phinze closed this as completed Dec 15, 2015
@ketzacoatl
Copy link
Contributor

Is the snapshot id or similar made available in a way we can store it in state or as output?

@stack72
Copy link
Contributor

stack72 commented Dec 15, 2015

@ketzacoatl you actually have to tell it the final snapshot identifier so you can just output that

@ketzacoatl
Copy link
Contributor

oh right, silly question from me. sorry for the noise!

@stack72
Copy link
Contributor

stack72 commented Dec 15, 2015

No noise at all - good to be sure

@ghost
Copy link

ghost commented Apr 29, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants