Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deletion Policy property as a workaround to CloudFormation Stack Delete Fail #19

Merged
merged 1 commit into from
Aug 17, 2018

Conversation

iDVB
Copy link
Contributor

@iDVB iDVB commented Jul 29, 2018

Do to Lambda@Edge's replica deletion time not yet supporting CFN.... deleting stacks causes the delete to fail very ungracefully. Hopefully AWS will fix this soon.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-edge-delete-replicas.html

Until then, this is something we need so that we can safely tear down CFN stacks and NOT have them fail.

Later we automate orphan Function cleanup.

@coveralls
Copy link

coveralls commented Jul 29, 2018

Coverage Status

Coverage increased (+1.002%) to 10.185% when pulling 531b0c8 on iDVB:master into e586c8b on silvermine:master.

Copy link
Member

@jthomerson jthomerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @iDVB! Nice work. Caught one typo to be fixed.

Also, I'm vacillating a bit on how we want to explain this to users. While you're obviously pro-Retain, I'm sort of against it. Why? Because I've found that if you delete the stack, you'll get the error the first time. But in the background, AWS does start deleting your replicated function. So if you delete the stack again a few hours later, it should succeed. To me this is preferable because:

  1. stack deletions are very rare (at least for us, and I think in general they're fairly rare)
  2. using "Retain" means that now you're not cleaning up after yourself
  3. the most common place to do stack deletions is in a development environment, where it's also really bad to leave random junk laying around (and I think devs are more prone to forget to go back and manually clean up after themselves if they used Retain)

Thoughts? Maybe we can incorporate something in the documentation explaining both approaches?

README.md Outdated
@@ -71,6 +71,12 @@ objects, the objects all have the same fields, each of which is explained here:
behaviors in the specified distribution if you want this function to be associated
with a specific cache behavior. If the path pattern is not defined here, the function
will be associated with the default cache behavior for the specified distribution.
* **`deletionPolicy`** (optional): a string, can only be set to `Retain` which
sets the [DeletionPolicy][DeletionPolicy] of the function resource. Can be
used to avoid the currently enevitable CloudFormation stack deletion failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/enevitable/inevitable/

Copy link
Member

@jthomerson jthomerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mark it as "request changes". Also, please rebase according to our coding standards so that this PR is a single commit for the feature and doesn't have fixup commits with it.

Thanks!

@iDVB
Copy link
Contributor Author

iDVB commented Jul 30, 2018

@jthomerson Very fair. I was not aware that re-runing the stack deletion succeeds the second time. I'll have to give that a go as well.

We do a lot stack deleting, due to the amount of prototyping in dev we do. Because of that, we must have the ability to fire up and tear down stacks in short order and to automate it. So for us, it's more important to know the stack is down or fail if something actually went wrong.

Whether you are:
A) Stack PASS w Retain and cron cleanup
or
B) Stack FAIL w Delete and cron rerunning later

Either way you have to automate something later. But with A at least it's more apparent if it failed due to a real error vs this known one.

In any event, I think having this option allows people to choose for themselves which method works best for their use-case.

I also struggled with how best to implement it, since it seemed redundant to allow people to set it to Delete and made no sense to allow them to set it to Snapshot. Also, AWS must be fixing this soon. There is no reason CFN is smart enough to wait for standard CF Distro deletion but can't wait for Edge function replication deletion. So I'm guessing this feature is just going to be around until they fix it.

Happy to put any additional context in there if you like.... or maybe even put this convo in a closed issue and point to the debate from the readme?

thoughts?

@jthomerson
Copy link
Member

@iDVB thanks for all the thought put into this. Thinking about the question you pose about how to implement it: perhaps we should consider implementing this as a flag in the user's custom variables. That's something that some plugins do when they need plugin-wide configuration. I thought of this because of the comment you had in the README about only needing to add the deletion policy to one of the distributions. Perhaps instead we support a service-wide setting like:

custom:
   foo: bar
   retainEdgeFunctions: true

Thoughts?

Either way I think it makes sense to have at least a paragraph in the README explaining why the option exists (what the error would be without it, etc). You could also add a link from that paragraph to this PR so they can see the background.

@iDVB
Copy link
Contributor Author

iDVB commented Jul 31, 2018

@jthomerson made the changes you requested and squashed the commits.

I implemented the global section under the common property name lambdaAtEdge to keep things consistent. I then setup this one as a sub prop to that as retain: true. I also updated the ReadMe

Let me know what you think.

@iDVB
Copy link
Contributor Author

iDVB commented Aug 11, 2018

@jthomerson Any chance this PR can go in soon?
Full disclosure, I'm authoring a course right now that references this plugin and would like to reference this feature as well.

@jthomerson jthomerson merged commit afe1f9f into silvermine:master Aug 17, 2018
@jthomerson
Copy link
Member

Thanks @iDVB! Sorry for the delay - I was on vacation and have been slowly getting caught up. I made some stylistic and typo changes to your commit and commit message, pushed, tested, and merged.

@jthomerson
Copy link
Member

@iDVB published this in v2.1.0

@iDVB iDVB mentioned this pull request Aug 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants