Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r/resource_aws_security_group: increase deletion timeout #1052

Closed
wants to merge 1 commit into from

Conversation

s-urbaniak
Copy link

This adds a bump in the timeout for removing security groups as we experienced a high flake count upon cluster destruction. Introducing this timeout reduces flakes significantly

This is a stop-gap solution until resource providers get timeout override support.

Fixes coreos/tectonic-installer#1242

/cc @radeksimko @jasminSPC

@radeksimko radeksimko added the bug Addresses a defect in current functionality. label Jul 4, 2017
@radeksimko
Copy link
Member

Hi @s-urbaniak
I'd love to solve this problem, but I think 30mins is a bit too high for such resource and consequence. Keep in mind this timeout also applies to users that have a genuine DependencyViolation - i.e. users that have a dependency lying around will have to wait 30mins until they see the error. That's not great user experience.

The good news is that if you can consistently reproduce this problem, we can dig in and find out what's causing this, like we did #1021 - do you mind following the same process? i.e. sending me the debug logs + relevant tf configs?

Thanks.

@radeksimko radeksimko added the waiting-response Maintainers are waiting on response from community or contributor. label Jul 4, 2017
@s-urbaniak
Copy link
Author

@radeksimko I'll try to provoke the timeout as I did in the other precedence, sure.

@jasmingacic
Copy link

Maybe It wouldn't hurt to maybe parameterise Delete timeout?

@radeksimko
Copy link
Member

radeksimko commented Jul 4, 2017

@jasminSPC I'd prefer not to do it in this context for this resource. See my longer explanation in #945 (comment)

There's no good reason for an SG removal to take so long. Yes, APIs are eventually consistent and sometimes laggy, but that's matter of minutes, not half an hour. If the user has specified references between resources correctly (= we aren't deleting all things at the same time) then terraform will go resource-by-resource and the only reason for being stuck for so long in this situation is just resource not cleaning up after itself on Amazon's side.

That is likely what's happening here, so when we understand what's holding the SG from removal we can do the cleanup ourselves without having to crank up timeouts and make things work out of the box for everyone. 🙂

@Ninir
Copy link
Contributor

Ninir commented Aug 17, 2017

Hey @s-urbaniak ,

As @radeksimko exposes it, this issue seems to me like a specific one since there are not so many people complaining about it.

Don't want to urge anyone on this one (😅) but find a solution for that if there is a real issue, or either close it if it's not reproductible.

Do you think you would have time to investigate?

Thanks!

@s-urbaniak
Copy link
Author

closing, as we will try to find more datapoints having TF_TRACE enabled and come up with a more sensible solution.

@s-urbaniak s-urbaniak closed this Aug 28, 2017
@Ninir
Copy link
Contributor

Ninir commented Aug 28, 2017

Thank you for that @s-urbaniak :)

@ghost
Copy link

ghost commented Apr 11, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 11, 2020
@breathingdust breathingdust removed the waiting-response Maintainers are waiting on response from community or contributor. label Sep 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

aws flake aws_security_group.*: DependencyViolation
5 participants