-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helper/resource: restore retval of resource.Retry on timeout #5460
Conversation
In #4700 while fixing a data race I made an incorrect assumption about the return value of StateChangeConf, and ended up changing the behavior in the timeout case to always return: > timeout while waiting for state to become '[success]' When it used to capture the "most recent error" from the function itself. It's much more useful to see that error bubbling up, so here we revert to pulling it out of the function as we did before, and we protect against the data race with a good old fashioned mutex.
@phinze you sir are awesome! This works a treat :) BEFORE
AFTER
|
This fixes #5454 (at least) |
Is there any workaround for the |
This LGTM. The lock/unlock seems cleaner than pulling out yet another variable for return. |
What I still don't quite understand is how this change seems to fix actual behavior problem. My reading of the code suggests that an existing timeout condition might have a more useful error message shadowed by the timeout error, but it seems like instead there are real issues being caused. I'm going to take another look to see why this might be happening. |
So here is what I think is happening: The timeout can interrupt anywhere inside the If the timeout fires when a happy retval has already been recorded but before we've caught the done channel close, it (prior to this PR) overrides the return value w/ the timeout error. I still haven't grokked why it seems to happen often enough to be noticeable, but I'm willing to just merge this as is and continue to triage incoming bug reports rather than trying to fully explain this thing step by step. |
helper/resource: restore retval of resource.Retry on timeout
I am getting the "Error launching source instance: timeout while waiting for state to become 'success' (timeout: 15s)" My terraform version is "Terraform v0.9.6". Can someone please help me.? Error while launching the instance. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
In #4700 while fixing a data race I made an incorrect assumption about
the return value of StateChangeConf, and ended up changing the behavior
in the timeout case to always return:
When it used to capture the "most recent error" from the function
itself.
It's much more useful to see that error bubbling up, so here we revert
to pulling it out of the function as we did before, and we protect
against the data race with a good old fashioned mutex.