Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helper: Add ContinuousTargetOccurence to work around inconsistency #4447

Merged

Conversation

radeksimko
Copy link
Member

This is to work around the known issue with inconsistent APIs, like IAM or KMS, namely #4375 #4306 #2869 #2499 #2136

Very similar issue has been raised recently in Vault too: hashicorp/vault#687
cc @jefferai

As you may expect such issues are difficult to reproduce unless you have a full copy of AWS environment.

Current solution(s)/work-around(s)

Just simply wait for the 1st "successful" state

The problem here is that 1st success != replication done, see below a snippet from my KMS testing:

KEY_ID=$(aws kms create-key --description "aws-eventually-consistent-kms-3" | jq -r .KeyMetadata.KeyId)
echo "Disabling $KEY_ID"
aws kms disable-key --key-id $KEY_ID
echo "$KEY_ID should be now disabled"
aws kms describe-key --key-id $KEY_ID | jq .KeyMetadata.Enabled
aws kms describe-key --key-id $KEY_ID | jq .KeyMetadata.Enabled
aws kms describe-key --key-id $KEY_ID | jq .KeyMetadata.Enabled
aws kms describe-key --key-id $KEY_ID | jq .KeyMetadata.Enabled
aws kms describe-key --key-id $KEY_ID | jq .KeyMetadata.Enabled
aws kms describe-key --key-id $KEY_ID | jq .KeyMetadata.Enabled
Disabling 128169fc-a5ac-481f-99f5-730c6980d924
128169fc-a5ac-481f-99f5-730c6980d924 should be now disabled
false
true
false
false
true
true

Retry dependent API calls

This for example means retrying ec2:StartInstance if the IAM Instance Profile has a policy which has not been replicated yet or retrying ecs:CreateService if the IAM Role has a policy which has not been replicated yet.

It seemed to be a pretty good solution, until I realised this approach fails exactly the same way as the approach above - see #4375 (comment) where ecs:CreateService succeeded and subsequent ecs:UpdateService failed.

@radeksimko radeksimko changed the title helper: Add StateChangeConf.ContinuousTargetOccurence (int) helper: Add ContinuousTargetOccurence to work around incosistency Dec 27, 2015
@radeksimko
Copy link
Member Author

This would help us to unblock the KMS PR (or at least it is my feeling that people will less likely complain about the eventual consistency effects of it).

@catsby @phinze What do you think of this concept? I know it may look a bit "hacky", but I was not able to come up with anything better. Vault apparently somehow worked around this by using STS (which is region specific), but that's not quite the option for us in Terraform I'm afraid as we're working with the resources directly, not solving auth problem.

@phinze
Copy link
Contributor

phinze commented Feb 23, 2016

Just finally got around to looking at this. Seems like a totally reasonable thing for us to add, and I like the unit tests!

LGTM

@radeksimko
Copy link
Member Author

ok, I will resolve the conflicts, repush and wait for green light from Travis, then merge

@radeksimko radeksimko changed the title helper: Add ContinuousTargetOccurence to work around incosistency helper: Add ContinuousTargetOccurence to work around inconsistency Feb 23, 2016
radeksimko added a commit that referenced this pull request Feb 24, 2016
helper: Add ContinuousTargetOccurence to work around inconsistency
@radeksimko radeksimko merged commit 8ce0458 into hashicorp:master Feb 24, 2016
@radeksimko radeksimko deleted the f-work-around-incosistent-api branch February 24, 2016 21:51
@deitch
Copy link

deitch commented Oct 19, 2018

It looks like this was merged in almost three years ago, but still seeing people reporting issues with IAM resource creation and eventual consistency.

Is there some (possibly unknown) config/flag one needs to use to indicate, "this is an eventually consistent resource, wait until you have x consistent successful gets before moving on"?

@ghost
Copy link

ghost commented Apr 1, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants