-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provider/aws: iam_instance_profile not yet ready when ec2 instance is launched #1885
Comments
I am unsure how to solve this properly. How can we be sure the instance profile is propagated? Simply checking if it exists is not enough, because the problem here is that it simply isn't visible to the created instance yet. At the same time, the error simply returns HTTP code Furthermore, I suspect this may actually need to be solved upstream by the Go AWS SDK. I "solved" it for now by adding |
+1 I just ran into this issue as well. |
+1 |
aaaaand I just hit this as well. I wonder if a new terraform option for retry or (uggg) sleep should be added as a generic resource option for advanced usage. I hate it but ..... |
We've solved these sorts of issues with the AWS api by doing a smart retry with a back-off. Try again, then back-off, try again, back-off a little more, try again, back-off a little more, error. |
@adamhjk Yep! We do the same thing, but the error message here is one we don't catch yet. We'll add that. |
We solved this on thursday with a local-exec provisioner on the iam_instance_profile w/ an inline command = "sleep 5". Experimentation may be needed to find the right sleep. It sucks to put in artificial sleeps in local-execs but it currently avoids having to re-run apply so that's a win imo. I agree with the backoff patch being the correct solution. Just sharing our bandaid. |
Glad to find this. Thought I was going crazy. |
Proposing #2037 to fix this |
Aaand I've finally hit this issue as well ! terraform version
Terraform v0.5.3-dev (a6b8b65e6e0b5cf1f0b4fcf3f6abde3b7db21a97) which has been built 5 days after the PR has been merged. Am I missing something obvious ? When I apply all resources on the 1st run ( from scratch - no resources present ) TF fails because of the slow IAM propagation(see error output below) Error output when creating all VPC resources in one run:
|
This has always been a problem with the AWS API, some methods results when evoked are eventually consistent. I believe there are more cases then just IAM. I've definitely run into other cases over the years. So maybe a more generic approach is a good idea? Incremental back off seems like a decent idea, maybe parameterising it on a resource level, aka adding a backoff_step (time in seconds) and backoff_attempts properties to resources, then calculating the amount of time/cumulative step from there. this would at least allow people to sidestep the issue rather then it being a blocker, as there will inevitably be more cases like this, such is the nature of the AWS API. I'd definitely like to be able to change the behaviour via config over changing source and recompiling terraform. |
@stefancocora do you have a sample config that reproduces this error for you consistently? #2037 addressed the simple case I was able to reproduce, but maybe it was too simple. If you do have such a config, please share, being sure to omit any secrets! |
As a note I do not get the same issue if the instances are booted by and autoscale group. this race condition is between launching an instance directly with the aws_instance resource that depends on a IAM instance profile directly. So if you can use autoscaling it side steps the issue as AWS must make sure instance profile is available, or enough time passes before it tries to create the minimum instances in the launch config. |
PR #2037 does not fix the issue for me, either...
@knuckolls local-exec provisioner workaround has gotten us past the issue for now. |
@ljohnston What's the command you're local-exec-ing? |
@jkodroff I would assume it's something like "sleep 10" |
@jkodroff ...
|
Just hit this same issue on terraform v0.6.4. Lots of errors like this:
Adding |
I'm hitting this error with 0.6.12 |
I am having the same issue with 0.6.15 and it's very intermediate. |
+1 |
+1 |
+1 seem to be hitting this with latest |
+1 |
1 similar comment
+1 |
0.7.11 having issues here |
kubernetes v1.4.5
SOLUTION - teardown cluster and retry ... internally shielded from enduser this error fixes itself so no need to manually muck about to create any IAM gadgetry
|
We've been using 0.7.5 and have been hitting the issue. So, we tried 0.7.11 today and it had the same problem. |
I finally got past it using the sleep hack, started at 5 seconds and had to bump it all the way to 90s before deployment would succeed
|
Same issue here |
Just hit this with Terraform v0.8.6. Perhaps worth mentioning is that this, at least in my case, usually leads to a stray EC2 instance running on EC2 that isn't picked up by Terraform and requires manual termination/cleanup. |
I haven't hit this in terraform but when using the API directly. I often have to put sleeps in for things that often take a little bit of time to propagate across aws services. I've noticed that instance profiles take quite a while (relative to other resources) to find seen as a valid profile to attach to an instance. I just wanted to note that CloudFormation appears to wait (sleep) a full 2 minutes (!) after an instance profile is created! (Assuming this is still accurate, over a year later) https://forums.aws.amazon.com/thread.jspa?messageID=593651 |
Found in v0.9.1 |
Likewise, hitting this in v0.9.3. It's still resulting in a stray instance, too. |
Still hitting this bug also on 0.8.x so I am not sure why it's closed |
Found in v0.9.4 also. |
Also hitting this in v0.9.0.4. Running a second time does solve the issue, but would be great to have a longer term fix. |
Yep the second run worked for me as well. |
I do have similar issue
Putting sleep for 2 min works for me.
Ref: https://forums.aws.amazon.com/thread.jspa?messageID=593651 |
Still present in 0.9.10 |
Probably hashicorp/terraform-provider-aws#838 |
I think this issue occurs because when terraform tries to find
Let me know if this works for anyone else too. Better than waiting for some random amount of time. |
It's worse than that. |
Why this issue is closed? It is still exist in v0.11.7. So far I'm using this ways, which is weird:
|
Hi all, Issues with the terraform aws provider should be opened in the aws provider repository. Because this closed issue is generating notifications for subscribers, I am going to lock it and encourage anyone experiencing issues with the aws provider to open tickets there. Please continue to open issues here for any other terraform issues you encounter, and thanks! |
When launching an EC2 instance with a new IAM instance profile, Terraform returns an error:
I believe this is due to a problem mentioned here in the AWS docs:
I think Terraform should catch this error and retry when it occurs.
Right now, simply re-running
plan
/apply
solves the issue for me.The text was updated successfully, but these errors were encountered: