Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IAM instance profile attachment race condition #9474

Closed
tomelliff opened this issue Oct 20, 2016 · 9 comments · Fixed by #11678
Closed

IAM instance profile attachment race condition #9474

tomelliff opened this issue Oct 20, 2016 · 9 comments · Fixed by #11678

Comments

@tomelliff
Copy link
Contributor

tomelliff commented Oct 20, 2016

It looks like there's yet another eventual consistency issue with IAM roles being created and the API returning before the role is fully created leading to errors being thrown by the AWS API when an instance attempts to use it.

#7324 and #7938 each fixed one set of issues we were seeing when creating the IAM role and profile in the same folder as the instance that used it.

We're now seeing the following error:

04:12:49 * aws_instance.scope.2: Error launching source instance: InvalidParameterValue: Value (test01-tp-pipe-selenium-node) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name
04:12:49    status code: 400, request id: f4b4e246-f1c2-49ee-bf03-782a85a28260
04:12:49 * aws_instance.scope.6: Error launching source instance: InvalidParameterValue: Value (test01-tp-pipe-selenium-node) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name
04:12:49    status code: 400, request id: 73afb476-3ebd-4208-a9a3-def9f3152cd9
04:12:49 * aws_instance.scope.1: Error launching source instance: InvalidParameterValue: Value (test01-tp-pipe-selenium-node) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name
04:12:49    status code: 400, request id: 1f512f73-6493-44d4-bad0-202402bb4130
04:12:49 * aws_instance.scope.4: Error launching source instance: InvalidParameterValue: Value (test01-tp-pipe-selenium-node) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name
04:12:49    status code: 400, request id: 7774f7b9-7818-454b-b7e4-5e0d521b6a58

which follows these creating/creation complete logs:

04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.2: Still creating... (10s elapsed)�[21m�[0m
04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.6: Still creating... (10s elapsed)�[21m�[0m
04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.1: Still creating... (10s elapsed)�[21m�[0m
04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.3: Still creating... (10s elapsed)�[21m�[0m
04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.5: Still creating... (10s elapsed)�[21m�[0m
04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.4: Still creating... (10s elapsed)�[21m�[0m
04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.7: Still creating... (10s elapsed)�[21m�[0m
04:12:28 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.0: Still creating... (10s elapsed)�[21m�[0m
04:12:38 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.3: Still creating... (20s elapsed)�[21m�[0m
04:12:38 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.5: Still creating... (20s elapsed)�[21m�[0m
04:12:38 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.7: Still creating... (20s elapsed)�[21m�[0m
04:12:38 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.0: Still creating... (20s elapsed)�[21m�[0m
04:12:47 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.0: Creation complete�[21m�[0m
04:12:48 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.3: Still creating... (30s elapsed)�[21m�[0m
04:12:48 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.5: Still creating... (30s elapsed)�[21m�[0m
04:12:48 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.7: Still creating... (30s elapsed)�[21m�[0m
04:12:48 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.3: Creation complete�[21m�[0m
04:12:48 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.7: Creation complete�[21m�[0m
04:12:49 �[0m�[1mmodule.selenium_node_instance.aws_instance.scope.5: Creation complete�[21m�[0m

It looks like we need to be catching and retrying on "InvalidParameterValue", "Invalid IAM Instance Profile name" as well as "InvalidParameterValue", "Invalid IAM Instance Profile" and "InvalidParameterValue", " has no associated IAM Roles".

Terraform Version

Currently running Terraform v0.7.1 but I can't see anything in the changelog or the code to suggest this has been fixed since.

Affected Resource(s)

Please list the resources as a list, for example:

  • aws_instance

Expected Behavior

Instances should have been created with correct IAM instance profile and not error.

Actual Behavior

Instances that were created first failed although they were launched with an IAM instance profile (but due to the error then not tagged and not added to state file). A second run then successfully created these "missing" instances.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply

References

Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:
#2660
#4709

EDIT: Just spotted that this exact error was thrown for the OP on this closed issue: #1885

@ghost
Copy link

ghost commented Oct 27, 2016

We are also running into this issue.
After poking around in the terraform code we found that there is a retry:
https://github.com/hashicorp/terraform/blob/master/builtin/providers/aws/resource_aws_instance.go#L374
and awserr checks by substring so it looks like it should work:
https://github.com/hashicorp/terraform/blob/master/builtin/providers/aws/awserr.go#L11

Is this a problem with the timeout being too short?

@brainrake
Copy link

present in 0.7.11
workaround: extend resource "aws_iam_instance_profile" with

  provisioner "local-exec" {
    command = "sleep 60" # wait for instance profile to appear :(
  }

30 sec didn't do it for me

@vasiliyb
Copy link

vasiliyb commented Nov 16, 2016

@brainrape what is the correct syntax? does that look right?

resource "aws_iam_instance_profile" "test" {
    name = "test"
    roles = ["${aws_iam_role.test.id}"]
    provisioner "local-exec" {
        command = "sleep 60" # wait for instance profile to appear :(
    }
}

@brainrake
Copy link

👍

@catsby
Copy link
Contributor

catsby commented Feb 6, 2017

This should be addressed in the next release with #11678 , where we use AWS waiters to ensure the IAM Profile exists before moving on.

arcadiatea pushed a commit to ticketmaster/terraform that referenced this issue Feb 9, 2017
@s-nakka
Copy link

s-nakka commented Aug 1, 2017

I still see this on Terraform v0.9.9

@dictvm
Copy link

dictvm commented Aug 11, 2017

Same here, though I'm on v10.0.0.

@realflash
Copy link
Contributor

realflash commented Sep 9, 2017

Probably duplicate of hashicorp/terraform-provider-aws#838

@ghost
Copy link

ghost commented Apr 7, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
8 participants