Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_cloudformation_stack may need retry logic when utilizing iam_role_arn attribute #22834

Closed
thatderek opened this issue Jan 29, 2022 · 2 comments · Fixed by #22840
Closed
Labels
eventual-consistency Pertains to eventual consistency issues. service/cloudformation Issues and PRs that pertain to the cloudformation service. service/iam Issues and PRs that pertain to the iam service.
Milestone

Comments

@thatderek
Copy link
Contributor

thatderek commented Jan 29, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Terraform v1.1.4
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v3.74.0

Affected Resource(s)

  • aws_cloudformation_stack

Terraform Configuration Files

data "aws_iam_policy_document" "cloudformation_assume" {
  statement {
    actions = ["sts:AssumeRole"]
    principals {
      type        = "Service"
      identifiers = ["cloudformation.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "cloudformation" {
  assume_role_policy = data.aws_iam_policy_document.cloudformation_assume.json
}

resource "aws_iam_policy" "cloudformation" {
  name_prefix = "cfn"
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Effect = "Allow"
        Action : ["*"]
        Resource = "*"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "cloudformation" {
  policy_arn = aws_iam_policy.cloudformation.arn
  role       = aws_iam_role.cloudformation.name
}

data "aws_iam_role" "cloudformation" {
  # This is introduced so the aws_cloudformation_stack has a dependency on the role_attachment
  # instead of the role directly. Without it, the policy may not exist before cloudformation
  # has a chance to delete it's resources
  name = aws_iam_role_policy_attachment.cloudformation.role
}

resource "aws_cloudformation_stack" "sns" {
  name = "eks-${random_string.main.result}"

  on_failure   = "DO_NOTHING"
  iam_role_arn = data.aws_iam_role.cloudformation.arn

  template_body =<<JSON
{
  "Type" : "AWS::SNS::Topic"
}
JSON
}

Debug Output

Note: providing the AWS Cloudtrail log in lieu of debug as that seems the more relevant thing.

{
    "eventVersion": "1.08",
    "userIdentity": {
        "type": "IAMUser",
        "principalId": "...",
        "arn": "...",
        "accountId": "...",
        "accessKeyId": "...",
        "userName": "..."
    },
    "eventTime": "2022-01-28T22:51:45Z",
    "eventSource": "cloudformation.amazonaws.com",
    "eventName": "CreateStack",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "...",
    "userAgent": "APN/1.0 HashiCorp/1.0 Terraform/1.1.4 (+https://www.terraform.io) terraform-provider-aws/3.74.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.42.38 (go1.16; darwin; arm64)",
    "errorCode": "ValidationException",
    "errorMessage": "Role arn:aws:iam::...:role/terraform-2022012822513824810000000a is invalid or cannot be assumed",
    "requestParameters": null,
    "responseElements": null,
    "requestID": "...",
    "eventID": "...",
    "readOnly": false,
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "recipientAccountId": "...",
    "eventCategory": "Management",
    "tlsDetails": {
        "tlsVersion": "TLSv1.2",
        "cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",
        "clientProvidedHostHeader": "cloudformation.us-east-1.amazonaws.com"
    }
}

Expected Behavior

aws_cloudformation_stack should have been able to assume the role and provision utilizing the given iam_role.

Actual Behavior

Note!!!! Behavior is intermittent and seems to depend on time of day/region/aws load/etc and is a problem in the AWS side, not the TF side, though the TF side could deal with this with retry logic I think.

I assume the iam role takes a second to propagate on the AWS side and isn't available to use yet. Looking at the tests for the cloudformation_stack resource, I noticed there aren't any ci-tests that create/use an IAM role. I assume if there were, retry logic would need to be added to the stack.go when a ValidationException "role cannot be assumed" error occurs.

Steps to Reproduce

  1. terraform apply
  2. keep doing it in us-e-1 until it breaks. It will. Eventually.

Important Factoids

Intermittent sadly but trust me, it happens.

References

Here is a post from AWS talking about how there can be slight delays in IAM things as it is a distributed system.

  • #0000
@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/cloudformation Issues and PRs that pertain to the cloudformation service. service/iam Issues and PRs that pertain to the iam service. labels Jan 29, 2022
@ewbankkit ewbankkit added eventual-consistency Pertains to eventual consistency issues. and removed needs-triage Waiting for first response or review from a maintainer. labels Jan 31, 2022
@github-actions github-actions bot added this to the v3.75.0 milestone Jan 31, 2022
@ewbankkit ewbankkit modified the milestones: v3.75.0, v4.0.0 Feb 1, 2022
@github-actions
Copy link

This functionality has been released in v4.0.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
eventual-consistency Pertains to eventual consistency issues. service/cloudformation Issues and PRs that pertain to the cloudformation service. service/iam Issues and PRs that pertain to the iam service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants