Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Replace the use of toset() with static keys for node IAM role policy attachment #2962

Merged

Conversation

bryantbiggs
Copy link
Member

@bryantbiggs bryantbiggs commented Mar 10, 2024

Description

  • Use static keys for IAM role policy attachment

Motivation and Context

The use of toset() in for_each loops is known to be problematic. In hindsight, this should have been corrected during v20.0 but it must have been an oversight on my part. The use of toset() on the static policy attachment (custom/additional policy attachment was added awhile back with static keys) has worked without issue except for scenarios where users try to use an explicit depends_on (which is not recommended across modules for a number of reasons).

Why is this an issue now? We have an issue with our user data where the cluster service CIDR is not being passed properly, causing nodes to have the wrong internal IP. Oddly enough, we haven't seen user reports for this prior to #2955, and I first noticed it on #2960 (comment) (see screen shot - some internal IPs are IPv6, others are IPv4 - they all should be IPv6). While drafting a PR to resolve the service CIDR issue, the dreaded Error: Invalid for_each argument appeared for reasons I can't quite understand (yet):

Error: Invalid for_each argument
│
│   on ../../modules/eks-managed-node-group/main.tf line 506, in resource "aws_iam_role_policy_attachment" "this":
│  506:   for_each = var.create && var.create_iam_role ? toset(compact([
│  507:     "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
│  508:     "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  509:     var.iam_role_attach_cni_policy ? local.cni_policy : "",
│  510:   ])) : object()
│     ├────────────────
│     │ local.cni_policy is a string, known only after apply
│     │ local.iam_role_policy_prefix is "arn:aws:iam::aws:policy"
│     │ var.create is true
│     │ var.create_iam_role is true
│     │ var.iam_role_attach_cni_policy is true
│
│ The "for_each" set includes values derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of keys that will identify the instances
│ of this resource.

Breaking Changes

From an API perspective, no. However, there will be a very brief (< 1s-2s) period where permission(s) will be removed and re-attached to the IAM role used by nodes. This has been minimized with the use of Terraform state move blocks, but due to limitations of the current move functionality, they are unable to support moving from multiple sources to a single target (i.e. - the source will vary based on the partition - aws, aws-cn, aws-us-gov) so only the commercial aws region (majority) is supported. In addition, move blocks do not support variables/dynamic values (which would also resolve the prior issue), which means that the IPv6 CNI policy attachment, which is not a managed policy and therefore user created and its ARN will contain the account ID, cannot be moved either. However, the majority use case for the commercial aws partition, using the managed IPv4 CNI policy, is supported without disruption

The potential disruption while unfortunate, is quite minimal and could be equated to a small blip in network connectivity. Therefore, I am proposing this change be made under the current v20 major version so that we can continue on to resolve the service CIDR issue and avoid introducing further, larger disruptions that would be captured under v21 at a later date.

How Has This Been Tested?

  • I have updated at least one of the examples/* to demonstrate and validate my change(s)
  • I have tested and validated these changes using one or more of the provided examples/* projects
  • I have executed pre-commit run -a on my pull request

@bryantbiggs bryantbiggs changed the title fix: Use static keys for node IAM role policy attachment feat: Use static keys for node IAM role policy attachment; remove the use of toset() Mar 10, 2024
@bryantbiggs bryantbiggs changed the title feat: Use static keys for node IAM role policy attachment; remove the use of toset() feat: Replace the use of toset() with static keys for node IAM role policy attachment Mar 10, 2024
@bryantbiggs bryantbiggs marked this pull request as ready for review March 10, 2024 13:26
@bryantbiggs bryantbiggs merged commit 57f5130 into terraform-aws-modules:master Mar 10, 2024
18 checks passed
@bryantbiggs bryantbiggs deleted the fix/policy-attachment branch March 10, 2024 16:49
antonbabenko pushed a commit that referenced this pull request Mar 10, 2024
## [20.8.0](v20.7.0...v20.8.0) (2024-03-10)

### Features

* Replace the use of `toset()` with static keys for node IAM role policy attachment ([#2962](#2962)) ([57f5130](57f5130))
@antonbabenko
Copy link
Member

This PR is included in version 20.8.0 🎉

@mmerickel
Copy link

Given the comment that this hasn't been reported prior to #2955 and #2960 I'll just note that this is the exact fix requested in #2681 and #2747. Thank you for releasing it.

@bryantbiggs
Copy link
Member Author

My comments wasn't around not having seen this type of error reported before on the attach policies, it was that we didn't have a reported issue with a valid configuration. Meaning, the explicit depends_on was not a valid configuration. But regardless, it was something that should have been corrected in v20 so hopefully this puts these type of issues to bed 🤞🏽

Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants