Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AWS Managed NodeGroups #280

Merged
merged 3 commits into from
Dec 4, 2019
Merged

Add support for AWS Managed NodeGroups #280

merged 3 commits into from
Dec 4, 2019

Conversation

metral
Copy link
Contributor

@metral metral commented Nov 26, 2019

Proposed changes

PR includes new examples/test: examples/managed-nodegroups.


Support Details:

Parity compared to self-managed node groups that use CloudFormation Stacks and ASGs.

  • Security Groups
    • Amazon EKS clusters beginning with Kubernetes version 1.14 and platform version eks.3 create a cluster security group as part of cluster creation (or when a cluster is upgraded to this Kubernetes version and platform version). This security group is designed to allow all traffic from the control plane and managed node groups to flow freely between each other

    • Amazon EKS managed node groups are automatically configured to use the cluster security group.

    • Users cannot provide their own security group, or ingress rules for use with the cluster
    • Users can only provide their own security group for remote configuration for Node SSH access.
  • SSH Key Pair
    • Only key pair name can be provided.
  • Cluster Attributes
    • Only k8s labels and AWS tags on the managed node groups are supported.
    • kubelet-extra-args is not supported.
      • Taints are not supported.
    • Node Updates

      Node updates and terminations gracefully drain nodes to ensure that your applications stay available

  • Instance Type
    • Terraform takes instance_types (plural) to be a set, but EKS accepts a single, string value.
      • The Pulumi support is mapped to a pulumi.Output<string>, which is inline with EKS support
      • The variable in Pulumi is instanceTypes (plural per the TF provider), even though its value is singular.
  • AMI ID
    • Instances in a managed node group use the latest version of the Amazon EKS-optimized Amazon Linux 2 AMI for its cluster's Kubernetes version. You can choose between standard and GPU variants of the Amazon EKS-optimized Amazon Linux 2 AMI.

References:

Related issues (optional)

Closes #278

@metral metral force-pushed the metral/managed-ng branch 2 times, most recently from 72a742b to 272178a Compare November 26, 2019 23:10
@metral metral force-pushed the metral/managed-ng branch 4 times, most recently from 248fad9 to 761a64e Compare November 26, 2019 23:19
@metral metral changed the title Add support for AWS Managed NodeGroups Add support for AWS Managed NodeGroups with examples/tests Nov 27, 2019
@metral metral changed the title Add support for AWS Managed NodeGroups with examples/tests Add support for AWS Managed NodeGroups Nov 27, 2019
Copy link
Contributor

@lukehoban lukehoban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome to see this! A few questions on little details. The one larger question - which I think I know the answer to but would love to hear your thoughts - do we need the createManagedNodeGroup instead of just allowing folks to use aws.eks.NodeGroup directly instead? What benefits do uses get from having the API available here? And are there enough benefits to non-managed NodeGroups that you expect we’ll still want them long term here long term?

nodejs/eks/nodegroup.ts Outdated Show resolved Hide resolved
nodejs/eks/nodegroup.ts Outdated Show resolved Hide resolved
nodejs/eks/nodegroup.ts Outdated Show resolved Hide resolved
nodejs/eks/nodegroup.ts Outdated Show resolved Hide resolved
@metral
Copy link
Contributor Author

metral commented Nov 27, 2019

The one larger question - which I think I know the answer to but would love to hear your thoughts - do we need the createManagedNodeGroup instead of just allowing folks to use aws.eks.NodeGroup directly instead? What benefits do uses get from having the API available here?

Great point, I'll refactor.

And are there enough benefits to non-managed NodeGroups that you expect we’ll still want them here long term?

I believe we should keep self-managed NodeGroups in maintenance mode for now, and reassess later. Keeping it around allows:

  • Existing clusters to have a path to use the AWS managed node groups if necessary, and
  • If the cluster needs maximum configuration, this is the option best suited.

As the Support Details in the OP note, some users may want specificity not yet supported in AWS managed node groups, e.g.: kubelet-extra-args, Taints, custom security groups, and AMI version pinning.

@metral metral force-pushed the metral/managed-ng branch 5 times, most recently from 17e343f to 236346d Compare November 27, 2019 08:52
Copy link
Contributor

@stack72 stack72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t looked at backwards compatibility but I love the simplicity of the API in creating the managedNodeGroup - nice work here!

@metral
Copy link
Contributor Author

metral commented Nov 27, 2019

Instance Type

  • TF allows instance_types (plural) to be a set, but EKS appears to only accept a single, string value.
  • The Pulumi support is mapped to a pulumi.Output<string>, which is inline with EKS support
  • The variable in Pulumi is instanceTypes (plural per the TF provider), even though its value is singular.

^ Thoughts on support details from OP @stack72 ?

@metral metral marked this pull request as ready for review November 28, 2019 03:20
@metral metral force-pushed the metral/managed-ng branch 3 times, most recently from e815fe7 to c8112c8 Compare November 28, 2019 03:30
@metral
Copy link
Contributor Author

metral commented Nov 28, 2019

@joeduffy @lukehoban @stack72

CI is currently broken, but I know where the fix is: the k8s test framework. I have to update it to account for managed nodegroups in addition to CF node groups.

PTAL and review.

@stack72
Copy link
Contributor

stack72 commented Nov 28, 2019

Instance Type

  • TF allows instance_types (plural) to be a set, but EKS appears to only accept a single, string value.
  • The Pulumi support is mapped to a pulumi.Output<string>, which is inline with EKS support
  • The variable in Pulumi is instanceTypes (plural per the TF provider), even though its value is singular.

^ Thoughts on support details from OP @stack72 ?

@metral so the AWS CLI actually uses InstanceTypes as well - I think this could be for an upcoming feature - I think we should keep that

Set defaults where possible:
  - clusterName
  - scalingConfig
  - subnets
@metral
Copy link
Contributor Author

metral commented Nov 29, 2019

CI is ✅

PTAL @lukehoban @joeduffy

nodejs/eks/cluster.ts Outdated Show resolved Hide resolved
nodejs/eks/cluster.ts Outdated Show resolved Hide resolved
nodejs/eks/cluster.ts Outdated Show resolved Hide resolved
nodejs/eks/cluster.ts Outdated Show resolved Hide resolved
nodejs/eks/cluster.ts Outdated Show resolved Hide resolved
nodejs/eks/examples/managed-nodegroups/iam.ts Show resolved Hide resolved
nodejs/eks/nodegroup.ts Outdated Show resolved Hide resolved
nodejs/eks/nodegroup.ts Outdated Show resolved Hide resolved
nodejs/eks/nodegroup.ts Show resolved Hide resolved
nodejs/eks/cluster.ts Outdated Show resolved Hide resolved
nodejs/eks/cluster.ts Outdated Show resolved Hide resolved
nodejs/eks/examples/managed-nodegroups/iam.ts Show resolved Hide resolved
@metral
Copy link
Contributor Author

metral commented Dec 3, 2019

I've addressed feedback where possible, but have not yet identified a path forward for how to manage the aws-auth configmap between self-managed and managed node groups.

To summarize what we're working with:

  • With self-managed node groups, we will always need to create the configmap as AWS does not create it.
  • With managed node groups, IME, only the first nodegroup to connect to the cluster gets its role mapped into the aws-auth configmap, which gets automatically created by AWS. This is not documented anywhere that I've found on TF or AWS.
  • Any subsequent managed node groups that are created after the first one is up will not be automatically added to the configmap. It appears AWS does not handle this scenario.
    • The proceeding managed node groups register with the cluster, but then become NotReady shortly after, and the console shows access denied for the nodes to join the cluster.
  • The TF provider and current AWS API does not seem to take into account user and role mappings for the configmap. The only reason we know the aws-auth configmap gets auto-created and populated with the first managed node group's IAM role mapping is because the creation of the eksNodeAccess configmap errors with an "existing resource" error when we try creating it && using managed node groups simultaneously.
  • Given that we don't have a kubectl apply or a get & patch approach (Allow a model similar to kubectl apply -f pulumi-kubernetes#264) in the event of an existing configmap, and that we encounter errors removing the eksNodeAccess for existing node group clusters, we must conditionally create or apply the aws-auth configmap based on whether its self or managed node groups. Not to mention, if the user has both self-managed && managed node groups in the same cluster.

The only "clean" way I see of creating the configmap is that we must get & patch / apply the user args for the configmap w/ any data already in the existing configmap, or create the configmap if it does not exist as usual. But this relies on pulumi/pulumi-kubernetes#264 or some temp work around, and I'm not sure how to proceed.

Thoughts @lukehoban @lblackstone?

@lukehoban
Copy link
Contributor

I believe that we should be able to keep the original model of creating the aws-auth ConfigMap ourselves for both self-managed and managed NodeGroups. We just need to ensure we have a dependency of the aws.eks.NodeGroup on the ConfigMap to ensure that we always create the ConfigMap prior to AWS creating it itself.

@metral
Copy link
Contributor Author

metral commented Dec 3, 2019

Feedback has been addressed and CI is currently running with changes to the nodegroup to now depend on the original eksNodeAccess configmap managed by pulumi.

This is working as expected on the managed-nodegroups example/test, along with an existing basic cluster and existing nodegroup cluster.

@lukehoban @lblackstone PTAL

Copy link
Contributor

@lukehoban lukehoban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Managed Node Groups
4 participants