Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [request]: Rolling update to change instance type for Managed Nodes #746

Closed
badaldavda opened this issue Feb 11, 2020 · 9 comments
Closed
Labels
EKS Managed Nodes EKS Managed Nodes EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@badaldavda
Copy link

Community Note

Tell us about your request
What do you want us to build?
Currently when we edit nodegroup/update nodegroup config we can only update scaling config -
https://docs.aws.amazon.com/cli/latest/reference/eks/update-nodegroup-config.html

Can we also change instance type for managed nodes and have a rolling update?

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Currently, to update a nodegroup we need to create a new nodegroup and then delete the older nodegroup. But in some cases modifying the same nodegroup with other instance type would be needed in the same way we change the nodegroup version to latest release version.

Are you currently working around this issue?
Currently, to update a nodegroup we need to create a new nodegroup and then delete the older nodegroup.

Additional context
NA

Attachments
NA

@badaldavda badaldavda added the Proposed Community submitted issue label Feb 11, 2020
@cdenneen
Copy link

@badaldavda I believe the proposed method for any sort of nodegroup update is a new nodegroup and then a deletion of the old one.

@mikestef9 mikestef9 added the EKS Amazon Elastic Kubernetes Service label Apr 9, 2020
@mikestef9
Copy link
Contributor

This will be possible with #585

@mikestef9 mikestef9 added the EKS Managed Nodes EKS Managed Nodes label Jun 11, 2020
@8398a7
Copy link

8398a7 commented Jul 4, 2020

@badaldavda I believe the proposed method for any sort of nodegroup update is a new nodegroup and then a deletion of the old one.

In this case, the user would need to take the following steps

  1. Adding taint to old node groups
  2. Move the Pod to a new node group
  3. When the move is complete, delete the old node group

Now when I delete a group of nodes, it looks like all nodes start the termination process at the same time.
I don't want them to be deleted at the same time, so I think you'll need to move the Pod yourself in step 2.
This is a lot of work, and I'd also like to see managed node groups be able to update rolling instance types.

@mikestef9
Copy link
Contributor

While this will be possible #585, one thing to note is that if you switch to a smaller instance type, there is a chance you could disrupt running workloads if there are not sufficient resources available on the instances after the update. We will document this behavior, but something to keep in mind you if you choose to leverage this functionality.

@mikestef9
Copy link
Contributor

Closing as this feature request is addressed by launch template support. See #585 for details!

@shibataka000
Copy link

@mikestef9 I think this issue is not resolved in case of spot managed node groups completely.

As you said, we can perform rolling update to change a instance type of managed node groups when we pass instance type through the launch template. But in case of spot managed node groups, passing a instance type through the launch template is NOT recommended far as I understand.

https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html#managed-node-group-capacity-types says

When deploying your node group with the Spot capacity type that's using a custom launch template, use the API to pass multiple instance types instead of passing a single instance type through the launch template. For more information about deploying a node group using a launch template, see Launch template support.

We still cannot perform rolling update to change instances types of managed node groups which have multiple instance types.

Could you reopen this issue? Or should I create another issue?

@psyhomb
Copy link

psyhomb commented Jun 20, 2022

Now when I delete a group of nodes, it looks like all nodes start the termination process at the same time. I don't want them to be deleted at the same time, so I think you'll need to move the Pod yourself in step 2. This is a lot of work, and I'd also like to see managed node groups be able to update rolling instance types.

This is still an issue and not possible in 2022, it's only working when updating AMI and Maximum unavailable in update config is set to 1- using same node group, but if we change instance type - triggers creation of a new node group, we still encounter situation where all nodes that are running in an old node group, will be drained and terminated at the same time, which is definitely not acceptable in production environment because it can cause downtime.
One possible solution would be to somehow apply update config between node groups as well and that way if Maximum unavailable is set to 1, start draining and terminating nodes running in an old node group one node at a time.

@arunsisodiya
Copy link

Do we have any resolution on this?
Still, in 2023, we are facing this issue. In the production environment, it is really difficult to change the instance-type as it is not following the zero downtime method.

Does any of you have any workaround for EKS-managed node groups to handle this situation?

@wxGold
Copy link

wxGold commented Apr 1, 2023

Do we have any resolution on this? Still, in 2023, we are facing this issue. In the production environment, it is really difficult to change the instance-type as it is not following the zero downtime method.

Does any of you have any workaround for EKS-managed node groups to handle this situation?

Hey ArunSisodiya,
Have you manage somehow to figure this? Im facing this issue also when using node grouop module with instance_types from the node group resource and not launch_template
for example
instance_types = ["m6a.4xlarge", "m5a.4xlarge" ]
to
instance_types = ["m6a.4xlarge"]

`Terraform will perform the following actions:

  # module.node_group_tools.aws_eks_node_group.this[0] must be replaced
+/- resource "aws_eks_node_group" "this" {
      ~ ami_type               = "CUSTOM" -> (known after apply)
      ~ arn                    = "arn:aws:eks:eu-west-2:218111588114:nodegroup/example/tools-Kpb_zA/x6x398f7-x1x5-x25x-067x-xx6xxx927dxx" -> (known after apply)
      ~ capacity_type          = "ON_DEMAND" -> (known after apply)
      ~ disk_size              = 0 -> (known after apply)
      ~ id                     = "example:tools-Kpb_zA" -> (known after apply)
      ~ instance_types         = [ # forces replacement
            # (1 unchanged element hidden)
            "m5a.4xlarge",
          - "m5.4xlarge",
        ]
      + node_group_name_prefix = (known after apply)
      ~ release_version        = "ami-020622bc6d23e2c90" -> (known after apply)
      ~ resources              = [
          - {
              - autoscaling_groups              = [
                  - {
                      - name = "eks-tools-Kpb_zA-x6x398f7-x1x5-x25x-067x-xx6xxx927dxx"
                    },
                ]
              - remote_access_security_group_id = ""
            },
        ] -> (known after apply)
      ~ status                 = "ACTIVE" -> (known after apply)
        tags                   = {
            "Client"                    = "example"
            "Environment"               = "dynamic"
            "Name"                      = "tools"
            "Owner"                     = "terraform"
            "kubernetes.io/cluster/eks" = "owned"
            "workload_type"             = "tools"
        }
      ~ version                = "1.22" -> (known after apply)
        # (6 unchanged attributes hidden)

      ~ launch_template {
            id      = "lt-0fb3efe295efbd2e6"
          ~ name    = "example-tools-lt-Kpb_zA" -> (known after apply)
            # (1 unchanged attribute hidden)
        }

      - update_config {
          - max_unavailable            = 1 -> null
          - max_unavailable_percentage = 0 -> null
        }

        # (2 unchanged blocks hidden)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

module.node_group_tools.aws_eks_node_group.this[0]: Creating...
╷
│ Error: error creating EKS Node Group (example:tools-Kpb_zA): ResourceInUseException: NodeGroup already exists with name tools-Kpb_zA and cluster name example
│ {
│   RespMetadata: {
│     StatusCode: 409,
│     RequestID: "9a567eab-0983-424a-aefb-1d13dbd3b857"
│   },
│   ClusterName: "example",
│   Message_: "NodeGroup already exists with name tools-Kpb_zA and cluster name example",
│   NodegroupName: "tools-Kpb_zA"
│ }
│ 
│   with module.node_group_tools.aws_eks_node_group.this[0],
│   on .terraform/modules/node_group_tools/main.tf line 322, in resource "aws_eks_node_group" "this":
│  322: resource "aws_eks_node_group" "this" {`

why should it be recreated if the main first type still persist or even if t must to recreate then why not with a new name (random suffix i have added) as it works when i do other changes , and life cycle is configured for create_before_destroy

albgus added a commit to albgus/terraform-aws-eks that referenced this issue Jul 6, 2023
This allows setting the instance type on the launch template rather than
the EKS Managed node group. This is useful because it is not possible to
change the instance types set on a managed node groups after creation.
The suggested way to work around this is to simply define the instance
type in the launch template.

aws/containers-roadmap#746 (comment)
albgus added a commit to albgus/terraform-aws-eks that referenced this issue Jul 6, 2023
This allows setting the instance type on the launch template rather than
the EKS Managed node group. This is useful because it is not possible to
change the instance types set on a managed node groups after creation.
The suggested way to work around this is to simply define the instance
type in the launch template.

aws/containers-roadmap#746 (comment)
albgus added a commit to albgus/terraform-aws-eks that referenced this issue Jul 6, 2023
This allows setting the instance type on the launch template rather than
the EKS Managed node group. This is useful because it is not possible to
change the instance types set on a managed node groups after creation.
The suggested way to work around this is to simply define the instance
type in the launch template.

aws/containers-roadmap#746 (comment)
albgus added a commit to albgus/terraform-aws-eks that referenced this issue Jul 6, 2023
This allows setting the instance type on the launch template rather than
the EKS Managed node group. This is useful because it is not possible to
change the instance types set on a managed node groups after creation.
The suggested way to work around this is to simply define the instance
type in the launch template.

aws/containers-roadmap#746 (comment)
albgus added a commit to albgus/terraform-aws-eks that referenced this issue Sep 13, 2023
This allows setting the instance type on the launch template rather than
the EKS Managed node group. This is useful because it is not possible to
change the instance types set on a managed node groups after creation.
The suggested way to work around this is to simply define the instance
type in the launch template.

aws/containers-roadmap#746 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Managed Nodes EKS Managed Nodes EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests

8 participants