Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [bug]: EKS Auto-Mode NodePools NotReady with Custom Path Node IAM Role #2486

Open
jmsaturno opened this issue Dec 4, 2024 · 7 comments
Labels
EKS Auto Mode EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@jmsaturno
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
This is to report a bug related to NodePool creation for custom path Node IAM roles.

Which service(s) is this request for?
EKS, more specifically EKS Auto-Mode related to NodePools

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
The backend SDK utilized in conjunction with EKS Auto-Mode encounters an issue when handling NodePools that remain in a NotReady status due to custom path node IAM roles. This problem arises during the creation process of these NodePools. The SDK employs the "DescribeAccessEntry" API to verify the mapping of the Node IAM role to an Access Entry. However, a limitation exists in the SDK's implementation of this API call. While the SDK includes the IAM role name in the request parameters, it fails to account for any custom paths associated with the role. Consequently, when dealing with Node IAM roles that have custom paths, the API responds with AccessDenied errors. These errors prevent the NodePools from progressing beyond the NotReady status, effectively hindering their proper initialization and functionality within the EKS Auto-Mode environment.

The expected behavior should be for the backend SDK to account for both custom and non-custom path based node IAM roles.

The observed behavior is the following:

    "eventTime": "2024-12-04T21:49:03Z",
    "eventSource": "eks.amazonaws.com",
    "eventName": "DescribeAccessEntry",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "eks.amazonaws.com",
    "userAgent": "eks.amazonaws.com",
    "errorCode": "AccessDenied",
    "requestParameters": {
        "name": "CLUSTER_NAME_XXXX",
        "principalArn": "arn%3Aaws%3Aiam%3A%3A123456789%3Arole%2FAmazonEKSAutoNodeRole"
    }

Are you currently working around this issue?
Using IAM roles without custom paths fixes this issue. However, for organizations that require path based IAM roles, this can cause limitations.

Additional context
EKS cluster version: 1.31

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

@jmsaturno jmsaturno added the Proposed Community submitted issue label Dec 4, 2024
@mikestef9 mikestef9 added EKS Amazon Elastic Kubernetes Service EKS Auto Mode labels Dec 5, 2024
@aslatter
Copy link

aslatter commented Dec 7, 2024

This was very frustrating to debug. The error in the node-class was "Role (my roles) is unauthorized to join nodes to the cluster" so I was looking in cloud-trail for failures related to the role.

Having access to logs for the managed controls would have been a great help, here.

@jatinmehrotra
Copy link

@aslatter

Role (my roles) is unauthorized to join nodes to the cluster

I am also facing the same issue how did you solve this role error. I tried the following steps but nothing helped

@aslatter
Copy link

aslatter commented Dec 8, 2024

I am also facing the same issue how did you solve this role error. I tried the following steps but nothing helped

Does your EKS cluster role have a path? That's what's documented in this issue.

@jmsaturno
Copy link
Author

Also adding that the EKS cluster role also has this same issue. If either the Node or Cluster IAM role has a path on creation, then nodepools fail to move out of NotReady status.

Did a replication for this in my sandbox environment.

@nicolas-fidel-wmx
Copy link

In the NodeClasses configuration, you must enter the role name and not the arn, as noted in the documentation. Then it works fine

@saurav-agarwalla
Copy link

Thanks for reporting this. We're actively working on this.

Based on what we know so far, there are a bunch of different things in play here:

  1. The documentation incorrectly uses a role ARN in spec.role. It should be the role name. We only support passing in the role name today. It is slightly unrelated to the issue mentioned here but mentioning it since there's some confusion around that. We are getting the docs fixed.
  2. There are a few changes needed to support custom path in the IAM role. We're working on that and should have an fix/update soon.

@sftim
Copy link

sftim commented Dec 13, 2024

@saurav-agarwalla is this viable to triage as accepted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Auto Mode EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests

8 participants