Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_eks: kubectl function shouldn't require the vpc when endpointAccess is PUBLIC_AND_PRIVATE #27055

Open
diranged opened this issue Sep 7, 2023 · 5 comments
Labels
@aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. p2

Comments

@diranged
Copy link

diranged commented Sep 7, 2023

Describe the bug

See AWS Case ID #13749604991 for more details.

We've been working on building out new EKS clusters based on CDK rather than native CloudFormation. As part of that, we're iterating with the integration test runner. We were struggling to understand why the DELETE process for the tests was taking 30+m, when we discovered that the KubectlHandler function is being created inside the VPC. This was surprising to us, as we specifically are not requiring that (we are not setting the placeClusterHandlerInVpc flag).

It turns out that the Cluster class makes the decision to place the KubectlHandler into the VPC at https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-eks/lib/cluster.ts#L1594:

    if (this.endpointAccess._config.privateAccess && privateSubnets.length !== 0) {

      // when private access is enabled and the vpc has private subnets, lets connect
      // the provider to the vpc so that it will work even when restricting public access.

      // validate VPC properties according to: https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html
      if (this.vpc instanceof ec2.Vpc && !(this.vpc.dnsHostnamesEnabled && this.vpc.dnsSupportEnabled)) {
        throw new Error('Private endpoint access requires the VPC to have DNS support and DNS hostnames enabled. Use `enableDnsHostnames: true` and `enableDnsSupport: true` when creating the VPC.');
      }

      this.kubectlPrivateSubnets = privateSubnets;

      // the vpc must exist in order to properly delete the cluster (since we run `kubectl delete`).
      // this ensures that.
      this._clusterResource.node.addDependency(this.vpc);
    }

While yes .. we want our clusters to be PUBLCI and PRIVATE so that within the VPC, API calls can go directly to the API without leaving the network. On the other hand, we couldn't care less about the Lambda function being inside the VPC. According to our case rep, any time a DELETE call is made to a Lambda function with attached ENIs, there is a 20m delay between the DELETE starting and finishing. This delay adds a huge amount of wasted time to our integration tests and basically slows everything down for little value.

Expected Behavior

I would expect that if the cluster is PUBLIC_AND_PRIVATE, and I set placeClusterHandlerInVpc: false, that the Lambda functions would indeed NOT be placed in the VPC.

Current Behavior

They are placed in the VPC, regardless of our placeClusterHandlerInVpc setting.

Reproduction Steps

Create a cluster with PUBLIC_AND_PRIVATE set ... then try deleting it. Watch it take 20 minutes to delete.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.94.0

Framework Version

No response

Node.js Version

18

OS

linux

Language

Typescript

Language Version

No response

Other information

No response

@diranged diranged added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 7, 2023
@github-actions github-actions bot added the @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service label Sep 7, 2023
@diranged
Copy link
Author

diranged commented Sep 7, 2023

We developed a hacky fix here using the escape hatches ...

let kubectlProvider = Stack.of(this).node.findChild('@aws-cdk/aws-eks.KubectlProvider');
let handler = kubectlProvider.node.findChild('Handler');
let resource = handler.node.findChild('Resource') as CfnFunction;
resource.vpcConfig = undefined;
$ npx integ-runner --no-clean --force integ/constructs/aws_eks/integ.cluster.ts

Verifying integration test snapshots...

  CHANGED    integ/constructs/aws_eks/integ.cluster 4.488s
      Resources
[~] AWS::IAM::Role TestKubectlHandlerRole7073B199 
 └─ [-] DependsOn
     └─ ["TestClusterCIDRBlockA2FCD90D","TestClusterCIDRBlockSubnet0CD56F47A","TestClusterCIDRBlockSubnet1216B045C","TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8","TestClusterCIDRBlockSubnetRouteTableAssociation15F930483"]
[~] AWS::IAM::Policy TestKubectlHandlerRoleDefaultPolicy40956F86 
 └─ [-] DependsOn
     └─ ["TestClusterCIDRBlockA2FCD90D","TestClusterCIDRBlockSubnet0CD56F47A","TestClusterCIDRBlockSubnet1216B045C","TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8","TestClusterCIDRBlockSubnetRouteTableAssociation15F930483"]
[~] AWS::IAM::Role TestRole17AB2208 
 └─ [-] DependsOn
     └─ ["TestClusterCIDRBlockA2FCD90D","TestClusterCIDRBlockSubnet0CD56F47A","TestClusterCIDRBlockSubnet1216B045C","TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8","TestClusterCIDRBlockSubnetRouteTableAssociation15F930483"]
[~] AWS::EC2::SecurityGroup TestControlPlaneSecurityGroup05BB1761 
 └─ [-] DependsOn
     └─ ["TestClusterCIDRBlockA2FCD90D","TestClusterCIDRBlockSubnet0CD56F47A","TestClusterCIDRBlockSubnet1216B045C","TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8","TestClusterCIDRBlockSubnetRouteTableAssociation15F930483"]
[~] AWS::IAM::Role TestCreationRoleE90BFD6D 
 └─ [~] DependsOn
     └─ @@ -1,9 +1,4 @@
        [ ] [
        [-]   "TestClusterCIDRBlockA2FCD90D",
        [-]   "TestClusterCIDRBlockSubnet0CD56F47A",
        [-]   "TestClusterCIDRBlockSubnet1216B045C",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation15F930483",
        [ ]   "UserSuppliedVpcIGWCA64A826",
        [ ]   "UserSuppliedVpcPrivateSubnet1DefaultRoute966544D4",
        [ ]   "UserSuppliedVpcPrivateSubnet1RouteTableC32F4B57",
[~] AWS::IAM::Policy TestCreationRoleDefaultPolicy592DF1D6 
 └─ [~] DependsOn
     └─ @@ -1,9 +1,4 @@
        [ ] [
        [-]   "TestClusterCIDRBlockA2FCD90D",
        [-]   "TestClusterCIDRBlockSubnet0CD56F47A",
        [-]   "TestClusterCIDRBlockSubnet1216B045C",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation15F930483",
        [ ]   "UserSuppliedVpcIGWCA64A826",
        [ ]   "UserSuppliedVpcPrivateSubnet1DefaultRoute966544D4",
        [ ]   "UserSuppliedVpcPrivateSubnet1RouteTableC32F4B57",
[~] Custom::AWSCDK-EKS-Cluster Test3ACBFD1D 
 └─ [~] DependsOn
     └─ @@ -1,9 +1,4 @@
        [ ] [
        [-]   "TestClusterCIDRBlockA2FCD90D",
        [-]   "TestClusterCIDRBlockSubnet0CD56F47A",
        [-]   "TestClusterCIDRBlockSubnet1216B045C",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation15F930483",
        [ ]   "TestCreationRoleDefaultPolicy592DF1D6",
        [ ]   "TestCreationRoleE90BFD6D",
        [ ]   "UserSuppliedVpcIGWCA64A826",
[~] AWS::SSM::Parameter TestKubectlReadyBarrier082A53D9 
 └─ [~] DependsOn
     └─ @@ -1,9 +1,4 @@
        [ ] [
        [-]   "TestClusterCIDRBlockA2FCD90D",
        [-]   "TestClusterCIDRBlockSubnet0CD56F47A",
        [-]   "TestClusterCIDRBlockSubnet1216B045C",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation15F930483",
        [ ]   "TestCreationRoleDefaultPolicy592DF1D6",
        [ ]   "TestCreationRoleE90BFD6D",
        [ ]   "Test3ACBFD1D"
[~] Custom::AWSCDK-EKS-KubernetesResource TestAwsAuthmanifestE302BF21 
 └─ [~] DependsOn
     └─ @@ -1,8 +1,3 @@
        [ ] [
        [-]   "TestClusterCIDRBlockA2FCD90D",
        [-]   "TestClusterCIDRBlockSubnet0CD56F47A",
        [-]   "TestClusterCIDRBlockSubnet1216B045C",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation0A86BFFB8",
        [-]   "TestClusterCIDRBlockSubnetRouteTableAssociation15F930483",
        [ ]   "TestKubectlReadyBarrier082A53D9"
        [ ] ]


  CHANGED    integ/constructs/aws_eks/integ.cluster 4.488s
      Resources
[~] AWS::Lambda::Function Handler886CB40B 
 └─ [-] VpcConfig
     └─ {"SecurityGroupIds":[{"Ref":"referencetoINFRACDKK8SClusterTest9112B301ClusterSecurityGroupId"}],"SubnetIds":[{"Ref":"referencetoINFRACDKK8SClusterTestUserSuppliedVpcPrivateSubnet1SubnetF8460491Ref"},{"Ref":"referencetoINFRACDKK8SClusterTestUserSuppliedVpcPrivateSubnet2SubnetDDB7C6A0Ref"}]}

@diranged
Copy link
Author

diranged commented Sep 8, 2023

Looks like there's another resource in the provider framework that needs to also have the vpcConfig removed to fix the issue:

 // onevent handler
    let provider = kubectlProvider.node.findChild('Provider');
    let framework = provider.node.findChild('framework-onEvent');
    let onEventResource = framework.node.findChild('Resource') as CfnFunction;
    onEventResource.vpcConfig = undefined;

@indrora indrora added p2 @aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud and removed needs-triage This issue or PR still needs to be triaged. labels Sep 11, 2023
@indrora
Copy link
Contributor

indrora commented Sep 11, 2023

I can't comment on why the handler is placed inside the private side of the VPC except that I would assume that it is in fact for the purposes of security.

Please also understand that this team cannot see AWS cases.

While this workaround isn't great, I'm glad it works for you.

@diranged
Copy link
Author

@indrora,
So I am glad that I was able to come up with a workaround - but I still believe the fundamental behavior is wrong. I believe this should follow the behavior of the placeClusterHandlerInVpc setting.

@xazhao
Copy link
Contributor

xazhao commented Oct 24, 2024

Hi @diranged,

Thanks for raising the issue. There are 2 lambda functions involved here:

  • Cluster Handler - which uses EKS API to create the EKS Cluster
  • Kubectl Handler - which is to issue kubectl commands against the cluster

placeClusterHandlerInVpc setting is for the Cluster Handler so it doesn't impact Kubectl Handler. Regarding the kubectl Handler vpc setup, here is the description: https://github.com/aws/aws-cdk/tree/main/packages/aws-cdk-lib/aws-eks#kubectl-handler

Breaking this down, it means that if the endpoint exposes private access 
(via EndpointAccess.PRIVATE or EndpointAccess.PUBLIC_AND_PRIVATE), 
and the VPC contains private subnets, the Lambda function will be provisioned inside the VPC 
and use the private subnets to interact with the cluster. This is the common use-case.

We're about to rewrite the current EKS module and will definitely consider your feedback when making decisions.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. p2
Projects
None yet
Development

No branches or pull requests

3 participants