Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knob to disable/enable leaked eni cleanup #1641

Closed
wants to merge 6 commits into from

Conversation

jayanthvn
Copy link
Contributor

@jayanthvn jayanthvn commented Sep 28, 2021

What type of PR is this?
Enhancement

Which issue does this PR fix:
N/A

What does this PR do / Why do we need it:
This adds 2 additional tags for ENIs allocated by aws-node -

    1. kubernetes.io/cluster/<cluster-name>: owned
    2. eks:eni:owner: amazon-vpc-cni

Also added a knob to disable leaked ENI collection.

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
n/a

Testing done on this change:

Yes

UTs

PASS
coverage: 100.0% of statements
ok  	github.com/aws/amazon-vpc-cni-k8s/pkg/utils/retry	0.003s	coverage: 100.0% of statements
?   	github.com/aws/amazon-vpc-cni-k8s/pkg/utils/ttime	[no test files]
?   	github.com/aws/amazon-vpc-cni-k8s/pkg/utils/ttime/mocks	[no test files]
?   	github.com/aws/amazon-vpc-cni-k8s/pkg/version	[no test files]

Automation added to e2e:

Will add as a follow up PR

Will this break upgrades or downgrades. Has updating a running cluster been tested?:
No

Does this change require updates to the CNI daemonset config files to work?:

No

Does this PR introduce any user-facing change?:

2 additional tags will be created on ENIs allocated by aws-node


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@@ -411,7 +416,7 @@ func New(useCustomNetworking, disableENIProvisioning, v4Enabled, v6Enabled bool)
}

// Clean up leaked ENIs in the background
if !disableENIProvisioning {
if !disableENIProvisioning && !disableLeakedENICollection {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it..Why don't we disable this for v6 completely since v6 support in PD mode will not be creating any additional ENIs? We only have/need Primary ENI in v6 PD mode..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can. Only catch is if they moved from v4 to v6 cluster and there are leaked ENIs. Maybe should be fine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, v6 cluster will have to be a brand new cluster and more importantly the proposed IPv6 IAM policy for an IPv6 cluster will not have permissions to delete an ENI, so it is not going to be of any use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say in future if we support custom networking with v6 then we might still need this logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we might but we can probably remove the v6 check at that point along with v6 specific checks we placed around custom nw code. Problem I see with this right now is the ipamd logs might be filled with 403s(Unauthorized) every hr when the aws-node pod attempts to delete a leaked ENI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense to disable it for now until we have actual support for custom networking

@srini-ram srini-ram requested a review from achevuru September 28, 2021 15:44
@jayanthvn jayanthvn added this to the v1.11.0 milestone Oct 20, 2021
@jayanthvn jayanthvn requested a review from a team as a code owner January 11, 2022 16:30
@jayanthvn jayanthvn modified the milestones: v1.11.0, v1.11.1 Mar 7, 2022
@jayanthvn jayanthvn requested a review from srini-ram March 28, 2022 21:31
@@ -54,6 +54,11 @@ const (
eniClusterTagKey = "cluster.k8s.amazonaws.com/name"
additionalEniTagsEnvVar = "ADDITIONAL_ENI_TAGS"
reservedTagKeyPrefix = "k8s.amazonaws.com"
clusterNameTagKeyFormat = "kubernetes.io/cluster/%s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already add a cluster name tag -

tags[eniClusterTagKey] = cache.clusterName
. We use this for our Cluster specific IAM policy right now. MAO will also populate this env variable by default going forward. Do we still need another tag for cluster name?

@github-actions
Copy link

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Jul 20, 2022
@jayanthvn
Copy link
Contributor Author

/not stale

@github-actions github-actions bot removed the stale Issue or PR is stale label Jul 21, 2022
@vikasmb vikasmb removed this from the v1.11.3 milestone Aug 4, 2022
@github-actions
Copy link

github-actions bot commented Oct 4, 2022

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Oct 4, 2022
@jayanthvn
Copy link
Contributor Author

/not stale

@github-actions github-actions bot removed the stale Issue or PR is stale label Oct 13, 2022
@github-actions
Copy link

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Dec 13, 2022
@jdn5126 jdn5126 removed the stale Issue or PR is stale label Dec 14, 2022
@github-actions
Copy link

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Feb 13, 2023
@jayanthvn
Copy link
Contributor Author

/not stale. This will be needed once we move cleanup logic to VPC RC.

@github-actions github-actions bot removed the stale Issue or PR is stale label Feb 19, 2023
@github-actions
Copy link

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Apr 21, 2023
@jayanthvn
Copy link
Contributor Author

/not stale

@github-actions github-actions bot removed the stale Issue or PR is stale label Apr 26, 2023
@jayanthvn
Copy link
Contributor Author

@M00nF1sh - As part of this PR, I will remove the tagging and have just the disable logic. Once we have the final design we can add the tagging..

@jayanthvn jayanthvn added this to the v1.12.7 milestone Apr 29, 2023
@jdn5126
Copy link
Contributor

jdn5126 commented May 3, 2023

Discarding in favor of #2370

@jdn5126 jdn5126 closed this May 3, 2023
@jayanthvn jayanthvn modified the milestones: v1.12.7, v1.13.0 May 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants