Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knob to disable/enable leaked eni cleanup #1641

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -522,17 +522,23 @@ This plugin interacts with the following tags on ENIs:
* `cluster.k8s.amazonaws.com/name`
* `node.k8s.amazonaws.com/instance_id`
* `node.k8s.amazonaws.com/no_manage`
* `kubernetes.io/cluster/<cluster-name>`
* `eks:eni:owner`

#### Cluster Name tag

The tag `cluster.k8s.amazonaws.com/name` will be set to the cluster name of the
The tag `cluster.k8s.amazonaws.com/name` and `kubernetes.io/cluster/<cluster-name>` will be set to the cluster name of the
aws-node daemonset which created the ENI.

#### Instance ID tag

The tag `node.k8s.amazonaws.com/instance_id` will be set to the instance ID of
the aws-node instance that allocated this ENI.

#### ENI owner tag

The tag `eks:eni:owner` will be set to `amazon-vpc-cni` since aws-node created this ENI.

#### No Manage tag

The tag `node.k8s.amazonaws.com/no_manage` is read by the aws-node daemonset to
Expand Down
14 changes: 11 additions & 3 deletions pkg/awsutils/awsutils.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,11 @@ const (
eniClusterTagKey = "cluster.k8s.amazonaws.com/name"
additionalEniTagsEnvVar = "ADDITIONAL_ENI_TAGS"
reservedTagKeyPrefix = "k8s.amazonaws.com"
clusterNameTagKeyFormat = "kubernetes.io/cluster/%s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already add a cluster name tag -

tags[eniClusterTagKey] = cache.clusterName
. We use this for our Cluster specific IAM policy right now. MAO will also populate this env variable by default going forward. Do we still need another tag for cluster name?

clusterNameTagValue = "owned"

networkInterfaceOwnerTagKey = "eks:eni:owner"
networkInterfaceOwnerTagValue = "amazon-vpc-cni"
// UnknownInstanceType indicates that the instance type is not yet supported
UnknownInstanceType = "vpc ip resource(eni ip limit): unknown instance type"

Expand Down Expand Up @@ -371,7 +376,7 @@ func (i instrumentedIMDS) GetMetadataWithContext(ctx context.Context, p string)
}

// New creates an EC2InstanceMetadataCache
func New(useCustomNetworking, disableENIProvisioning, v4Enabled, v6Enabled bool) (*EC2InstanceMetadataCache, error) {
func New(useCustomNetworking, disableENIProvisioning, v4Enabled, v6Enabled, disableLeakedENICollection bool) (*EC2InstanceMetadataCache, error) {
//ctx is passed to initWithEC2Metadata func to cancel spawned go-routines when tests are run
ctx := context.Background()

Expand Down Expand Up @@ -411,7 +416,7 @@ func New(useCustomNetworking, disableENIProvisioning, v4Enabled, v6Enabled bool)
}

// Clean up leaked ENIs in the background
if !disableENIProvisioning {
if !disableENIProvisioning && !disableLeakedENICollection {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it..Why don't we disable this for v6 completely since v6 support in PD mode will not be creating any additional ENIs? We only have/need Primary ENI in v6 PD mode..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can. Only catch is if they moved from v4 to v6 cluster and there are leaked ENIs. Maybe should be fine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, v6 cluster will have to be a brand new cluster and more importantly the proposed IPv6 IAM policy for an IPv6 cluster will not have permissions to delete an ENI, so it is not going to be of any use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say in future if we support custom networking with v6 then we might still need this logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we might but we can probably remove the v6 check at that point along with v6 specific checks we placed around custom nw code. Problem I see with this right now is the ipamd logs might be filled with 403s(Unauthorized) every hr when the aws-node pod attempts to delete a leaked ENI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense to disable it for now until we have actual support for custom networking

go wait.Forever(cache.cleanUpLeakedENIs, time.Hour)
}

Expand Down Expand Up @@ -828,13 +833,16 @@ func (cache *EC2InstanceMetadataCache) createENI(useCustomCfg bool, sg []*string
// buildENITags computes the desired AWS Tags for eni
func (cache *EC2InstanceMetadataCache) buildENITags() map[string]string {
tags := map[string]string{
eniNodeTagKey: cache.instanceID,
eniNodeTagKey: cache.instanceID,
networkInterfaceOwnerTagKey: networkInterfaceOwnerTagValue,
}

// If clusterName is provided,
// tag the ENI with "cluster.k8s.amazonaws.com/name=<cluster_name>"
// and "kubernetes.io/cluster/<cluster-name>: owned"
if cache.clusterName != "" {
tags[eniClusterTagKey] = cache.clusterName
tags[fmt.Sprintf(clusterNameTagKeyFormat, cache.clusterName)] = clusterNameTagValue
}
for key, value := range cache.additionalENITags {
tags[key] = value
Expand Down
38 changes: 34 additions & 4 deletions pkg/awsutils/awsutils_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -914,6 +914,7 @@ func TestEC2InstanceMetadataCache_buildENITags(t *testing.T) {
},
want: map[string]string{
"node.k8s.amazonaws.com/instance_id": "i-xxxxx",
"eks:eni:owner": "amazon-vpc-cni",
},
},
{
Expand All @@ -923,8 +924,10 @@ func TestEC2InstanceMetadataCache_buildENITags(t *testing.T) {
clusterName: "awesome-cluster",
},
want: map[string]string{
"node.k8s.amazonaws.com/instance_id": "i-xxxxx",
"cluster.k8s.amazonaws.com/name": "awesome-cluster",
"node.k8s.amazonaws.com/instance_id": "i-xxxxx",
"cluster.k8s.amazonaws.com/name": "awesome-cluster",
"kubernetes.io/cluster/awesome-cluster": "owned",
"eks:eni:owner": "amazon-vpc-cni",
},
},
{
Expand All @@ -940,6 +943,7 @@ func TestEC2InstanceMetadataCache_buildENITags(t *testing.T) {
"node.k8s.amazonaws.com/instance_id": "i-xxxxx",
"tagKey-1": "tagVal-1",
"tagKey-2": "tagVal-2",
"eks:eni:owner": "amazon-vpc-cni",
},
},
}
Expand Down Expand Up @@ -1430,6 +1434,14 @@ func TestEC2InstanceMetadataCache_TagENI(t *testing.T) {
Key: aws.String("cluster.k8s.amazonaws.com/name"),
Value: aws.String("awesome-cluster"),
},
{
Key: aws.String("eks:eni:owner"),
Value: aws.String("amazon-vpc-cni"),
},
{
Key: aws.String("kubernetes.io/cluster/awesome-cluster"),
Value: aws.String("owned"),
},
{
Key: aws.String("node.k8s.amazonaws.com/instance_id"),
Value: aws.String("i-xxxx"),
Expand All @@ -1455,8 +1467,10 @@ func TestEC2InstanceMetadataCache_TagENI(t *testing.T) {
args: args{
eniID: "eni-xxxx",
currentTags: map[string]string{
"node.k8s.amazonaws.com/instance_id": "i-xxxx",
"cluster.k8s.amazonaws.com/name": "awesome-cluster",
"node.k8s.amazonaws.com/instance_id": "i-xxxx",
"cluster.k8s.amazonaws.com/name": "awesome-cluster",
"kubernetes.io/cluster/awesome-cluster": "owned",
"eks:eni:owner": "amazon-vpc-cni",
},
},
wantErr: nil,
Expand All @@ -1475,6 +1489,14 @@ func TestEC2InstanceMetadataCache_TagENI(t *testing.T) {
Key: aws.String("cluster.k8s.amazonaws.com/name"),
Value: aws.String("awesome-cluster"),
},
{
Key: aws.String("eks:eni:owner"),
Value: aws.String("amazon-vpc-cni"),
},
{
Key: aws.String("kubernetes.io/cluster/awesome-cluster"),
Value: aws.String("owned"),
},
},
},
},
Expand Down Expand Up @@ -1503,6 +1525,14 @@ func TestEC2InstanceMetadataCache_TagENI(t *testing.T) {
Key: aws.String("cluster.k8s.amazonaws.com/name"),
Value: aws.String("awesome-cluster"),
},
{
Key: aws.String("eks:eni:owner"),
Value: aws.String("amazon-vpc-cni"),
},
{
Key: aws.String("kubernetes.io/cluster/awesome-cluster"),
Value: aws.String("owned"),
},
{
Key: aws.String("node.k8s.amazonaws.com/instance_id"),
Value: aws.String("i-xxxx"),
Expand Down
11 changes: 10 additions & 1 deletion pkg/ipamd/ipamd.go
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,9 @@ const (
envManageUntaggedENI = "MANAGE_UNTAGGED_ENI"

eniNodeTagKey = "node.k8s.amazonaws.com/instance_id"

//envDisableLeakedENICollection is used to disable leaked eni go routine in aws-node
envDisableLeakedENICollection = "DISABLE_LEAKED_ENI_COLLECTION"
)

var log = logger.Get()
Expand Down Expand Up @@ -248,6 +251,7 @@ type IPAMContext struct {
enablePrefixDelegation bool
lastInsufficientCidrError time.Time
enableManageUntaggedMode bool
disableLeakedENICollection bool
}

// setUnmanagedENIs will rebuild the set of ENI IDs for ENIs tagged as "no_manage"
Expand Down Expand Up @@ -359,8 +363,9 @@ func New(rawK8SClient client.Client, cachedK8SClient client.Client) (*IPAMContex
c.enableIPv6 = isIPv6Enabled()

c.disableENIProvisioning = disablingENIProvisioning()
c.disableLeakedENICollection = isLeakedENICollectionDisabled()

client, err := awsutils.New(c.useCustomNetworking, c.disableENIProvisioning, c.enableIPv4, c.enableIPv6)
client, err := awsutils.New(c.useCustomNetworking, c.disableENIProvisioning, c.enableIPv4, c.enableIPv6, c.disableLeakedENICollection)
if err != nil {
return nil, errors.Wrap(err, "ipamd: can not initialize with AWS SDK interface")
}
Expand Down Expand Up @@ -1687,6 +1692,10 @@ func enableManageUntaggedMode() bool {
return getEnvBoolWithDefault(envManageUntaggedENI, true)
}

func isLeakedENICollectionDisabled() bool {
return getEnvBoolWithDefault(envDisableENIProvisioning, false)
}

// filterUnmanagedENIs filters out ENIs marked with the "node.k8s.amazonaws.com/no_manage" tag
func (c *IPAMContext) filterUnmanagedENIs(enis []awsutils.ENIMetadata) []awsutils.ENIMetadata {
numFiltered := 0
Expand Down