Skip to content

Commit

Permalink
# This is a combination of 11 commits.
Browse files Browse the repository at this point in the history
# This is the 1st commit message:

Add VlanId in the cmdAdd Result struct
This VlanId will appear in the prevResult during cmdDel request

Test prevResult contents

CleanUp Pod Network using vlanId from prevResult in CNI itself
No need to call ipamd

Log formatting changes

Added hostNetworking Setup test for pods using security groups

revoke unnecessary test agent image changes

Revoke unnecessary changes

remove focussed test
set replica count to total number of branch interface

Fix replica count

# This is the commit message aws#2:

Updated cleanUpPodENI method

# This is the commit message aws#3:

Skip processing Delete request if prevResult is nil
Add Logging vlanId to ipamd

# This is the commit message aws#4:

Add support to test with containerd nodegroup in pod-eni test

# This is the commit message aws#5:

Add check for empty Netns() in cni

# This is the commit message aws#6:

Manifests and Readme updates (aws#1732)

* Manifests and Readme updates

* update manifest.jsonnet
# This is the commit message aws#7:

Readme updates (aws#1735)


# This is the commit message aws#8:

Updates to troubleshooting doc (aws#1737)

* Updates to troubleshooting doc

* updates to troubleshooting doc
# This is the commit message aws#9:

imdsv2 changes (aws#1743)


# This is the commit message aws#10:

fix flaky canary test (aws#1742)


# This is the commit message aws#11:

add CODEOWNERS (aws#1747)
  • Loading branch information
Chinmay Gadgil committed Dec 9, 2021
1 parent 6a15a84 commit d6a1cee
Show file tree
Hide file tree
Showing 27 changed files with 344 additions and 74 deletions.
1 change: 1 addition & 0 deletions CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @aws/eks-networking
79 changes: 72 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,11 @@ scheduling that exceeds the IP address resources available to the kubelet.

The default manifest expects `--cni-conf-dir=/etc/cni/net.d` and `--cni-bin-dir=/opt/cni/bin`.

L-IPAM requires following [IAM policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html):
## IAM Policy

L-IPAM requires one of the following [IAM policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html) depending on the IP Family configured:

**IPv4 Mode:**

```
{
Expand Down Expand Up @@ -56,6 +60,31 @@ L-IPAM requires following [IAM policy](https://docs.aws.amazon.com/IAM/latest/Us
}
```

**IPv6 Mode:**

```
{
"Effect": "Allow",
"Action": [
"ec2:AssignIpv6Addresses",
"ec2:DescribeInstances",
"ec2:DescribeTags",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeInstanceTypes"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags"
],
"Resource": [
"arn:aws:ec2:*:*:network-interface/*"
]
}
```

Alternatively there is also a [Helm](https://helm.sh/) chart: [eks/aws-vpc-cni](https://github.com/aws/eks-charts/tree/master/stable/aws-vpc-cni)

## Building
Expand Down Expand Up @@ -474,14 +503,16 @@ Type: Boolean as a String

Default: `false`

To enable IPv4 prefix delegation on nitro instances. Setting `ENABLE_PREFIX_DELEGATION` to `true` will start allocating a /28 prefix
instead of a secondary IP in the ENIs subnet. The total number of prefixes and private IP addresses will be less than the
To enable prefix delegation on nitro instances. Setting `ENABLE_PREFIX_DELEGATION` to `true` will start allocating a prefix (/28 for IPv4
and /80 for IPv6) instead of a secondary IP in the ENIs subnet. The total number of prefixes and private IP addresses will be less than the
limit on private IPs allowed by your instance. Setting or resetting of `ENABLE_PREFIX_DELEGATION` while pods are running or if ENIs are attached is supported and the new pods allocated will get IPs based on the mode of IPAMD but the max pods of kubelet should be updated which would need either kubelet restart or node recycle.

Custom networking and Security group per pods are supported with this feature.

Setting ENABLE_PREFIX_DELEGATION to true will not increase the density of branch ENI pods. The limit on number of branch network interfaces per instance type will remain the same - https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html#supported-instance-types. Each branch network will be allocated a primary IP and this IP will be allocated for the branch ENI pods.

Please refer to [VPC CNI Feature Matrix](https://github.com/aws/amazon-vpc-cni-k8s#vpc-cni-feature-matrix) section below for additional information around using Prefix delegation with Custom Networking and Security Groups Per Pod features.

**Note:** `ENABLE_PREFIX_DELEGATION` needs to be set to `true` when VPC CNI is configured to operate in IPv6 mode (supported in v1.10.0+).

---

#### `WARM_PREFIX_TARGET` (v1.9.0+)
Expand Down Expand Up @@ -522,10 +553,10 @@ Type: Boolean as a String
Default: `false`

Setting `ANNOTATE_POD_IP` to `true` will allow IPAMD to add an annotation `vpc.amazonaws.com/pod-ips` to the pod with pod IP.

There is a known [issue](https://github.com/kubernetes/kubernetes/issues/39113) with kubelet taking time to update `Pod.Status.PodIP` leading to calico being blocked on programming the policy. Setting `ANNOTATE_POD_IP` to `true` will enable AWS VPC CNI plugin to add Pod IP as an annotation to the pod spec to address this race condition.

To annotate the pod with pod IP, you will have to add "patch" permission for pods resource in aws-node clusterrole. You can use the below command -
To annotate the pod with pod IP, you will have to add "patch" permission for pods resource in aws-node clusterrole. You can use the below command -

```
cat << EOF > append.yaml
Expand All @@ -543,6 +574,40 @@ kubectl apply -f <(cat <(kubectl get clusterrole aws-node -o yaml) append.yaml)
```
---

#### `ENABLE_IPv4` (v1.10.0+)

Type: Boolean as a String

Default: `true`

VPC CNI can operate in either IPv4 or IPv6 mode. Setting `ENABLE_IPv4` to `true` will configure it in IPv4 mode (default mode).

**Note:** Dual stack mode isn't yet supported. So, enabling both IPv4 and IPv6 will be treated as invalid configuration.

---

#### `ENABLE_IPv6` (v1.10.0+)

Type: Boolean as a String

Default: `false`

VPC CNI can operate in either IPv4 or IPv6 mode. Setting `ENABLE_IPv6` to `true` (both under `aws-node` and `aws-vpc-cni-init` containers in the manifest)
will configure it in IPv6 mode. IPv6 is only supported in Prefix Delegation mode, so `ENABLE_PREFIX_DELEGATION` needs to set to `true` if VPC CNI is
configured to operate in IPv6 mode. Prefix delegation is only supported on nitro instances.


**Note:** Please make sure that the required IPv6 IAM policy is applied (Refer to [IAM Policy](https://github.com/aws/amazon-vpc-cni-k8s#iam-policy) section above). Dual stack mode isn't yet supported. So, enabling both IPv4 and IPv6 will be treated as invalid configuration. Please refer to the [VPC CNI Feature Matrix](https://github.com/aws/amazon-vpc-cni-k8s#vpc-cni-feature-matrix) section below for additional information.

---

### VPC CNI Feature Matrix

IP Mode | Secondary IP Mode | Prefix Delegation | Security Groups Per Pod | WARM & MIN IP/Prefix Targets | External SNAT
------ | ------ | ------ | ------ | ------ | ------
`IPv4` | Yes| Yes | Yes | Yes | Yes | Yes
`IPv6` | No | Yes | No | No | No | No

### ENI tags related to Allocation

This plugin interacts with the following tags on ENIs:
Expand Down
4 changes: 2 additions & 2 deletions charts/aws-vpc-cni/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v1
name: aws-vpc-cni
version: 1.1.10
appVersion: "v1.9.3"
version: 1.1.11
appVersion: "v1.10.0"
description: A Helm chart for the AWS VPC CNI
icon: https://raw.githubusercontent.com/aws/eks-charts/master/docs/logo/aws.png
home: https://github.com/aws/amazon-vpc-cni-k8s
Expand Down
5 changes: 3 additions & 2 deletions charts/aws-vpc-cni/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ nameOverride: aws-node

init:
image:
tag: v1.9.3
tag: v1.10.0
region: us-west-2
account: "602401143452"
pullPolicy: Always
Expand All @@ -17,12 +17,13 @@ init:
# override: "repo/org/image:tag"
env:
DISABLE_TCP_EARLY_DEMUX: "false"
ENABLE_IPv6: "false"
securityContext:
privileged: true

image:
region: us-west-2
tag: v1.9.3
tag: v1.10.0
account: "602401143452"
domain: "amazonaws.com"
pullPolicy: Always
Expand Down
4 changes: 2 additions & 2 deletions charts/cni-metrics-helper/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.4
version: 0.1.5

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
appVersion: v1.9.3
appVersion: v1.10.0
2 changes: 1 addition & 1 deletion charts/cni-metrics-helper/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ nameOverride: cni-metrics-helper

image:
region: us-west-2
tag: v1.9.3
tag: v1.10.0
account: "602401143452"
domain: "amazonaws.com"
# Set to use custom image
Expand Down
56 changes: 54 additions & 2 deletions cmd/routed-eni-cni-plugin/cni.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
"net"
"os"
"runtime"
"strconv"
"strings"

"github.com/containernetworking/cni/pkg/skel"
Expand All @@ -43,6 +44,7 @@ import (
)

const ipamdAddress = "127.0.0.1:50051"
const vlanInterfaceName = "vlanId"

var version string

Expand Down Expand Up @@ -95,6 +97,12 @@ func LoadNetConf(bytes []byte) (*NetConf, logger.Logger, error) {
return nil, nil, errors.Wrap(err, "add cmd: error loading config from args")
}

if conf.RawPrevResult != nil {
if err := cniSpecVersion.ParsePrevResult(&conf.NetConf); err != nil {
return nil, nil, fmt.Errorf("could not parse prevResult: %v", err)
}
}

logConfig := logger.Configuration{
LogLevel: conf.PluginLogLevel,
LogLocation: conf.PluginLogFile,
Expand Down Expand Up @@ -122,6 +130,8 @@ func add(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap
log.Infof("Received CNI add request: ContainerID(%s) Netns(%s) IfName(%s) Args(%s) Path(%s) argsStdinData(%s)",
args.ContainerID, args.Netns, args.IfName, args.Args, args.Path, args.StdinData)

log.Infof("Prev Result: %v\n", conf.PrevResult)

var k8sArgs K8sArgs
if err := cniTypes.LoadArgs(args.Args, &k8sArgs); err != nil {
log.Errorf("Failed to load k8s config from arg: %v", err)
Expand Down Expand Up @@ -194,14 +204,12 @@ func add(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap
var hostVethName string
if r.PodVlanId != 0 {
hostVethName = generateHostVethName("vlan", string(k8sArgs.K8S_POD_NAMESPACE), string(k8sArgs.K8S_POD_NAME))

err = driverClient.SetupPodENINetwork(hostVethName, args.IfName, args.Netns, v4Addr, v6Addr, int(r.PodVlanId), r.PodENIMAC,
r.PodENISubnetGW, int(r.ParentIfIndex), mtu, log)
} else {
// build hostVethName
// Note: the maximum length for linux interface name is 15
hostVethName = generateHostVethName(conf.VethPrefix, string(k8sArgs.K8S_POD_NAMESPACE), string(k8sArgs.K8S_POD_NAME))

err = driverClient.SetupNS(hostVethName, args.IfName, args.Netns, v4Addr, v6Addr, int(r.DeviceNumber), r.VPCv4CIDRs, r.UseExternalSNAT, mtu, log)
}

Expand Down Expand Up @@ -241,12 +249,15 @@ func add(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap

hostInterface := &current.Interface{Name: hostVethName}
containerInterface := &current.Interface{Name: args.IfName, Sandbox: args.Netns}
vlanInterface := &current.Interface{Name: vlanInterfaceName, Mac: fmt.Sprint(r.PodVlanId)}
log.Infof("Using vlanInterface: %v", vlanInterface)

result := &current.Result{
IPs: ips,
Interfaces: []*current.Interface{
hostInterface,
containerInterface,
vlanInterface,
},
}

Expand All @@ -270,6 +281,8 @@ func del(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap
driverClient driver.NetworkAPIs) error {

conf, log, err := LoadNetConf(args.StdinData)
log.Infof("Prev Result: %v\n", conf.PrevResult)

if err != nil {
return errors.Wrap(err, "add cmd: error loading config from args")
}
Expand All @@ -283,6 +296,35 @@ func del(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap
return errors.Wrap(err, "del cmd: failed to load k8s config from args")
}

prevResult, ok := conf.PrevResult.(*current.Result)

if !ok || args.Netns == "" {
log.Info("prevResult is nil or Netns() is empty, skip processing this request")
return nil
}

for _, iface := range prevResult.Interfaces {
if iface.Name == vlanInterfaceName {
podVlanId, err := strconv.Atoi(iface.Mac)
if err != nil {
return errors.Wrap(err, "Failed to parse vlanId from prevResult")
}
// podVlanId == 0 means pod is not using branch ENI
// then fallback to existing cleanup
if podVlanId == 0 {
break
}
// if podVlanId != 0 means pod is using branch ENI
err = cleanUpPodENI(podVlanId, log, args.ContainerID, driverClient)
if err != nil {
return err
}
log.Infof("Received del network response for pod %s namespace %s sandbox %s with vlanId: %v", string(k8sArgs.K8S_POD_NAME),
string(k8sArgs.K8S_POD_NAMESPACE), string(k8sArgs.K8S_POD_INFRA_CONTAINER_ID), podVlanId)
return nil
}
}

// notify local IP address manager to free secondary IP
// Set up a connection to the server.
conn, err := grpcClient.Dial(ipamdAddress, grpc.WithInsecure())
Expand Down Expand Up @@ -362,6 +404,16 @@ func del(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap
return nil
}

func cleanUpPodENI(podVlanId int, log logger.Logger, containerId string, driverClient driver.NetworkAPIs) error {
err := driverClient.TeardownPodENINetwork(podVlanId, log)
if err != nil {
log.Errorf("Failed on TeardownPodNetwork for container ID %s: %v",
containerId, err)
return errors.Wrap(err, "del cmd: failed on tear down pod network")
}
return nil
}

func main() {
log := logger.DefaultLogger()
about := fmt.Sprintf("AWS CNI %s", version)
Expand Down
16 changes: 9 additions & 7 deletions config/master/aws-k8s-cni-cn.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ metadata:
app.kubernetes.io/name: aws-node
app.kubernetes.io/instance: aws-vpc-cni
k8s-app: aws-node
app.kubernetes.io/version: "v1.9.3"
app.kubernetes.io/version: "v1.10.0"
---
# Source: aws-vpc-cni/templates/customresourcedefinition.yaml
apiVersion: apiextensions.k8s.io/v1
Expand All @@ -20,7 +20,7 @@ metadata:
app.kubernetes.io/name: aws-node
app.kubernetes.io/instance: aws-vpc-cni
k8s-app: aws-node
app.kubernetes.io/version: "v1.9.3"
app.kubernetes.io/version: "v1.10.0"
spec:
scope: Cluster
group: crd.k8s.amazonaws.com
Expand All @@ -47,7 +47,7 @@ metadata:
app.kubernetes.io/name: aws-node
app.kubernetes.io/instance: aws-vpc-cni
k8s-app: aws-node
app.kubernetes.io/version: "v1.9.3"
app.kubernetes.io/version: "v1.10.0"
rules:
- apiGroups:
- crd.k8s.amazonaws.com
Expand Down Expand Up @@ -80,7 +80,7 @@ metadata:
app.kubernetes.io/name: aws-node
app.kubernetes.io/instance: aws-vpc-cni
k8s-app: aws-node
app.kubernetes.io/version: "v1.9.3"
app.kubernetes.io/version: "v1.10.0"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
Expand All @@ -100,7 +100,7 @@ metadata:
app.kubernetes.io/name: aws-node
app.kubernetes.io/instance: aws-vpc-cni
k8s-app: aws-node
app.kubernetes.io/version: "v1.9.3"
app.kubernetes.io/version: "v1.10.0"
spec:
updateStrategy:
rollingUpdate:
Expand All @@ -121,10 +121,12 @@ spec:
hostNetwork: true
initContainers:
- name: aws-vpc-cni-init
image: "961992271922.dkr.ecr.cn-northwest-1.amazonaws.com.cn/amazon-k8s-cni-init:v1.9.3"
image: "961992271922.dkr.ecr.cn-northwest-1.amazonaws.com.cn/amazon-k8s-cni-init:v1.10.0"
env:
- name: DISABLE_TCP_EARLY_DEMUX
value: "false"
- name: ENABLE_IPv6
value: "false"
securityContext:
privileged: true
volumeMounts:
Expand All @@ -137,7 +139,7 @@ spec:
{}
containers:
- name: aws-node
image: "961992271922.dkr.ecr.cn-northwest-1.amazonaws.com.cn/amazon-k8s-cni:v1.9.3"
image: "961992271922.dkr.ecr.cn-northwest-1.amazonaws.com.cn/amazon-k8s-cni:v1.10.0"
ports:
- containerPort: 61678
name: metrics
Expand Down
Loading

0 comments on commit d6a1cee

Please sign in to comment.