Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add event recorder utils to raise aws-node pod events #1536

Merged
merged 2 commits into from
Apr 13, 2022

Conversation

sushrk
Copy link
Contributor

@sushrk sushrk commented Jul 20, 2021

What type of PR is this?
feature

Which issue does this PR fix:
#1520

What does this PR do / Why do we need it:
Add event recorder utils to handle raising aws-node pod events for faster triaging. This PR adds missing permissions pod event when EC2 API call fails due to missing CNI policy from the IAM role.

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
Repro steps: Detach AmazonEKS_CNI_Policy from IAM role and restart aws-node pods. The pods will not be ready due to missing permissions.

Testing done on this change:

  1. Nodegroup creation when CNI policy is missing
  2. Restart aws-node pods when CNI policy missing

Events can be retrieved with "kubectl describe pod aws-node-xxx" & "kubectl get events":

2m29s       Warning   MissingIAMPermissions   pod/aws-node-lps4s   Unauthorized operation: failed to call ec2:DescribeNetworkInterfaces due to missing permissions. Please refer https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/iam-policy.md to attach relevant policy to IAM role

Ginkgo test run:

ipamd git:(addPodEvent) ✗ ginkgo -v -- \                                                       
 --cluster-kubeconfig=$KUBECONFIG \
 --cluster-name=$CLUSTER_NAME \
 --aws-region=$AWS_REGION \
 --aws-vpc-id=$VPC_ID \
 --ng-name-label-key=$NG_NAME_LABEL_KEY \
 --ng-name-label-val=$NG_NAME_LABEL_VAL
Running Suite: VPC IPAMD Test Suite - /Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd
=========================================================================================================================
Random Seed: 1649808955

Will run 1 of 26 specs
------------------------------
[BeforeSuite] 
/Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_suite_test.go:39
STEP: Delete coredns addon if it exists 04/12/22 17:16:22.748
------------------------------
[BeforeSuite] PASSED [4.469 seconds]
[BeforeSuite] 
/Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_suite_test.go:39

  Begin Captured GinkgoWriter Output >>
    STEP: Delete coredns addon if it exists 04/12/22 17:16:22.748
  << End Captured GinkgoWriter Output
------------------------------
SSSSSSSSSSS
------------------------------
test aws-node pod event when iam role is missing VPC_CNI policy
  unauthorized event must be raised on aws-node pod
  /Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_event_test.go:104
STEP: getting the iam role 04/12/22 17:16:23.151
STEP: getting the node instance role 04/12/22 17:16:23.689
STEP: detaching VPC_CNI policy and restart aws-node pods 04/12/22 17:16:24.07
STEP: checking aws-node pods not running 04/12/22 17:20:23.582
STEP: attaching VPC_CNI policy and restart aws-node pods 04/12/22 17:20:28.6
STEP: checking aws-node pods are running 04/12/22 17:25:04.428
------------------------------
• [SLOW TEST] [531.288 seconds]
test aws-node pod event
/Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_event_test.go:32
  when iam role is missing VPC_CNI policy
  /Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_event_test.go:35
    unauthorized event must be raised on aws-node pod
    /Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_event_test.go:104

  Begin Captured GinkgoWriter Output >>
    STEP: getting the iam role 04/12/22 17:16:23.151
    STEP: getting the node instance role 04/12/22 17:16:23.689
    STEP: detaching VPC_CNI policy and restart aws-node pods 04/12/22 17:16:24.07
    STEP: checking aws-node pods not running 04/12/22 17:20:23.582
    STEP: attaching VPC_CNI policy and restart aws-node pods 04/12/22 17:20:28.6
    STEP: checking aws-node pods are running 04/12/22 17:25:04.428
  << End Captured GinkgoWriter Output
------------------------------
SSSSSSSSSSSSSS
------------------------------
[AfterSuite] 
/Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_suite_test.go:61
------------------------------
[AfterSuite] PASSED [0.000 seconds]
[AfterSuite] 
/Users/ravsushm/go/src/github.com/aws/amazon-vpc-cni-k8s/test/integration-new/ipamd/ipamd_suite_test.go:61
------------------------------

Ran 1 of 26 Specs in 535.762 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 25 Skipped
PASS | FOCUSED

Automation added to e2e:
None

Will this break upgrades or downgrades. Has updating a running cluster been tested?:
No. Upgrade tested.

Does this change require updates to the CNI daemonset config files to work?:
No.

Does this PR introduce any user-facing change?:
No.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sushrk sushrk marked this pull request as draft July 20, 2021 22:42
@sushrk sushrk marked this pull request as ready for review July 20, 2021 22:43
@sushrk
Copy link
Contributor Author

sushrk commented Jul 21, 2021

@jayanthvn @abhipth please review.

pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
cmd/aws-k8s-agent/main.go Outdated Show resolved Hide resolved
pkg/k8sapi/k8sutils.go Outdated Show resolved Hide resolved
pkg/k8sapi/k8sutils.go Outdated Show resolved Hide resolved
@jayanthvn
Copy link
Contributor

Can you also update the documentation. Maybe we can include it here [https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/troubleshooting.md].

Ref : #389

pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
@jayanthvn
Copy link
Contributor

@sushrk - Can you please fix the branch conflicts?

@sushrk sushrk requested a review from a team as a code owner January 11, 2022 01:40
@sushrk sushrk changed the title Add pod event when ec2 permissions are missing [WIP]Add pod event when ec2 permissions are missing Jan 11, 2022
@sushrk sushrk force-pushed the addPodEvent branch 3 times, most recently from c3281b2 to f23e16d Compare January 28, 2022 21:17
@sushrk sushrk changed the title [WIP]Add pod event when ec2 permissions are missing Add pod event when ec2 permissions are missing Jan 28, 2022
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
cmd/aws-k8s-agent/main.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
@achevuru
Copy link
Contributor

achevuru commented Feb 11, 2022

I've a generic comment - PR only captures the scenario when NodeInit fails (due to missing permissions for DescribeNetworkInterfaces call). However, CNI relies on a bunch of other EC2/VPC APIs(Attach/Detach/Modify ENIs, Assign/Unassign IPs etc) and it will be very useful to log an event whenever we receive a 403 for any of these API calls. Our Event message can specifically call out the API for which we ran in to Unauthorized error along with a generic statement pointing to relevant resources that can guide them. With the current PR, we only log an event if the CNI pod is missing permissions for DescribeNetworkInterfaces call but I believe the end goal is to alert the customer via events if they are missing permissions for any EC2/VPC API that CNI relies on for it's operation.

@jayanthvn
Copy link
Contributor

Thats right @achevuru. End goal is to capture all important events. Hence making the function common will make it easy to extend.

@sushrk sushrk changed the title Add pod event when ec2 permissions are missing Add event recorder utils to raise aws-node pod events Feb 21, 2022
@achevuru
Copy link
Contributor

achevuru commented Feb 28, 2022

Thanks for making it generic. However, I only see the event being logged for DescribeNetworkInterfaces call. Now that we've generic infra in place - we should log all the Unauthorized errors for any of the EC2 calls VPC CNI relies on (i.e.,) Create/Delete/Attach/Detach/Modify ENIs & Assign/Unassign IPs.

@jayanthvn
Copy link
Contributor

Thanks for making it generic. However, I only see the event being logged for DescribeNetworkInterfaces call. Now that we've generic infra in place - we should log all the Unauthorized errors for any of the EC2 calls VPC CNI relies on (i.e.,) Create/Delete/Attach/Detach/Modify ENIs & Assign/Unassign IPs.

Agreed.

@jayanthvn
Copy link
Contributor

Can you please fix the conflicts?

pkg/awsutils/awsutils.go Outdated Show resolved Hide resolved
@sftim
Copy link
Contributor

sftim commented Apr 2, 2022

@sushrk if you know how to do a rebase and squash, I recommend that you do. That would make these changes easier to review and to accept.

Copy link
Contributor

@jayanthvn jayanthvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jayanthvn jayanthvn merged commit fd8bcf0 into aws:master Apr 13, 2022
@sushrk sushrk deleted the addPodEvent branch April 19, 2022 19:04
jayanthvn added a commit that referenced this pull request Jul 14, 2022
* 1.10.3 release artifacts (#1962)

* Stale PR and issue cleanup wrkflow (#1964)

* fix image name during build (#1968)

* add event recorder utils to raise aws-node pod events (#1536)

* refactor uploader scripts (#1972)

* Fix cni panic due to pod.Annotations is a nil map (#1974)

Co-authored-by: Relk Li <relk@maicoin.com>

* chart: Add extraVolumes and extraVolumeMounts (#1949)

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Add the new command in the section of CNI Plugin Sequence (#1813)

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Bump github.com/containernetworking/cni from 0.8.0 to 0.8.1 (#1966)

Bumps [github.com/containernetworking/cni](https://github.com/containernetworking/cni) from 0.8.0 to 0.8.1.
- [Release notes](https://github.com/containernetworking/cni/releases)
- [Commits](containernetworking/cni@v0.8.0...v0.8.1)

---
updated-dependencies:
- dependency-name: github.com/containernetworking/cni
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Update README to highlight containerd.sock edge case with EKS AMI. (#1884)

* Update README to highlight containerd.sock edge case with EKS AMI.

* Updated Instructions as per review.

* add cni release test script (#1971)

* Multus release manifest (#1984)

* release manifest for Multus v3.8.0-eksbuild.1

* minor change to Readme

* Added Tests for validating Multus Installation (#1811)

* Added Tests for validating Multus Installation

Added missing files

Refactored code
Tried to make it modular and extensible.

* Deleted redundant file

* Fixed compilation issues

* fixed minor error

* Added script to trigger Multus tests (will be used by prow job)

* remove multus installation logic from ginkgo

* remove redundant changes

* Cleaned up run-multus-tests helper script

* Updated Readme for running multus tests
Added few checks in canary helper script

* revert changes to canary.sh

* Pass tag as an argument

* Updated Readme

* Updated tag for multus tests to use latest image

* Port new integration tests (#1928)

* Minor changes to run-integration-tests
Added integration-new framework tests

* Modified run-integration-tests to use new integration tests

* reverted redundant changes

* Merge integration with integration-new

* increase timeout (#1985)

fix syntax for ginkgo-v2

* Added configurable flag to create test nodes with arm64 and containerd runtime (#1977)

* Cleanup binary file (#1987)

* log error in ipamd on api server timeout (#1988)

* Refactored code and Added cni addon upgrade/downgrade regression test (#1861)

* Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Renamed package name for adddon tests

removed unnecessary changes
Fixed replica count for MTU and Veth test in host networking

Updated ENI/IP limits file for newly added instances (#1864)

* Added new instances

* Updated test readme

* needed rebase

* formatting

* remove all references to integration-new
migrate to ginkgo v2 in addon test files

* fix maxIPPerInterface count on pod_networking_suite

* Increase default deployment ready timeout

Co-authored-by: Vikas Basavaraj <5373156+vikasmb@users.noreply.github.com>

* Remove generation of calico manifests (#1905)

* cni manifest upgrade downgrade test (#1863)

* Added upgrade/downgrade script template

Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Added upgrade/downgrade test for custom cni-manifest-file

Added missing files

remove upgrade-downgrade.sh

* Add eks.go file , deleted by mistake

* Extract apply manifest logic in common
Remove redundant code

* Add PD traffic test for cni upgrade downgrade test

* Update golang to Go 1.18 (#1991)

* Update CNI Plugins to v1.1.1 (#1997)

* Update release manifests for VPC CNI v1.11.2 (#2001) (#2002)

* Enable Calico on ARM64 and add configureable flags for Calico installation (#2004)

* Enable Calico on ARM64 and add configureable flags for Calico
installation

* Add v to Calico version in release test script

* fix integration test script (#1998)

* Updated dependencies (#2012)

* Fix readme (#2013)

* Added upgrade/downgrade script template

Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Added upgrade/downgrade test for custom cni-manifest-file

Added missing files

remove upgrade-downgrade.sh

* Add eks.go file , deleted by mistake

* Extract apply manifest logic in common
Remove redundant code

* Add PD traffic test for cni upgrade downgrade test

* Updated Readme

* Merge fix-ginkgo to master (#2014)

* fix path failure

* seperate makefile for test

Co-authored-by: abhipth <abhipth@amazon.com>

* Multus manifest for release v3.9.0-eksbuild.1 (#2016)

* Updating new instances - p4de (#2018)

* Updating new instances

* fix formatting

* Fix go build failure with v6 networking suite. (#2020)

* Update README.md (#2021)

* Fix Go build for ipamd test package. (#2023)

* Fix Go build for ipamd test package.

* Fix format with make format

* Fix go build for cni test package. (#2024)

* Prevent allocate/free ENIs when node is marked noSchedule (#1927)

* Prevent allocate/free ENIs when node is marked noSchedule

* Update UTs

* Re-use logger instance (#2029)

* Re-use logger instance

- Existing logger initialization constructed different logger
  instances upon call to Get() method.
- Fixed the initailiation logic to re-use the logger instance.

* Added unit tests for logger initialization fix

Co-authored-by: M00nF1sh <yyyng@amazon.com>
Co-authored-by: Sushmitha Ravikumar <58063229+sushrk@users.noreply.github.com>
Co-authored-by: Relk Li <YiJiun.Li.C@gmail.com>
Co-authored-by: Relk Li <relk@maicoin.com>
Co-authored-by: Jan-Otto Kröpke <github@jkroepke.de>
Co-authored-by: Shuntaro Azuma <azush.work@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Senthil Kumaran <senthilx@amazon.com>
Co-authored-by: cgchinmay <cgadgil@amazon.com>
Co-authored-by: Vikas Basavaraj <5373156+vikasmb@users.noreply.github.com>
Co-authored-by: Hao Zhou <haouc@users.noreply.github.com>
Co-authored-by: abhipth <abhipth@amazon.com>
Co-authored-by: Prasad Jivane <prasad.jivane@walchandsangli.ac.in>
vikasmb added a commit that referenced this pull request Jul 22, 2022
* 1.10.3 release artifacts (#1962)

* Stale PR and issue cleanup wrkflow (#1964)

* fix image name during build (#1968)

* add event recorder utils to raise aws-node pod events (#1536)

* refactor uploader scripts (#1972)

* Fix cni panic due to pod.Annotations is a nil map (#1974)

Co-authored-by: Relk Li <relk@maicoin.com>

* chart: Add extraVolumes and extraVolumeMounts (#1949)

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Add the new command in the section of CNI Plugin Sequence (#1813)

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Bump github.com/containernetworking/cni from 0.8.0 to 0.8.1 (#1966)

Bumps [github.com/containernetworking/cni](https://github.com/containernetworking/cni) from 0.8.0 to 0.8.1.
- [Release notes](https://github.com/containernetworking/cni/releases)
- [Commits](containernetworking/cni@v0.8.0...v0.8.1)

---
updated-dependencies:
- dependency-name: github.com/containernetworking/cni
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Update README to highlight containerd.sock edge case with EKS AMI. (#1884)

* Update README to highlight containerd.sock edge case with EKS AMI.

* Updated Instructions as per review.

* add cni release test script (#1971)

* Multus release manifest (#1984)

* release manifest for Multus v3.8.0-eksbuild.1

* minor change to Readme

* Added Tests for validating Multus Installation (#1811)

* Added Tests for validating Multus Installation

Added missing files

Refactored code
Tried to make it modular and extensible.

* Deleted redundant file

* Fixed compilation issues

* fixed minor error

* Added script to trigger Multus tests (will be used by prow job)

* remove multus installation logic from ginkgo

* remove redundant changes

* Cleaned up run-multus-tests helper script

* Updated Readme for running multus tests
Added few checks in canary helper script

* revert changes to canary.sh

* Pass tag as an argument

* Updated Readme

* Updated tag for multus tests to use latest image

* Port new integration tests (#1928)

* Minor changes to run-integration-tests
Added integration-new framework tests

* Modified run-integration-tests to use new integration tests

* reverted redundant changes

* Merge integration with integration-new

* increase timeout (#1985)

fix syntax for ginkgo-v2

* Added configurable flag to create test nodes with arm64 and containerd runtime (#1977)

* Cleanup binary file (#1987)

* log error in ipamd on api server timeout (#1988)

* Refactored code and Added cni addon upgrade/downgrade regression test (#1861)

* Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Renamed package name for adddon tests

removed unnecessary changes
Fixed replica count for MTU and Veth test in host networking

Updated ENI/IP limits file for newly added instances (#1864)

* Added new instances

* Updated test readme

* needed rebase

* formatting

* remove all references to integration-new
migrate to ginkgo v2 in addon test files

* fix maxIPPerInterface count on pod_networking_suite

* Increase default deployment ready timeout

Co-authored-by: Vikas Basavaraj <5373156+vikasmb@users.noreply.github.com>

* Remove generation of calico manifests (#1905)

* cni manifest upgrade downgrade test (#1863)

* Added upgrade/downgrade script template

Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Added upgrade/downgrade test for custom cni-manifest-file

Added missing files

remove upgrade-downgrade.sh

* Add eks.go file , deleted by mistake

* Extract apply manifest logic in common
Remove redundant code

* Add PD traffic test for cni upgrade downgrade test

* Update golang to Go 1.18 (#1991)

* Update CNI Plugins to v1.1.1 (#1997)

* Update release manifests for VPC CNI v1.11.2 (#2001) (#2002)

* Enable Calico on ARM64 and add configureable flags for Calico installation (#2004)

* Enable Calico on ARM64 and add configureable flags for Calico
installation

* Add v to Calico version in release test script

* fix integration test script (#1998)

* Updated dependencies (#2012)

* Fix readme (#2013)

* Added upgrade/downgrade script template

Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Added upgrade/downgrade test for custom cni-manifest-file

Added missing files

remove upgrade-downgrade.sh

* Add eks.go file , deleted by mistake

* Extract apply manifest logic in common
Remove redundant code

* Add PD traffic test for cni upgrade downgrade test

* Updated Readme

* Merge fix-ginkgo to master (#2014)

* fix path failure

* seperate makefile for test

Co-authored-by: abhipth <abhipth@amazon.com>

* Multus manifest for release v3.9.0-eksbuild.1 (#2016)

* Updating new instances - p4de (#2018)

* Updating new instances

* fix formatting

* Fix go build failure with v6 networking suite. (#2020)

* Update README.md (#2021)

* Fix Go build for ipamd test package. (#2023)

* Fix Go build for ipamd test package.

* Fix format with make format

* Fix go build for cni test package. (#2024)

* Prevent allocate/free ENIs when node is marked noSchedule (#1927)

* Prevent allocate/free ENIs when node is marked noSchedule

* Update UTs

* Re-use logger instance (#2029)

* Re-use logger instance

- Existing logger initialization constructed different logger
  instances upon call to Get() method.
- Fixed the initailiation logic to re-use the logger instance.

* Added unit tests for logger initialization fix

* fix addOn version api for beta (#2034)

* Update yaml.v3 package dependency (#2036)

* Update yaml.v3 package dependency

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>
Co-authored-by: M00nF1sh <yyyng@amazon.com>
Co-authored-by: Sushmitha Ravikumar <58063229+sushrk@users.noreply.github.com>
Co-authored-by: Relk Li <YiJiun.Li.C@gmail.com>
Co-authored-by: Relk Li <relk@maicoin.com>
Co-authored-by: Jan-Otto Kröpke <github@jkroepke.de>
Co-authored-by: Shuntaro Azuma <azush.work@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Senthil Kumaran <senthilx@amazon.com>
Co-authored-by: cgchinmay <cgadgil@amazon.com>
Co-authored-by: Hao Zhou <haouc@users.noreply.github.com>
Co-authored-by: abhipth <abhipth@amazon.com>
Co-authored-by: Prasad Jivane <prasad.jivane@walchandsangli.ac.in>
jayanthvn added a commit that referenced this pull request Aug 16, 2022
* 1.10.3 release artifacts (#1962)

* Stale PR and issue cleanup wrkflow (#1964)

* fix image name during build (#1968)

* add event recorder utils to raise aws-node pod events (#1536)

* refactor uploader scripts (#1972)

* Fix cni panic due to pod.Annotations is a nil map (#1974)

Co-authored-by: Relk Li <relk@maicoin.com>

* chart: Add extraVolumes and extraVolumeMounts (#1949)

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Add the new command in the section of CNI Plugin Sequence (#1813)

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Bump github.com/containernetworking/cni from 0.8.0 to 0.8.1 (#1966)

Bumps [github.com/containernetworking/cni](https://github.com/containernetworking/cni) from 0.8.0 to 0.8.1.
- [Release notes](https://github.com/containernetworking/cni/releases)
- [Commits](containernetworking/cni@v0.8.0...v0.8.1)

---
updated-dependencies:
- dependency-name: github.com/containernetworking/cni
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Update README to highlight containerd.sock edge case with EKS AMI. (#1884)

* Update README to highlight containerd.sock edge case with EKS AMI.

* Updated Instructions as per review.

* add cni release test script (#1971)

* Multus release manifest (#1984)

* release manifest for Multus v3.8.0-eksbuild.1

* minor change to Readme

* Added Tests for validating Multus Installation (#1811)

* Added Tests for validating Multus Installation

Added missing files

Refactored code
Tried to make it modular and extensible.

* Deleted redundant file

* Fixed compilation issues

* fixed minor error

* Added script to trigger Multus tests (will be used by prow job)

* remove multus installation logic from ginkgo

* remove redundant changes

* Cleaned up run-multus-tests helper script

* Updated Readme for running multus tests
Added few checks in canary helper script

* revert changes to canary.sh

* Pass tag as an argument

* Updated Readme

* Updated tag for multus tests to use latest image

* Port new integration tests (#1928)

* Minor changes to run-integration-tests
Added integration-new framework tests

* Modified run-integration-tests to use new integration tests

* reverted redundant changes

* Merge integration with integration-new

* increase timeout (#1985)

fix syntax for ginkgo-v2

* Added configurable flag to create test nodes with arm64 and containerd runtime (#1977)

* Cleanup binary file (#1987)

* log error in ipamd on api server timeout (#1988)

* Refactored code and Added cni addon upgrade/downgrade regression test (#1861)

* Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Renamed package name for adddon tests

removed unnecessary changes
Fixed replica count for MTU and Veth test in host networking

Updated ENI/IP limits file for newly added instances (#1864)

* Added new instances

* Updated test readme

* needed rebase

* formatting

* remove all references to integration-new
migrate to ginkgo v2 in addon test files

* fix maxIPPerInterface count on pod_networking_suite

* Increase default deployment ready timeout

Co-authored-by: Vikas Basavaraj <5373156+vikasmb@users.noreply.github.com>

* Remove generation of calico manifests (#1905)

* cni manifest upgrade downgrade test (#1863)

* Added upgrade/downgrade script template

Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Added upgrade/downgrade test for custom cni-manifest-file

Added missing files

remove upgrade-downgrade.sh

* Add eks.go file , deleted by mistake

* Extract apply manifest logic in common
Remove redundant code

* Add PD traffic test for cni upgrade downgrade test

* Update golang to Go 1.18 (#1991)

* Update CNI Plugins to v1.1.1 (#1997)

* Update release manifests for VPC CNI v1.11.2 (#2001) (#2002)

* Enable Calico on ARM64 and add configureable flags for Calico installation (#2004)

* Enable Calico on ARM64 and add configureable flags for Calico
installation

* Add v to Calico version in release test script

* fix integration test script (#1998)

* Updated dependencies (#2012)

* Fix readme (#2013)

* Added upgrade/downgrade script template

Refactored code
Addon upgrade/downgrade test similar to #1795

Added tests for addon upgrade/downgrade

Changed DEFAULT version
Added addon status checks

Fetch latest addon version for given K8s Cluster

Update kops cluster config used in weekly tests (#1862)

* Change to kops cluster creation scripts

* Add logging for retry attempt

* Switch kops cluster to use docker container runtime

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

Added upgrade/downgrade test for custom cni-manifest-file

Added missing files

remove upgrade-downgrade.sh

* Add eks.go file , deleted by mistake

* Extract apply manifest logic in common
Remove redundant code

* Add PD traffic test for cni upgrade downgrade test

* Updated Readme

* Merge fix-ginkgo to master (#2014)

* fix path failure

* seperate makefile for test

Co-authored-by: abhipth <abhipth@amazon.com>

* Multus manifest for release v3.9.0-eksbuild.1 (#2016)

* Updating new instances - p4de (#2018)

* Updating new instances

* fix formatting

* Fix go build failure with v6 networking suite. (#2020)

* Update README.md (#2021)

* Fix Go build for ipamd test package. (#2023)

* Fix Go build for ipamd test package.

* Fix format with make format

* Fix go build for cni test package. (#2024)

* Prevent allocate/free ENIs when node is marked noSchedule (#1927)

* Prevent allocate/free ENIs when node is marked noSchedule

* Update UTs

* Re-use logger instance (#2029)

* Re-use logger instance

- Existing logger initialization constructed different logger
  instances upon call to Get() method.
- Fixed the initailiation logic to re-use the logger instance.

* Added unit tests for logger initialization fix

* fix addOn version api for beta (#2034)

* Update yaml.v3 package dependency (#2036)

* Update yaml.v3 package dependency

* Increase cpu requests limit (#2038)

- Porting changes from release-1.10 branch made in PR #1749

* fix ipamd integration failures and cleanup (#2039)

* fix integration test failures and cleanup

* README update

* cleanup info logs in event recorder and test script (#2043)

* add nodeSelector to cni-metrics-helper test deployment and update image tag (#2047)

* fix makefile path in canary test script (#2051)

* disable arm build (#2052)

* Updated changelog for 1.11.3 release (#2053)

Co-authored-by: Vikas Basavaraj Mallapura <“5373156+vikasmb@users.noreply.github.com”>

* Updating master branch config files to release 1.11.3 (#2055)

* Updated changelog for 1.11.3 release

* Image tag and chart version update for 1.11.3 release (#2050)

Co-authored-by: Vikas Basavaraj Mallapura <“5373156+vikasmb@users.noreply.github.com”>
(cherry picked from commit ff42a83)

Co-authored-by: Vikas Basavaraj Mallapura <“5373156+vikasmb@users.noreply.github.com”>

* update aws-node clusterrole permissions (#2058)

* Fix minor typo on documentation (#2059)

s/varibales/variables/

* multus manifest for release v3.9.0-eksbuild.2 (#2057)

* Setting AWS_VPC_K8S_CNI_RANDOMIZESNAT to the default value (#2028)

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* Fixing prefixes per ENI value in example (#2060)

Prefixes per ENI in row 6 should be 1 not 3.

Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>

* IPAMD optimizations and makefile changes (#1975)

* IPAMD optimizations and makefile changes

* Minor comments

* Removed IMDS dependency

* fix test

* fix test

* fix test-format

* Updated new instances (#2062)

* Updated new instances

* fix format

Co-authored-by: M00nF1sh <yyyng@amazon.com>
Co-authored-by: Sushmitha Ravikumar <58063229+sushrk@users.noreply.github.com>
Co-authored-by: Relk Li <YiJiun.Li.C@gmail.com>
Co-authored-by: Relk Li <relk@maicoin.com>
Co-authored-by: Jan-Otto Kröpke <github@jkroepke.de>
Co-authored-by: Shuntaro Azuma <azush.work@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Senthil Kumaran <senthilx@amazon.com>
Co-authored-by: cgchinmay <cgadgil@amazon.com>
Co-authored-by: Vikas Basavaraj <5373156+vikasmb@users.noreply.github.com>
Co-authored-by: Hao Zhou <haouc@users.noreply.github.com>
Co-authored-by: abhipth <abhipth@amazon.com>
Co-authored-by: Prasad Jivane <prasad.jivane@walchandsangli.ac.in>
Co-authored-by: Vikas Basavaraj Mallapura <“5373156+vikasmb@users.noreply.github.com”>
Co-authored-by: Guillaume Delacour <guillaume.delacour@gmail.com>
Co-authored-by: Venkata Gunapati <gvsukumar@gmail.com>
Co-authored-by: Muhammed Karakas <karakas@amazon.com>
- apiGroups: ["", "events.k8s.io"]
resources:
- events
verbs: ["create", "patch", "list", "get"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it need patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required for event aggregation. For eg, without patch we wouldn't see the aggregated info [x12 over 21s] below:
Warning MissingIAMPermissions 20s (x12 over 21s) aws-node Unauthorized operation ...

FYI we removed get recently as it was not required.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants