Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netchecker: update images to 1.2.2 from Mirantis #8074

Merged
merged 2 commits into from
Oct 19, 2021

Conversation

cristicalin
Copy link
Contributor

@cristicalin cristicalin commented Oct 12, 2021

Update netchecker to a slightly less older version than what we have today in kubespray. We rely on this tool in our CI so it does not hurt to be a bit more up-to-date.

What type of PR is this?

/kind feature

What this PR does / why we need it:
Netchecker, while old and has not seen many changes recently, does have slightly newer versions published by Mirantis. This PR updates our netchecker to the newer version.

Which issue(s) this PR fixes:

Fixes #8075

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Update Netchecker to v1.2.2 - now local etcd backend is needed to run

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 12, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @cristicalin. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cristicalin cristicalin changed the title netchecker: update images to 1.2.2 from Mirantis which is slightly le… netchecker: update images to 1.2.2 from Mirantis Oct 12, 2021
@k8s-ci-robot k8s-ci-robot requested review from EppO and oomichi October 12, 2021 12:05
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 12, 2021
@floryut floryut added the release-note-none Denotes a PR that doesn't merit a release note. label Oct 12, 2021
Copy link
Member

@floryut floryut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test
Indeed.. not sure an image from 12/2017 will make a difference compare to 06/2017 but well no harm done in upgrading it

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 12, 2021
@cristicalin cristicalin marked this pull request as draft October 12, 2021 13:00
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 12, 2021
@cristicalin
Copy link
Contributor Author

Seems the server is not quite happy:

E1012 12:57:35.019182       1 server.go:58] Error while setting up the handler. Details: the server could not find the requested resource
panic: the server could not find the requested resource

goroutine 1 [running]:
main.main()
	/home/travis/gopath/src/github.com/Mirantis/k8s-netchecker-server/cmd/server.go:59 +0x6d3

@cristicalin
Copy link
Contributor Author

Bit more detail:

I1012 13:16:13.714714       1 server.go:48] K8s netchecker. Compiled at: 20180102-12:47:46-UTC
I1012 13:16:13.714855       1 server.go:54] Start listening on 0.0.0.0:8081
I1012 13:16:13.716319       1 request.go:991] Request Body: {"kind":"CustomResourceDefinition","apiVersion":"apiextensions.k8s.io/v1beta1","metadata":{"name":"agents.network-checker.ext","creationTimestamp":null},"spec":{"group":"network-checker.ext","version":"v1","names":{"plural":"agents","kind":"Agent"},"scope":"Namespaced"},"status":{"conditions":null,"acceptedNames":{"plural":"","kind":""}}}
I1012 13:16:13.716448       1 round_trippers.go:386] curl -k -v -XPOST  -H "Content-Type: application/json" -H "User-Agent: netchecker-server/v1.7.4 (linux/amd64) kubernetes/$Format" -H "Accept: application/json, */*" -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IklONkVpd0tTcFNTbnhNSzBNUU5iZGFRa3NVZkZKeXFxcEVQRDROdERmeWcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjY1NTgwNTcwLCJpYXQiOjE2MzQwNDQ1NzAsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJkZWZhdWx0IiwicG9kIjp7Im5hbWUiOiJuZXRjaGVja2VyLXNlcnZlci03ZmRjN2Y4OTQ3LXo0NWdzIiwidWlkIjoiZmZiMTExNzgtMzU0MS00YjYxLTk3NmQtYmVlNTc1ZGQ0ZjlhIn0sInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJuZXRjaGVja2VyLXNlcnZlciIsInVpZCI6ImRjYzE4YjRiLWIxMzItNGM4NS1hZjYwLTY1ZWE0MDJiODAzZSJ9LCJ3YXJuYWZ0ZXIiOjE2MzQwNDgxNzd9LCJuYmYiOjE2MzQwNDQ1NzAsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpkZWZhdWx0Om5ldGNoZWNrZXItc2VydmVyIn0.k5tCcOY9VjZ9M1nTIPa7Yj66QArjkKgs8VrzK05okQ3A7gWs87c5LZgZVgZ4K4qTqnkEoCOEIxrENtZNfiHALJvF2PSL5ibrK6P7MydTp_6jebVzeJA6xFL3UOkPH2RJ1dzFhHqwYrL5n8q-gR-mR-whBVqAqNB_YnqLbyQ67hWuDOAsVml-faByd1HiEKKRTAAp214KOaHlu6hX3VUD9H6dThzhjGwdCmJDa_Z3SBfqzk2okelFPz_7BP2RyLqoClXUCXYBKa66u3aLcAM7ybDIGSvvWCrudrisy-a6YUnFvKFx_AK0V2xy2n0xn0j2jkLZpTirfQN_Aznd5mAiAw" https://10.133.0.1:443/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions
I1012 13:16:13.722769       1 round_trippers.go:405] POST https://10.133.0.1:443/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions 404 Not Found in 6 milliseconds
I1012 13:16:13.722795       1 round_trippers.go:411] Response Headers:
I1012 13:16:13.722801       1 round_trippers.go:414]     Audit-Id: 2d20a555-8570-48e0-93d6-b044aae0fbe7
I1012 13:16:13.722805       1 round_trippers.go:414]     Cache-Control: no-cache, private
I1012 13:16:13.722808       1 round_trippers.go:414]     Content-Type: application/json
I1012 13:16:13.722811       1 round_trippers.go:414]     X-Kubernetes-Pf-Flowschema-Uid: b5d70319-af8c-4ca1-9ca2-a26caab6b154
I1012 13:16:13.722814       1 round_trippers.go:414]     X-Kubernetes-Pf-Prioritylevel-Uid: 66ce347c-fdcb-4ac5-b6e4-62882391c4aa
I1012 13:16:13.722816       1 round_trippers.go:414]     Content-Length: 174
I1012 13:16:13.722819       1 round_trippers.go:414]     Date: Tue, 12 Oct 2021 13:16:13 GMT
I1012 13:16:13.722834       1 request.go:991] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"the server could not find the requested resource","reason":"NotFound","details":{},"code":404}
E1012 13:16:13.722979       1 storer_k8s.go:68] the server could not find the requested resource
E1012 13:16:13.722988       1 server.go:58] Error while setting up the handler. Details: the server could not find the requested resource
panic: the server could not find the requested resource

goroutine 1 [running]:
main.main()
	/home/travis/gopath/src/github.com/Mirantis/k8s-netchecker-server/cmd/server.go:59 +0x6d3

This looks like the Mirantis version is trying to create some CRDs using the v1beta1 API which was dropped in kube 1.22.

@cristicalin
Copy link
Contributor Author

cristicalin commented Oct 13, 2021

I tried a couple of hacks on the Mirantis code to modernize it but it's just to ancient and my golang is to weak to be able to apply a quick and dirty fix.

For now, it looks like the v1.0 images from l23networks still work with modern kubernetes.

An alternative would be to deploy an etcd cluster and have the netchecker server use etcd instead of CRDs, a solution I don't quite like due to the wasted resources.

@floryut
Copy link
Member

floryut commented Oct 14, 2021

I tried a couple of hacks on the Mirantis code to modernize it but it's just to ancient and my golang is to weak to be able to apply a quick and dirty fix.

For now, it looks like the v1.0 images from l23networks still work with modern kubernetes.

An alternative would be to deploy an etcd cluster and have the netchecker server use etcd instead of CRDs, a solution I don't quite like due to the wasted resources.

Yup, even if I guess at some point netchecker won't work with newer version, so for now we could shrug that but it'll become an issue in the future 😢

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 14, 2021
@cristicalin cristicalin force-pushed the update_netchecker branch 4 times, most recently from 129efd7 to 1770981 Compare October 14, 2021 18:24
@cristicalin
Copy link
Contributor Author

While I'm not 100% happy that I had to add a separate sidecar for this to work, I now have a workable solution with an updated version.

The main benefit of the v1.2.2 images is that they pack a prometheus exporter which can be used to monitor connectivity in the cluster, so for production clusters this is a big win.

The solution was to add an etcd sidecar in the netchecker server pod and make netchecker server talk to the local etcd. The etcd container is not exposed outside the pod so it does make this setup somewhat brittle since we cannot reliably scale up the server deployment. An additional caveat is that the code is so ancient it only has support for etcd v2 API which no longer works properly in our current etcd version (as of this writing this is 3.5.0) and I had to use etcd 3.4.17 for which I had to add a separate version variable to make it work.

The good news is that this now works reliably beyond kube 1.22 but the solution is messy. We should consider trying to revive or fork the upstream project as I'm not aware of a similar project that is currently maintained.

@cristicalin cristicalin marked this pull request as ready for review October 14, 2021 20:24
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 14, 2021
@cristicalin
Copy link
Contributor Author

@floryut I think the current solution should be good enough to merge and allow us to not worry about the changes in 1.22 and beyond, even though it's a bit dirty

@floryut
Copy link
Member

floryut commented Oct 18, 2021

@floryut I think the current solution should be good enough to merge and allow us to not worry about the changes in 1.22 and beyond, even though it's a bit dirty

Agreed, good job here 👍

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cristicalin, floryut

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@floryut
Copy link
Member

floryut commented Oct 19, 2021

/cc @oomichi @EppO

@oomichi
Copy link
Contributor

oomichi commented Oct 19, 2021

Thanks for doing this.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 19, 2021
@k8s-ci-robot k8s-ci-robot merged commit 6a5b87d into kubernetes-sigs:master Oct 19, 2021
@floryut floryut removed the release-note-none Denotes a PR that doesn't merit a release note. label Oct 20, 2021
@floryut floryut mentioned this pull request Dec 21, 2021
sakuraiyuta pushed a commit to sakuraiyuta/kubespray that referenced this pull request Apr 16, 2022
* netchecker: update images to 1.2.2 from Mirantis which is slightly less ancinet than the l23networks images

* Netchecker: use local etcd instead of kubernetes v1beta1 crds which are no longer suported by kube 1.22+
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Jun 27, 2023
* netchecker: update images to 1.2.2 from Mirantis which is slightly less ancinet than the l23networks images

* Netchecker: use local etcd instead of kubernetes v1beta1 crds which are no longer suported by kube 1.22+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

upgrade netchecker
4 participants