Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix using ovn kubernetes as the CNI #1689

Merged
merged 6 commits into from
Jun 4, 2022
Merged

Conversation

astoycos
Copy link
Contributor

@astoycos astoycos commented Feb 7, 2022

This PR re-writes the submariner networkplugin-syncer for ovn-kubernetes. Mainly it converts from using go-ovn/nbctl to libovsdb for interacting with the ovn-dbs. I will update this PR with more commits/explanations as I finish tidying things up.

Fixes #1608

Depends on ovn-kubernetes/ovn-kubernetes#2747
Depends on submariner-io/shipyard#704

@submariner-bot
Copy link
Contributor

🤖 Created branch: z_pr1689/astoycos/fix-ovn-k
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

@astoycos
Copy link
Contributor Author

astoycos commented Feb 8, 2022

I would expect E2E to fail until the dependent PRs (updating our KIND setup) works

@astoycos astoycos marked this pull request as ready for review February 11, 2022 05:04
@astoycos
Copy link
Contributor Author

I've tested this locally by running E2E and things look good so far, so I'll go ahead and open this up for review.

@stale
Copy link

stale bot commented Feb 26, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Feb 26, 2022
@stale stale bot removed the wontfix This will not be worked on label Feb 28, 2022
@github-actions github-actions bot removed the dependent label Mar 2, 2022
@mangelajo
Copy link
Contributor

Very cool, @astoycos pong me to see if I can do a review. I'm not sure if I have enough bandwidth currently to go through the whole patch / do a detailed review, I will try do diagonally-read as much as possible.

I'd make sure it works with OCP + kind.

k8s.io/client-go v1.5.2
k8s.io/klog v1.0.0
k8s.io/utils v0.0.0-20210305010621-2afb4311ab10
k8s.io/klog/v2 v2.30.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we need to keep logr at 0.4.0 so we need to keep K8s at 0.21.11 and klog/v2 at v2.9.0 (or pin it).

@@ -113,3 +113,57 @@ func getOVNTLSConfig(pkFile, certFile, caFile string) (*tls.Config, error) {
MinVersion: tls.VersionTLS12,
}, nil
}

func createLibovsdbClient(dbAddress string, tlsConfig *tls.Config, dbModel model.ClientDBModel) (libovsdbclient.Client, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func createLibovsdbClient(dbAddress string, tlsConfig *tls.Config, dbModel model.ClientDBModel) (libovsdbclient.Client, error) {
func createLibOvsdbClient(dbAddress string, tlsConfig *tls.Config, dbModel model.ClientDBModel) (libovsdbclient.Client, error) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libovsdb is a single word here :)

@astoycos
Copy link
Contributor Author

I still need to work on vendoring issues, since now they're breaking unit tests

@astoycos astoycos force-pushed the fix-ovn-k branch 2 times, most recently from d7f3831 to 89337ae Compare May 12, 2022 16:04
// TODO: Remove this workaround for https://github.com/kubernetes/client-go/issues/949 once
// admiral has been updated
t.scheme.AddKnownTypeWithName(schema.GroupVersionKind{Group: "fake-dynamic-client-group", Version: "v1", Kind: "List"}, &unstructured.UnstructuredList{})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pin to K8s 0.21.11 to avoid having to do this. The rest of the submariner projects use 0.21.11 so I think we should be consistent.

Copy link
Contributor Author

@astoycos astoycos May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpantelis I was more just trying to figure out why that was happening :) ..... Also I believe this problem occurred from 0.19 -> 0.20 so we'll probably still run into this specific issue

We ran into issues in submariner-operator with lighthouse bumped to v0.23.5 - we had to pin a bunch of things. We're going to have to do the same here (ie pin k8s 0.21.11). sigs.k8s.io/controller-runtime v0.11.2 uses the backwards incompatible logr 1.2.0 - we need to directly set or pin it to sigs.k8s.io/controller-runtime v0.9.7. Also github.com/go-logr/logr pinned to v0.4.0 and k8s.io/klog/v2 to v2.9.0.

I'll give pinning to 0.21.11 a try...But currently there's some dependencies with the imported OVN-K libraries which are forcing the version up to v0.23.3 since that's what's defined here -> https://github.com/ovn-org/ovn-kubernetes/blob/master/go-controller/go.mod#L38. Also something to keep in mind... We don't use logr in this repo so I don't think that's a problem anymore... I updated klog/v2 to v2.30 (in this current iteration) and it looks like everything is running alright

But I do need to ensure that we can import those ovn-k libraries without depending on k8s version.. Or else then every time ovn-k bumps their K8s version we'll have to, creating an unnecessary dependency (unless we are always just keeping it pinned...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example when I pin to 0.21.11

#14 29.60 # github.com/ovn-org/ovn-kubernetes/go-controller/pkg/libovsdbops
#14 29.60 vendor/github.com/ovn-org/ovn-kubernetes/go-controller/pkg/libovsdbops/transact.go:21:20: undefined: wait.PollImmediateUntilWithContext

Copy link
Contributor

@tpantelis tpantelis May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also something to keep in mind... We don't use logr in this repo so that's not a problem anymore...

But I intend/want to use logr, specifically for zerolog.

But I do need to ensure that we can import those ovn-k libraries without depending on k8s version.. Or else then every time ovn-k bumps their K8s version we'll have to, creating an unnecessary dependency.

I see that you're using the master branch from github.com/ovn-org/ovn-kubernetes. Not sure how stable that is but I guess we have to since no prior version has a go.mod file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait.PollImmediateUntilWithContext was added in K8s 0.22 so perhaps try pinning to 0.22.9, which uses k8s.io/klog/v2 v2.9.0 and logr v0.4.0.

Copy link
Contributor Author

@astoycos astoycos May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you're using the master branch from github.com/ovn-org/ovn-kubernetes. Not sure how stable that is but I guess we have to since no prior version has a go.mod file.

Right they have no upstream release versioning process so it is a bit unstable.... granted these bits were already pretty unstable, and regardless I don't think what we have here is necessarily the best long-term ovn-k integration solution since any way we cut it there will always be a tightly coupled dependency with the ovn-k version

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we would bump to K8s 0.23 across the board. At one point lighthouse was using 0.23 due to coredns bump but we downgraded due to submariner-operator issues. Obviously submariner is building fine with 0.23 with this PR. I'm not sure what's holding us back. @skitt? Is it the older operator framework in submariner-operator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried pinning to K8s 0.22.9 et al but hit a showstopper:

github.com/submariner-io/submariner/pkg/networkplugin-syncer/handlers/ovn imports
        github.com/ovn-org/libovsdb/client imports
        github.com/go-logr/stdr imports
        github.com/go-logr/logr/funcr: no required module provides package github.com/go-logr/logr/funcr; to add it:
        go get github.com/go-logr/logr/funcr
~/go/src/github.com/submariner-io/submariner>go get github.com/go-logr/logr/funcr
go: module github.com/go-logr/logr@upgrade found (v1.2.3, replaced by github.com/go-logr/logr@v0.4.0), but does not contain package github.com/go-logr/logr/funcr

So github.com/go-logr/logr/funcr was added in logr v1.0.0 which also introduces the backwards incompatible Logger change from interface to struct. So in order to use libovsdb, we need K8s 0.23 et al. We'll see what happens when the submariner version is bumped in submariner-operator 😄 Of course bumping everything to K8s 0.23 would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I experienced the exact same thing :(

And sounds good, I'll let this sit for now since CI is green and continue to monitor the best way forward

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @skitt - after bumping the operator framework version and adjusting our operator code (planned epic for this release) and requiring go 1.17, there should be no obstacle in bumping to K8s 0.23 across the projects.

@astoycos astoycos force-pushed the fix-ovn-k branch 2 times, most recently from dc71dc8 to 627ff30 Compare May 12, 2022 19:20
Copy link
Member

@dfarrell07 dfarrell07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @sridhargaddam / @vthapar / @aswinsuryan does anyone else want to review or can we merge this?

@sridhargaddam
Copy link
Member

LGTM! @sridhargaddam / @vthapar / @aswinsuryan does anyone else want to review or can we merge this?

@dfarrell07 I could not look into this as I was occupied with other stuff, but @vthapar wanted to try this out on an OCP cluster. I'll let him comment. I'm fine if we want to go ahead and merge it.

@vthapar
Copy link
Contributor

vthapar commented May 30, 2022

@dfarrell07 @sridhargaddam While testing I ran into a issue with OCP 4.10. This fix requires OCP 4.11 or later to work. I've been unable to get a working setup with OCP 4.11 I'd like to confirm it working with 4.11 at least once before we merge it. I should have results in a day at most.

Copy link
Contributor

@vthapar vthapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the code but did test it out and confirmed E2E to be passing in both kind and AWS+OCP.

astoycos added 6 commits June 3, 2022 11:26
Switch from using go-ovn clients to interact
with the ovn dbs and change to using libovsdb.

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Around the ovn package we defined constants in a
confusing array of locations, this commit reorganizes
them into a single file.

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
With the conversion to using libovsdb we no longer need to
use nbctl. This commit removes the associated helper
functions.

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Rename the files which contain the core submariner/ovn
functionality bits. Convert all the ovsdb clients from
go-ovn to ovsdb.

Re-organize some submariner files to align more
closely with OVN Semantics.  Convert all calls
reguarding OVN logical_router_static_routes to
use libovsdb rather than go-ovn.

Convert all logical_router_policy calls
to libovsdb.

convert all OVN sbdb chassis
related calls to use libovsdb.

Update how clusterSet service avaliability works
in the networkplugin-syncer code. Speciafically it ensures we add
the a new OVN object (OVN cluster loadbalancer_group) to the submariner
router. This group has all the VIPs for the cluster in question.

Update vendoring since ovn-k requires k8s 1.23

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
With the OVN-K CNI we were seeing and ungraceful
termination of the GW Pod in GW migration tests,
this patch ensures we cleanup the cableengine
on node resources before exiting.

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
For a Fake k8s client we must now
explicitly register all resouces
we expect to list in the scheme.

Otherwise we'll run into
kubernetes/client-go#914

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
@vthapar vthapar enabled auto-merge (rebase) June 4, 2022 12:04
@vthapar vthapar merged commit a5aee59 into submariner-io:devel Jun 4, 2022
@submariner-bot
Copy link
Contributor

🤖 Closed branches: [z_pr1689/astoycos/fix-ovn-k]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OVN priority:high ready-to-test When a PR is ready for full E2E testing
Projects
None yet
8 participants