Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDN 2316: Add GatewayConfig to OVNKubernetesConfig #1033

Merged
merged 1 commit into from
Dec 3, 2021

Conversation

tssurya
Copy link
Contributor

@tssurya tssurya commented Oct 15, 2021

With this PR we can enable users to specify
gateway config options via CRD.

/assign @trozet : PTAL at the gatewayConfig. Since we need to anyways expose local gw mode, I was thinking of importing the other gatewayConfigs from upstream as well like disableSNATMultipleGWs in a more API friendly way as opposed to the hardcore values we set via CNO. WDYT? Are there any concerns in doing so?
Based on the arch call and after talking to trozet, we have decided to add a boolean which can do gateway mode switches for us.

/assign @danwinship, the API expert :)

@openshift-ci openshift-ci bot requested review from bparees and jwforres October 15, 2021 16:16
@tssurya
Copy link
Contributor Author

tssurya commented Oct 15, 2021

/test verify

@tssurya
Copy link
Contributor Author

tssurya commented Oct 19, 2021

Brought this up at the team meeting, I think the consensus was to not expose all these values and only put out the necessary ones. @trozet wdyt?

@danwinship
Copy link
Contributor

Yeah, anything we expose in the API we have to support forever.

If we think it's more likely than not that we'll have to expose additional gateway-specific config options in the future, then keep the GatewayConfig struct, even though it will only have one subfield for now. But if we think that's not likely, then just add GatewayMode to OVNKubernetesConfig directly.

@tssurya
Copy link
Contributor Author

tssurya commented Oct 20, 2021

Yeah, anything we expose in the API we have to support forever.

yeah that's a good point.

If we think it's more likely than not that we'll have to expose additional gateway-specific config options in the future, then keep the GatewayConfig struct, even though it will only have one subfield for now. But if we think that's not likely, then just add GatewayMode to OVNKubernetesConfig directly.

I think we already need the two fields mode and disableSNATMultipleGWs. So I'll use the GatewayConfig struct. Plus like you said in the future if we feel the need to expose the other values we can do so on a one-by-one basis.

@tssurya
Copy link
Contributor Author

tssurya commented Oct 20, 2021

/hold

updating PR...

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 20, 2021
@tssurya tssurya force-pushed the add-gm-config-ovnk branch from 9c47cdf to b055424 Compare October 20, 2021 18:49
@tssurya
Copy link
Contributor Author

tssurya commented Oct 20, 2021

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 20, 2021
@@ -374,6 +373,11 @@ type OVNKubernetesConfig struct {
// reported defaults are used.
// +optional
PolicyAuditConfig *PolicyAuditConfig `json:"policyAuditConfig,omitempty"`
// gatewayConfig holds the configuration for node gateway options. If unset,
// reported defaults are used. Supported from OCP 4.10.
// +kubebuilder:default:={mode: shared}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what "reported defaults are used" but I see you just copied that from other fields...

I don't know if we should be defaulting the value here; CNO is going to have to deal with GatewayConfig being nil in existing configs, so it seems like we should just leave it nil if unspecified in new configs as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let me look into this and test if it works the way I think it should, I thought since the option is optional default wouldn't really put a value there, but if the user just puts gatewayConfig key and forgets to provide a value, it would fill it with default value. I could be completely wrong here, let me check.

I'll also just remove the "unset reported defaults are used" part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so you are right, by default we will have:

Spec:
  Cluster Network:
    Cidr:         10.128.0.0/14
    Host Prefix:  23
  Default Network:
    Ovn Kubernetes Config:
      Gateway Config:
        Egress Multiple External Gateways:  false
        Mode:                               shared
      Geneve Port:                          6081
      Mtu:                                  1360
      Policy Audit Config:
        Destination:            null
        Max File Size:          50
        Rate Limit:             20
        Syslog Facility:        local0
    Type:                       OVNKubernetes

It's probably better not to expose these default values like this, specially if the user hasn't set them to avoid confusions.

operator/v1/types_network.go Outdated Show resolved Hide resolved
// Default is false.
// +kubebuilder:default:=false
// +optional
DisableSNATMultipleGWs bool `json:"disableSNATMultipleGWs,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this part of a feature that is not supported for most customers? If so, it's weird to put it in the API...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but I thought we were planning to support it for all in the near future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On seconds thoughts again if need there is no harm in adding them later on. So we could just go with the mode field for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CNO only allows toggling this because it breaks egress IP completely, see: openshift/cluster-network-operator#1145 but I am not sure should be exposed to customers. And if we do: it should have a better name than DisableSNATMultipleGWs (which is completely incomprehensible), maybe EnableICNI would be clearer (because it's only used within the scope of ICNI and its F5 integration)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CNO only allows toggling this because it breaks egress IP completely, see: openshift/cluster-network-operator#1145 but I am not sure should be exposed to customers. And if we do: it should have a better name than DisableSNATMultipleGWs (which is completely incomprehensible), maybe EnableICNI would be clearer (because it's only used within the scope of ICNI and its F5 integration)?

yeah you make a very good point, this option right now like you said is needed only for ICNI model. So if we feel this isn't something we should expose and should just continue using the gateway-mode-config ConfigMap, I'm fine with that. Also we don't have any documentation on how to use this feature, so maybe the hidden feature doesn't need the hidden flag exposed :)

Copy link
Contributor Author

@tssurya tssurya Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally am against exposing all of this FWIW. Exposing these knobs to customers means we will need to support shared and local gateway mode for OVN-Kubernetes for ever. I for one was hoping that whatever networking feature/capability was provided by local but not shared, would be implemented in the future and covered by some epic/user story. From my understanding: the sole argument for using local gateway mode today is: you, as a user, want to modify the kernel routing table or you manage multiple NICs/routes yourself, and shared gateway mode currently breaks you because we haven't implemented what they are doing themselves in OVN-Kubernetes (or an API for them to do that). I might be missing something critical though, which makes it so that we can't align and fill these needs?

Yes, I think we all intended to make LGW go away at some point and only have SGW (mainly for HW offloading as I understand). Unfortunately a lot of users actually count on having the egress traffic coming from the pod leave via the management port into the kernel before sending it out from the host. This is simply never going to happen with SGW because all traffic from pods will leave via br-ex and won't touch the host stack.

Implementing this in SGW would mean adding(hacking) necessary routes in ovn_cluster_router to ask it to send it out into the host. The plan currently is to provide like a CRD or something for SGW to allow users to be able to specify routes. Trouble is this means people need to know the OVN topology :/ . Like you said, the main reason is probably secondary networks on the host they wanna connect to.

Anyways until a proper solution in SGW gets shaped up, the other solution was to support LGW since we can't really regress (but I guess that ship has sailed - lucky saved by the hidden mode in the config-map).

The middle ground solution is to have a toggle in SGW where we allow users to say 1) I want pod egress to go via host 2) I am ok with it not touching host. This former toggle will now be called "local gateway". At least that's the aim if I can get the "move lgw closer to sgw" epic done properly. If we can remove all the extra bits from lgw, then its just a toggle for how we wanna direct traffic.

EDIT: well I am more so referring to the gateway mode flag. ICNI and egress IP are logically incompatible, so customers need to chose one or the other. But the impact of that flag on OVN-Kubernetes as a project is much smaller than supporting both gateway modes.

right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DisableSNATMultipleGWs isn't to enable/disable ICNI. ICNI is always enabled. The flag simply decides whether ICNI egress traffic will be SNAT'ed to the hosts IP or expose the pod source IPs directly. For our current use case we disable SNAT and expose pod IPs. We have never purposefully exposed this so that someone could use ICNI with SNAT. I would say we can hold off on exposing it for now, but keep the gatewayconfig block so we could add it (or other features later).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DisableSNATMultipleGWs isn't to enable/disable ICNI. ICNI is always enabled. The flag simply decides whether ICNI egress traffic will be SNAT'ed to the hosts IP or expose the pod source IPs directly. For our current use case we disable SNAT and expose pod IPs. We have never purposefully exposed this so that someone could use ICNI with SNAT. I would say we can hold off on exposing it for now, but keep the gatewayconfig block so we could add it (or other features later).

Oh okay, this is news to me :D I always thought the disableSNAT's option was also needed for ICNI to work, but it seems like we first hit the ecmp route and only then SNAT towards the end, so makes sense that SNAT is not a strict requirement for ICNI to work, just that outgoing traffic won't have the podIP's as their source. Should we just add the toggle in this PR itself? Asking because in 4.9 we were actually allowing the user to specify this via the hidden config-map. Now from 4.10 if we suddenly remove that config-map and don't provide a way to specify this option are we in some way regressing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the purpose of it in 4.9 was to workaround the incompatibility with egress IP right? In 4.10 we are going to make egress gw and egress IP compatible so that isn't a valid reason to expose the option. If we officially supported multiple exgw then I would agree we should keep the option and expose it via API, but we do not officially support it for 4.10. Therefore I think we should hold off on exposing it until we officially support it, and ensure that multiple exgw and egress IP are compatible in 4.10 (which you are already working on).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think the purpose of it in 4.9 was to workaround the incompatibility with egress IP right? In 4.10 we are going to make egress gw and egress IP compatible so that isn't a valid reason to expose the option. If we officially supported multiple exgw then I would agree we should keep the option and expose it via API, but we do not officially support it for 4.10. Therefore I think we should hold off on exposing it until we officially support it, and ensure that multiple exgw and egress IP are compatible in 4.10 (which you are already working on).

yes, true, the intention was to allow exclusiveness and that has been removed with ovn-kubernetes/ovn-kubernetes#2686.

Cool as long as we agree we have not agreed to support this feature through API until its fully Alpha, I'm good with this decision.

@tssurya tssurya force-pushed the add-gm-config-ovnk branch from b055424 to 3589e84 Compare October 25, 2021 07:11
@tssurya tssurya force-pushed the add-gm-config-ovnk branch 2 times, most recently from 30b701d to 70ecfdd Compare October 25, 2021 13:31
tssurya added a commit to tssurya/cluster-network-operator that referenced this pull request Nov 29, 2021
Let's port the API changes from
openshift/api#1033 and
use it in CNO to set the gateway mode accordingly.
Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@tssurya
Copy link
Contributor Author

tssurya commented Nov 30, 2021

@sjenning or @sttts or @deads2k ptal

@tssurya
Copy link
Contributor Author

tssurya commented Dec 1, 2021

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 1, 2021
@tssurya tssurya force-pushed the add-gm-config-ovnk branch from dd9b16e to 938166b Compare December 1, 2021 10:01
@tssurya tssurya requested review from trozet and danwinship December 1, 2021 10:11
@tssurya tssurya force-pushed the add-gm-config-ovnk branch from 938166b to 99ab9dc Compare December 2, 2021 09:12
@tssurya
Copy link
Contributor Author

tssurya commented Dec 2, 2021

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 2, 2021
@trozet
Copy link
Contributor

trozet commented Dec 2, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2021
Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tssurya . Some small comments, but this is very close.

operator/v1/0000_70_cluster-network-operator_01.crd.yaml Outdated Show resolved Hide resolved
host before sending it out. If this is not set, traffic
will always egress directly from OVN to outside without
touching the host stack. Setting this to true means
hardware offload will not be supported. Default is false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we do when we upgrade existing clusters? Do we need to handle that case specifically?

Copy link
Contributor Author

@tssurya tssurya Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So upgrades are a bit different, currently with this feature we'll only be supporting 4.9 -> 4.10 cases. Since we kind of stopped supporting lgw around 4.8, in 4.8/4.9 lgw is currently unsupported - same for the upgrades.

But with this bool, the aim is to allow users to be able to switch between both the modes at any given time - i.e let's say you are on 4.9 & shared gateway mode. If you want to upgrade to 4.10 and choose lgw mode, you'd have to first do the upgrade to 4.10 shared and then edit this bool in the crd and CNO will roll out ovnkube pods in local gateway. Vice versa if you are coming from 4.9 local gateway and you want to move to 4.10 shared you'd have to first go to 4.10 local and switch off this bool (which it is by default, so by default we are always shared). So essentially users should be able to go between the two modes at any given point since the --to-upgrade doesn't really provide a way to specify like get me directly to x gateway mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point to note is, AFAIK, the api is the first thing to upgrade and cno is the third component, so scenario:

We are moving from 4.9 local to 4.10 local:

  • If the API upgrades first, the CRD changes for the gatewayconfig should be available in the client-go for the user to do an oc edit (I think). So API is already on 4.10.
  • At this point CNO is still on 4.9, and CNO is the last thing to upgrade in the sense it rolls out all multus/ovn-k pods and then it itself undergoes upgrade, so it rolls out all the ovnkube daemonsets from 4.9 to 4.10 and since the 4.9 CNO used the hidden config-map to set the gateway mode, it would roll out the new ones respecting that config-map.mode value.
  • Once all ovn-k pods are on 4.10, CNO also upgrades, the new CNO now recognizes the new gatewayconfig values if they are set.

I need to test this process when this bool is merged and basically make CNO control the mode migration process during upgrades.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is whether we should say something like "the default is false unless upgrading from an earlier version"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah :D you make a good point :) . yes we definitely need to respect the mode that existed in the previous version of the cluster. Let me update that.

@knobunc knobunc self-assigned this Dec 2, 2021
@tssurya tssurya force-pushed the add-gm-config-ovnk branch from 99ab9dc to 92f9f42 Compare December 2, 2021 18:36
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2021
@tssurya tssurya force-pushed the add-gm-config-ovnk branch from 92f9f42 to 65181bd Compare December 3, 2021 14:48
@knobunc
Copy link
Contributor

knobunc commented Dec 3, 2021

/lgtm
/approve

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 3, 2021
@tssurya
Copy link
Contributor Author

tssurya commented Dec 3, 2021

/test verify

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@tssurya
Copy link
Contributor Author

tssurya commented Dec 3, 2021

/retest-required

With this we can enable users to specify
gateway config options via CRD. For now
we are exposing only the mode of egress.
@tssurya tssurya force-pushed the add-gm-config-ovnk branch from 65181bd to a880d29 Compare December 3, 2021 16:36
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2021
@tssurya
Copy link
Contributor Author

tssurya commented Dec 3, 2021

/retest-required

@knobunc
Copy link
Contributor

knobunc commented Dec 3, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 3, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knobunc, trozet, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tssurya
Copy link
Contributor Author

tssurya commented Dec 3, 2021

/test verify

@tssurya
Copy link
Contributor Author

tssurya commented Dec 3, 2021

/retest-required

@openshift-merge-robot openshift-merge-robot merged commit fb6f933 into openshift:master Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants