-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDN 2316: Add GatewayConfig to OVNKubernetesConfig #1033
SDN 2316: Add GatewayConfig to OVNKubernetesConfig #1033
Conversation
/test verify |
Brought this up at the team meeting, I think the consensus was to not expose all these values and only put out the necessary ones. @trozet wdyt? |
Yeah, anything we expose in the API we have to support forever. If we think it's more likely than not that we'll have to expose additional gateway-specific config options in the future, then keep the GatewayConfig struct, even though it will only have one subfield for now. But if we think that's not likely, then just add |
yeah that's a good point.
I think we already need the two fields |
/hold updating PR... |
9c47cdf
to
b055424
Compare
/hold cancel |
operator/v1/types_network.go
Outdated
@@ -374,6 +373,11 @@ type OVNKubernetesConfig struct { | |||
// reported defaults are used. | |||
// +optional | |||
PolicyAuditConfig *PolicyAuditConfig `json:"policyAuditConfig,omitempty"` | |||
// gatewayConfig holds the configuration for node gateway options. If unset, | |||
// reported defaults are used. Supported from OCP 4.10. | |||
// +kubebuilder:default:={mode: shared} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "reported defaults are used" but I see you just copied that from other fields...
I don't know if we should be defaulting the value here; CNO is going to have to deal with GatewayConfig
being nil
in existing configs, so it seems like we should just leave it nil
if unspecified in new configs as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah let me look into this and test if it works the way I think it should, I thought since the option is optional default
wouldn't really put a value there, but if the user just puts gatewayConfig
key and forgets to provide a value, it would fill it with default value. I could be completely wrong here, let me check.
I'll also just remove the "unset reported defaults are used" part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so you are right, by default we will have:
Spec:
Cluster Network:
Cidr: 10.128.0.0/14
Host Prefix: 23
Default Network:
Ovn Kubernetes Config:
Gateway Config:
Egress Multiple External Gateways: false
Mode: shared
Geneve Port: 6081
Mtu: 1360
Policy Audit Config:
Destination: null
Max File Size: 50
Rate Limit: 20
Syslog Facility: local0
Type: OVNKubernetes
It's probably better not to expose these default values like this, specially if the user hasn't set them to avoid confusions.
operator/v1/types_network.go
Outdated
// Default is false. | ||
// +kubebuilder:default:=false | ||
// +optional | ||
DisableSNATMultipleGWs bool `json:"disableSNATMultipleGWs,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this part of a feature that is not supported for most customers? If so, it's weird to put it in the API...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but I thought we were planning to support it for all in the near future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On seconds thoughts again if need there is no harm in adding them later on. So we could just go with the mode field for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CNO only allows toggling this because it breaks egress IP completely, see: openshift/cluster-network-operator#1145 but I am not sure should be exposed to customers. And if we do: it should have a better name than DisableSNATMultipleGWs
(which is completely incomprehensible), maybe EnableICNI
would be clearer (because it's only used within the scope of ICNI and its F5 integration)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CNO only allows toggling this because it breaks egress IP completely, see: openshift/cluster-network-operator#1145 but I am not sure should be exposed to customers. And if we do: it should have a better name than
DisableSNATMultipleGWs
(which is completely incomprehensible), maybeEnableICNI
would be clearer (because it's only used within the scope of ICNI and its F5 integration)?
yeah you make a very good point, this option right now like you said is needed only for ICNI model. So if we feel this isn't something we should expose and should just continue using the gateway-mode-config
ConfigMap, I'm fine with that. Also we don't have any documentation on how to use this feature, so maybe the hidden feature doesn't need the hidden flag exposed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally am against exposing all of this FWIW. Exposing these knobs to customers means we will need to support shared and local gateway mode for OVN-Kubernetes for ever. I for one was hoping that whatever networking feature/capability was provided by local but not shared, would be implemented in the future and covered by some epic/user story. From my understanding: the sole argument for using local gateway mode today is: you, as a user, want to modify the kernel routing table or you manage multiple NICs/routes yourself, and shared gateway mode currently breaks you because we haven't implemented what they are doing themselves in OVN-Kubernetes (or an API for them to do that). I might be missing something critical though, which makes it so that we can't align and fill these needs?
Yes, I think we all intended to make LGW go away at some point and only have SGW (mainly for HW offloading as I understand). Unfortunately a lot of users actually count on having the egress traffic coming from the pod leave via the management port into the kernel before sending it out from the host. This is simply never going to happen with SGW because all traffic from pods will leave via br-ex and won't touch the host stack.
Implementing this in SGW would mean adding(hacking) necessary routes in ovn_cluster_router
to ask it to send it out into the host. The plan currently is to provide like a CRD or something for SGW to allow users to be able to specify routes. Trouble is this means people need to know the OVN topology :/ . Like you said, the main reason is probably secondary networks on the host they wanna connect to.
Anyways until a proper solution in SGW gets shaped up, the other solution was to support LGW since we can't really regress (but I guess that ship has sailed - lucky saved by the hidden mode in the config-map).
The middle ground solution is to have a toggle in SGW where we allow users to say 1) I want pod egress to go via host 2) I am ok with it not touching host. This former toggle will now be called "local gateway". At least that's the aim if I can get the "move lgw closer to sgw" epic done properly. If we can remove all the extra bits from lgw, then its just a toggle for how we wanna direct traffic.
EDIT: well I am more so referring to the gateway mode flag. ICNI and egress IP are logically incompatible, so customers need to chose one or the other. But the impact of that flag on OVN-Kubernetes as a project is much smaller than supporting both gateway modes.
right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DisableSNATMultipleGWs isn't to enable/disable ICNI. ICNI is always enabled. The flag simply decides whether ICNI egress traffic will be SNAT'ed to the hosts IP or expose the pod source IPs directly. For our current use case we disable SNAT and expose pod IPs. We have never purposefully exposed this so that someone could use ICNI with SNAT. I would say we can hold off on exposing it for now, but keep the gatewayconfig block so we could add it (or other features later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DisableSNATMultipleGWs isn't to enable/disable ICNI. ICNI is always enabled. The flag simply decides whether ICNI egress traffic will be SNAT'ed to the hosts IP or expose the pod source IPs directly. For our current use case we disable SNAT and expose pod IPs. We have never purposefully exposed this so that someone could use ICNI with SNAT. I would say we can hold off on exposing it for now, but keep the gatewayconfig block so we could add it (or other features later).
Oh okay, this is news to me :D I always thought the disableSNAT's option was also needed for ICNI to work, but it seems like we first hit the ecmp route and only then SNAT towards the end, so makes sense that SNAT is not a strict requirement for ICNI to work, just that outgoing traffic won't have the podIP's as their source. Should we just add the toggle in this PR itself? Asking because in 4.9 we were actually allowing the user to specify this via the hidden config-map. Now from 4.10 if we suddenly remove that config-map and don't provide a way to specify this option are we in some way regressing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the purpose of it in 4.9 was to workaround the incompatibility with egress IP right? In 4.10 we are going to make egress gw and egress IP compatible so that isn't a valid reason to expose the option. If we officially supported multiple exgw then I would agree we should keep the option and expose it via API, but we do not officially support it for 4.10. Therefore I think we should hold off on exposing it until we officially support it, and ensure that multiple exgw and egress IP are compatible in 4.10 (which you are already working on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think the purpose of it in 4.9 was to workaround the incompatibility with egress IP right? In 4.10 we are going to make egress gw and egress IP compatible so that isn't a valid reason to expose the option. If we officially supported multiple exgw then I would agree we should keep the option and expose it via API, but we do not officially support it for 4.10. Therefore I think we should hold off on exposing it until we officially support it, and ensure that multiple exgw and egress IP are compatible in 4.10 (which you are already working on).
yes, true, the intention was to allow exclusiveness and that has been removed with ovn-kubernetes/ovn-kubernetes#2686.
Cool as long as we agree we have not agreed to support this feature through API until its fully Alpha, I'm good with this decision.
b055424
to
3589e84
Compare
30b701d
to
70ecfdd
Compare
70ecfdd
to
dd9b16e
Compare
Let's port the API changes from openshift/api#1033 and use it in CNO to set the gateway mode accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/hold |
dd9b16e
to
938166b
Compare
938166b
to
99ab9dc
Compare
/hold cancel |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tssurya . Some small comments, but this is very close.
host before sending it out. If this is not set, traffic | ||
will always egress directly from OVN to outside without | ||
touching the host stack. Setting this to true means | ||
hardware offload will not be supported. Default is false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we do when we upgrade existing clusters? Do we need to handle that case specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So upgrades are a bit different, currently with this feature we'll only be supporting 4.9 -> 4.10 cases. Since we kind of stopped supporting lgw around 4.8, in 4.8/4.9 lgw is currently unsupported - same for the upgrades.
But with this bool, the aim is to allow users to be able to switch between both the modes at any given time - i.e let's say you are on 4.9 & shared gateway mode. If you want to upgrade to 4.10 and choose lgw mode, you'd have to first do the upgrade to 4.10 shared and then edit this bool in the crd and CNO will roll out ovnkube pods in local gateway. Vice versa if you are coming from 4.9 local gateway and you want to move to 4.10 shared you'd have to first go to 4.10 local and switch off this bool (which it is by default, so by default we are always shared). So essentially users should be able to go between the two modes at any given point since the --to-upgrade doesn't really provide a way to specify like get me directly to x gateway mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point to note is, AFAIK, the api is the first thing to upgrade and cno is the third component, so scenario:
We are moving from 4.9 local to 4.10 local:
- If the API upgrades first, the CRD changes for the
gatewayconfig
should be available in the client-go for the user to do anoc edit
(I think). So API is already on 4.10. - At this point CNO is still on 4.9, and CNO is the last thing to upgrade in the sense it rolls out all multus/ovn-k pods and then it itself undergoes upgrade, so it rolls out all the ovnkube daemonsets from 4.9 to 4.10 and since the 4.9 CNO used the hidden config-map to set the gateway mode, it would roll out the new ones respecting that config-map.mode value.
- Once all ovn-k pods are on 4.10, CNO also upgrades, the new CNO now recognizes the new gatewayconfig values if they are set.
I need to test this process when this bool is merged and basically make CNO control the mode migration process during upgrades.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my question is whether we should say something like "the default is false unless upgrading from an earlier version"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah :D you make a good point :) . yes we definitely need to respect the mode that existed in the previous version of the cluster. Let me update that.
99ab9dc
to
92f9f42
Compare
92f9f42
to
65181bd
Compare
/lgtm |
/test verify |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required |
With this we can enable users to specify gateway config options via CRD. For now we are exposing only the mode of egress.
65181bd
to
a880d29
Compare
/retest-required |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: knobunc, trozet, tssurya The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test verify |
/retest-required |
With this PR we can enable users to specify
gateway config options via CRD.
/assign @trozet : PTAL at the
gatewayConfig
. Since we need to anyways exposelocal
gw mode,I was thinking of importing the other gatewayConfigs from upstream as well likedisableSNATMultipleGWs
in a more API friendly way as opposed to the hardcore values we set via CNO. WDYT? Are there any concerns in doing so?Based on the arch call and after talking to trozet, we have decided to add a boolean which can do gateway mode switches for us.
/assign @danwinship, the API expert :)