-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2086: alpha prod readiness review #2441
Conversation
/assign @johnbelamaric @thockin |
@@ -0,0 +1,3 @@ | |||
kep-number: 2086 | |||
alpha: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PRR is already filled out in the KEP https://github.com/kubernetes/enhancements/blob/699fd3769fd3cd1f6846020675a330cdcee5f5af/keps/sig-network/2086-service-internal-traffic-policy/README.md#production-readiness-review-questionnaire but it needs a review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
@@ -0,0 +1,3 @@ | |||
kep-number: 2086 | |||
alpha: | |||
approver: "@johnbelamaric" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually claim we should rethink this proposal with the new approach that we're proceed with decribed in:
#2434
The arguments described in the proposal (EndpointSlice per node) will no longer hold with that (we can easily imagine "forNode" hint too).
So instead of rushing with this one, I would really like to take a step back and ensure that we can't converge this with #2434
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the slices have already a hostname
field, IIUIC this is a just a matter of signalling kube-proxy to leverage it, it can be an annotation or an api field, no?
// hostname of this endpoint. This field may be used by consumers of
// endpoints to distinguish endpoints from each other (e.g. in DNS names).
// Multiple endpoints which use the same hostname should be considered
// fungible (e.g. multiple A values in DNS). Must pass DNS Label (RFC 1123)
// validation.
// +optional
Hostname *string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the problem is that the API field can be contradictory with the new annotations topology.kubernetes.io/topology-aware-routing
, right?
Maybe we should add a new value to the annotation to implement local traffic ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point that we don't need any additional hints for that.
Regarding API - given it's explicitly stated there as "local-traffic will require careful handing" (or sth like that), I think we should think if it doesn't make sense to unify (using single annotation is what i was thinking, but I think we should rely on API approvers (@thockin - back to you) to recommend if that makes sense).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding API - given it's explicitly stated there as "local-traffic will require careful handing" (or sth like that), I think we should think if it doesn't make sense to unify (using single annotation is what i was thinking, but I think we should rely on API approvers (@thockin - back to you) to recommend if that makes sense).
There was a lot of back and forth on this and the conclusion IIRC is that "route to node-local" is a distinct and common enough use-case to warrant a different API from topology-aware routing. More on this email thread: https://groups.google.com/g/kubernetes-sig-network/c/wXd1D_fKjqU. But that was also under the assumption that we were going to use EndpointSlice subsetting and a EndpointSlice per-node didn't make sense.
So, the problem is that the API field can be contradictory with the new annotations topology.kubernetes.io/topology-aware-routing, right?
There is an overlap if you consider Local
or PreferLocal
. The assumption so far has been that this annotation would only be used with Cluster
(now All
). My personal preference would actually be to have TopologyAware
be another possible value of internalTrafficPolicy
as opposed to an annotation but I'll leave that feedback in the other KEP PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @maplain who is working on the initial implementation for internalTrafficPolicy kubernetes/kubernetes#96600
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to try to combine these. There is a lot of overlap. I prefer PreferZone
over Topology
as it leaves room for expansion in the future if we add other topology algorithms.
In regards to "Auto" meaning "decide for me"- IIUC that means that based on the endpoints available, choosing between Cluster
, Local
, PreferLocal
, and PreferZone
. Would that require an additional field or annotation to state what was chosen? PreferZone
is the only one that would be detectable because of the the topologyHints
. How would kube-proxy differentiate between the others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the easiest thing to understand here might be adding PreferZone
The semantics of PreferNode are "if there is an endpoint on-node, use it. Only
if there is no endpoint should you overflow to another node". I don't think we
want the same for zone (or at least, that's not the heuristic you have
developed and almost certainly not the best for "auto").
potentially RequireZone as options for InternalTrafficPolicy
If and when we have use-cases.
EndpointSlice hints are only generated when internalTrafficPolicy == PreferZone
We could always generate hints, but why bother unless we are in a mode where
they matter? We might want external traffic to have topology, too?
Limiting this to internalTrafficPolicy removes a potential link to LB config.
Actually - is that a reason why topology and local-only might not be the same?
What happens if I set internalTrafficPolicy: Topology
and I take external
traffic? Does that not follow topology? Or does topological subsetting apply
to all "whole cluster" policies?
In the context of external traffic, "Local" has a semantic implication (no
SNAT) as well as a perfomance/cost implication. We have to respect that. But
the default is "Cluster" and it seems to me that topology should apply in those
cases (cost/perf win but no semantic change).
This point is weighing on me - thoughts?
In regards to "Auto" meaning "decide for me"- IIUC that means that based on the endpoints available, choosing between Cluster, Local, PreferLocal, and PreferZone. Would that require an additional field or annotation to state what was chosen? PreferZone is the only one that would be detectable because of the the topologyHints. How would kube-proxy differentiate between the others?
I'm not sure it's the right approach, but "Auto" could look at the endpoints
and decide "I have at least one endpoint on every node, so I will set a hint to
use the same node" vs "I don't have enough for node, but I can bound by zone" or
even some mixed mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should all jump on a call? I don't think we need to block the KEP(s), though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates from Slack thread:
- rename
internalTrafficPolicy
totrafficPolicy
to reflect increased scope - if
externalTrafficPolicy=Cluster
, fall back totrafficPolicy
for external sources - if
externalTrafficPolicy=Local
-- this rule only takes precedent for external sources, with internal traffic followingtrafficPolicy
@thockin - feel free to hold cancel when you're fine with it; I just wanted to flag the overlap and it seems I succeeded. I will not have time to go back to it by feature freeze, so I'm leaving the resolution to you. |
Thanks @wojtek-t, I will also update PRR reviewer to you. I think you still need to approve the PR for changes in |
I'll let @wojtek-t take this one... |
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work on this! Just a few nits but nothing that should block this PR from getting in.
InternalTrafficPolicy ServiceInternalTrafficPolicyType `json:"internalTrafficPolicy,omitempty"` | ||
// trafficPolicy denotes if the traffic for a Service should route | ||
// to cluster-wide endpoints or node-local endpoints. "Cluster" routes traffic | ||
// to a Service to all cluster-wide endpoints. "Topology" routes traffic based on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd personally prefer "PreferZone" over "Topology" here but no reason to block this PR.
@@ -34,7 +34,7 @@ milestone: | |||
# The following PRR answers are required at alpha release | |||
# List the feature gate name and the components for which it must be enabled | |||
feature-gates: | |||
- name: ServiceInternalTrafficPolicy | |||
- name: ServiceITrafficPolicy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: ServiceITrafficPolicy | |
- name: ServiceTrafficPolicy |
@@ -34,7 +34,7 @@ milestone: | |||
# The following PRR answers are required at alpha release | |||
# List the feature gate name and the components for which it must be enabled | |||
feature-gates: | |||
- name: ServiceInternalTrafficPolicy | |||
- name: ServiceITrafficPolicy | |||
components: | |||
- kube-apiserver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This will also need to be read from kube-controller-manager now that topology is dependent on this feature gate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
1. Cluster (default): route to all cluster-wide endpoints (or use topology aware subsetting if enabled). | ||
2. Topology: route to endpoints using topology-aware routing. See Topology Aware Hints KEP for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can argue names in the PR I need to think about it, but I don't think I like this layout. :) Minor
* when `internalTrafficPolicy=PreferLocal`, route to endpoints in EndpointSlice that matches the local node's topology (topology defined by `kubernetes.io/hostname`), | ||
* when `trafficPolicy=Cluster`, default to existing behavior today. | ||
* when `trafficPolicy=Topology`, use topology hints from EndpointSlice API. | ||
* when `trafficPolicy=PreferLocal`, route to endpoints in EndpointSlice that matches the local node's topology (topology defined by `kubernetes.io/hostname`), | ||
fall back to "Cluster" behavior if there are no local endpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to figure out if the fallback is Cluster or Topology or if we need 2 modes. OK for here, the PR will need to explore that.
/approve |
|
||
## Proposal | ||
|
||
Introduce a new field in Service `spec.internalTrafficPolicy`. The field will have 3 codified values: | ||
Introduce a new field in Service `spec.trafficPolicy`. The field will have 4 codified values: | ||
1. Cluster (default): route to all cluster-wide endpoints (or use topology aware subsetting if enabled). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should #2166 be rebased or we could take the chance to update Cluster
to All
cc @howardjohn
/approve PRR @andrewsykim - please apply the remaining minor comments in the follow up PR /hold cancel |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewsykim, thockin, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The PRR for KEP-2086 is already filled out for alpha but it needs to be approved
Signed-off-by: Andrew Sy Kim kim.andrewsy@gmail.com