-
Notifications
You must be signed in to change notification settings - Fork 39.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endpoint selected by multiple headless services only has 1 dns hostname #124207
Comments
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig network |
I'll just tack on -- this could totally be expected from statefulsets. It feels like the intersection of a lot of different pieces. If it's expected, I'd say the DNS spec I linked doesn't reflect reality and is overly strict. |
/assign @adrianmoisey |
@danwinship: GitHub didn't allow me to assign the following users: adrianmoisey. Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign |
Hello, So it seems that the issue you're facing here is that the DNS records for the StatefulSets are only created for the Service set in This behaviour is documented here already: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-network-id, specifically:
This behaviour is consistent with the code that you pointed out (https://github.com/kubernetes/kubernetes/blob/v1.29.3/staging/src/k8s.io/endpointslice/util/controller_utils.go#L113-L117) in the issue description. Regarding a path forward, is this a case where the DNS spec needs to be updated (https://github.com/kubernetes/dns/blob/master/docs/specification.md#24---records-for-a-headless-service) or is there a use case where the behaviour needs to change? |
I'd probably say updating the spec is the best path forward.
need is a strong word 🙂 but yes, there is a scenario where this is useful for us in practice. admiralty + linkerd multicluster provides a really easy and lightweight multicluster service topology. linkerd multicluster specifically supports per-endpoint routing for statefulsets by hostname. we depend on this today as we require per-endpoint tracking across clusters (we use endpoints, not k8s services, and this is a workload requirement today for resource allocation/tracking reasons). sometimes we want to do something like canary rollouts, or test a minority of endpoints on a new version. deploying two statefulsets with matching selectors works perfectly for this use case, except that we cannot track the hostnames correctly for both statefulsets across clusters due the DNS issue. This is already a fairly niche case and can likely be handled by something like SMI/traffic splitting further up the stack (well, that doesn't solve endpoint tracking, but we can deal with that I guess), but given it is a technical spec violation (even if longstanding behavior and the spec came later?), I figured I'd raise the issue. Other approaches which can deliver the same functionality impose much heavier requirements (flat disjoint network across clusters, identical CNI configuration -- cilium is a good example) or I have yet to find their existence (global overlay secondary VPN with minimal complexity, something like kilo is close, we found liqo awfully heavy. Submariner is also somewhat close conceptually). I imagine changing that behavior would be far more disruptive today. |
Part of the problem is that there are a few things conflated here. The Endpoint In hindsight, this is a sloppy design, but it comes from a different era, and is hard to undo. |
Turns out there are several more things here!
multiple PTRs for overlapping service with non-unique subdomains (probably, subdomain should be the determining thing here)
colliding PTR
this seems to intersect with #60789 |
additionally:
|
I can't speak to intention here, but it seems correct to me? Embedding a PodSpec is all-or-nothing, sadly.
Yeah this should be updated in the spec. |
I had a look at making a PR to update the DNS spec, but as I was digging into it, I realised that the spec seems fine.
This part of the spec seems to be talking about Endpoints, not Pods or Statefulsets. I also see that the Statefulset docs also describe the behaviour of DNS creation. |
You're right that this is speaking in terms of endpoints. If you think the spec is clear enough, we should close this. But fresh eyes are worth a fortune, so if you think it could be clarified, I am open to it. |
Overall I think it's clear enough. If it started getting into high-level concepts, such as relationships between Deployments and Pods and Endpoints, I think it would be a bit too much. My only suggestion here is to capitalise the references to Endpoint, making it clear that the spec is talking about Endpoint objects. But I'm not sure if that is just an implementation detail that may change in the future. |
The use of "endpoint" here is a little vague on purpose.It used to mean "records in the |
Yup, that's what I assumed. |
OK, thanks! |
What happened?
A pod in multiple endpoint slices selected by a headless service only resolves the hostname for one of the services.
Per the spec:
So I feel like I'm missing something?
What did you expect to happen?
Both names resolve (maybe?)
How can we reproduce it (as minimally and precisely as possible)?
now things get interesting:
end result is only
second-pod.second-service-name.ns.svc.cluster.local
resolves to the pod IP, even thoughsecond-pod.shared-service.ns.svc.cluster.local
should too?my read is that is due to this logic:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/endpointslice/util/controller_utils.go#L113-L117
https://github.com/coredns/coredns/blob/e3f83cb1fabb9b1cbaffb9df3c4b65476e92c39b/plugin/kubernetes/kubernetes.go#L476-L480
so who is wrong?
Anything else we need to know?
interestingly, on the endpoints with issue, I see in endpointslice yaml they don't have the hostname due to the k8s code above
long bash outputs of dig and endpoint slices full yaml output
yamls
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: