-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New method for providing configurable self-hosted LB/DNS/VIP for on-prem #524
Conversation
cc: @patrickdillon |
What a small world! I stumbled on this today right before I planned to write a script to remove all the KNI stuff from the MCO configs, since I have external DNS/LB available. I'm a 👍 on this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see some work on giving the cluster admin more control over the LB/DNS stack. This is a highly desired feature for deployments on OpenStack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts inline.
As I mentioned in one of my comments, I'm still a little unclear about the plan for supporting external LBs. What I had in mind is not quite what I'm hearing from other people, so it might be good if we could have a sync meeting when I'm back from PTO.
/cc @Miciah |
|
||
### Open Questions [optional] | ||
|
||
- Is `cluster-hosted-net-services-operator` an acceptable name? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cluster-self-hosted-load-balancers-operator ?
In order to run the CHNSO components, Kubelet in each node should first join the cluster by communicating with the control plane, however the components provided by the CHNSO are responsible for enabling that. | ||
As a result, the below circular dependency should be addressed: | ||
* Kubelet can't talk to the control plane until the CHNSO has started. | ||
* CHNSO can't start until Kubelet can talk to the control plane. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also applies to kube-controller-manager and kube-scheduler. They both use the API VIP to communicate with the API. In other words: without API VIP no deployments, no replicasets, no pods.
|
||
### Open Questions [optional] | ||
|
||
- Is `cluster-hosted-net-services-operator` an acceptable name? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/net/network ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know there was some objection to the "cluster-hosted" part of the name previously. What if we replaced that with "internal"? I think the main thing we're trying to communicate here is that on other platforms these are external services provided by the cloud, but for on-prem they need to hosted internally.
I believe the "services" part was also considered redundant, so would something like "internal-network-operator" be a better option? It's shorter, at least. :-) The one major objection I could see to that is keepalived is providing an externally available VIP, and the internal name might be confusing in that context.
@yboaron what I'm missing in enhancement is:
|
|
||
### Suggested design | ||
|
||
To support early clustering requirements Keepalived will continue running as static pods through MCO, additionally, a new dnsmasq service (also deployed through MCO) will run in each node while HAProxy, CoreDNS-MDNS and MDNS-publisher will run as static pods through the new operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contradicts line 64. Who will run keepalived on masters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keepalived should run by the MCO and according to the description from line64 services on masters will be managed by both MCO and the new operator.
It seems OK to me.
Am I missing something?
#### Self hosted API Loadbalancer | ||
|
||
- The current self hosted LB implementation (based on Keepalived and HAProxy) doesn't support graceful switchover, which means connections will break upon shutdown of node holds the VIP. | ||
- The self hosted API loadbalancer will run similarly to the current mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only similarly? Why not equally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change that
`Progressing=False` when the `Infrastructure` resource is a platform | ||
type other than openstack, baremetal, ovirt or vsphere. | ||
- Update `ClusterOperator` DEGRADED field in accordance with the following healthchecks (in case self-hosted stack is enabled): | ||
- single node holds the VIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can there be multiple holding the VIP ? How would you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, my fault, I'll update this line
Below is a possible option for a CRD instance of this operator: | ||
|
||
```yaml | ||
apiVersion: clusterHostedNetServices.operator.io/v1alpha1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
operator.openshift.io/v1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
spec: | ||
dns: | ||
nodesResolution: Enabled | ||
appsResolution: Enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are apps? Do you mean services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the .apps wildcard DNS record
|
||
- resolve node names and .apps wildcard record | ||
- resolve api-int to 192.168.111.5. | ||
- Run the self-hosted Loadbalancer for api only if apiintIpAddress is equal to API-VIP value provides in install-config file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is that API-VIP value during cluster runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The VIP address provided in the install-config.yaml file
|
||
In order to migrate the self-hosted LB to an external Load balancer the admin should: | ||
|
||
- Provide new IP address (!= API-VIP from install-config) pointing to external Load balancer front end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there no way to use a DNS name only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's possible since there are kubeconfig files (e.g: /var/lib/kubelet/kubeconfig) pointing to https://api-int.ostest.test.metalkube.org:6443 server
Removing hold now that processes keep running as static pods. /hold cancel |
…r on-prem In current implementation, the self-hosted DNS/LB/VIP stack runs under the MCO umbrella by default in all on-prem platforms and there's no option to configure it. Some customers usually have their own external Load Balancing and DNS resolution. Besides, there are cases in which some traffic used to provide these self-hosted services is disallowed in the customer's network. The outcome of such scenarios is that cluster's resources are spent on providing unused services. This PR suggests a new configurable method for providing the self-hosted stack.
As this seems to have stalled: I have removed my hold as the technical concerns about DaemonSet are gone after targetting static pods again. Hence, this is mainly a concern of MCO team now and general platform strategy. And speaking about my view of general platform strategy: this enhancement allows customers to use their own LB even on these on-prem platforms. I think this is the wrong direction. We should rather become or stay opinionated where we can. I don't see value for customers to have their own internal API LB. Instead this increases support costs of the product. Note: I cannot comment on the need for the DNS part of this enhancement. As this is about general direction, I would like to hear architects' opinion: @smarterclayton @derekwaynecarr |
Sorry for the late response (It took time to clarify the detailed requirements in the context of internal API/Ingress traffic with the PMs), After further discussions with PM it was recently decided that till we'll have detailed requirements from customers regarding internal api/ingress traffic it would be possible for the external API/Ingress traffic to run through external LB , similar to what shiftstack doing [0] while the internal API/Ingress will continue running through self-hosted stack. As per on-prem infra DNS components (coredns-mdns and mdns-publisher), there's already ongoing work [1], [2] to replace them by dnsmaq (similar to what described in [3] enhancement ) So, I assume I should close this PR [0] https://docs.openshift.com/container-platform/4.7/networking/load-balancing-openstack.html#nw-osp-configuring-external-load-balancer_load-balancing-openstack |
/close |
@yboaron: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In current implementation, the self-hosted DNS/LB/VIP stack runs under the MCO umbrella by default
in all on-prem platforms and there's no option to configure it.
Some customers have their own external Load Balancing and DNS resolution,besides, there are cases in which some traffic used to provide these self-hosted services is disallowed in the customer's network.
The outcome of such scenarios is that cluster's resources are spent on providing unused services.
This PR suggests a new configurable method for providing the self-hosted stack.