Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load balancing #45

Open
zonca opened this issue Feb 8, 2021 · 10 comments
Open

Load balancing #45

zonca opened this issue Feb 8, 2021 · 10 comments
Assignees

Comments

@zonca
Copy link
Owner

zonca commented Feb 8, 2021

The last version of Kubespray has support for MetalLB, which is a Kubernetes service which can provide a load-balancing service without relying on cloud providers.

This would allow the deployment to be more resilient in case the master node which has the NGINX service and which is now a single point of failure.

For larger deployments which need to be more resilient this would be really nice to implement.

References:

@zonca zonca self-assigned this Feb 8, 2021
@zonca
Copy link
Owner Author

zonca commented Feb 8, 2021

I don't understand how MetalLB can work in Openstack, it says you need to assign a range of IPs to MetalLB that can be assigned to services, but if I point a bunch of Openstack floating IPs to 1 node, and then MetalLB hands them out to services, the problem is that if that node goes down, there is no way to tell Openstack to redirect those IP to another node.

So it seems it is only a convenience to have services automatically published outside but we don't gain any resiliency.

@zonca
Copy link
Owner Author

zonca commented Feb 8, 2021

So I would return to my initial idea of having 1 machine outside of Kubernetes to be a load balancer, so the idea would be to have a single node run only HAProxy so that it is really difficult for it to fail (in principle can have roundrobin DNS to have a couple of these).

This HAProxy node points to the Kubernetes cluster deployed by Kubespray which has at least 2 master nodes. So now if one of the 2 master node fails, the traffic is routed to the second master node.

So now we don't have a single point of failure (except the HAProxy node which is really lightweight).

the HAProxy node should be in the same local network of Kubernetes and be able to forward the requests directly to the internal Service IP, so then the internal networking can be handled by k8s.

@zonca
Copy link
Owner Author

zonca commented Feb 8, 2021

For HTTPS we can start with pass-through and have the NGINX ingress and Kuberentes letsencrypt service handle it.
But if we are only hosting 1 service (i.e. JupyterHub) it is probably easier to do SSL termination with letsecrypt at the HAProxy node.

@zonca
Copy link
Owner Author

zonca commented Feb 8, 2021

@julienchastang do you think a load balancer would be useful for your deployments? or not worth the extra effort?

@julienchastang
Copy link
Contributor

In practice, the problems I have encountered have mostly been with volumes:

  • volumes "stuck" in reserved (described previously).
  • volumes slow to attach

There have also been LDAP problems at TACC affecting IU.

That is what I need the most help with. Not sure if loadbalancers address those problems.

@zonca
Copy link
Owner Author

zonca commented Feb 8, 2021

thanks @julienchastang, no, a load balancer is not of any help with that

@sebastian-luna-valero
Copy link

Hi,

As mentioned in #43 I deploy a separate VM to act as a reverse proxy for the JupyterHub deployment on Kubernetes with Magnum on the same OpenStack project/VLAN. At the moment this is the simplest reverse proxy configuration, and I haven't configured load balancing yet.

On the reverse proxy VM I follow the official documentation:

https://jupyterhub.readthedocs.io/en/stable/reference/config-proxy.html

I deploy a VM with Ubuntu 20.04 and Apache with:

# edit
vi /etc/apache2/sites-available/000-default.conf
 
# adding
  ServerName <fqdn>
  RewriteEngine On
  RewriteCond %{HTTP:Connection} Upgrade [NC]
  RewriteCond %{HTTP:Upgrade} websocket [NC]
  RewriteRule /<subpath>/(.*) ws://<private-ip-on-kube-master>:30001/<subpath>/$1 [NE,P,L]
  RewriteRule /<subpath>/(.*) http://<private-ip-on-kube-master>:30001/<subpath>/$1 [NE,P,L]
  ProxyPreserveHost on
  ProxyPass /<subpath>/ http://<private-ip-on-kube-master>:30001/<subpath>/
  ProxyPassReverse /<subpath>/  http://<private-ip-on-kube-master>:30001/<subpath>/
 
# restart apache
systemctl restart apache2

On the kubernetes cluster, the values.yaml is configured with:

proxy:
  service:
    type: NodePort
    nodePorts:
      http: 30001

hub:
  baseUrl: /<subpath>

A floating IP is assigned to the reverse proxy VM, and an SSL certificate has also been configured.

In terms of load balancing, I would follow steps in http://httpd.apache.org/docs/2.4/mod/mod_proxy_balancer.html but this would be an static configuration not accounting for autoscaling in kubernetes.

I hope that helps.

Best regards,
Sebastian

@zonca
Copy link
Owner Author

zonca commented Feb 19, 2021

thanks @sebastian-luna-valero, the pointer to the JupyterHub docs is really useful!
so I think it is not worth pursuing this now, I don't have any deployment large enough to have 2 master nodes.
But I think it won't be too difficult to implement if needed.

Another option where this could be useful, in theory is if we want to have on the same domain both JupyterHub and another website, so the reverse proxy could serve both. Anyway it seems it would be just easier in this case to have 2 subdomains, so that 1 only points to JupyterHub/Kubernetes.

@zonca
Copy link
Owner Author

zonca commented Feb 22, 2023

we might have Octavia deployed to Jetstream 2 soon!

@zonca
Copy link
Owner Author

zonca commented Oct 13, 2023

Octavia has been available for a while, https://docs.jetstream-cloud.org/general/octavia/
try to modify the recipe to have a load balancer in front of the master node instead of using the IP of the master node.
this can also allow us to deploy multiple master nodes instead of just 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants