Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate ip #121

Merged
merged 8 commits into from
Sep 23, 2019
Merged

Propagate ip #121

merged 8 commits into from
Sep 23, 2019

Conversation

arianvp
Copy link
Contributor

@arianvp arianvp commented Aug 5, 2019

This makes sure that nginx-controller has access to the source IP address of a request.

Fixes #119

@arianvp
Copy link
Contributor Author

arianvp commented Aug 5, 2019

I haven't been able to test this yet because I accidentally removed my local wire app config and I forgot what hacks I needed to get it working. Something with content security policies that didn't work properly. will try to set up again tomorrow.

@arianvp arianvp marked this pull request as ready for review September 5, 2019 09:26
@arianvp
Copy link
Contributor Author

arianvp commented Sep 11, 2019

This works on my installation of wire!

Copy link
Contributor

@tiago-loureiro tiago-loureiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth making this the default setting? I don't think it breaks anything, any thoughts @jschaul ?

Either way, could you add a potential reference here to make it easier for the reader? Thanks!

@jschaul
Copy link
Member

jschaul commented Sep 11, 2019

I'm fine making that the default. Do I understand correctly that setting this to DaemonSet ensures there is one pod running on each kubernetes worker node? Or do we need to take care that the amount of replicas matches the amount of workers manually?

Another question: in the demo value, you set Daemonset, in the prod value, you don't set it. Why exactly?

@arianvp
Copy link
Contributor Author

arianvp commented Sep 16, 2019

I was reluctant to change it there because for a "prod" environment running it as a DaemonSet probably isn't the best idea. (If you have 10 worker nodes, you'll have 10 nginx servers).

I saw in our prod version that we have an (commented) nodeSelector field, which can be used to force the pods to run on certain nodes. For example you might have 3 nodes behind your loadbalancer in your 10 node cluster. and then you can do something like a Deploymetnt together with nodeSelector will allow you to force that each of those 3 nodes have nginx-ingress running.

# this could be a line in our kubespray playbook
kubectl label nodes worker1 wire.com/role=ingress
kubectl label nodes worker2 wire.com/role=ingress
kubectl label nodes worker3 wire.com/role=ingress
# worker3-10 are not labeled this
# deployment.yaml
  nodeSelector:
    wire.com/role: ingress

Maybe I should add a comment about that. This sounded like a more sane choice than running nginx on every node.

But again, every prod environment is going to be different with regards to load balancing, and it's hard to give a "right" answer for that usecase. So we should probably instead link to https://kubernetes.github.io/ingress-nginx/deploy/baremetal/ or a more "wire-specific" doc on https://docs.wire.com/ that explains what choices a person should make whilst setting up a Loadbalancer and/or ingress for their k8s cluster

@arianvp
Copy link
Contributor Author

arianvp commented Sep 16, 2019

I'm fine making that the default. Do I understand correctly that setting this to DaemonSet ensures there is one pod running on each kubernetes worker node? Or do we need to take care that the amount of replicas matches the amount of workers manually?

Your assumption is correct. There is one pod running on each worker node.

@jschaul
Copy link
Member

jschaul commented Sep 16, 2019

I was reluctant to change it there because for a "prod" environment running it as a DaemonSet probably isn't the best idea. (If you have 10 worker nodes, you'll have 10 nginx servers).

I saw in our prod version that we have an (commented) nodeSelector field, which can be used to force the pods to run on certain nodes. For example you might have 3 nodes behind your loadbalancer in your 10 node cluster. and then you can do something like a Deploymetnt together with nodeSelector will allow you to force that each of those 3 nodes have nginx-ingress running.

# this could be a line in our kubespray playbook
kubectl label nodes worker1 wire.com/role=ingress
kubectl label nodes worker2 wire.com/role=ingress
kubectl label nodes worker3 wire.com/role=ingress
# worker3-10 are not labeled this
# deployment.yaml
  nodeSelector:
    wire.com/role: ingress

Maybe I should add a comment about that. This sounded like a more sane choice than running nginx on every node.

But again, every prod environment is going to be different with regards to load balancing, and it's hard to give a "right" answer for that usecase. So we should probably instead link to https://kubernetes.github.io/ingress-nginx/deploy/baremetal/ or a more "wire-specific" doc on https://docs.wire.com/ that explains what choices a person should make whilst setting up a Loadbalancer and/or ingress for their k8s cluster

We can of course link to, or provide a brief description of choices available on the docs. Is there a way to do the kubectl label part in a way that is not ad-hoc? I think at this point in time I would prefer a more automatic way of configuring (with for instance a daemon set) as a default, rather than having to manually know the names of workers, label them, and need to think of labeling another worker again 6 months later when replacing a worker instance. Unless there is a way to do that labeling automatically, what do you think of making the daemonset the default and adding docs on how to do a different setup "in case of large clusters" where the extra cpu/memory of an nginx could be a problem?

@arianvp
Copy link
Contributor Author

arianvp commented Sep 16, 2019

of choices available on the docs. Is there a way to do the kubectl label part in a way that is not ad-hoc? I think at this point in time I would prefer a more automatic way of configuring (with for instance a daemon set) as a default, rather than having to manually know the names of workers, label them, and need to think of labeling another worker again 6 months later when replacing a worker instance.

It is exposed as a variable in kubespray so we could do something like:

# inventory.ini
[ingress]
node1
node2
node3
# group_vars.yaml
ingress:
  node_labels:
    wire.com/role:  ingress

then when replacing nodes or whatever, ansible will do the labelling for us
This is implemented by giving the list node labels as a CLI argument to kubelet:
https://github.com/kubernetes-sigs/kubespray/blob/1be788f7854dfb841a9bda4d4183e867ffb33a0d/roles/kubernetes/node/templates/kubelet.env.v1beta1.j2#L34

@jschaul
Copy link
Member

jschaul commented Sep 16, 2019

of choices available on the docs. Is there a way to do the kubectl label part in a way that is not ad-hoc? I think at this point in time I would prefer a more automatic way of configuring (with for instance a daemon set) as a default, rather than having to manually know the names of workers, label them, and need to think of labeling another worker again 6 months later when replacing a worker instance.

It is exposed as a variable in kubespray so we could do something like:

# inventory.ini
[ingress]
node1
node2
node3
# group_vars.yaml
ingress:
  node_labels:
    wire.com/role:  ingress

then when replacing nodes or whatever, ansible will do the labelling for us
This is implemented by giving the list node labels as a CLI argument to kubelet:
https://github.com/kubernetes-sigs/kubespray/blob/1be788f7854dfb841a9bda4d4183e867ffb33a0d/roles/kubernetes/node/templates/kubelet.env.v1beta1.j2#L34

Sounds like a plan!

@arianvp
Copy link
Contributor Author

arianvp commented Sep 16, 2019

Maybe then just do that by default? In the case that all nodes are in the ingress group then this does the same as it would if we label the deployment a DaemonSet right?

Well it sort of does. DaemonSet forces that each node runs at least one pod. Having it as a deployment could get you that one ingress node runs two pods, one node runs one pod, and one node runs zero pods (as spreading of replicas for deployments is best-effort), but I think this is fine for our usecase right?

Only problem I can think of is: We have a dumb loadbalancer on the outside, like a fixed set of DNS records that doesn't change, and then one of the nodes is not running the nginx ingress. Because the policy is Local, and we send a request to that node (Due to DNS round robin), that request will be dropped, as kube-proxy will not forward it due to this setting. With daemonset we have a slightly higher chance that there is something running on that node (but again, we cannot theoretically assume it).

@tiago-loureiro
Copy link
Contributor

With daemonset we have a slightly higher chance that there is something running on that node (but again, we cannot theoretically assume it).

Why slightly higher? As said, being a daemonset forces it to run on every node unless we ran out of resources, no?

I also it's preferable to have a default which works out of the box, even if it's perhaps a little resource wasteful but nginx is pretty low on resources in general. You have documented it nicely on the examples and we can add it to our docs as well but for bare metal daemonset + nodePort seems like a sane thing to do.

@arianvp
Copy link
Contributor Author

arianvp commented Sep 18, 2019

a deployment has a spread "attempt" but no guarantee. A daemonset has a spread guarantee but that doesn't mean the pods it has running on each node are actually functioning; it might be in a crashloop the entire time due to a wrong nginx config for example. The point I was trying to make is that you still need something "smart" in front that will not route traffic to dead pods (e.g. a loadbalancer, or a health-check aware DNS server), even if nginx is running on each node.

@jschaul
Copy link
Member

jschaul commented Sep 18, 2019

Let's go with Tiago's suggestion: by default (in the chart) use a daemonset. In the docs, and overridable, allow to disable that and instead use a tagged-node approach. Would you be fine with that @arianvp ?

@tiago-loureiro
Copy link
Contributor

spread guarantee but that doesn't mean the pods it has running on each node are actually functioning; it might be in a crashloop the entire time

Oh right I see, good point 👍

The point I was trying to make is that you still need something "smart" in front that will not route traffic to dead pods (e.g. a loadbalancer, or a health-check aware DNS server), even if nginx is running on each node.

Yup, clear :)

@arianvp
Copy link
Contributor Author

arianvp commented Sep 18, 2019

Sounds good to me!

@jschaul jschaul changed the base branch from master to develop September 18, 2019 11:46
…ource IP addresses

TODO: Maybe we should let the production config 'inherit' from the demo
config, as now the comments are duplicated twice which is prone to error
Explains all the kinds of configuration that we think are useful.
And adds links with extra information
@arianvp
Copy link
Contributor Author

arianvp commented Sep 19, 2019

OK I have added the new and shiney explanations to the values.yml
I should probably update https://github.com/wireapp/wire-server-deploy/blob/develop/docs/configuration.md#load-balancer-on-bare-metal-servers to reflect this newly gained knowledge? It's out of date anyway. Or should I move this file to docs.wire.com and then update it there?

@jschaul
Copy link
Member

jschaul commented Sep 19, 2019

I should probably update https://github.com/wireapp/wire-server-deploy/blob/develop/docs/configuration.md#load-balancer-on-bare-metal-servers to reflect this newly gained knowledge? It's out of date anyway. Or should I move this file to docs.wire.com and then update it there?

Either way works - that file has not yet migrated to docs.wire.com yet. Whether you first update it, then we migrate it, or the other way round, is up to you.

@@ -0,0 +1,80 @@
# Normally, NodePort will listen to traffic on all nodes, and uses kube-proxy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file now looks great, with all the explanations. Could you make this the default? By "making it the default", I mean actually moving this to charts/nginx-ingress-controller/values.yaml. That way, we will not have any file under values/nginx-ingress-controller at all, and installing with helm install wire/nginx-ingress-controller will provide one working solution out-of-the-box.

@arianvp arianvp merged commit 442c556 into develop Sep 23, 2019
@arianvp arianvp deleted the propagate-ip branch September 23, 2019 14:53
@tiago-loureiro tiago-loureiro mentioned this pull request Oct 8, 2019
q3k pushed a commit that referenced this pull request Oct 8, 2019
* Configure nginx-controller to run on each node, so that we preserve source IP addresses

* Add a heavily commented example of values.yaml for nginx ingress

* Add example of how to set up inventory file

* Make our suggested nginx ingress config the default config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure that we have a way to check a client's IP addresses
3 participants