-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for ingress-nginx + cert-manager #10
feat: add support for ingress-nginx + cert-manager #10
Conversation
Thanks for the pull request, @MoisesGSalas! Please note that it may take us up to several weeks or months to complete a review and merge your PR. Feel free to add as much of the following information to the ticket as you can:
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here. Please let us know once your PR is ready for our review and all tests are green. |
tutor-multi-chart/values.yaml
Outdated
|
||
# Configuration for the Traefik load balancer: | ||
traefik: | ||
# By default use ingress-nginx + cert-manager. If you want to use the Traefik controller | ||
# instead remember to disable ingress-nginx and cert manager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be the case that nobody actually wants to use Traefik. It's what I'm personally familiar with, so I used it to get a proof of concept built quickly, but it may make sense to just remove it as an option to simplify our approach here. I think most people will be happy with nginx. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm indeed most familiar with nginx and as far as I know the same applies to most people involved in this project. Unless someone is well versed with Traefik I agree with dropping it to make things easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed all the references to traefik
# Open edX installation to add to your ingress routes. This especially evident | ||
# when installing additional plugins such as tutor-ecommerce or tutor-minio. | ||
# The workaround is to manually add a list of hosts to be routed to the caddy | ||
# instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a bit of an issue. I'd like to see a more automatic solution in the future.
In the future, the Ingress
objects could be defined in each tutor plugin, e.g. https://github.com/overhangio/tutor-ecommerce/blob/master/tutorecommerce/patches/k8s-services instead of requiring this separate plugin. If they use an ingress class name like tutor-multi-nginx
, they'll be ignored in the "normal" Tutor k8s environment (which has no ingress controller) but they'll work automatically in our helm chart with its central load balancer. So I think that would be a nice solution. Otherwise, what you have here seems like a good approach for now.
Thoughts?
CC @regisb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that could work, or maybe a different kind of patch like k8s-ingress
or k8s-ingress-route
.
I think allowing some kind of code introspection between plugins can open interesting possibilities, but this is mostly a passing thought I had at the moment. I need to explore the plugin API more in depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we leverage the Caddy API? On a local installation of Open edX this is what I'm getting with curl localhost:2019/config/
inside the caddy container:
{
"apps" : {
"http" : {
"servers" : {
"srv0" : {
"listen" : [
":80"
],
"logs" : {
"logger_names" : {
"apps.local.overhang.io" : "log4",
"discovery.local.overhang.io" : "log2",
"ecommerce.local.overhang.io" : "log3",
"local.overhang.io" : "log0",
"preview.local.overhang.io" : "log0",
"studio.local.overhang.io" : "log1"
}
},
"routes" : [
{
"handle" : [
{
"handler" : "subroute",
"routes" : [
{
"handle" : [
{
"handler" : "request_body",
"max_size" : 10000000
},
{
"encodings" : {
"gzip" : {}
},
"handler" : "encode",
"prefer" : [
"gzip"
]
},
{
"handler" : "reverse_proxy",
"headers" : {
"request" : {
"set" : {
"X-Forwarded-Port" : [
"80"
]
}
}
},
"upstreams" : [
{
"dial" : "discovery:8000"
}
]
}
]
}
]
}
],
"match" : [
{
"host" : [
"discovery.local.overhang.io"
]
}
],
"terminal" : true
},
{
"handle" : [
{
"handler" : "subroute",
"routes" : [
{
"handle" : [
{
"handler" : "request_body",
"max_size" : 10000000
},
{
"encodings" : {
"gzip" : {}
},
"handler" : "encode",
"prefer" : [
"gzip"
]
},
{
"handler" : "reverse_proxy",
"headers" : {
"request" : {
"set" : {
"X-Forwarded-Port" : [
"80"
]
}
}
},
"upstreams" : [
{
"dial" : "ecommerce:8000"
}
]
}
]
}
]
}
],
"match" : [
{
"host" : [
"ecommerce.local.overhang.io"
]
}
],
"terminal" : true
},
{
"handle" : [
{
"handler" : "subroute",
"routes" : [
{
"handle" : [
{
"handler" : "request_body",
"max_size" : 1000000
}
],
"match" : [
{
"path" : [
"/api/profile_images/*/*/upload"
]
}
]
},
{
"handle" : [
{
"handler" : "request_body",
"max_size" : 4000000
}
]
},
{
"group" : "group2",
"handle" : [
{
"handler" : "rewrite",
"uri" : "/theming/asset/images/favicon.ico"
}
],
"match" : [
{
"path_regexp" : {
"pattern" : "^/favicon.ico$"
}
}
]
},
{
"handle" : [
{
"encodings" : {
"gzip" : {}
},
"handler" : "encode",
"prefer" : [
"gzip"
]
},
{
"handler" : "reverse_proxy",
"headers" : {
"request" : {
"set" : {
"X-Forwarded-Port" : [
"80"
]
}
}
},
"upstreams" : [
{
"dial" : "lms:8000"
}
]
}
]
}
]
}
],
"match" : [
{
"host" : [
"local.overhang.io",
"preview.local.overhang.io"
]
}
],
"terminal" : true
},
{
"handle" : [
{
"handler" : "subroute",
"routes" : [
{
"handle" : [
{
"handler" : "request_body",
"max_size" : 250000000
}
]
},
{
"group" : "group3",
"handle" : [
{
"handler" : "rewrite",
"uri" : "/theming/asset/images/favicon.ico"
}
],
"match" : [
{
"path_regexp" : {
"pattern" : "^/favicon.ico$"
}
}
]
},
{
"handle" : [
{
"encodings" : {
"gzip" : {}
},
"handler" : "encode",
"prefer" : [
"gzip"
]
},
{
"handler" : "reverse_proxy",
"headers" : {
"request" : {
"set" : {
"X-Forwarded-Port" : [
"80"
]
}
}
},
"upstreams" : [
{
"dial" : "cms:8000"
}
]
}
]
}
]
}
],
"match" : [
{
"host" : [
"studio.local.overhang.io"
]
}
],
"terminal" : true
},
{
"handle" : [
{
"handler" : "subroute",
"routes" : [
{
"handle" : [
{
"handler" : "request_body",
"max_size" : 2000000
},
{
"encodings" : {
"gzip" : {}
},
"handler" : "encode",
"prefer" : [
"gzip"
]
}
]
},
{
"handle" : [
{
"handler" : "static_response",
"status_code" : 204
}
],
"match" : [
{
"path" : [
"/"
]
}
]
},
{
"handle" : [
{
"handler" : "reverse_proxy",
"headers" : {
"request" : {
"set" : {
"Host" : [
"local.overhang.io"
]
}
}
},
"upstreams" : [
{
"dial" : "lms:8000"
}
]
}
],
"match" : [
{
"path" : [
"/api/mfe_config/v1*"
]
}
]
},
{
"handle" : [
{
"handler" : "reverse_proxy",
"headers" : {
"request" : {
"set" : {
"X-Forwarded-Port" : [
"80"
]
}
}
},
"upstreams" : [
{
"dial" : "mfe:8002"
}
]
}
]
}
]
}
],
"match" : [
{
"host" : [
"apps.local.overhang.io"
]
}
],
"terminal" : true
}
]
}
}
}
},
"logging" : {
"logs" : {
"default" : {
"exclude" : [
"http.log.access.log2",
"http.log.access.log3",
"http.log.access.log0",
"http.log.access.log1",
"http.log.access.log4"
]
},
"log0" : {
"encoder" : {
"fields" : {
"common_log" : {
"filter" : "delete"
},
"request>headers" : {
"filter" : "delete"
},
"resp_headers" : {
"filter" : "delete"
},
"tls" : {
"filter" : "delete"
}
},
"format" : "filter",
"wrap" : {
"format" : "json"
}
},
"include" : [
"http.log.access.log0"
],
"writer" : {
"output" : "stdout"
}
},
"log1" : {
"encoder" : {
"fields" : {
"common_log" : {
"filter" : "delete"
},
"request>headers" : {
"filter" : "delete"
},
"resp_headers" : {
"filter" : "delete"
},
"tls" : {
"filter" : "delete"
}
},
"format" : "filter",
"wrap" : {
"format" : "json"
}
},
"include" : [
"http.log.access.log1"
],
"writer" : {
"output" : "stdout"
}
},
"log2" : {
"encoder" : {
"fields" : {
"common_log" : {
"filter" : "delete"
},
"request>headers" : {
"filter" : "delete"
},
"resp_headers" : {
"filter" : "delete"
},
"tls" : {
"filter" : "delete"
}
},
"format" : "filter",
"wrap" : {
"format" : "json"
}
},
"include" : [
"http.log.access.log2"
],
"writer" : {
"output" : "stdout"
}
},
"log3" : {
"encoder" : {
"fields" : {
"common_log" : {
"filter" : "delete"
},
"request>headers" : {
"filter" : "delete"
},
"resp_headers" : {
"filter" : "delete"
},
"tls" : {
"filter" : "delete"
}
},
"format" : "filter",
"wrap" : {
"format" : "json"
}
},
"include" : [
"http.log.access.log3"
],
"writer" : {
"output" : "stdout"
}
},
"log4" : {
"encoder" : {
"fields" : {
"common_log" : {
"filter" : "delete"
},
"request>headers" : {
"filter" : "delete"
},
"resp_headers" : {
"filter" : "delete"
},
"tls" : {
"filter" : "delete"
}
},
"format" : "filter",
"wrap" : {
"format" : "json"
}
},
"include" : [
"http.log.access.log4"
],
"writer" : {
"output" : "stdout"
}
}
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @regisb . It seems like the Caddy API has the data we need, but I'm not sure how it works with the lifecycle of Open edX with Tutor. The Caddy API won't return data until after the instance's Caddy container is already deployed and configured. But the k8s Ingress
object is generated and deployed at the same time. In other words we can't query the Caddy API when Tutor is generating the config with tutor config save
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you really don't want to query the Caddy API at configuration time. If we really to do some dynamic introspection, then it has to happen either at init-time or in a dedicated do ...
command.
# for per-instance load balancing. Eventually it would be nice to get rid of | ||
# it completely as it's not needed in this setup, but that would break | ||
# compatibility with existing Tutor k8s plugins that configure Caddy via | ||
# patches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/optional: we can remove this comment that says "it would be nice to get rid of it completely" - as explained by Regis in this thread it is actually performing some useful functions beyond just basic routing of HTTP requests, and it's extremely lightweight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the comment, but I want to revisit the Caddy discussion in the forum at a later date because we recently faced a few issues regarding a high rate of 502 errors when running a large a mount of lms pods.
|
||
cert-manager: | ||
# Use cert-manager as a default certificate controller. | ||
enabled: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cert-manager docs say:
Be sure never to embed cert-manager as a sub-chart of other Helm charts; cert-manager manages non-namespaced resources in your cluster and care must be taken to ensure that it is installed exactly once.
But they also provide clear instructions on how to install it as a sub-chart so I guess it's fine?
I think we should probably just put something in the README which says "If you already have cert-manager on your cluster, be sure to set cert-manager.enabled: false
for this chart, to avoid any conflicts.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance seems like the main concern is losing track of whether you have cert-manager or not and installing it twice by mistake. I added a section in the README that mentions this.
This actually got me thinking about having a large amount of different resources living under the same namespace and if it's that what we want (autoscaling, monitoring, etc). The cert-manager docs mention that we could install each dependency in its own namespace if each chart allows it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually got me thinking about having a large amount of different resources living under the same namespace and if it's that what we want (autoscaling, monitoring, etc). The cert-manager docs mention that we could install each dependency in its own namespace if each chart allows it.
I'm open to either approach. For now I think one namespace is easier because of how helm works but it may not be the best.
Awesome work here @MoisesGSalas ! I read through the code and left a few comments/questions but I think it looks great. I'll test it next week and then give it my 👍🏻 |
Thanks a lot for the comments @bradenmacdonald. I tried to address all of them, tell me your thoughts whenever you have time. |
Thanks @MoisesGSalas! Sorry for the delay - I'll try to test and approve this soon :) |
values-example.yaml
Outdated
@@ -1,5 +1,7 @@ | |||
# Set your email address here so auto-generated HTTPS certs will work: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move this comment down to be right above where the email is set.
I'm trying to test this now on a fresh kubernetes cluster but getting an error:
It took me a while but here's what I figured out: The Helm docs say that charts should put their CRDs into a special Some possible solutions to this could be:
|
Sorry for the late response @bradenmacdonald, I'm in favor managing the CRDs for cert-manager outside the chart especially if we consider that doing that is an actual use case that is mentioned in the cert-manager docs. I've updated the Readme to address this caveat. |
Hey @bradenmacdonald, do you have any updates on when you'll be able to continue reviewing here? |
@MoisesGSalas @itsjeyd So sorry for the delay. I have finally been able to test this again this week and the CRDs are no longer an issue but something doesn't seem to be working. My new test instance is returning a 404: http://braden-harmony-test01-01.opencraft.com/ It will take me a bit more time to figure out what the issue is. Hopefully tomorrow. |
Don't worry @bradenmacdonald, I appreciate your help. I think the error might be related with the ingressClass name. I threw a default value for |
@@ -45,7 +45,7 @@ kind: Ingress | |||
metadata: | |||
name: cluster-echo-test | |||
spec: | |||
ingressClassName: tutor-multi-traefik | |||
ingressClassName: {{ ternary (index .Values "ingress-nginx" "controller" "ingressClass") "tutor-multi-traefik" (index .Values "ingress-nginx" "enabled") }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove tutor-multi-traefik
from here - I don't think it's needed anymore.
Ah, I think you're right @MoisesGSalas. Actually there are two issues: first, you'll need to update this PR to use a consistent ingress class name in both places. I can see the ingress controller is using For now I manually changed the Ingress object's ingressClassName to |
I've adjusted the default values to match, I think it should be good now. I created a new cluster and was able to deploy two instances. I will be on holidays until later next week, so I apologize in advance for not being able to answer the PR until then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
- I tested this: deployed an instance onto a DigitalOcean k8s cluster
- I read through the code
- I checked for accessibility issues: n/a
- Includes documentation
Hmm, this is ready to merge but it seems I don't actually have merge rights here. @felipemontoya do you? |
Nop, I also don't. Im waiting on openedx/axim-engineering#665 |
Ah, right. Thanks for the link. |
@bradenmacdonald you should have permissions now. |
@MoisesGSalas 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future. |
Description
This is a first attempt to add support for
ingress-nginx
andcert-manager
in response to #1.For the most part it works the same as before, cert-manager and ingress-nginx were added as dependencies and are installed by default, with the possibility of simply using traefik.
The the helm chart now provisions a
clusterissuer
to provide the certificates used globally.and the plugin includes a list of additional hosts to be routed to the caddy instance.
There doesn't seem to be a nice way to discover the public hosts used by a tutor installation, for example, in addition to LMS, CMS, MFE and PREVIEW one may need to route traffic from ECOMMERCE, DISCOVERY, MINIO or any arbitrary plugin to caddy. I tried to think of a nice way to do this but wasn't able to come up with anything. I wonder what could be a nice way for plugins to announce their hosts.