Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Traefik Dashboard #797

Merged
merged 59 commits into from
Sep 13, 2021
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
f48f364
add grafana traefik route
balast Jun 29, 2021
8144ee0
grafana working
balast Jul 14, 2021
6677c3e
prometheus-helm-chart-working
balast Jul 14, 2021
62b6be4
initial integration - wip
balast Jul 14, 2021
f200ab3
add external-url variable
balast Jul 14, 2021
1529ca9
add external-url variable
balast Jul 14, 2021
29e83a1
add tls var
balast Jul 14, 2021
43fbbc4
add tls var
balast Jul 14, 2021
d1d21bc
add tls var
balast Jul 14, 2021
dc23e26
merge with main
balast Jul 14, 2021
06c7cac
cluster monitoring docs
balast Jul 14, 2021
ddb28e9
fix debug change
balast Jul 14, 2021
6e9e4d4
fix formatting, delete ingress
balast Jul 14, 2021
b9b0eb5
add monitoring by default, fix routing service name
balast Jul 14, 2021
169766d
terraform format
balast Jul 14, 2021
f23e99d
Update monitoring instructions
Adam-D-Lewis Jul 14, 2021
c32074d
don't include helm chart in repo
balast Jul 16, 2021
9da48ee
Merge branch 'prometheus_grafana' of github.com:Quansight/qhub into p…
balast Jul 16, 2021
47be777
terraform format
balast Jul 16, 2021
a851719
terraform format
balast Jul 16, 2021
d08db3f
add the values file back
balast Jul 16, 2021
689ca2a
remove values files
Aug 2, 2021
6d25c36
terraform fmt
Aug 2, 2021
f30c0bf
terraform fmt
Aug 2, 2021
5130237
Merge branch 'main' into prometheus_grafana
Aug 2, 2021
59b0ef0
Merge remote-tracking branch 'origin/main' into prometheus_grafana
Aug 10, 2021
1bf1d19
up minikube memory
Aug 10, 2021
120321e
set CI minikube memory to 6500mb
Aug 10, 2021
6300040
move kubernetes tests to new file
Aug 12, 2021
844aa52
use self-hosted action runner (cirun.io)
Aug 12, 2021
7cb1247
add .cirun.yml
Aug 12, 2021
063ce2a
Misc fixes
aktech Aug 12, 2021
439ea44
Install cypress after k8s tests
aktech Aug 13, 2021
4b8ad35
use cheapest acceptable DO droplet
Aug 13, 2021
5d0e799
add release notes
Aug 13, 2021
33f3440
configure traefik to output metrics to prometheus
Aug 26, 2021
83e3f53
commented out lines for later
Aug 26, 2021
2cb8ff2
deleted commented out lines
Aug 26, 2021
f7074a3
Merge branch 'main' into prometheus_grafana
Aug 26, 2021
24f3b06
enable prometheus monitoring of traefik
Aug 27, 2021
bda0631
add some labels to automatically add traefik dashboards
Aug 30, 2021
9c23ab0
remove unneccessary comment
Aug 30, 2021
d0a7db2
reformat comment
Aug 30, 2021
702b252
apply terraform fmt
Aug 30, 2021
b876b03
apply terraform fmt
Aug 30, 2021
6a031d6
working example of loading dashboard
Aug 30, 2021
59b27b0
add traefik dashboard
Aug 31, 2021
deb462a
remove namespace from most graphs
Aug 31, 2021
b30a54a
remove label requirement for scraping
Sep 3, 2021
5662e19
fix namespace selector in dashboard
Sep 3, 2021
84ba2a9
trigger CICD
Sep 3, 2021
e204896
fix render error
Sep 3, 2021
a02265b
fix
Sep 3, 2021
02d55dd
Combine traefik return codes into a single graph
tylerpotts Sep 6, 2021
b5f41ef
test change to kubernetes tests
Sep 7, 2021
d2eeedf
Merge branch 'prometheus_grafana' of github.com:Quansight/qhub into p…
Sep 7, 2021
1851697
add wait for cyprus run
Sep 7, 2021
1e0e6ed
Revert "add wait for cyprus run"
Sep 7, 2021
07ea7f9
Merge branch 'main' into prometheus_grafana
Sep 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,38 @@ resource "kubernetes_service" "main" {
}
}

resource "kubernetes_service" "traefik_internal" {
wait_for_load_balancer = true

metadata {
name = "${var.name}-traefik-internal"
namespace = var.namespace
annotations = {
"prometheus.io/scrape" = "true"
"prometheus.io/path" = "/metrics"
"prometheus.io/port" = 9000
}
labels = {
"app.kubernetes.io/component" = "traefik-internal-service"
"app.kubernetes.io/part-of" = "traefik-ingress"
}
}

spec {
selector = {
"app.kubernetes.io/component" = "traefik-ingress"
}

port {
name = "http"
protocol = "TCP"
port = 9000
target_port = 9000
}

type = "ClusterIP"
}
}

resource "kubernetes_deployment" "main" {
metadata {
Expand Down Expand Up @@ -189,6 +221,8 @@ resource "kubernetes_deployment" "main" {
"--entryPoints.traefik.address=:9000",
"--entrypoints.web.http.redirections.entryPoint.to=websecure",
"--entrypoints.web.http.redirections.entryPoint.scheme=https",
# Enable Prometheus Monitoring of Traefik
"--metrics.prometheus=true",
# Enable debug logging. Useful to work out why something might not be
# working. Fetch logs of the pod.
"--log.level=${var.loglevel}",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
locals {
traefik_dashboard = file("${path.module}/traefik.json")
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,37 @@ resource "helm_release" "kube-prometheus-stack-helm-deployment" {
chart = "kube-prometheus-stack"
version = "16.12.0"

values = [<<EOT
prometheus:
prometheusSpec:
additionalScrapeConfigs:

# This job will scrape from any service with the label app.kubernetes.io/component=traefik-internal-service
# and the annotation app.kubernetes.io/scrape=true
- job_name: 'traefik'

kubernetes_sd_configs:
- role: service

relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_component]
action: keep
regex: traefik-internal-service
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
Copy link
Member Author

@Adam-D-Lewis Adam-D-Lewis Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As set up now, this will scrape only k8s services with both labels app.kubernetes.io/component: traefik-internal-service and prometheus.io/scrape: true (see relabel configs).

It is possible to instead scrape all services with the label prometheus.io/scrape: true, but they'll all be included under the same prometheus job. I'm not sure if that's the "right way" to do things in Prometheus given their definition of job which I've pasted below (from here):

In Prometheus terms, an endpoint you can scrape is called an instance, usually corresponding to a single process. A collection of instances with the same purpose, a process replicated for scalability or reliability for example, is called a job.

Copy link
Member Author

@Adam-D-Lewis Adam-D-Lewis Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@costrouc, your thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As set up now, this will scrape only k8s services with both labels app.kubernetes.io/component: traefik-internal-service and prometheus.io/scrape: true (see relabel configs).

I'm not a fan of this since for each new service we'll have to add a new job for scraping.

It is possible to instead scrape all services with the label prometheus.io/scrape: true, but they'll all be included under the same prometheus job.

This is exactly what I'd like to have. I'd like to not have to require development to qhub to require additional modifications to the monitoring configuration.

- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
EOT
]

set {
name = "grafana.grafana\\.ini.server.domain"
value = var.external-url
Expand All @@ -19,7 +50,27 @@ resource "helm_release" "kube-prometheus-stack-helm-deployment" {
name = "grafana.grafana\\.ini.server.server_from_sub_path"
value = "true"
}
}

resource "kubernetes_manifest" "traefik_dashboard_configmap" {
provider = kubernetes-alpha

manifest = {
apiVersion = "v1"
kind = "ConfigMap"
metadata = {
name = "grafana-traefik-dashboard"
namespace = var.namespace
labels = {
# grafana_dashboard label needed for grafana to pick it up automatically
grafana_dashboard = "1"
}
}

data = {
"traefik-dashboard.json" = local.traefik_dashboard
}
}
}

resource "kubernetes_manifest" "grafana-strip-prefix-middleware" {
Expand Down
Loading