Expose prometheus with basic auth #1091

yuvipanda · 2022-03-12T00:51:39Z

This is the beginning of implementing idea 1 from the
list @georiganaelena made in
#328 (comment).

We have one prometheus running per cluster, but manage many clusters.
A single grafana that can connect to all such prometheus clusters
will help with monitoring as well as reporting. So we need to expose
it as securely as possible to the external world, as it can contain
private information.

In this case, we're using https + basic auth provided by
nginx-ingress
(https://kubernetes.github.io/ingress-nginx/examples/auth/basic/)
to safely expose prometheus to the outside world. We can then
use a grafana that knows these username / passwords to access this
prometheus instance. Each cluster needs its own username / password
(generated with pwgen 64 1), so users in one cluster can not access
prometheus for another cluster.

Ref #328

This is the beginning of implementing idea 1 from the list @georiganaelena made in 2i2c-org#328 (comment). We have one prometheus running per cluster, but manage many clusters. A single grafana that can connect to all such prometheus clusters will help with monitoring as well as reporting. So we need to expose it as securely as possible to the external world, as it can contain private information. In this case, we're using https + basic auth provided by nginx-ingress (https://kubernetes.github.io/ingress-nginx/examples/auth/basic/) to safely expose prometheus to the outside world. We can then use a grafana that knows these username / passwords to access this prometheus instance. Each cluster needs its own username / password (generated with pwgen 64 1), so users in one cluster can not access prometheus for another cluster. Ref 2i2c-org#328

yuvipanda · 2022-03-12T00:53:03Z

I'm manually generating and check-ing in the username / password for each prometheus instead of autogenerating it in the deployer, as I sense that's the preferred approach now based on conversations with @consideRatio and @sgibson91. Keeps config explicit.

TODO:

Do this for all clusters
Document this as part of setting up the cluster

/cc @GeorgianaElena as well.

Except meom-ige and farallon, which don't yet have support clusters setup

yuvipanda · 2022-03-12T01:48:36Z

I used a tiny helper script to generate these encrypted secret files:

#!/bin/bash
set -euo pipefail

F="${1}/enc-support.secret.yaml"
echo "prometheusAuth:" > $F
echo "  username: $(pwgen 64 1)" >> $F
echo "  password: $(pwgen 64 1)" >> $F
sops -i -e $F
echo "Done $F"%

Then ran find . -type d | xargs -L1 ./gen.bash to run it for all dirs inside config/clusters.

However, meom-ige and farallon still don't have a support cluster, so this can't be used there yet.

yuvipanda · 2022-03-12T01:49:00Z

Note - I've already run deploy-support for all these :)

run pre-commit run -a

consideRatio · 2022-03-12T08:33:58Z

Hmmm, the support chart could also do the trick jupyterhub does and generate secrets in a Secret file - if prometheus could accept accessing the content from there.

It may not be what makes sense, just floating the idea to ensure it is considered as an option we rule out. Someplace, the grafana server would need to have access to those secrets in other clusters so it would probably be relevant to have access to those centrally still without probing multiple clusters whenever they are needed.

yuvipanda · 2022-03-12T08:53:55Z

@consideRatio yeah, that's the other option - but that would require giving kubernetes API access to every single cluster to the centralized grafana, which I would very much prefer to not do.

consideRatio

This is a nice beginning!

My understanding summarized

We assume we have prometheus subdomains for each cluster point to an nginx-ingress controllers in each cluster.
Letting the support chart create a k8s Secret to hold username/password for prometheus access via support chart values.
Letting the support chart's dependent prometheus Helm chart create an k8s Ingress resource that references the created k8s Secret with username/password. This reference is a annotation that an nginx-ingress controller understands: we wish to require basic authentication using the provided username/password in the k8s Secret
Later we manually (?) configure a central Grafana instance through its web UI to have many data sources, one for each cluster. Following this, we can allow dashboards there to choose to inspect any given cluster or have some dashboards that summarize the state of all clusters.

A few action points

Add a k8s Secret template in the support chat
Do work to ensure we get the relevant new sub-domain names get CNAME entries in the DNS servers we control, I figure they should point to the primary domain name we use which also points to the ingress controllers IP.
Point out a specific Grafana instance to be the central instance we use (the one in the 2i2c cluster right?)

deployer/cluster.py

helm-charts/support/values.schema.yaml

helm-charts/support/values.yaml

config/clusters/azure.carbonplan/support.values.yaml

Also actually add the secrets file

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

@consideRatio

@consideRatio's reasoning in 2i2c-org#1091 (comment) is pretty flawless. Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

helm-charts/support/templates/prometheus-htpasswd.yaml

helm-charts/support/values.yaml

consideRatio

This LGTM, if needed, lets iterate from here to not slow down too much!

I think the key points now with relation to this PR is that this goes with the creation of additional domain names etc and those aspects may need documentation updates.

helm-charts/support/templates/prometheus-ingress-auth.yaml

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

yuvipanda · 2022-03-15T03:28:29Z

I've added docs on what needs to happen with DNS, and I think it's already happened for all the clusters mentioned here.

consideRatio

With these comments resolved, this LGTM! Nice work on this @yuvipanda!

docs/howto/operate/grafana.md

helm-charts/support/values.yaml

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

deployer/cluster.py

yuvipanda · 2022-03-15T09:07:36Z

I think what's left here is to automate adding all these as a data source to one centralized grafana.

yuvipanda · 2022-03-15T09:09:06Z

I think this is ready to go!

yuvipanda · 2022-03-18T19:41:40Z

Still need to write code that'll populate central grafana with datasources, but let's get this merged until then.

damianavila · 2022-04-05T13:48:54Z

Still need to write code that'll populate central grafana with datasources

I think the upcoming work is actually described here: #328 (comment), let me know if you disagree.

Btw, thanks for all this work, @yuvipanda!!

yuvipanda added 2 commits March 11, 2022 17:46

Enable encrypted prometheus ingress for all clusters

2968b14

Except meom-ige and farallon, which don't yet have support clusters setup

Fix typo causing errors when deploying to AWS

e5c840e

yuvipanda force-pushed the multi-support branch from 34f1c5f to e5c840e Compare March 12, 2022 01:47

Autoformat

951f087

run pre-commit run -a

yuvipanda force-pushed the multi-support branch from 33936ef to 951f087 Compare March 12, 2022 02:12

consideRatio reviewed Mar 12, 2022

View reviewed changes

config/clusters/azure.carbonplan/support.values.yaml Show resolved Hide resolved

yuvipanda and others added 5 commits March 14, 2022 13:42

Rename prometheusAuth to be clearer

0e105e9

Also actually add the secrets file

Add clarifying comment

3a228fd

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

Fix duplicate prometheus entry in YAML

18e6f0a

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

Rename key used to contain basic auth secrets for prometheus

6c96d4a

@consideRatio's reasoning in 2i2c-org#1091 (comment) is pretty flawless. Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

Add comment about prometheus ingress

3621665

consideRatio reviewed Mar 14, 2022

View reviewed changes

helm-charts/support/templates/prometheus-htpasswd.yaml Outdated Show resolved Hide resolved

consideRatio reviewed Mar 14, 2022

View reviewed changes

helm-charts/support/values.yaml Outdated Show resolved Hide resolved

consideRatio reviewed Mar 14, 2022

View reviewed changes

helm-charts/support/values.yaml Outdated Show resolved Hide resolved

yuvipanda added 2 commits March 14, 2022 15:15

Rename file creating prometheus ingress secrets

0b6a61f

Clarify comment about where the secret comes from

c44254a

yuvipanda force-pushed the multi-support branch from b849ab9 to c44254a Compare March 14, 2022 22:38

consideRatio reviewed Mar 14, 2022

View reviewed changes

helm-charts/support/templates/prometheus-ingress-auth.yaml Outdated Show resolved Hide resolved

helm-charts/support/templates/prometheus-ingress-auth.yaml Outdated Show resolved Hide resolved

yuvipanda and others added 3 commits March 14, 2022 15:57

Put prometheus-ingress-auth into its own directory

9cdd10d

Fix space stripping in helm template

3b3510f

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

Document how to setup prometheus ingress

a5b0f2c

consideRatio approved these changes Mar 15, 2022

View reviewed changes

docs/howto/operate/grafana.md Outdated Show resolved Hide resolved

docs/howto/operate/grafana.md Outdated Show resolved Hide resolved

docs/howto/operate/grafana.md Outdated Show resolved Hide resolved

helm-charts/support/values.yaml Outdated Show resolved Hide resolved

yuvipanda and others added 3 commits March 15, 2022 01:35

Fix typos

b4f858c

Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>

Mass rename secret values file to have a .values.yaml suffix

b14ad10

Fix some more typos

bbf9581

GeorgianaElena reviewed Mar 15, 2022

View reviewed changes

deployer/cluster.py Outdated Show resolved Hide resolved

consideRatio reviewed Mar 15, 2022

View reviewed changes

deployer/cluster.py Outdated Show resolved Hide resolved

Restore commented-out code

b913db1

yuvipanda force-pushed the multi-support branch from 7325381 to b913db1 Compare March 15, 2022 09:08

This was referenced Mar 15, 2022

Minor refactor to reduce helm emitted log noise and increase performance #1100

Closed

Tighten access to prometheus servers #1101

Closed

yuvipanda mentioned this pull request Mar 18, 2022

Move a few hubs off *.pilot.2i2c.cloud #1129

Merged

yuvipanda merged commit c8b87a4 into 2i2c-org:master Mar 18, 2022

consideRatio mentioned this pull request Mar 18, 2022

Update links away from pilot naming #1134

Merged

GeorgianaElena mentioned this pull request Mar 28, 2022

Cloud usage monitoring and alerting infrastructure and process #328

Closed

10 tasks

damianavila mentioned this pull request Apr 5, 2022

Use nginx-ingress as entrypoint for all hubs #594

Closed

4 tasks

GeorgianaElena mentioned this pull request May 5, 2022

Deploy the support charts in the remaining three clusters #1278

Closed

3 tasks

GeorgianaElena mentioned this pull request May 16, 2022

Expose leap prometheus isntance through basicAuth #1310

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose prometheus with basic auth #1091

Expose prometheus with basic auth #1091

yuvipanda commented Mar 12, 2022

yuvipanda commented Mar 12, 2022 •

edited

Loading

yuvipanda commented Mar 12, 2022

yuvipanda commented Mar 12, 2022

consideRatio commented Mar 12, 2022 •

edited

Loading

yuvipanda commented Mar 12, 2022

consideRatio left a comment •

edited by yuvipanda

Loading

consideRatio left a comment

yuvipanda commented Mar 15, 2022

consideRatio left a comment

yuvipanda commented Mar 15, 2022

yuvipanda commented Mar 15, 2022

yuvipanda commented Mar 18, 2022

damianavila commented Apr 5, 2022

Expose prometheus with basic auth #1091

Expose prometheus with basic auth #1091

Conversation

yuvipanda commented Mar 12, 2022

yuvipanda commented Mar 12, 2022 • edited Loading

yuvipanda commented Mar 12, 2022

yuvipanda commented Mar 12, 2022

consideRatio commented Mar 12, 2022 • edited Loading

yuvipanda commented Mar 12, 2022

consideRatio left a comment • edited by yuvipanda Loading

Choose a reason for hiding this comment

My understanding summarized

A few action points

consideRatio left a comment

Choose a reason for hiding this comment

yuvipanda commented Mar 15, 2022

consideRatio left a comment

Choose a reason for hiding this comment

yuvipanda commented Mar 15, 2022

yuvipanda commented Mar 15, 2022

yuvipanda commented Mar 18, 2022

damianavila commented Apr 5, 2022

yuvipanda commented Mar 12, 2022 •

edited

Loading

consideRatio commented Mar 12, 2022 •

edited

Loading

consideRatio left a comment •

edited by yuvipanda

Loading