Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Since 5.13.0 - Grafana Operator cannot manage TLS protected internal Grafanas #1675

Open
diranged opened this issue Sep 13, 2024 · 2 comments · May be fixed by #1690
Open

[Bug] Since 5.13.0 - Grafana Operator cannot manage TLS protected internal Grafanas #1675

diranged opened this issue Sep 13, 2024 · 2 comments · May be fixed by #1690
Labels
bug Something isn't working triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@diranged
Copy link

Describe the bug
It seems in our environment (where we pass in TLS certs to our Grafana service so that it's encrypted end to end) that the Grafana Operator stopped being able to connect with our Grafana instances after #1628 was shipped in 5.13.0. We get the following reconciliation errors:

    "status": {
        "hash": "9250f003846c19a973bd035ce560da23aaad2fdc855a951d63c99d75b7c40a03",
        "lastMessage": "fetching data sources: Get \"https://grafana-app-service.grafana:3000/api/datasources\": tls: failed to verify certificate: x509: certificate signed by unknown authority",
        "lastResync": "2024-09-12T17:50:18Z",
        "uid": "loki"
    }

Logs:

2024-09-13T20:58:43Z	ERROR	GrafanaDatasourceReconciler	error reconciling datasource	{"controller": "grafanadatasource", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaDatasource", "GrafanaDatasource": {"name":"grafana-app-root","namespace":"grafana"}, "namespace": "grafana", "name": "grafana-app-root", "reconcileID": "c1671f2b-3f25-436b-a479-7b2fe96edbdf", "datasource": "grafana-app-root", "grafana": "grafana-app", "error": "fetching data sources: Get \"https://grafana-app-service.grafana:3000/api/datasources\": tls: failed to verify certificate: x509: certificate signed by unknown authority"}
github.com/grafana/grafana-operator/v5/controllers.(*GrafanaDatasourceReconciler).Reconcile
	github.com/grafana/grafana-operator/v5/controllers/datasource_controller.go:252
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:222

Version
v5.13.0

To Reproduce

Create a Grafana with a TLS config...

config:
...
  server:
    ca_cert: /certs/ca.crt
    cert_file: /certs/tls.crt
    cert_key: /certs/tls.key
    domain: ....com
    protocol: https
    root_url: https://....com
deployment:
  spec:
    template:
      spec:
        containers:
          - volumeMounts:
              - mountPath: /certs/ca.crt
                name: ca
                readOnly: true
                subPath: ca.crt
              - mountPath: /certs/tls.crt
                name: tls
                readOnly: true
                subPath: tls.crt
              - mountPath: /certs/tls.key
                name: tls
                readOnly: true
                subPath: tls.key
        volumes:
          - name: ca
            secret:
              defaultMode: 420
              optional: false
              secretName: grafana-app-cacert
          - name: tls
            secret:
              defaultMode: 420
              optional: false
              secretName: grafana-app-tls
@diranged diranged added bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 13, 2024
@theSuess
Copy link
Member

Thanks for reporting this. The TLS settings introduced in #1628 should have only affected external instances, but it had the unintended side effect of requiring complete certificate chains on all instances.

As a workaround, you can try mounting your ca.crt in the operator manager container under /etc/ssl/certs/ca-certificates.crt until we have a fix ready.

@theSuess theSuess added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 17, 2024
@diranged
Copy link
Author

Thanks - for now we just rolled the operator upgrade back...

@theSuess theSuess linked a pull request Sep 24, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants