Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update installation to target Grafana Operator v5 #526

Merged
merged 1 commit into from
Jan 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Use the third-party application, Grafana, to visualize system-level metrics that
For more information about configuring data collectors, see xref:configuring-red-hat-openstack-platform-overcloud-for-stf_assembly-completing-the-stf-configuration[].

ifdef::include_when_16[]
//TODO: can re-work this once we have OSP13 dashboard(s) to show. Can't use container health checks or monitoring in OSP13.
You can use dashboards to monitor a cloud:

Infrastructure dashboard::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,9 @@
[role="_abstract"]
Grafana is not included in the default {Project} ({ProjectShort}) deployment, so you must deploy the Grafana Operator from community-operators CatalogSource. If you use the Service Telemetry Operator to deploy Grafana, it results in a Grafana instance and the configuration of the default data sources for the local {ProjectShort} deployment.

ifdef::include_16[The dashboards in {ProjectShort} require features that are available only in Grafana version 8.1.0 and later. By default, the Service Telemetry Operator installs a compatible version. For more information about how to override the Grafana container image, see xref:overriding-the-default-grafana-container-image_assembly-advanced-features[].]

.Procedure

. Log in to {OpenShift}.
. Change to the `service-telemetry` namespace:
+
[source,bash]
----
$ oc project service-telemetry
----
. Log in to your {OpenShift} environment where {ProjectShort} is hosted.

. Subscribe to the Grafana Operator by using the community-operators CatalogSource:
+
Expand All @@ -31,10 +23,12 @@ $ oc apply -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
labels:
operators.coreos.com/grafana-operator.openshift-operators: ""
name: grafana-operator
namespace: service-telemetry
namespace: openshift-operators
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing the namespace here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grafana Operator v5 can (and is recommended to) be installed as a cluster-scoped operator, so this is being updated to match our other cluster-scoped operators (such as COO).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to install non-openshift operators into that project. COO is installed there, like you say, but maybe it shouldn't be, either? At least it's our product.

One reason is that openshift-operators has this setting pod-security.kubernetes.io/enforce: privileged whereas the service-telemetry namespace, for example, has pod-security.kubernetes.io/enforce: baseline, so I think this affects the security posture.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh.... that seems to be the default recommended installation via the UI....

<removes operator, checks UI again for what happens>

In the Console UI when installing the Grafana Operator from the v5 channel, it defaults to All Namespaces (cluster scoped) and it won't even let me pick a different project other than openshift-operators... I'm not entirely sure why that is...

I've been trying to look at other operator install documents from the OpenShift documentation, and everyone seems to do it a little bit different. We install cert-manager into it's own cert-manager-operator namespace. While COO is installed in openshift-operators, and that was a pattern I was following, it seems there are some implications to that.

Should we adjust our default practices here to create say a grafana-operator namespace and install the Operator there? I will need to see if that results in it being cluster or namespace scoped by default when installed from CLI. I'm not really sure what the Subscription configuration needs to look like to be namespace vs cluster scoped either... I'll have to investigate that.

Should we also get COO installed in a consistent manner where we create a cluster-observability-operator namespace if we want to have separate namespaces for the operators?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer to my own question:

I'm not really sure what the Subscription configuration needs to look like to be namespace vs cluster scoped either... I'll have to investigate that.

I forgot about OperatorGroups. That's what defines the Operator as being installed in AllNamespaces or SingleNamespace.

https://docs.openshift.com/container-platform/4.14/operators/user/olm-installing-operators-in-namespace.html#olm-installing-operator-from-operatorhub-using-cli_olm-installing-operators-in-namespace

I do notice that even in the documentation here that they direct to using openshift-operators as the target installation namespace for AllNamespace operators...

From step 4:

For default AllNamespaces install mode usage, specify the openshift-operators namespace. Alternatively, you can specify a custom global namespace, if you have created one. Otherwise, specify the relevant single namespace for SingleNamespace install mode usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty convincing. I'd never really considered this issue before this PR and was trying to figure out the implications. If the openshift docs are saying to use that namespace, then I guess it's okay.

spec:
channel: v4
channel: v5
installPlanApproval: Automatic
name: grafana-operator
source: community-operators
Expand All @@ -46,9 +40,9 @@ EOF
+
[source,bash,options="nowrap"]
----
$ oc wait --for jsonpath="{.status.phase}"=Succeeded csv -l operators.coreos.com/grafana-operator.service-telemetry --timeout=600s
$ oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace openshift-operators -l operators.coreos.com/grafana-operator.openshift-operators

clusterserviceversion.operators.coreos.com/grafana-operator.v4.10.1 condition met
clusterserviceversion.operators.coreos.com/grafana-operator.v5.6.0 condition met
----

. To launch a Grafana instance, create or modify the `ServiceTelemetry` object. Set `graphing.enabled` and `graphing.grafana.ingressEnabled` to `true`. Optionally, set the value of `graphing.grafana.baseImage` to the Grafana workload container image that will be deployed:
Expand All @@ -66,34 +60,34 @@ spec:
enabled: true
grafana:
ingressEnabled: true
baseImage: 'registry.redhat.io/rhel8/grafana:7'
baseImage: 'registry.redhat.io/rhel8/grafana:9'
----

. Verify that the Grafana instance deployed:
+
[source,bash]
[source,bash,options="nowrap"]
----
$ oc wait --for jsonpath="{.status.phase}"=Running pod -l app=grafana --timeout=600s
$ oc wait --for jsonpath="{.status.phase}"=Running pod -l app=default-grafana --timeout=600s

pod/grafana-deployment-7566475c56-jlkjp condition met
pod/default-grafana-deployment-669968df64-wz5s2 condition met
----

. Verify that the Grafana data sources installed correctly:
+
[source,bash]
[source,bash,options="nowrap"]
----
$ oc get grafanadatasources
$ oc get grafanadatasources.grafana.integreatly.org

NAME AGE
default-datasources 20h
NAME NO MATCHING INSTANCES LAST RESYNC AGE
default-ds-stf-prometheus 2m35s 2m56s
----

. Verify that the Grafana route exists:
+
[source,bash,options="nowrap"]
----
$ oc get route grafana-route
$ oc get route default-grafana-route

NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
grafana-route grafana-route-service-telemetry.apps.infra.watch grafana-service 3000 edge None
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
default-grafana-route default-grafana-route-service-telemetry.apps.infra.watch default-grafana-service web reencrypt None
----
Loading