Skip to content

Commit

Permalink
[issue#306] Add missing ClusterRoles (#465)
Browse files Browse the repository at this point in the history
* [issue#306] Add missing ClusterRoles

The cluster-monitoring-operator is required for STF to install. It
creates the required alertmanager-main and prometheus-k8s.
ClusterRoles, and STF relies on these being present.
These are not present when using CRC, so ClusterRoles need to be
explicitly created.

The names of the ClusterRoles have been updated, in case there is some
conflict when cluster-monitoring-operator is installed after STF.

This is a workaround for not having cluster-monitoring-operator
installed: #306

resolves #306

* Fix up the RBAC setup for prometheus-stf (#467)

Fix up the RBAC changes to fully get prometheus-stf working and
decoupled from prometheus-k8s. Changes to using a separate
prometheus-stf ClusterRole, ClusterRoleBinding, and ServiceAccount,
along with a Role and RoleBinding, all using prometheus-stf as the
ServiceAccount. Also updates the Alertmanager configuration to use
alertmanager-stf instead of alertmanager-main.

* Fix smoketest to use prometheus-stf for token retrieval

* Refactor smoketest script (#468)

* Refactor smoketest script

Perform a bit of smoketest refactoring and fix up a few bugs.

* Update alert trigger to use startsAt in order to potentially speed up
  delivery of the alerts. Failures in the SNMP_WEBHOOK_STATUS seems to
  be primarily to delayed alert notification through
  prometheus-snmp-webhook.
* Add an alert clean up task as part of the clean up logic at the end.
* Update openssl x509 to not use the -in flag which seems unnecessary
  and on some systems causes a failure.
* Add new SMOKETEST_VERBOSE boolean so local testing can skip massive
  amounts of information dumped to stdout.
* Remove curl pod using label selector for slightly cleaner output.
* Update failure check to combine RET and SNMP_WEBHOOK_STATUS since
  testing seems to show changes are slightly more reliable.

* Show logs from curl

* Remove nodes/metrics permission from ClusterRole

As part of least priviledge work, remove the nodes/metrics permission as
we're not scraping nodes for information. Everything appears to continue
working in STF without this permission.

* Move SCC RBAC from ClusterRole to Role

Working on simplifying and reducing our access scope as much as
possible. It appears moving SCC RBAC from ClusterRole to Role allows
things to continue to work with Prometheus. It's possible further
testing may reveal this will need to reverted.

* Convert alertmanager-stf Role to ClusterRole (#473)

Convert alertmanager-stf Role to ClusterRole as the tokenreviews and
subjectaccessreviews resources need to be accessable at the cluster
scope.

* Create ClusterRoleBinding and Role for alertmanager (#475)

* Create ClusterRoleBinding and Role for alertmanager

Create appropriate ClusterRoleBinding and Role for alertmanager-stf,
breaking out SCC into a Role vs ClusterRole to keep things in alignment
to prometheus-stf RBAC setup.

* Adjust smoketest.sh for SNMP webhook test failures

Adjust the smoketest script to also fail when the SNMP webhook test has
failed. Add a wait condition for the curl pod to complete so logs can be
retrieved.

* Add *RoleBinding rescue capabilities

If changes happen to the ClusterRoleBinding or RoleBinding then
generally the system is not going to allow you to patch the object. Adds
block/rescue logic to remove the existing ClusterRoleBinding or
RoleBinding before creating it when patching the object fails.

---------

Co-authored-by: Leif Madsen <lmadsen@redhat.com>
  • Loading branch information
elfiesmelfie and leifmadsen authored Sep 21, 2023
1 parent 3765a65 commit 805ada4
Show file tree
Hide file tree
Showing 5 changed files with 327 additions and 147 deletions.
3 changes: 2 additions & 1 deletion deploy/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ rules:
- watch
- update
- patch
- delete
- apiGroups:
- authorization.k8s.io
resources:
Expand Down Expand Up @@ -185,4 +186,4 @@ rules:
verbs:
- get
- list
- watch
- watch
119 changes: 109 additions & 10 deletions roles/servicetelemetry/tasks/component_alertmanager.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@
kind: Route
name: '{{ ansible_operator_meta.name }}-alertmanager-proxy'

- name: Add a service account to used by Alertmanager
- name: Create ServiceAccount/alertmanager-stf with oauth redirect annotation
k8s:
definition:
apiVersion: v1
Expand All @@ -77,22 +77,121 @@
annotations:
serviceaccounts.openshift.io/oauth-redirectreference.alertmanager: '{{ alertmanager_oauth_redir_ref | to_json }}'

- name: Bind role
- name: Create ClusterRole/alertmanager-stf
k8s:
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
kind: ClusterRole
metadata:
name: alertmanager-stf
namespace: '{{ ansible_operator_meta.namespace }}'
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: alertmanager-main
subjects:
- kind: ServiceAccount
rules:
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create

- name: Setup ClusterRoleBinding for Alertmanager
block:
- name: Define ClusterRoleBinding/alertmanager-stf
set_fact:
def_alertmanager_stf_crb: |
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: alertmanager-stf
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: alertmanager-stf
subjects:
- kind: ServiceAccount
name: alertmanager-stf
namespace: '{{ ansible_operator_meta.namespace }}'
- name: Create ClusterRoleBinding/alertmanager-stf
k8s:
definition:
"{{ def_alertmanager_stf_crb }}"
rescue:
- name: Remove ClusterRoleBinding/alertmanager-stf when fail to update
k8s:
state: absent
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: alertmanager-stf

- name: Create ClusterRoleBinding/alertmanager-stf
k8s:
definition:
"{{ def_alertmanager_stf_crb }}"

- name: Create Role/alertmanager-stf
k8s:
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: alertmanager-stf
namespace: '{{ ansible_operator_meta.namespace }}'
rules:
- apiGroups:
- security.openshift.io
resourceNames:
- nonroot
resources:
- securitycontextconstraints
verbs:
- use

- name: Setup RoleBinding for Alertmanager
block:
- name: Define RoleBinding/alertmanager-stf
set_fact:
def_alertmanager_stf_rb: |
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: alertmanager-stf
namespace: '{{ ansible_operator_meta.namespace }}'
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: alertmanager-stf
namespace: '{{ ansible_operator_meta.namespace }}'
subjects:
- kind: ServiceAccount
name: alertmanager-stf
namespace: '{{ ansible_operator_meta.namespace }}'
- name: Create RoleBinding/alertmanager-stf
k8s:
definition:
"{{ def_alertmanager_stf_rb }}"
rescue:
- name: Remove RoleBinding/alertmanager-stf when fail to update
k8s:
state: absent
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: alertmanager-stf
namespace: '{{ ansible_operator_meta.namespace }}'

- name: Create RoleBinding/alertmanager-stf
k8s:
definition:
"{{ def_alertmanager_stf_rb }}"

- name: Set default alertmanager service template
set_fact:
Expand Down
200 changes: 140 additions & 60 deletions roles/servicetelemetry/tasks/component_prometheus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,91 +7,171 @@
kind: Route
name: '{{ ansible_operator_meta.name }}-prometheus-proxy'

- name: Add oauth redirect annotation to prometheus-k8s service account
- name: Create ServiceAccount/prometheus-stf with oauth redirect annotation
k8s:
definition:
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-k8s
name: prometheus-stf
namespace: '{{ ansible_operator_meta.namespace }}'
annotations:
serviceaccounts.openshift.io/oauth-redirectreference.prometheus: '{{ prom_oauth_redir_ref | to_json }}'

- block:
- name: Install RBAC Role for prometheus operations
- name: Create ClusterRole/prometheus-stf for non-resource URL /metrics access
k8s:
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-stf
rules:
- nonResourceURLs:
- /metrics
verbs:
- get
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create
- apiGroups:
- ""
resources:
- namespaces
verbs:
- get

- name: Setup ClusterRoleBinding for Prometheus
block:
- name: Define ClusterRoleBinding/prometheus-stf
set_fact:
def_prometheus_stf_crb: |
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus-stf
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-stf
subjects:
- kind: ServiceAccount
name: prometheus-stf
namespace: '{{ ansible_operator_meta.namespace }}'
- name: Create ClusterRoleBinding/prometheus-stf
k8s:
definition:
"{{ def_prometheus_stf_crb }}"
rescue:
- name: Remove ClusterRoleBinding/prometheus-stf when fail to update
k8s:
state: absent
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus-stf

- name: Create ClusterRoleBinding/prometheus-stf
k8s:
definition:
"{{ def_prometheus_stf_crb }}"

- name: Create Role/prometheus-stf for Prometheus operations
k8s:
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: prometheus-stf
namespace: '{{ ansible_operator_meta.namespace }}'
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- security.openshift.io
resourceNames:
- nonroot
- nonroot-v2
resources:
- securitycontextconstraints
verbs:
- use

- name: Setup RoleBinding for Prometheus
block:
- name: Define RoleBinding/prometheus-stf
set_fact:
def_prometheus_stf_rb: |
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: prometheus-stf
namespace: '{{ ansible_operator_meta.namespace }}'
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: prometheus-stf
subjects:
- kind: ServiceAccount
name: prometheus-stf
namespace: '{{ ansible_operator_meta.namespace }}'
- name: Create RoleBinding/prometheus-stf
k8s:
definition:
"{{ def_prometheus_stf_rb }}"
rescue:
- name: Remove RoleBinding/prometheus-stf on failure to update
k8s:
state: absent
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
kind: RoleBinding
metadata:
name: prometheus-stf
namespace: '{{ ansible_operator_meta.namespace }}'
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- security.openshift.io
resourceNames:
- nonroot
- nonroot-v2
resources:
- securitycontextconstraints
verbs:
- use

- name: Bind the local prometheus SA to our new role

- name: Create RoleBinding/prometheus-stf
k8s:
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: prometheus-k8s-stf
namespace: '{{ ansible_operator_meta.namespace }}'
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: prometheus-stf
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: '{{ ansible_operator_meta.namespace }}'
when:
- observability_strategy in ['use_redhat', 'use_hybrid']
"{{ def_prometheus_stf_rb }}"

- name: Bind the local prometheus SA to prometheus cluster role (for oauth perms)
- name: Remove old ClusterRoleBinding for prometheus-k8s using CMO roleRef
k8s:
state: absent
definition:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus-k8s-{{ ansible_operator_meta.namespace }}
namespace: '{{ ansible_operator_meta.namespace }}'
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-k8s
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: '{{ ansible_operator_meta.namespace }}'

- name: Check for existing prometheus htpasswd user secret
k8s_info:
Expand Down
Loading

0 comments on commit 805ada4

Please sign in to comment.