Failure time exporter captures the timestamps at which a broken deployment occurred, and when it was fixed. It does this by parsing the information of when the issue related to that broken deployment was created, and closed, in the Issue Tracker(s).
The Failure Time Exporter configuration should be adjusted based on the backend being used to restrict failures to only those related to failed deployments in a production.
Failure exporter only collects failure events that are less than 30 minutes old. Older failures won't be included unless they have been already collected.
Failure Time Exporter may be deployed with one of the supported Issues Trackers. In one clusters' namespace there may be multiple instances of Failure Time Exporter, one for each provider (or each project). Each provider requires specific configuration.
Each Failure time exporter configuration option must be placed under spec.exporters.instances
in the Pelorus configuration object YAML file as in the example:
apiVersion: charts.pelorus.dora-metrics.io/v1alpha1
kind: Pelorus
metadata:
name: example-configuration
spec:
exporters:
instances:
- app_name: failure-exporter
exporter_type: failure
[...] # Failure time exporter configuration options
Configuration part of the Failure time exporter YAML file, with some non-default options:
apiVersion: charts.pelorus.dora-metrics.io/v1alpha1
kind: Pelorus
metadata:
name: example-configuration
spec:
exporters:
instances:
- app_name: failure-exporter
exporter_type: failure
env_from_secrets:
- github-secret
env_from_configmaps:
- failure-config
extraEnv:
- name: PROVIDER
value: github
- name: PROJECTS
value: github_user/repository
This is the list of options that can be applied to env_from_secrets
, env_from_configmaps
and extraEnv
section of a Failure time exporter.
Variable | Required | Default Value |
---|---|---|
PROVIDER | no | jira |
LOG_LEVEL | no | INFO |
SERVER | yes | - |
API_USER | no | - |
TOKEN | yes | - |
APP_LABEL | no | app.kubernetes.io/name |
APP_NAME | no | - |
APP_FIELD | no | u_application |
PROJECTS | no | - |
PELORUS_DEFAULT_KEYWORD | no | default |
JIRA_JQL_SEARCH_QUERY | no | - |
JIRA_RESOLVED_STATUS | no | - |
GITHUB_ISSUE_LABEL | no | bug |
PAGERDUTY_URGENCY | no | - |
PAGERDUTY_PRIORITY | no | - |
AZURE_DEVOPS_TYPE | no | - |
AZURE_DEVOPS_PRIORITY | no | - |
- Required: no
- Default Value: jira
- Type: string
: Set the Issue Tracker provider for the failure exporter. One of jira
, github
, servicenow
, pagerduty
, azure-devops
.
- Required: no
- Default Value: INFO
- Type: string
: Set the log level. One of DEBUG
, INFO
, WARNING
, ERROR
.
: > NOTE: DEBUG
log level is too verbose, do not use it in production.
- Required: yes
- Only applicable for PROVIDER set to
jira
,servicenow
orazure-devops
- Only applicable for PROVIDER set to
- Type: string
: URL to the Jira, ServiceNow or Azure DevOps Server.
- Required: no
jira
; yes forservicenow
- Type: string
: Issue Tracker provider username.
- Required: yes
- For the
jira
PROVIDER Personal Access Token (PATs) is used if API_USERNAME is not provided
- For the
- Type: string
: Issue Tracker provider API Token.
- Required: no
- Only applicable for PROVIDER set to
jira
,github
orazure-devops
- Default Value: app.kubernetes.io/name
- Only applicable for PROVIDER set to
- Type: string
: Changes the label used to identify applications.
- Required: no
- Only applicable for PROVIDER set to
jira
- Only applicable for PROVIDER set to
- Type: string
: Set fallback option when Jira exporter can not determine application related to collected issues. Otherwise, applications that can not be determined from issues are stored as unknown
and do not appear in Grafana dashboards.
- Required: no
- Only applicable for PROVIDER set to
servicenow
- Default Value: u_application
- Only applicable for PROVIDER set to
- Type: string
: Field used for the Application label.
- Required: no
- Only applicable for PROVIDER set to
jira
,github
orazure-devops
- Only applicable for PROVIDER set to
- Type: comma separated list of strings
: * Used by Jira to define which projects (keys or names) to monitor. Value is ignored if JIRA_JQL_SEARCH_QUERY is defined.
: * Used by GitHub to define which repositories' issues to monitor.
: * Used by Azure DevOps to define which projects (by names) to monitor.
- Required: no
- Default Value: default
- Type: string
: Used only when configuring instance using ConfigMap. It is the ConfigMap value that represents default
value. If specified it's used in other data values to indicate "Default Value" should be used.
- Required: no
- Only applicable for PROVIDER set to
jira
- Only applicable for PROVIDER set to
- Type: string
: Used to define custom JQL query to gather issues. More information is available at Advanced Jira Query Language (JQL) site.
- Required: no
- Only applicable for PROVIDER set to
jira
- Only applicable for PROVIDER set to
- Type: string
: Defines issue status (comma separated) that indicates if issue is resolved.
- Required: no
- Only applicable for PROVIDER set to
github
- Default Value: bug
- Only applicable for PROVIDER set to
- Type: string
: Defines a custom label to be used in GitHub issues to identify the ones to be monitored.
- Required: no
- Only applicable for PROVIDER set to
pagerduty
- Only applicable for PROVIDER set to
- Type: string
: Defines incidents urgencies (comma separated) to be monitored. By default, monitors all urgencies.
- Required: no
- Only applicable for PROVIDER set to
pagerduty
- Only applicable for PROVIDER set to
- Type: string
: Defines incidents priorities (comma separated) to be monitored. By default, monitors all priorities. To monitor incidents without priority, add null to this value.
- Required: no
- Only applicable for PROVIDER set to
azure-devops
- Only applicable for PROVIDER set to
- Type: string
: Defines work items types (comma separated) to be monitored. By default, monitors all types.
- Required: no
- Only applicable for PROVIDER set to
azure-devops
- Only applicable for PROVIDER set to
- Type: int
: Defines work items priorities (comma separated) to be monitored. By default, monitors all priorities.
By default, Failure Time Exporter(s) configured to work with Jira expects specific workflow to be used, where the monitored issues need to:
-
Be in any of the projects within the SERVER.
-
Be of type
Bug
and of priorityHighest
. -
Be labeled with
app.kubernetes.io/name=app_name
, where app_name is the name of one of the applications being monitored.NOTE: Issues without such label are collected with the application name set to unknown.
-
Be
Resolved
withresolutiondate
.
Failure Time Exporter(s) configured to work with Jira can be easily adjusted to adapt to custom workflow(s), like:
-
Custom Jira JQL query to find all matching issues, using JIRA_JQL_SEARCH_QUERY.
NOTE: in such case PROJECTS value is ignored.
-
Custom label to track application named app_name, using APP_LABEL.
-
Custom issue resolved status, using JIRA_RESOLVED_STATUS.
In the following examples, we consider that env_from_secrets
contains both API_USER and TOKEN.
In this example, Failure Time Exporter configured to work with Jira will monitor only issues:
- in example_server_url server.
- in Testproject, SECONDPROJECTKEY or thirdproject projects.
- of type Bug.
- with priority Hightest.
- labeled with my.app.label/myname or my.app.label/myname=app_name.
- And only issues that have
resolutiondate
will be considered resolved.
[...]
- app_name: jira-failure-exporter
exporter_type: failure
env_from_secrets:
- jira-secret
extraEnv:
- name: SERVER
value: example_server_url
- name: PROJECTS
value: Testproject,SECONDPROJECTKEY,thirdproject
- name: APP_LABEL
value: my.app.label/myname
[...]
In this example, Failure Time Exporter configured to work with Jira will monitor only issues:
- in example_server_url server.
- in Sample or MYJIRAPROJ projects.
- of type Bug.
- with priority Hightest or Medium.
- labeled with my.company.org/appname or my.company.org/appname=app_name.
- And only issues that have their status changed to DONE, CLOSED or RESOLVED will be considered resolved.
[...]
- app_name: jira-failure-exporter
exporter_type: failure
env_from_secrets:
- jira-secret
extraEnv:
- name: SERVER
value: example_server_url
- name: JIRA_JQL_SEARCH_QUERY
value: type in ("Bug") AND priority in ("Highest","Medium") AND project in ("Sample","MYJIRAPROJ")
- name: APP_LABEL
value: my.company.org/appname
- name: JIRA_RESOLVED_STATUS
value: Done,Closed,Resolved
[...]
In this example, 2 Failure Time Exporters are configured to work with Jira, each in a different server.
jira-failure-exporter-1
will monitor only issues:
- in any of the projects within the example_server_url_1 server.
jira-failure-exporter-2
will monitor only issues:
- in any of the projects within the example_server_url_2 server.
And both will monitor only issues:
- of type Bug.
- with priority Hightest.
- labeled with my.app.label/myname or my.app.label/myname=app_name.
- And only issues that have
resolutiondate
will be considered resolved.
[...]
- app_name: jira-failure-exporter-1
exporter_type: failure
env_from_secrets:
- jira-secret
extraEnv:
- name: SERVER
value: example_server_url_1
- app_name: jira-failure-exporter-2
exporter_type: failure
env_from_secrets:
- jira-secret
extraEnv:
- name: SERVER
value: example_server_url_2
[...]
By default, Failure Time Exporter(s) configured to work with GitHub expects specific workflow to be used, where the monitored issues need to:
-
Be in the repositories listed in PROJECTS.
-
Be labeled with
bug
. -
Be labeled with
app.kubernetes.io/name=app_name
, where app_name is the name of one of the applications being monitored.NOTE: Issues without such label are collected with the application name set to the repository name.
Failure Time Exporter(s) configured to work with GitHub can be easily adjusted to adapt to custom workflow(s), like:
-
Custom label to track monitored issues, using GITHUB_ISSUE_LABEL.
-
Custom label to track application named app_name, using APP_LABEL.
In the following examples, we consider that env_from_secrets
contains TOKEN.
In this example, Failure Time Exporter configured to work with GitHub will monitor only issues:
- in github_user/repository1 or github_user/repository2 repositories' issues.
- labeled with bug.
- labeled with important or important=app_name.
- And only issues that are closed, will be considered resolved.
[...]
- app_name: github-failure-exporter
exporter_type: failure
env_from_secrets:
- github-secret
extraEnv:
- name: PROVIDER
value: github
- name: APP_LABEL
value: important
- name: PROJECTS
value: github_user/repository1,github_user/repository2
[...]
By default, Failure Time Exporter(s) configured to work with ServiceNow expects specific workflow to be used, where the monitored incidents need to:
-
Be in the SERVER.
-
Have the field
u_application
, where it should store the name of one of the applications being monitored.NOTE: Since there are not tags in all versions of ServiceNow, there is the need to configure a custom field on the Incident object to provide an application name to match OpenShift Labels.
-
And only incidents that are
stage=6
(when aresolved_at
field is populated) will be considered resolved.
Failure Time Exporter(s) configured to work with ServiceNow can be easily adjusted to adapt to custom workflow(s), like:
- Have the field APP_FIELD, where it should store the name of one of the applications being monitored.
A custom field can be configure with the following steps:
- Navigate to an existing Incident.
- Use the upper left Menu and select Configure -> Form Layout.
- Create a new field (String, Table or reference a List).
By default, Failure Time Exporter(s) configured to work with PagerDuty will:
-
Monitor all incidents in the domain of the token used to access it (PagerDuty's API Access Key manages both the API URL endpoint and the credentials information).
-
Incidents' service name must match the monitored application(s) name (PagerDuty does not have labels or tags, so this is not as flexible as it is for other providers).
-
Incidents will be considered resolved when their statuses change to
Resolved
(Pelorus will not monitor alerts, but resolving all alerts of an incident, will resolve it. Suppressing alerts do not resolve them).
Failure Time Exporter(s) configured to work with PagerDuty can be easily adjusted to adapt to custom workflow(s), like:
-
Monitor issues of only specific urgencies, using PAGERDUTY_URGENCY.
-
Monitor issues of only specific priorities, using PAGERDUTY_PRIORITY.
By default, Failure Time Exporter(s) configured to work with Azure DevOps will:
-
Monitor all work items in all projects that live in Azure DevOps URL passed through SERVER (that the token passed through TOKEN has access to).
-
Use the
app.kubernetes.io/name=app_name
work item tag, where app_name is the name of one of the applications being monitored.NOTE: Work items without such tag are collected with the application name set to unknown.
-
Work items will be considered resolved when their states change to
Done
.
Failure Time Exporter(s) configured to work with Azure DevOps can be easily adjusted to adapt to custom workflow(s), like:
-
Monitor work items of only specific projects within Azure DevOps URL, using PROJECTS.
-
Use a custom work item tag for getting the name of one of the applications being monitored , using APP_LABEL.
-
Monitor work items of only specific type, using AZURE_DEVOPS_TYPE.
-
Monitor work items of only specific priorities, using AZURE_DEVOPS_PRIORITY.