Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Open source #1

Merged
merged 1 commit into from
Apr 13, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
60 changes: 60 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# How to Contribute

cluster-monitoring-operator projects are [Apache 2.0 licensed](LICENSE) and accept contributions via GitHub pull requests. This document outlines some of the conventions on development workflow, commit message formatting, contact points and other resources to make it easier to get your contribution accepted.

# Certificate of Origin

By contributing to this project you agree to the Developer Certificate of Origin (DCO). This document was created by the Linux Kernel community and is a simple statement that you, as a contributor, have the legal right to make the contribution. See the [DCO](DCO) file for details.

## Getting Started

- Fork the repository on GitHub
- Read the [README](README.md) for build and test instructions
- Play with the project, submit bugs, submit patches!

## Contribution Flow

This is a rough outline of what a contributor's workflow looks like:

- Create a topic branch from where you want to base your work (usually master).
- Make commits of logical units.
- Make sure your commit messages are in the proper format (see below).
- Push your changes to a topic branch in your fork of the repository.
- Make sure the tests pass, and add any new tests as appropriate.
- Submit a pull request to the original repository.

Thanks for your contributions!

### Coding Style

cluster-monitoring-operator projects written in Go follow a set of style guidelines that we've documented [here](https://github.com/coreos/docs/tree/master/golang). Please follow them when working on your contributions.

### Format of the Commit Message

We follow a rough convention for commit messages that is designed to answer two
questions: what changed and why. The subject line should feature the what and
the body of the commit should describe the why.

```
scripts: add the test-cluster command

this uses tmux to setup a test cluster that you can easily kill and
start for debugging.

Fixes #38
```

The format can be described more formally as follows:

```
<subsystem>: <what changed>
<BLANK LINE>
<why this change was made>
<BLANK LINE>
<footer>
```

The first line is the subject and should be no longer than 70 characters, the
second line is always blank, and other lines should be wrapped at 80 characters.
This allows the message to be easier to read on GitHub as well as in various
git tools.
36 changes: 36 additions & 0 deletions DCO
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
660 York Street, Suite 102,
San Francisco, CA 94110 USA

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or

(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or

(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.

(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
5 changes: 5 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
FROM quay.io/prometheus/busybox:latest

ADD operator /bin/operator

ENTRYPOINT ["/bin/operator"]
4 changes: 4 additions & 0 deletions Dockerfile.generate
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FROM golang:1.9.2

RUN apt-get update && \
apt-get install -y python-yaml
Binary file added Documentation/arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
62 changes: 62 additions & 0 deletions Documentation/cluster-monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Cluster Monitoring

Cluster monitoring ships with a pre-configured and self-updating monitoring stack that is based on the [Prometheus][prometheus] open source project and its wider eco-system. It provides monitoring of cluster components and ships with a set of alerts to immediately notify cluster admins about any occurring problems.

## Overview

At the heart of the monitoring stack sits the Cluster Monitoring Operator, which watches over the deployed monitoring components and resources, and ensures that they are always up to date.

One of the core components that Cluster Monitoring ships is the [Prometheus Operator][prom-operator]. The Prometheus Operator creates, configures, and manages Prometheus monitoring instances. It automatically generates monitoring target configurations based on familiar Kubernetes label queries.

![Architecture](./arch.png)

## Cluster Monitoring

A Prometheus instance dedicated to monitoring the Cluster cluster itself is also shipped, controlled by the Prometheus Operator. This instance includes a set of alerting rules to notify operators about problems in a cluster.

Use the Prometheus [Alertmanager][alertmanager] to send notifications to operators. Cluster Monitoring includes a highly available cluster of the Alertmanager, meant to be used not only by the Prometheus instance monitoring the cluster, but rather by all Prometheus instances.

In addition to Prometheus and Alertmanager, Cluster Monitoring also includes [node-exporter][node-exporter] and [kube-state-metrics][kube-state]. Node-exporter is an agent deployed on every node to collect metrics about it. The kube-state-metrics exporter agent converts Kubernetes objects to metrics consumable by Prometheus.

The targets monitored as part of the cluster monitoring are:

- Prometheus itself
- Prometheus-Operator
- Alertmanager cluster instances
- Kubernetes apiserver
- kubelets (the kubelet embeds cAdvisor for per container metrics)
- kube-scheduler
- kube-controller-manager
- kube-state-metrics
- node-exporter

All these components are automatically updated.

Cluster Monitoring is also configurable, learn how to [configure Cluster Monitoring][configure-monitoring].

> Note that in order to be able to deliver updates with guaranteed compatibility, configurability of the Cluster Monitoring stack is limited to the explicitly available options. Read more on [update and compatibility guarantees][update-and-compatibility-guarantees].

## Application Monitoring

Create additional Prometheus instances managed by the Prometheus Operator to monitor individual applications.

## Accessing Prometheus and Alertmanager

Cluster Monitoring ships with a Prometheus instance for cluster monitoring and a central Alertmanager cluster. In addition to Prometheus and Alertmanager, Cluster Monitoring also includes a [Grafana][grafana] instance as well as pre-built dashboards for cluster monitoring troubleshooting.

By default, all web UIs are exposed through Kubernetes Ingress, and accessible at the following names:

- Prometheus: https://$CLUSTER-DNS/prometheus
- Alertmanager: https://$CLUSTER-DNS/alertmanager
- Grafana: https://$CLUSTER-DNS/grafana

Authentication is performed against the OpenShift identity system, and uses the same credentials or means of authentication as is used elsewhere in OpenShift.

[alertmanager]: https://prometheus.io/docs/alerting/alertmanager/
[grafana]: https://grafana.com/
[configure-monitoring]: user-guides/configuring-cluster-monitoring.md
[node-exporter]: https://github.com/prometheus/node_exporter
[kube-state]: https://github.com/kubernetes/kube-state-metrics
[prom-operator]: https://coreos.com/operators/prometheus/docs/latest/
[prometheus]: https://prometheus.io/
[update-and-compatibility-guarantees]: user-guides/update-and-compatibility-guarantees.md
127 changes: 127 additions & 0 deletions Documentation/user-guides/configuring-cluster-monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Configuring Cluster Monitoring

Parts of Cluster Monitoring are configurable. This configuration lies in a ConfigMap called `cluster-monitoring-config` in the `openshift-monitoring` namespace. The configuration file itself is defined under the `config.yaml` key within the ConfigMap's data.

Configuring Cluster Monitoring is optional. If the config does not exist, or is empty or malformed, then defaults will be used.

## Configuring custom images

In certain environments it may be required that container images are downloaded from a custom registry rather than from the canonical container image repositories on [quay.io][quay].

This is an example configuration with all image parameters set to a custom registry:

[embedmd]:# (../../examples/user-guides/configuring-cluster-monitoring/custom-image-config.yaml)
```yaml
prometheusOperator:
baseImage: custom-registry.com/prometheus-operator
prometheusConfigReloaderBaseImage: custom-registry.com/prometheus-config-reloader
configReloaderBaseImage: custom-registry.com/configmap-reload
prometheusK8s:
baseImage: custom-registry.com/prometheus
alertmanagerMain:
baseImage: custom-registry.com/alertmanager
auth:
baseImage: custom-registry.com/openshift-oauth-proxy
nodeExporter:
baseImage: custom-registry.com/node-exporter
kubeStateMetrics:
baseImage: custom-registry.com/kube-state-metrics
addonResizerBaseImage: custom-registry.com/addon-resizer
```

> Note: The container images coming from repositories of a custom registry are expected to mirror the canonical repositories on [quay.io][quay].

## Reference

The following configuration options are available for Cluster Monitoring.

### Config

The Config object represents the top level keys of the YAML configuration. Refer to the underlying configuration objects for their individual fields.

```yaml
[ prometheusOperator: <PrometheusOperatorConfig> ]
[ prometheusK8s: <PrometheusK8sConfig> ]
[ alertmanagerMain: <AlertmanagerMainConfig> ]
[ ingress: <IngressConfig> ]
[ auth: <AuthConfig> ]
[ nodeExporter: <NodeExporterConfig> ]
[ kubeStateMetrics: <KubeStateMetricsConfig> ]
```

### PrometheusOperatorConfig

Use PrometheusOperatorConfig to customize the base images used by the Prometheus Operator.

```yaml
# baseImage references a base container image. Defaults to "quay.io/coreos/prometheus-operator".
baseImage: <string>
# prometheusConfigReloaderBaseImage references a base container image. Defaults to "quay.io/coreos/prometheus-config-reloader".
prometheusConfigReloaderBaseImage: <string>
# configReloaderBaseImage references a base container image. Defaults to "quay.io/coreos/configmap-reload".
configReloaderBaseImage: <string>
```

### PrometheusK8sConfig

Use PrometheusK8sConfig to customize the Prometheus instance used for cluster monitoring.

```yaml
# retention time for samples.
retention: <string>
# baseImage references a base container image. Defaults to "quay.io/prometheus/prometheus".
baseImage: <string>
# nodeSelector defines the nodes on which the Prometheus server will be scheduled.
nodeSelector:
[ - <labelname>: <labelvalue> ]
# resources defines the resource requests and limits for the Prometheus instance.
resources: [v1.ResourceRequirements](https://kubernetes.io/docs/api-reference/v1.6/#resourcerequirements-v1-core)
# externalLabels allows the external labels configuration of Prometheus to be
# specified by users
externalLabels:
[ - <labelname>: <labelvalue> ]
```

### AlertmanagerMainConfig

Use AlertmanagerMainConfig to customize the central Alertmanager cluster.

```yaml
# baseImage references a base container image. Defaults to "quay.io/prometheus/alertmanager".
baseImage: <string>
# nodeSelector defines the nodes on which Alertmanager instances will be scheduled.
nodeSelector:
[ - <labelname>: <labelvalue> ]
# resources defines the resource requests and limits for the Alertmanager instances.
resources: [v1.ResourceRequirements](https://kubernetes.io/docs/api-reference/v1.6/#resourcerequirements-v1-core)
# volumeClaimTemplate defines the template to use for persistent storage for Alertmanager nodes.
volumeClaimTemplate: [v1.PersistentVolumeClaim](https://kubernetes.io/docs/api-reference/v1.6/#persistentvolumeclaim-v1-core)
```

### AuthConfig

Use AuthConfig to configure parameters for the authentication proxies of Prometheus and Alertmanager Pods.

```yaml
# baseImage is the container image repository that will be used to deploy monitoring auth service, along with the tag specified in the asset manifest. Defaults to repository listed in manifests in assets folder.
baseImage: <string>
```
### NodeExporterConfig

Use NodeExporterConfig to configure parameters for deployment of the `node-exporter` components.

```yaml
# baseImage is the container image repository that will be used to deploy the node-exporter pods
baseImage: <string>
```
### KubeStateMetricsConfig

Use KubeStateMetricsConfig to configure parameters for deployment of the `kube-state-metrics` components.

```yaml
# baseImage is the container image repository that will be used to deploy the kube-state-metrics pods
baseImage: <string>
addonResizerBaseImage: <string>
```

[quay]: https://quay.io/
Loading