openshift · brancz · Apr 13, 2018 · Apr 12, 2018
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,60 @@
+# How to Contribute
+
+cluster-monitoring-operator projects are [Apache 2.0 licensed](LICENSE) and accept contributions via GitHub pull requests.  This document outlines some of the conventions on development workflow, commit message formatting, contact points and other resources to make it easier to get your contribution accepted.
+
+# Certificate of Origin
+
+By contributing to this project you agree to the Developer Certificate of Origin (DCO). This document was created by the Linux Kernel community and is a simple statement that you, as a contributor, have the legal right to make the contribution. See the [DCO](DCO) file for details.
+
+## Getting Started
+
+- Fork the repository on GitHub
+- Read the [README](README.md) for build and test instructions
+- Play with the project, submit bugs, submit patches!
+
+## Contribution Flow
+
+This is a rough outline of what a contributor's workflow looks like:
+
+- Create a topic branch from where you want to base your work (usually master).
+- Make commits of logical units.
+- Make sure your commit messages are in the proper format (see below).
+- Push your changes to a topic branch in your fork of the repository.
+- Make sure the tests pass, and add any new tests as appropriate.
+- Submit a pull request to the original repository.
+
+Thanks for your contributions!
+
+### Coding Style
+
+cluster-monitoring-operator projects written in Go follow a set of style guidelines that we've documented [here](https://github.com/coreos/docs/tree/master/golang). Please follow them when working on your contributions.
+
+### Format of the Commit Message
+
+We follow a rough convention for commit messages that is designed to answer two
+questions: what changed and why. The subject line should feature the what and
+the body of the commit should describe the why.
+
+```
+scripts: add the test-cluster command
+
+this uses tmux to setup a test cluster that you can easily kill and
+start for debugging.
+
+Fixes #38
+```
+
+The format can be described more formally as follows:
+
+```
+<subsystem>: <what changed>
+<BLANK LINE>
+<why this change was made>
+<BLANK LINE>
+<footer>
+```
+
+The first line is the subject and should be no longer than 70 characters, the
+second line is always blank, and other lines should be wrapped at 80 characters.
+This allows the message to be easier to read on GitHub as well as in various
+git tools.
diff --git a/DCO b/DCO
@@ -0,0 +1,36 @@
+Developer Certificate of Origin
+Version 1.1
+
+Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
+660 York Street, Suite 102,
+San Francisco, CA 94110 USA
+
+Everyone is permitted to copy and distribute verbatim copies of this
+license document, but changing it is not allowed.
+
+
+Developer's Certificate of Origin 1.1
+
+By making a contribution to this project, I certify that:
+
+(a) The contribution was created in whole or in part by me and I
+    have the right to submit it under the open source license
+    indicated in the file; or
+
+(b) The contribution is based upon previous work that, to the best
+    of my knowledge, is covered under an appropriate open source
+    license and I have the right under that license to submit that
+    work with modifications, whether created in whole or in part
+    by me, under the same open source license (unless I am
+    permitted to submit under a different license), as indicated
+    in the file; or
+
+(c) The contribution was provided directly to me by some other
+    person who certified (a), (b) or (c) and I have not modified
+    it.
+
+(d) I understand and agree that this project and the contribution
+    are public and that a record of the contribution (including all
+    personal information I submit with it, including my sign-off) is
+    maintained indefinitely and may be redistributed consistent with
+    this project or the open source license(s) involved.
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,5 @@
+FROM quay.io/prometheus/busybox:latest
+
+ADD operator /bin/operator
+
+ENTRYPOINT ["/bin/operator"]
diff --git a/Dockerfile.generate b/Dockerfile.generate
@@ -0,0 +1,4 @@
+FROM golang:1.9.2
+
+RUN apt-get update && \
+    apt-get install -y python-yaml
diff --git a/Documentation/arch.png b/Documentation/arch.png
diff --git a/Documentation/cluster-monitoring.md b/Documentation/cluster-monitoring.md
@@ -0,0 +1,62 @@
+# Cluster Monitoring
+
+Cluster monitoring ships with a pre-configured and self-updating monitoring stack that is based on the [Prometheus][prometheus] open source project and its wider eco-system. It provides monitoring of cluster components and ships with a set of alerts to immediately notify cluster admins about any occurring problems.
+
+## Overview
+
+At the heart of the monitoring stack sits the Cluster Monitoring Operator, which watches over the deployed monitoring components and resources, and ensures that they are always up to date.
+
+One of the core components that Cluster Monitoring ships is the [Prometheus Operator][prom-operator]. The Prometheus Operator creates, configures, and manages Prometheus monitoring instances. It automatically generates monitoring target configurations based on familiar Kubernetes label queries.
+
+![Architecture](./arch.png)
+
+## Cluster Monitoring
+
+A Prometheus instance dedicated to monitoring the Cluster cluster itself is also shipped, controlled by the Prometheus Operator. This instance includes a set of alerting rules to notify operators about problems in a cluster.
+
+Use the Prometheus [Alertmanager][alertmanager] to send notifications to operators. Cluster Monitoring includes a highly available cluster of the Alertmanager, meant to be used not only by the Prometheus instance monitoring the cluster, but rather by all Prometheus instances.
+
+In addition to Prometheus and Alertmanager, Cluster Monitoring also includes [node-exporter][node-exporter] and [kube-state-metrics][kube-state]. Node-exporter is an agent deployed on every node to collect metrics about it. The kube-state-metrics exporter agent converts Kubernetes objects to metrics consumable by Prometheus.
+
+The targets monitored as part of the cluster monitoring are:
+
+- Prometheus itself
+- Prometheus-Operator
+- Alertmanager cluster instances
+- Kubernetes apiserver
+- kubelets (the kubelet embeds cAdvisor for per container metrics)
+- kube-scheduler
+- kube-controller-manager
+- kube-state-metrics
+- node-exporter
+
+All these components are automatically updated.
+
+Cluster Monitoring is also configurable, learn how to [configure Cluster Monitoring][configure-monitoring].
+
+> Note that in order to be able to deliver updates with guaranteed compatibility, configurability of the Cluster Monitoring stack is limited to the explicitly available options. Read more on [update and compatibility guarantees][update-and-compatibility-guarantees].
+
+## Application Monitoring
+
+Create additional Prometheus instances managed by the Prometheus Operator to monitor individual applications.
+
+## Accessing Prometheus and Alertmanager
+
+Cluster Monitoring ships with a Prometheus instance for cluster monitoring and a central Alertmanager cluster. In addition to Prometheus and Alertmanager, Cluster Monitoring also includes a [Grafana][grafana] instance as well as pre-built dashboards for cluster monitoring troubleshooting.
+
+By default, all web UIs are exposed through Kubernetes Ingress, and accessible at the following names:
+
+- Prometheus: https://$CLUSTER-DNS/prometheus
+- Alertmanager: https://$CLUSTER-DNS/alertmanager
+- Grafana: https://$CLUSTER-DNS/grafana
+
+Authentication is performed against the OpenShift identity system, and uses the same credentials or means of authentication as is used elsewhere in OpenShift.
+
+[alertmanager]: https://prometheus.io/docs/alerting/alertmanager/
+[grafana]: https://grafana.com/
+[configure-monitoring]: user-guides/configuring-cluster-monitoring.md
+[node-exporter]: https://github.com/prometheus/node_exporter
+[kube-state]: https://github.com/kubernetes/kube-state-metrics
+[prom-operator]: https://coreos.com/operators/prometheus/docs/latest/
+[prometheus]: https://prometheus.io/
+[update-and-compatibility-guarantees]: user-guides/update-and-compatibility-guarantees.md
diff --git a/Documentation/user-guides/configuring-cluster-monitoring.md b/Documentation/user-guides/configuring-cluster-monitoring.md
@@ -0,0 +1,127 @@
+# Configuring Cluster Monitoring
+
+Parts of Cluster Monitoring are configurable. This configuration lies in a ConfigMap called `cluster-monitoring-config` in the `openshift-monitoring` namespace. The configuration file itself is defined under the `config.yaml` key within the ConfigMap's data.
+
+Configuring Cluster Monitoring is optional. If the config does not exist, or is empty or malformed, then defaults will be used.
+
+## Configuring custom images
+
+In certain environments it may be required that container images are downloaded from a custom registry rather than from the canonical container image repositories on [quay.io][quay].
+
+This is an example configuration with all image parameters set to a custom registry:
+
+[embedmd]:# (../../examples/user-guides/configuring-cluster-monitoring/custom-image-config.yaml)
+```yaml
+prometheusOperator:
+  baseImage: custom-registry.com/prometheus-operator
+  prometheusConfigReloaderBaseImage: custom-registry.com/prometheus-config-reloader
+  configReloaderBaseImage: custom-registry.com/configmap-reload
+prometheusK8s:
+  baseImage: custom-registry.com/prometheus
+alertmanagerMain:
+  baseImage: custom-registry.com/alertmanager
+auth:
+  baseImage: custom-registry.com/openshift-oauth-proxy
+nodeExporter:
+  baseImage: custom-registry.com/node-exporter
+kubeStateMetrics:
+  baseImage: custom-registry.com/kube-state-metrics
+  addonResizerBaseImage: custom-registry.com/addon-resizer
+```
+
+> Note: The container images coming from repositories of a custom registry are expected to mirror the canonical repositories on [quay.io][quay].
+
+## Reference
+
+The following configuration options are available for Cluster Monitoring.
+
+### Config
+
+The Config object represents the top level keys of the YAML configuration. Refer to the underlying configuration objects for their individual fields.
+
+```yaml
+[ prometheusOperator: <PrometheusOperatorConfig> ]
+[ prometheusK8s: <PrometheusK8sConfig> ]
+[ alertmanagerMain: <AlertmanagerMainConfig> ]
+[ ingress: <IngressConfig> ]
+[ auth: <AuthConfig> ]
+[ nodeExporter: <NodeExporterConfig> ]
+[ kubeStateMetrics: <KubeStateMetricsConfig> ]
+```
+
+### PrometheusOperatorConfig
+
+Use PrometheusOperatorConfig to customize the base images used by the Prometheus Operator.
+
+```yaml
+# baseImage references a base container image. Defaults to "quay.io/coreos/prometheus-operator".
+baseImage: <string>
+# prometheusConfigReloaderBaseImage references a base container image. Defaults to "quay.io/coreos/prometheus-config-reloader".
+prometheusConfigReloaderBaseImage: <string>
+# configReloaderBaseImage references a base container image. Defaults to "quay.io/coreos/configmap-reload".
+configReloaderBaseImage: <string>
+```
+
+### PrometheusK8sConfig
+
+Use PrometheusK8sConfig to customize the Prometheus instance used for cluster monitoring.
+
+```yaml
+# retention time for samples.
+retention: <string>
+# baseImage references a base container image. Defaults to "quay.io/prometheus/prometheus".
+baseImage: <string>
+# nodeSelector defines the nodes on which the Prometheus server will be scheduled.
+nodeSelector:
+  [ - <labelname>: <labelvalue> ]
+# resources defines the resource requests and limits for the Prometheus instance.
+resources: [v1.ResourceRequirements](https://kubernetes.io/docs/api-reference/v1.6/#resourcerequirements-v1-core)
+# externalLabels allows the external labels configuration of Prometheus to be
+# specified by users
+externalLabels:
+  [ - <labelname>: <labelvalue> ]
+```
+
+### AlertmanagerMainConfig
+
+Use AlertmanagerMainConfig to customize the central Alertmanager cluster.
+
+```yaml
+# baseImage references a base container image. Defaults to "quay.io/prometheus/alertmanager".
+baseImage: <string>
+# nodeSelector defines the nodes on which Alertmanager instances will be scheduled.
+nodeSelector:
+  [ - <labelname>: <labelvalue> ]
+# resources defines the resource requests and limits for the Alertmanager instances.
+resources: [v1.ResourceRequirements](https://kubernetes.io/docs/api-reference/v1.6/#resourcerequirements-v1-core)
+# volumeClaimTemplate defines the template to use for persistent storage for Alertmanager nodes.
+volumeClaimTemplate: [v1.PersistentVolumeClaim](https://kubernetes.io/docs/api-reference/v1.6/#persistentvolumeclaim-v1-core)
+```
+
+### AuthConfig
+
+Use AuthConfig to configure parameters for the authentication proxies of Prometheus and Alertmanager Pods.
+
+```yaml
+# baseImage is the container image repository that will be used to deploy monitoring auth service, along with the tag specified in the asset manifest. Defaults to repository listed in manifests in assets folder.
+baseImage: <string>
+```
+### NodeExporterConfig
+
+Use NodeExporterConfig to configure parameters for deployment of the `node-exporter` components.
+
+```yaml
+# baseImage is the container image repository that will be used to deploy the node-exporter pods
+baseImage: <string>
+```
+### KubeStateMetricsConfig
+
+Use KubeStateMetricsConfig to configure parameters for deployment of the `kube-state-metrics` components.
+
+```yaml
+# baseImage is the container image repository that will be used to deploy the kube-state-metrics pods
+baseImage: <string>
+addonResizerBaseImage: <string>
+```
+
+[quay]: https://quay.io/