GitHub - swarmlibs/promstack: A Docker Stack deployment for the monitoring suite for Docker Swarm includes (Grafana, Prometheus, cAdvisor, Node exporter and Blackbox prober exporter)

About

A comprehensive guide for collecting, and exporting telemetry data (metrics, logs, and traces) from Docker Swarm environment can be found at swarmlibs/dockerswarm-monitoring-guide.

A Docker Stack deployment for the monitoring suite for Docker Swarm includes (Grafana, Prometheus, cAdvisor, Node exporter and Blackbox prober exporter)

Important

This project is a work in progress and is not yet ready for production use. But feel free to test it and provide feedback.

Table of Contents:

About
Concepts
Stacks
Pre-requisites
Getting Started
Grafana
- Injecting Grafana Dashboards
- Injecting Grafana Provisioning configurations
Prometheus
Services and Ports
Troubleshooting
- Grafana dashboards are not present
- Promethues targets are not present
License

Concepts

This section covers some concepts that are important to understand for day to day Promstack usage and operation.

When using Prometheus in server mode, scraped samples are stored in memory and on disk. These samples need to be preserved during disruptions, such as service replacements or cluster maintenance operations which cause evictions.

On the other hand, when running Prometheus in agent mode, samples are sent to a remote write target immediately, and are not kept locally for a long time. The only use-case for storing samples locally is to allow retries when remote write targets are not available. This is achieved by keeping scraped samples in a WAL for 2h at most. Samples which have been successfully sent to remote write targets are immediately removed from local storage.

Note

Read more about Prometheus Agent support proposal from prometheus-operator repository here.

Prometheus Server

The Prometheus server is the core component of the monitoring stack. It is responsible for collecting, storing and querying the metrics data. The Prometheus server is configured to receive remote write requests from the Prometheus agent.

Prometheus Agent

By design, the Prometheus agent is deploy globally to all noded and configured to automatically discover services, tasks and scrape the metrics from those deployed within the node.

You can use Docker object labels in the deploy block to automagically register services as targets for Prometheus. It also configured with config provider and config reloader services.

See Register services as Prometheus targets for more information.

Configurations provider and config reloader services

The grafana and prometheus service requires extra services to operate, mainly for providing configuration files. There are two type of child services, a config provider and config reloader service.

Here an example visual representation of the services:

These are the services that are responsible for providing the configuration files for the grafana and prometheus services.

Kubernetes compatible labels

Here is a list of Docker Service/Task labels that are mapped to Kubernetes labels.

Kubernetes	Docker	Scrape config
`namespace`	`__meta_dockerswarm_service_label_com_docker_stack_namespace`
`deployment`	`__meta_dockerswarm_service_name`
`pod`	`dockerswarm_task_name`	`dockerswarm/services`
`service`	`__meta_dockerswarm_service_name`	`dockerswarm/services-endpoints`

The dockerswarm_task_name is a combination of the service name, slot and task id.
The task id is a unique identifier for the task. It depends on the mode of the deployement (replicated or global). If the service is replicated, the task id is the slot number. If the service is global, the task id is the node id.

Stacks

These are the services that are part of the stack:

Blackbox exporter: https://github.com/prometheus/blackbox_exporter
cAdvisor: https://github.com/google/cadvisor
Grafana: https://github.com/grafana/grafana
Node exporter: https://github.com/prometheus/node_exporter
Prometheus: https://github.com/prometheus/prometheus
Pushgateway: https://github.com/prometheus/pushgateway

Pre-requisites

Docker running Swarm mode
A Docker Swarm cluster with at least 3 nodes
Configure Docker daemon to expose metrics for Prometheus
The official swarmlibs stack, this provided necessary services for other stacks operate.

Getting Started

There are two ways to deploy the promstack stack:

Unattented deployment
Manually deploy promstack stack

The unattented deployment is the recommended way to deploy the stack. It will automatically create the necessary networks and deploy the stack to the Docker Swarm cluster. The manual deployment is useful for debugging and troubleshooting the stack.

Unattented deployment

To deploy the stack, you can use the following command:

$ docker run -it --rm \
    --name promstack \
    -v /var/run/docker.sock:/var/run/docker.sock \
    swarmlibs/promstack install

For more documentation, visit https://github.com/swarmlibs/docker-promstack.

Manually deploy `promstack` stack

To get started, clone this repository to your local machine:

git clone https://github.com/swarmlibs/promstack.git
# or
gh repo clone swarmlibs/promstack

Navigate to the project directory:

cd promstack

Create user-defined networks:

make stack-networks

# or run the following command to create the networks manually

docker network create --scope=swarm --driver=overlay --attachable public
docker network create --scope=swarm --driver=overlay --attachable prometheus
docker network create --scope=swarm --driver=overlay --attachable prometheus_gwnetwork

This public network is used by Ingress service and Blackbox exporter to perform network probes
The prometheus network is used to perform service discovery for Prometheus scrape configs.
The prometheus_gwnetwork network is used for the internal communication between the Prometheus Server, exporters and other agents.

The grafana and prometheus service requires extra services to operate, mainly for providing configuration files. There are two type of child services, a config provider and config reloader service. In order to ensure placement of these services, you need to deploy the swarmlibs stack.

See https://github.com/swarmlibs/swarmlibs for more information.

Deploy stack

This will deploy the stack to the Docker Swarm cluster. Please ensure you have the necessary permissions to deploy the stack and the swarmlibs stack is deployed. See Pre-requisites for more information.

Important

It is important to note that the promstack is the default stack namespace for this deployment.
It is NOT RECOMMENDED to change the stack namespace as it may cause issues with the deployment.

make deploy

Remove stack

Warning

This will remove the stack and all the services associated with it. Use with caution.

make remove

Verify deployment

To verify the deployment, you can use the following commands:

docker stack services promstack

# NAME                                                 MODE         REPLICAS               IMAGE                                                           
# promstack_blackbox-exporter                          replicated   1/1 (max 1 per node)   prom/blackbox-exporter:v0.25.0                                  
# promstack_cadvisor                                   global       1/1                    gcr.io/cadvisor/cadvisor:v0.49.1                                
# promstack_grafana                                    replicated   1/1 (max 1 per node)   busybox:latest                                                  
# promstack_grafana-dashboard-provider                 global       1/1                    swarmlibs/prometheus-config-provider:0.1.0-rc.1                 
# promstack_grafana-provisioning-alerting-provider     global       1/1                    swarmlibs/prometheus-config-provider:0.1.0-rc.1                 
# promstack_grafana-provisioning-config-reloader       global       1/1                    swarmlibs/grafana-provisioning-config-reloader:0.1.0-rc.3       
# promstack_grafana-provisioning-dashboard-provider    global       1/1                    swarmlibs/prometheus-config-provider:0.1.0-rc.1                 
# promstack_grafana-provisioning-datasource-provider   global       1/1                    swarmlibs/prometheus-config-provider:0.1.0-rc.1                 
# promstack_grafana-server                             global       1/1                    grafana/grafana:11.3.0                                          
# promstack_node-exporter                              global       1/1                    prom/node-exporter:v1.8.1                                       
# promstack_prometheus                                 global       1/1                    swarmlibs/genconfig:0.1.0-rc.1                                  
# promstack_prometheus-agent                           global       1/1                    prom/prometheus:v3.0.0                                          
# promstack_prometheus-config-reloader                 global       1/1                    quay.io/prometheus-operator/prometheus-config-reloader:v0.74.0  
# promstack_prometheus-rule-provider                   global       1/1                    swarmlibs/prometheus-config-provider:0.1.0-rc.1                 
# promstack_prometheus-scrape-config-provider          global       1/1                    swarmlibs/prometheus-config-provider:0.1.0-rc.1                 
# promstack_prometheus-server                          replicated   1/1 (max 1 per node)   prom/prometheus:v3.0.0                                           
# promstack_prometheus-service-discovery               global       1/1                    swarmlibs/prometheus-service-discovery:0.1.0-rc.1               
# promstack_pushgateway                                replicated   1/1 (max 1 per node)   prom/pushgateway:v1.10.0

You can continously monitor the deployment by running the following command:

# The `watch` command will continously monitor the services in the stack and update the output every 2 seconds.
watch -n1 docker stack services promstack

Grafana

The Grafana service is configured with config provider and config reload services. The config provider service is responsible for providing the configuration files for the Grafana service. The config reloader service is responsible for reloading the Grafana service configuration when the config provider service updates the configuration files.

The following configuration are supported:

Grafana Dashboards
Provisioning (Datasources, Dashboards)

Injecting Grafana Dashboards

To inject a Grafana Provisioning configurations, you need to specify config object in your docker-compose.yml or docker-stack.yml file as shown below. The label io.grafana.dashboard=true is used by the config provider service to inject the dashboards into Grafana.

# See grafana/docker-stack.yml
configs:
  # Grafana & Prometheus dashboards
  gf-dashboard-grafana-metrics:
    name: gf-dashboard-grafana-metrics-v1
    file: ./dashboards/grafana-metrics.json
    labels:
      io.grafana.dashboard: "true"

Injecting Grafana Provisioning configurations

To inject a Grafana Provisioning configurations, you need to specify config object in your docker-compose.yml or docker-stack.yml file as shown below.

There are two types of provisioning configurations:

Dashboards: Use io.grafana.provisioning.dashboard=true label to inject the provisioning configuration for dashboards.
Datasources: Use io.grafana.provisioning.datasource=true label to inject the provisioning configuration for data sources.

# See grafana/docker-stack.yml
configs:
  # Grafana dashboards provisioning config
  gf-provisioning-dashboards:
    name: gf-provisioning-dashboards-v1
    file: ./provisioning/dashboards/grafana-dashboards.yml
    labels:
      io.grafana.provisioning.dashboard: "true"

  # Grafana datasources provisioning config
  gf-provisioning-datasource-prometheus:
    name: gf-provisioning-datasource-prometheus-v1
    file: ./provisioning/datasources/prometheus.yaml
    labels:
      io.grafana.provisioning.datasource: "true"

Prometheus

By design, the Prometheus server is configured to automatically discover and scrape the metrics from the Docker Swarm nodes, services and tasks. The default data retention is 182 days or ~6 months.

You can use Docker object labels in the deploy block to automagically register services as targets for Prometheus. It also configured with config provider and config reloader services.

Register services as Prometheus targets

io.prometheus.enabled: Enable the Prometheus scraping for the service.
io.prometheus.job_name: The Prometheus job name. Default is <docker_stack_namespace>/<service_name|job_name>.
io.prometheus.scrape_scheme: The scheme to scrape the metrics. Default is http.
io.prometheus.scrape_port: The port to scrape the metrics. Default is 80.
io.prometheus.metrics_path: The path to scrape the metrics. Default is /metrics.
io.prometheus.param_<name>: The Prometheus scrape parameters.

Example:

# Annotations:
services:
  my-app:
    # ...
    networks:
      prometheus:
    deploy:
      # ...
      labels:
        io.prometheus.enabled: "true"
        io.prometheus.job_name: "my-app"
        io.prometheus.scrape_port: "8080"

# As limitations of the Docker Swarm, you need to attach the service to the prometheus network.
# This is required to allow the Prometheus server to scrape the metrics.
networks:
  prometheus:
    name: prometheus
    external: true

Register a custom scrape config

To register a custom scrape config, you need to specify config object in your docker-compose.yml or docker-stack.yml file as shown below. The label io.prometheus.scrape_config=true is used by the Prometheus config provider service to inject the scrape config into Prometheus.

# See cadvisor/docker-stack.yml
configs:
  prometheus-cadvisor:
    name: prometheus-cadvisor-v1
    file: ./prometheus/cadvisor.yml
    labels:
      io.prometheus.scrape_config: "true"

Configure Prometheus Server

You can apply custom configurations to Prometheus via Environment variables by running docker service update command on promstack_prometheus service:

# Register the Alertmanager service address
docker service update --env-add PROMETHEUS_SCRAPE_INTERVAL=15s promstack_prometheus

# Remove the Alertmanager service address
docker service update --env-rm PROMETHEUS_SCRAPE_INTERVAL promstack_prometheus

Environment variables

PROMETHEUS_SCRAPE_INTERVAL: The scrape interval for Prometheus, default is 10s
PROMETHEUS_SCRAPE_TIMEOUT: The scrape timeout for Prometheus, default is 5
PROMETHEUS_EVALUATION_INTERVAL: The evaluation interval for Prometheus, default is 1m
PROMETHEUS_CLUSTER_NAME: The cluster name for Prometheus, default is promstack
PROMETHEUS_CLUSTER_REPLICA: The cluster replica for Prometheus, default is 1

Note

Configuration changes will be applied to the Prometheus server immediately with no downtime to the service.

Important

The Prometheus server is designed to be deployed as a single instance and should be used with the built-in Grafana dashboards for in-cluster monitoring and alerting.

It is recommended to deploy a separate Prometheus storage servers in a high-availability configuration such as Thanos, Cortex, Grafana Mimir, et cetera for long-term storage and querying. Due to complexity and limitations of the Docker Swarm, this deployment is not offered as part of this stack.

Configure Prometheus Agent

Important

The Prometheus Agent is currently not configurable. It is designed to be deployed globally to all nodes and automatically discover services, tasks and scrape the metrics from those deployed within the node and send the metrics to the Prometheus server.

Configure Alertmanager

Alertmanager is a Prometheus component that handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integrations such as email, PagerDuty, Slack, etc.

By default, the Alertmanager is disabled. To enable the Alertmanager, you need to specify the following environment variables:

PROMETHEUS_ALERTMANAGER_ENABLED: Enable Alertmanager for Prometheus server, default is false
PROMETHEUS_ALERTMANAGER_ADDR: The Alertmanager service address
PROMETHEUS_ALERTMANAGER_PORT: The Alertmanager service port, default is 9093

Configure Remote Write

Remote write is a feature that allows Prometheus servers to send samples to a remote storage system e.g. Thanos, Cortex, Grafana Mimir, etc.

By default, the remote write is disabled. To enable the remote write, you need to specify the following environment variables:

PROMETHEUS_REMOTE_WRITE_ENABLED: Enable remote write for Prometheus server, default is false
PROMETHEUS_REMOTE_WRITE_URL: The remote write URL for Prometheus server to send the metrics to.

Services and Ports

The following services and ports are exposed by the stack. You can access them via prometheus network using the cluster DNS name.

Service	Cluster DNS	Port
Grafana	`grafana.svc.cluster.local`	`3000`
Prometheus	`prometheus.svc.cluster.local`	`9090`
Pushgateway	`pushgateway.svc.cluster.local`	`9091`
Blackbox exporter	`blackbox-exporter.svc.cluster.local`	`9115`

The following services and ports are exposed per node, you can access them via the node IP address.
e.g: http://<node_ip>:<port>

Service	Port
Prometheus Agent	`19090`
cAdvisor	`18080`
Node exporter	`19100`

Troubleshooting

Grafana dashboards are not present

If the Grafana dashboards are not present, please restart grafana service to reload the dashboards.

# By force updating the service, it will restart the service and reload the dashboards.
docker service update --force promstack_grafana

Promethues targets are not present

Please ensure the services are attached to the prometheus network. This is required to allow the Prometheus server to scrape the metrics.

# Annotations:
services:
  my-app:
    # ...
    networks:
      prometheus:
    deploy:
      # ...
      labels:
        io.prometheus.enabled: "true"
        io.prometheus.job_name: "my-app"
        io.prometheus.scrape_port: "8080"

# As limitations of the Docker Swarm, you need to attach the service to the prometheus network.
# This is required to allow the Prometheus server to scrape the metrics.
networks:
  prometheus:
    name: prometheus
    external: true

License

Licensed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 506 Commits
.vscode		.vscode
blackbox-exporter		blackbox-exporter
cadvisor		cadvisor
grafana		grafana
hacks		hacks
housekeeping-agent		housekeeping-agent
node-exporter		node-exporter
prometheus-node-agent		prometheus-node-agent
prometheus-service-discovery		prometheus-service-discovery
prometheus		prometheus
pushgateway		pushgateway
.dockerenv		.dockerenv
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-stack.yml		docker-stack.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Concepts

Prometheus Server

Prometheus Agent

Configurations provider and config reloader services

Kubernetes compatible labels

Stacks

Pre-requisites

Getting Started

Unattented deployment

Manually deploy `promstack` stack

Deploy stack

Remove stack

Verify deployment

Grafana

Injecting Grafana Dashboards

Injecting Grafana Provisioning configurations

Prometheus

Register services as Prometheus targets

Register a custom scrape config

Configure Prometheus Server

Environment variables

Configure Prometheus Agent

Configure Alertmanager

Configure Remote Write

Services and Ports

Troubleshooting

Grafana dashboards are not present

Promethues targets are not present

License

About

Releases

Packages

Languages

License

swarmlibs/promstack

Folders and files

Latest commit

History

Repository files navigation

About

Concepts

Prometheus Server

Prometheus Agent

Configurations provider and config reloader services

Kubernetes compatible labels

Stacks

Pre-requisites

Getting Started

Unattented deployment

Manually deploy promstack stack

Deploy stack

Remove stack

Verify deployment

Grafana

Injecting Grafana Dashboards

Injecting Grafana Provisioning configurations

Prometheus

Register services as Prometheus targets

Register a custom scrape config

Configure Prometheus Server

Environment variables

Configure Prometheus Agent

Configure Alertmanager

Configure Remote Write

Services and Ports

Troubleshooting

Grafana dashboards are not present

Promethues targets are not present

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Manually deploy `promstack` stack

Packages