prometheus-ecs-sd

ECS Service Discovery for Prometheus

Info

This tool provides Prometheus service discovery for Docker containers running on AWS ECS. You can easily instrument your app using a Prometheus client and enable discovery adding an ENV variable at the Service Task Definition. Your container will then be added to the list of Prometheus targets to be scraped.

Bridge, host and awsvpc (EC2 and Fargate) network modes are supported.

Requires

python3
The boto3 library
Prometheus 2.x.

Developing

Local development requires the poetry tool, check the documentation for installation instructions.

To start the service run

AWS_PROFILE=<your_aws_profile> make dev-start

any AWS configuration supported by boto3 is supported, e.g. individual access/secret keys.

To format code run

make format

Setup

discoverecs.py should run alongside the Prometheus server. It generates targets using JSON file service discovery. It can be started by running:

python discoverecs.py --directory /opt/prometheus-ecs

note that the directory must already exist.

The output directory is then /opt/prometheus-ecs defined in your Prometheus config as a file_sd_config job:

- job_name: 'ecs-1m'
  scrape_interval: 1m
  file_sd_configs:
      - files:
            - /opt/prometheus-ecs/1m-tasks.json
  relabel_configs:
      - source_labels: [metrics_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)

You can also specify a discovery interval with --interval (in seconds). Default is 60s. We also provide caching to minimize hitting query rate limits with the AWS ECS API. discoverecs.py runs in a loop until interrupted and will output target information to stdout.

To make your application discoverable by Prometheus, you need to set the following environment variable in your task definition:

{ "name": "PROMETHEUS", "value": "true" }

Metric path and scrape interval is supported via PROMETHEUS_ENDPOINT:

"interval:/metric_path,..."

Examples:

"5m:/mymetrics,30s:/mymetrics2"
"/mymetrics"
"30s:/mymetrics1,/mymetrics2"

Under ECS task definition (task.json):

{ "name": "PROMETHEUS_ENDPOINT", "value": "5m:/mymetrics,30s:/mymetrics2" }

Available scrape intervals: 15s, 30s, 1m, 5m.

The default metric path is /metrics. The default scrape interval is 1m.

The following Prometheus configuration should be used to support all available intervals:

- job_name: 'ecs-15s'
  scrape_interval: 15s
  file_sd_configs:
      - files:
            - /opt/prometheus-ecs/15s-tasks.json
  relabel_configs:
      - source_labels: [metrics_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)

- job_name: 'ecs-30s'
  scrape_interval: 30s
  file_sd_configs:
      - files:
            - /opt/prometheus-ecs/30s-tasks.json
  relabel_configs:
      - source_labels: [metrics_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)

- job_name: 'ecs-1m'
  scrape_interval: 1m
  file_sd_configs:
      - files:
            - /opt/prometheus-ecs/1m-tasks.json
  relabel_configs:
      - source_labels: [metrics_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)

- job_name: 'ecs-5m'
  scrape_interval: 5m
  file_sd_configs:
      - files:
            - /opt/prometheus-ecs/5m-tasks.json
  relabel_configs:
      - source_labels: [metrics_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)

To add tags to the discovered services, set the PROMETHEUS_TAGS ENV variable:

{"name": "PROMETHEUS_TAGS", "value": ",team=foo,stage=prod,"}

EC2 IAM Policy

The following IAM Policy should be added when running discoverecs.py in EC2:

  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": ["ecs:Describe*", "ecs:List*"],
      "Effect": "Allow",
      "Resource": "*"
    }

You will also need EC2 Read Only Access. If you use Terraform:

# Prometheus EC2 service discovery
resource "aws_iam_role_policy_attachment" "prometheus-server-role-ec2-read-only" {
  role = "${aws_iam_role.prometheus-server-role.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess"
}

Cross-Account service discovery

To discover services in other AWS accounts, follow the below steps:

Create the above IAM policy in the target accounts, too

Create an IAM role in the target accounts with the policy attached and a trust relationship with the Prometheus account so that the discovery service is trusted to assume the role. Trust Policy example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111111111111:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

In the account with Prometheus running, add "sts:AssumeRole" to the list of allowed actions in the IAM policy

Provide the Role ARN or multiple Role ARNs to the discovery service with --role:

python discoverecs.py --directory /opt/prometheus-ecs --role \
  arn:aws:iam::222222222222:role/CrossAccountSdExampleRole \
  arn:aws:iam::333333333333:role/CrossAccountSdExampleRole

Special cases

For skipping labels, set PROMETHEUS_NOLABELS to true. This is useful when you use "blackbox" exporters or Pushgateway in a task and metrics are exposed at a service level. This way, no EC2/ECS labels will be exposed and the instance label will always point to the job name.

Networking

All network modes are supported (bridge, host and awsvpc).

If PROMETHEUS_PORT and PROMETHEUS_CONTAINER_PORT are not set, the script will pick the first port from the container definition (in awsvpc and host network mode) or the container host network bindings in bridge mode. On Fargate, if PROMETHEUS_PORT is not set, it will default to port 80.

If PROMETHEUS_CONTAINER_PORT is set, it will look at the container host network bindings, and find the entry with a matching containerPort. It will then use the hostPort found there as target port. This is useful when the container port is known, but the hostPort is randomly picked by ECS (by setting hostPort to 0 in the task definition).

If your container uses multiple ports, it's recommended to specify PROMETHEUS_PORT (awsvpc, host) or PROMETHEUS_CONTAINER_PORT (bridge).

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
discoverecs.py		discoverecs.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

prometheus-ecs-sd

Info

Developing

Setup

EC2 IAM Policy

Cross-Account service discovery

Special cases

Networking

About

Releases

Packages

Languages

License

pr11t/prometheus-ecs-sd

Folders and files

Latest commit

History

Repository files navigation

prometheus-ecs-sd

Info

Developing

Setup

EC2 IAM Policy

Cross-Account service discovery

Special cases

Networking

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages