Skep is a monitoring dashboard for Docker Swarm.
See skepdocker.github.io for more screenshots, features, etc..
You may find Skep to be a useful addition to your toolbox along with projects like these:
Launch Skep using the default configuration by running the following command on any Swarm manager:
curl -sSL https://raw.githubusercontent.com/bobf/skep/master/docker-compose.yml | docker-compose -f - config | docker stack deploy -c - skep
Skep will be available on any Swarm node on port 8080
.
When you have finished evaluating Skep you can remove the stack to destroy all services and networks:
docker stack rm skep
The agent service is responsible for harvesting host and container metrics; configure this service as appropriate for your hardware/operating system setup.
See the sections below to configure each of Skep's components. The provided example docker-compose.yml can be used as a starting point.
When you have a docker-compose.yml
that suits your requirements you can launch Skep by executing the following command on any Swarm manager:
docker-compose -f <your-compose-file.yml> config | docker stack deploy -c - skep
Variable | Meaning | Example |
---|---|---|
SKEP_PRIVATE_PORT |
Port used for internal communications between Skep services. Do not publish this port. | 6666 (default/recommended) |
Variable | Meaning | Example |
---|---|---|
SKEP_APP_URL |
URL that agent containers will use to send metrics to Skep web application | http://app:6666/ (default/recommended) |
DISKS |
Comma-separated list of disk devices to monitor (disk activity) | sda,sdc |
FILE_SYSTEMS |
Comma-separated list of file systems to monitor (available space) | /hostfs/root,/hostfs/backups (see file systems) |
NETWORK_INTERFACES |
Comma-separated list of network devices to monitor (traffic) [not yet implemented] | eth0,eth3 |
COLLECT_INTERVAL |
Time in seconds to wait between gathering metrics. | 5 |
SAMPLE_DURATION |
Minimum time in seconds to monitor disk I/O etc. Will accumulate for multiple devices. | 10 |
LOG_LEVEL |
By default, the agent only logs initial configuration on launch and errors. Set to DEBUG to log all statistics. |
INFO (default/recommended) |
SKEP_HOST |
Set to docker-desktop when running on Docker Desktop for Mac |
docker-desktop |
Variable | Meaning | Example |
---|---|---|
SKEP_APP_URL |
URL that agent containers will use to send metrics to Skep web application | http://app:6666/ (default/recommended) |
SERVICE_URL_TEMPLATE |
URL template for service names | See URL templating |
IMAGE_URL_TEMPLATE |
URL template for image names | See URL templating |
LOG_LEVEL |
By default, the monitor only logs initial configuration on launch and errors. Set to DEBUG to log all statistics. |
INFO (default/recommended) |
COLLECT_INTERVAL |
Time in seconds to wait between gathering metrics. | 5 |
SAMPLE_DURATION |
Minimum time in seconds to monitor disk I/O etc. Will accumulate for multiple devices. | 10 |
Variable | Meaning | Example |
---|---|---|
SKEP_APP_URL |
URL that agent containers will use to send metrics to Skep web application | http://app:6666/ (default/recommended) |
SERVICE_URL_TEMPLATE |
URL template for service names | See URL templating |
IMAGE_URL_TEMPLATE |
URL template for image names | See URL templating |
LOG_LEVEL |
By default, the monitor only logs initial configuration on launch and errors. Set to DEBUG to log all statistics. |
INFO (default/recommended) |
COLLECT_INTERVAL |
Time in seconds to wait between gathering metrics. | 5 |
SAMPLE_DURATION |
Minimum time in seconds to monitor disk I/O etc. Will accumulate for multiple devices. | 10 |
Variable | Meaning | Example |
---|---|---|
SKEP_APP_URL |
URL that agent containers will use to send metrics to Skep web application | http://app:6666/ (default/recommended) |
SKEP_CHARTS_URL |
URL that the charts service will be available on for handling chart requests. Sent to App service every time charts are updated. | http://charts:8080/ |
SKEP_CHARTS_DB_PATH |
Path to statistics SQLite3 database. Mount a shared storage endpoint to this location if you want to retain data between restarts. | /charts.db (default/recommended) |
SKEP_CHARTS_DB_PERSIST |
By default, the statistics database is re-initialised on startup. Set this variable to any value to retain data between restarts. | (not set) |
LOG_LEVEL |
Application server log level. | INFO (default/recommended) |
Skep uses the gunicorn web server in conjunction with Flask and Flask-SocketIO.
To deploy Skep behind Nginx the following configuration can be used:
upstream skep {
# Docker Swarm Nodes:
server node1:8080;
server node2:8080;
server node3:8080;
}
server {
server_name skep.example.com;
location / {
proxy_pass http://skep;
}
location /socket.io {
proxy_http_version 1.1;
proxy_buffering off;
proxy_set_header Origin "";
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_pass http://skep;
}
listen 80;
listen [::]:80;
}
To monitor a file system it must be mounted into the agent as a Docker bind mount. The FILE_SYSTEMS
environment variable should refer to the destination of the bind mount. Skep uses the base path /hostfs
for mounting host file systems but any valid path is acceptable.
For example, to monitor the root file system, the following configuration might be used:
stats:
image: skep/stats
volumes:
- "/:/hostfs/root:ro"
environment:
FILE_SYSTEMS: '/hostfs/root'
URL templating is supported for service names and image IDs. When the relevant environment variable is set, service names and image IDs will be rendered as hyperlinks according to a provided Python format string. See the table below for available parameters:
Parameter | Meaning | Example |
---|---|---|
name |
Name of service | skep_app |
id |
Service ID | yw1iaod282a7 |
Parameter | Meaning | Example |
---|---|---|
organization |
Image organization owner | skep |
repository |
Image repository name | app |
tag |
Image tag | latest |
# .env
SERVICE_URL_TEMPLATE=https://github.com/bobf/{name}
IMAGE_URL_TEMPLATE=https://hub.docker.com/r/{organization}/{repository}
Skep is comprised of four services:
- An agent which is deployed globally (i.e. to all Swarm nodes);
- A monitor which must be deployed to one manager node;
- A charts service which stores and calculates chart data which can be deployed to any node and must have only one replica;
- A web app that can be deployed to any node and must have only one replica.
The agent periodically harvests system and container metrics which are sent to the charts and app services; the app service forwards the data to the React front end using WebSockets / socket.io. The charts service retains the data in an SQLite3 database.
Chart requests are sent to the app which forwards to the charts service. A confirmation is immediately returned to the front end while the charts service uses one of its worker processes to render the chart data. When the data has been compiled it is sent back to the front end via a WebSocket event.
Redux is used in the front end to manage events and data storage/manipulation.
Agents use bind mounts to access metrics from the host system (/proc
, /etc/
, and /dev
are mounted). Agents also gather statistics about containers running on each host by mounting the Docker socket (/var/run/docker.sock
).
All services are written in Python 3.
Skep utilises the excellent Docker SDK for Python extensively.
The web application uses the equally excellent Flask web framework and Flask-SocketIO.
The front end is read-only. No changes to a swarm can be made via the web application. A best-effort approach to filter sensitive data (e.g. passwords in environment configurations) is implemented using simple heuristics. Regardless, as with all similar systems, it is highly recommended that you run Skep behind a firewall and/or an authentication layer.
Feel free to make a pull request.