+---------------+ +---------------+ +---------------+
| journald | | Exporters: | | pgw_scripts |
+---------------+ | ----------- | +---------------+
| | node | |
| | systemd | |
V | influx | |
+---------------+ +---------------+ |
| Syslog-ng | | |
+---------------+ | |
| | |
| | |
V V V
+---------------+ +---------------+ +---------------+
| Telegraf | | Prometheus | <------ | Pushgateway |
+---------------+ +---------------+ +---------------+
| |
| |
V V
+---------------+ +---------------+
| InfluxDB | ------> | Grafana |
+---------------+ +---------------+
This setup has zero security and zero redundancy. It's meant to be minimal, take up as little space as possible and handle inconsequential data.
If needed:
- use ansible-vault for encrypted passwords or use key based access.
- self-signed TLS for services.
- Prometheus and influxdb replication / HA
Prometheus tsdb retention is the default 2 weeks.
The telegraf database in InfluxDB has a retention policy of 2 weeks.
prometheus-node-exporter is run with ignore flag for cifs filesystems. When a windows shared folder is mounted and the machine goes offline, node-exporter will keep working.
Logrotate is set to keep 1 days worth of backlogs, and rotates daily.
Journald is configured to use no more than 1G of disk space. This could be lowered more, as syslog-ng pulls messages from journald effectively instantaneously.
There are two scripts that utilize pushgateway:
- pgw_exporter:
- cpu usage by process
- memory usage by process
- process count
- pinger
- pgw_pkgs
- package updates