This project provides an automated deployment infrastructure for CachetHQ status page system using Podman containers, with integrated Prometheus AlertManager webhook middleware for automatic incident management.
All application sources and dependencies are prepared automatically by the deployment script; you do not need to manually clone or manage any application repositories except this one.
This infrastructure deploys a complete status page system consisting of:
- Cachet Application: Open-source status page system (Laravel-based)
- PostgreSQL Database: Data persistence layer
- Traefik Reverse Proxy: HTTP/HTTPS routing and SSL termination
- AlertManager Webhook Middleware: Python service that receives Prometheus alerts and automatically manages Cachet incidents and component statuses
- Two-tier Component Architecture: Invisible components (per-target monitoring) + visible components (aggregated service status)
The middleware handles alert lifecycle (firing/resolved) and intelligently updates component statuses and incidents based on alert state and target criticality.
To deploy the infrastructure, ensure the following prerequisites are met:
- System Requirements:
- Bash: Ensure Bash is installed as the default shell.
- Podman: Install Podman for container management.
- Podman-Compose: Install Podman-Compose. Version must be greater than 1.0.6 as version 1.0.6 is outdated and contains bugs related to volume mounting.
- Python 3: Ensure Python 3 and pip3 are installed.
- curl: Required for testing webhook endpoints.
- htpasswd: Installable via
apache2-utils, used for HTTP authentication. - openssl: Required for generating random keys (e.g.,
APP_KEY). - systemctl: Required for managing the Podman rootless socket (systemd-based systems).
- sed, grep, awk: Standard utilities for file and string manipulation.
Manual steps to deploy on Fedora 42:
-
If the machines does not have enough RAM, create a swapfile:
btrfs filesystem mkswapfile --size 2G /swap swapon /swapThen add it to /etc/fstab:
/swap none swap defaults 0 0 -
Install required packages:
dnf install git podman-compose && dnf update -
Create a dedicated user for running the containers:
useradd cachet -m -s /bin/bash loginctl enable-linger cachet echo 'net.ipv4.ip_unprivileged_port_start=80' > /etc/sysctl.d/99-podman.conf sysctl -p /etc/sysctl.d/99-podman.conf -
Switch to the new user and set up SSH keys for GitHub:
sudo su - cachet mkdir -p ~/.ssh/authorized_keys curl https://github.com/<username>.keys >> .ssh/authorized_keys chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys git clone git@github.com:nethesis/status.git cd statusFollow the Quick Start instructions below from this point.
Copy the example environment file and edit with your values follow the inline instructions:
cp .env.example .env
nano .envLeave empty (auto-generated):
APP_KEYCACHET_API_TOKEN
Authentication:
The webhook endpoint is protected by Basic Auth. You can configure the credentials by setting the WEBHOOK_BASIC_AUTH environment variable in .env.
The format is user:hash. You can generate the hash using htpasswd -nb user password or an online generator (BCrypt, MD5, SHA1).
If not set, the default credentials are admin:admin.
Copy the example configuration and edit with your infrastructure details:
cp middleware/prometheus.yml.example middleware/prometheus.yml
nano middleware/prometheus.ymlFor a production deployment use prometheus config from private repository.
Required Prometheus labels for status page integration:
Your Prometheus targets configuration must include these custom labels:
status_page_alert(required): Set totrueto enable monitoring for this targetstatus_page_component(required): Name(s) of visible component(s) affected by this target (comma-separated for multiple)status_page_critical_target(optional): Set totrueto mark target as critical. When a critical target fails, the entire visible component is set to major outage regardless of other targets' status. Default:false
See prometheus.yml.example for usage examples.
Important: Make sure all component names used in status_page_component labels are also defined in the groups_configuration section of middleware/config.json.
Copy the example configuration and edit to define your component groups:
cp middleware/config.json.example middleware/config.json
nano middleware/config.jsonConfiguration parameters:
status_page_group: Name of the group that will be created on the status pagestatus_page_components: Array of visible component names that belong to this group
Each visible component referenced in your Prometheus labels must be mapped to a group in this configuration. The setup-components.py script will use this mapping to automatically create groups and organize components during initialization.
For local development (without HTTPS), ensure your .env is properly configured:
ENVIRONMENT=local
CACHET_DOMAIN=localhost
WEBHOOK_DOMAIN=localhost
APP_ENV=local
APP_DEBUG=true
APP_URL=http://localhost:8080
ASSET_URL=http://localhost:8080
CERT_RESOLVER=Run the deployment script:
./deploy.shThe script will automatically:
- Validate configuration
- Prepare all application sources and dependencies (no manual cloning required)
- Generate the Laravel
APP_KEYautomatically - Configure webhook authentication
- Build container images
- Start Traefik, PostgreSQL, Cachet and Middleware services
- Run database migrations
- Create the admin user and generate the API token
- Automatically set up components using Prometheus/config.json if requested
At the end of the process, the Cachet status page and middleware will be fully operational.
The middleware implements a two-tier component architecture:
Invisible Components (one per monitored target):
- Name format:
<instance> | <component_names> - Example:
192.168.1.10:9100 | Web Server, Database - Status: Operational (1) or Major Outage (4)
- Purpose: Track individual target health
Visible Components (aggregated by service):
- Name: Service name (e.g.,
Web Server,Database) - Status: Calculated by aggregating invisible component statuses
- Incident: Created/closed when status changes to/from Major Outage
Visible component status is determined by:
- Major Outage (4): All invisible components down OR any critical target down
- Partial Outage (3): Mixed statuses (some up, some down) without critical targets down
- Operational (1): All invisible components operational
Critical targets (marked with status_page_critical_target: true) have priority: if any critical target fails, the entire visible component immediately goes to Major Outage.
To monitor received webhook requests and component status changes from the middleware container, you can use the following commands:
To view all requests received on the /webhook endpoint:
podman logs <container-name> 2>/dev/null | grep "\[WEBHOOK_REQUEST\]"
Example output:
[WEBHOOK_REQUEST] source_ip=10.0.0.5 headers=[Host: example.com; User-Agent: curl/7.68.0; Content-Type: application/json] body={...}
To view all component status changes (both visible and invisible):
podman logs <container-name> 2>/dev/null | grep "\[COMPONENT_STATUS_CHANGE\]"
Example output:
[COMPONENT_STATUS_CHANGE] component="Database" old_status=1 (Operational) new_status=4 (Outage)
Replace <container-name> with the actual container name (e.g., cachet-middleware).
middleware/README.md- Detailed middleware architecture and API- Cachet Documentation
- Prometheus Documentation
- Traefik Documentation