This document describes the security architecture, practices, and controls implemented in Bosun.
Bosun follows a defense-in-depth approach with these core principles:
- Least Privilege: Operations use minimal permissions required
- Secure by Default: Security measures are always enabled (no opt-out)
- Fail Secure: Errors fail closed, never exposing sensitive data
- Input Validation: All external inputs are validated before use
- Secret Isolation: Secrets are isolated from logs, environment variables, and error messages
Bosun uses SOPS (Secrets OPerationS) for encrypting sensitive configuration files. SOPS provides:
- Encryption at rest for YAML/JSON files
- Support for multiple key management backends
- Partial encryption (only values are encrypted, keys remain readable)
- Git-friendly encrypted file format
Implementation: internal/internal/reconcile/sops.go
// SOPSOps provides SOPS decryption operations
type SOPSOps struct{}
// Decrypt decrypts a SOPS-encrypted file and returns the plaintext bytes
func (s *SOPSOps) Decrypt(ctx context.Context, file string) ([]byte, error)Bosun uses age as the encryption backend for SOPS. Age provides:
- Modern, audited cryptography (X25519 + ChaCha20-Poly1305)
- Simple key format (single line text files)
- No external dependencies or key servers
Key Location Priority (checked in order):
SOPS_AGE_KEYenvironment variable (inline key)SOPS_AGE_KEY_FILEenvironment variable (path to key file)- Default:
~/.config/sops/age/keys.txt
Implementation: internal/internal/cmd/init.go
During bosun init, age keys are generated with secure defaults:
// Create key directory with restricted permissions
if err := os.MkdirAll(keyDir, 0700); err != nil {
return "", fmt.Errorf("create key directory: %w", err)
}
// Generate key using age-keygen
keygen := exec.Command("age-keygen", "-o", ageKeyFile)
// Set secure permissions on key file
if err := os.Chmod(ageKeyFile, 0600); err != nil {
return "", fmt.Errorf("set key permissions: %w", err)
}Security Controls:
| Resource | Permission | Rationale |
|---|---|---|
Key directory (~/.config/sops/age/) |
0700 |
Owner-only access |
Key file (keys.txt) |
0600 |
Owner read/write only |
To rotate age keys:
-
Generate new key:
age-keygen -o ~/.config/sops/age/keys-new.txt -
Update
.sops.yamlwith the new public key:creation_rules: - path_regex: .*\.sops\.yaml$ age: age1newpublickeyhere...
-
Re-encrypt existing secrets:
# For each encrypted file sops updatekeys secrets.sops.yaml -
Verify decryption works with new key
-
Archive old key securely (do not delete immediately - needed for backups)
Implementation: internal/internal/reconcile/template.go
Secrets are passed to templates via temporary files rather than environment variables. This prevents:
- Secret leakage in process listings (
ps aux) - Secret exposure in shell history
- Secret inheritance by child processes
// Write secrets to a temporary file with restricted permissions (0600)
// instead of passing the actual secret values via environment variables
secretsFile, err := os.CreateTemp("", "bosun-secrets-*.json")
if err != nil {
return fmt.Errorf("failed to create temp secrets file: %w", err)
}
secretsPath := secretsFile.Name()
defer func() {
secretsFile.Close()
os.Remove(secretsPath) // Cleanup after use
}()
// Set restrictive permissions before writing
if err := os.Chmod(secretsPath, 0600); err != nil {
return fmt.Errorf("failed to set secrets file permissions: %w", err)
}Template Access Pattern:
// Templates access secrets via file path (not content):
// {{ $secrets := fromJson (include (env "BOSUN_SECRETS_FILE")) }}
cmd.Env = append(filterSafeEnv(os.Environ()), "BOSUN_SECRETS_FILE="+secretsPath)| File Type | Permission | Rationale |
|---|---|---|
| Secrets temp file | 0600 |
Owner read/write only |
| Rendered output files | 0644 |
World-readable (contains no secrets) |
| Output directories | 0755 |
Standard directory permissions |
| Staging directories | 0755 |
Standard directory permissions |
Temporary secret files are cleaned up using Go's defer pattern:
defer func() {
secretsFile.Close()
os.Remove(secretsPath)
}()This ensures cleanup occurs even if template rendering fails.
Implementation: internal/internal/reconcile/deploy.go
SSH connections include multiple security controls:
// SSH connection with security options
cmd := exec.CommandContext(ctx, "ssh",
"-o", "ConnectTimeout=5", // Prevent hanging on unreachable hosts
"-o", "BatchMode=yes", // Disable password prompts (key-only)
host, "exit", "0",
)Security Options:
| Option | Value | Purpose |
|---|---|---|
ConnectTimeout |
5 seconds | Prevent DoS via slow hosts |
BatchMode |
yes | Disable interactive prompts, enforce key auth |
Bosun implements exponential backoff retry for transient SSH errors:
// Transient error patterns that trigger retry
transientPatterns := []string{
"connection refused",
"connection reset",
"connection timed out",
"network is unreachable",
"no route to host",
"host is down",
"operation timed out",
"i/o timeout",
"temporary failure",
}Non-transient errors (authentication failures, host key verification) fail immediately.
Operation Timeouts (defined in internal/internal/reconcile/deploy.go):
| Operation | Timeout | Rationale |
|---|---|---|
| SSH Connect | 5 seconds | Quick failure detection |
| SSH Commands | 30 seconds | Reasonable for remote ops |
| Rsync Transfer | 5 minutes | Large file transfers |
| Docker Compose Up | 10 minutes | Container pulls/startup |
Implementation: internal/internal/reconcile/validation.go
SSH hosts are validated to prevent command injection:
// Reject SSH option injection (arguments starting with -)
if strings.HasPrefix(host, "-") {
return fmt.Errorf("invalid host: cannot start with '-' (potential SSH option injection)")
}
// Reject shell metacharacters
shellMetachars := []string{";", "&", "|", "$", "`", "(", ")", "{", "}", "<", ">", "\\", "\n", "\r", "'", "\""}
for _, char := range shellMetachars {
if strings.Contains(host, char) {
return fmt.Errorf("invalid host: contains shell metacharacter %q", char)
}
}
// Validate format with regex
hostPattern = regexp.MustCompile(`^([a-zA-Z0-9_-]+@)?[a-zA-Z0-9.-]+$`)Implementation: internal/internal/reconcile/template.go
Environment variables are filtered to prevent secret leakage to child processes:
Excluded Prefixes:
| Prefix | Reason |
|---|---|
SOPS_ |
Contains encryption keys |
AWS_ |
Cloud credentials |
AZURE_ |
Cloud credentials |
GCP_, GOOGLE_ |
Cloud credentials |
DO_ |
DigitalOcean credentials |
LINODE_ |
Linode credentials |
VULTR_ |
Vultr credentials |
CLOUDFLARE_ |
Cloudflare credentials |
HETZNER_ |
Hetzner credentials |
OVH_ |
OVH credentials |
API_KEY |
Generic API keys |
SECRET |
Generic secrets |
TOKEN |
Generic tokens |
PASSWORD |
Passwords |
CREDENTIAL |
Credentials |
Excluded Suffixes:
| Suffix | Reason |
|---|---|
_TOKEN |
Auth tokens |
_SECRET |
Secret values |
_KEY |
API/encryption keys |
_PASS, _PASSWORD |
Passwords |
_AUTH |
Auth credentials |
_CREDENTIAL, _CREDENTIALS |
Credentials |
Excluded Exact Matches:
| Variable | Reason |
|---|---|
GITHUB_TOKEN |
CI/CD token |
GITLAB_TOKEN |
CI/CD token |
NPM_TOKEN |
Registry auth |
DOCKER_AUTH |
Registry auth |
REGISTRY_AUTH |
Registry auth |
SSH_AUTH_SOCK |
SSH agent socket |
GPG_TTY |
GPG signing |
Only these prefixes are passed to child processes:
PATH=- Required for command executionHOME=- User home directoryUSER=- Current userLANG=- Locale settingsLC_- Locale categoriesTERM=- Terminal typeXDG_- XDG base directoriesTMPDIR=,TMP=,TEMP=- Temp directories
Template errors are sanitized to prevent secret leakage:
func sanitizeStderr(stderr string) string {
// Truncate long output that might contain secrets
const maxLen = 500
if len(stderr) > maxLen {
stderr = stderr[:maxLen] + "... (truncated)"
}
return stderr
}Implementation: internal/internal/lock/lock.go
Bosun uses file-based locking to prevent concurrent operations:
// Lock structure
type Lock struct {
path string // e.g., .bosun/locks/provision.lock
file *os.File
}
// Acquire exclusive lock (non-blocking)
if err := syscall.Flock(int(f.Fd()), syscall.LOCK_EX|syscall.LOCK_NB); err != nil {
// Another process holds the lock
}Platform Support:
- Unix: Uses
flock(2)system call - Windows: Uses
LockFileExAPI
| Lock | Path | Purpose |
|---|---|---|
| Provision | .bosun/locks/provision.lock |
Prevent concurrent renders |
| Reconcile | /tmp/reconcile.lock |
Prevent concurrent deploys |
Lock files contain the PID of the holding process for debugging:
// Write PID to lock file for debugging
f.Truncate(0)
f.Seek(0, 0)
fmt.Fprintf(f, "%d\n", os.Getpid())Locks are automatically released when:
- The process calls
Release() - The process terminates (kernel releases flock)
- The file descriptor is closed
Implementation: internal/internal/cmd/emergency.go
Tar archives are validated to prevent directory traversal attacks (zip slip):
// Sanitize path to prevent directory traversal
target := filepath.Join(destDir, header.Name)
if !strings.HasPrefix(target, filepath.Clean(destDir)+string(os.PathSeparator)) {
return fmt.Errorf("invalid file path in archive: %s", header.Name)
}This prevents malicious archives containing paths like:
../../../etc/passwd/etc/shadowfoo/../../bar
Extracted files are limited to prevent resource exhaustion:
// Limit copy size as a security measure
const maxFileSize = 100 * 1024 * 1024 // 100MB max per file
if _, err := io.CopyN(outFile, tr, maxFileSize); err != nil && err != io.EOF {
return err
}Implementation: internal/internal/reconcile/validation.go
| Input | Pattern | Rejects |
|---|---|---|
| SSH Host | ^([a-zA-Z0-9_-]+@)?[a-zA-Z0-9.-]+$ |
Shell metacharacters, option injection |
| Git Branch | ^[a-zA-Z0-9_/.-]+$ |
Shell metacharacters, option injection |
| Container Name | ^[a-zA-Z0-9][a-zA-Z0-9_.-]*$ |
Shell metacharacters, option injection |
| Docker Signal | Allowlist only | Arbitrary signals |
All validated inputs reject these characters:
shellMetachars = []string{
";", "&", "|", "$", "`",
"(", ")", "{", "}",
"<", ">", "\\",
"\n", "\r", "'", "\""
}Inputs starting with - are rejected to prevent:
- SSH option injection:
ssh -oProxyCommand=... evil - Git option injection:
git clone --upload-pack=evil ... - Docker option injection:
docker --config=evil ...
The include template function reads arbitrary files from the local filesystem:
"include": func(path string) (string, error) {
data, err := os.ReadFile(path)
// ...
}Current threat model: Templates come from your own Git repository, which you control. The include function is used to read secrets files via {{ include (env "BOSUN_SECRETS_FILE") }}.
Risk: If Bosun ever processes templates from untrusted sources, this is an arbitrary file read vulnerability. A malicious template could {{ include "/etc/shadow" }} or read any file the Bosun process has access to.
Mitigation: Path validation for the include function is planned. See bosun-4su. Until then, only render templates from trusted repositories.
The template renderer walks the entire cloned repository directory for .tmpl files. Non-template files are only copied from the infrastructure subdirectory. This means a .tmpl file placed outside the expected path will still be rendered, potentially producing unexpected output files.
Mitigation: Limit .tmpl files to the infrastructure subdirectory in your repository. Consider code review rules that flag .tmpl files in unexpected locations.
The auth stack — Traefik, Authelia, and Tailscale gateway — forms a dependency chain for external access. A partial compose up failure where Authelia is down but Traefik is up could serve routes without authentication middleware.
Current mitigations:
- Traefik's
forwardAuthmiddleware fails closed — if Authelia is unreachable, requests receive 502 errors rather than passing through unauthenticated - Post-deploy drift verification catches missing or unhealthy containers and logs warnings
- All external routes defined via provisions include
forwardAuthmiddleware by default
Planned: A dedicated health gate that verifies the full auth chain is healthy before declaring a deploy successful. See bosun-r9n.
- Generate unique keys per environment (dev, staging, production)
- Store production keys in HSM or cloud KMS when possible
- Rotate keys annually or after personnel changes
- Never commit private keys to version control
- Use
age-keygenrather than importing existing keys
- Rotate secrets regularly (quarterly minimum)
- Update encrypted files when rotating secrets
- Use unique secrets per service (no shared credentials)
- Audit secret access via git history of
.sops.yamlfiles
- Git history tracks all secret file changes
- Lock files contain PIDs for debugging
- SSH connections use BatchMode (logged by SSH daemon)
- Timeouts prevent silent failures
- Use SSH keys with passphrases
- Configure known_hosts before first deployment
- Set ConnectTimeout to prevent hanging
- Verify host keys to prevent MITM attacks
Before deploying with Bosun:
- Age keys generated with
0600permissions -
.sops.yamlconfigured with correct public key - SSH keys configured on target hosts
-
known_hostspopulated for all targets - No secrets in environment variables
- No secrets in git history (use
git-secretsor similar) - Production keys stored securely (not on developer machines)
- Audit log retention configured