-
Notifications
You must be signed in to change notification settings - Fork 37
Description
Problem
The workflow health monitoring system lacks execution metrics data, preventing comprehensive health analysis of the 128 workflows in this repository.
Current State
- Metrics Location:
/tmp/gh-aw/repo-memory-default/memory/default/metrics/latest.json - Status: File does not exist
- Impact: Cannot analyze workflow success rates, failure patterns, or mean time between failures (MTBF)
Missing Capabilities
Without metrics data, the health manager cannot:
- ✗ Calculate workflow success/failure rates
- ✗ Identify consistently failing workflows
- ✗ Track performance trends over time
- ✗ Detect workflow regressions
- ✗ Analyze error patterns across workflows
- ✗ Calculate mean time between failures (MTBF)
- ✗ Generate reliability scores based on execution history
Root Cause
The metrics-collector.md workflow is outdated (source modified after lock file compilation), which may be preventing metrics from being collected and stored properly.
Required Actions
1. Recompile metrics-collector workflow
make recompile # Or specifically: gh-aw compile .github/workflows/metrics-collector.md2. Verify metrics collection schedule
Check that metrics-collector.md is scheduled to run daily and has proper permissions to write to shared memory.
3. Wait for metrics accumulation
After fixing compilation, allow at least 7 days for metrics to accumulate for meaningful trend analysis.
4. Verify metrics storage
After first run, verify file exists:
ls -lh /tmp/gh-aw/repo-memory-default/memory/default/metrics/latest.json
cat /tmp/gh-aw/repo-memory-default/memory/default/metrics/latest.json | jq '.timestamp'Expected Metrics Structure
The metrics file should contain:
- Per-workflow run statistics (total runs, successes, failures)
- Success rates calculated
- Timestamps for tracking trends
- Historical data for 30-day analysis
Dependencies
- Related to: #[P0 recompile issue]
- Blocks: Comprehensive workflow health analysis
- Blocks: Automated failure detection
- Blocks: Performance trend tracking
Priority
P0 - Critical: This is a meta-monitoring capability that enables all other health checks. Without metrics, health monitoring is limited to structural analysis (compilation status, configuration) and cannot detect runtime failures.
Success Criteria
-
metrics-collector.mdlock file is up-to-date - Metrics collector runs successfully on schedule
-
/tmp/gh-aw/repo-memory-default/memory/default/metrics/latest.jsonexists and contains valid data - Historical metrics accumulate in
metrics/daily/directory - Workflow Health Manager can analyze execution metrics
Detected by Workflow Health Manager on 2026-01-04
AI generated by Workflow Health Manager - Meta-Orchestrator