-
Notifications
You must be signed in to change notification settings - Fork 225
Description
Pattern Overview
The GitHub Data Analysis Framework pattern appears in 19+ workflows that fetch GitHub data (issues, PRs, workflow runs, metrics), store results in persistent memory (repo-memory or cache-memory), perform analysis with trend calculations, and generate reports. Extracting this into a shared component would eliminate ~2,850 lines of duplicated code.
Current Usage
This pattern appears in the following workflows:
-
audit-workflows.md(lines 14-18, 176-220) -
workflow-health-manager.md(lines 15-17, 176-220) -
daily-code-metrics.md(lines 11-17, 43-90) -
daily-issues-report.md(lines 31-34, 59-68) -
copilot-pr-merged-report.md(repo memory not used, but similar pattern) -
daily-team-status.md -
metrics-collector.md -
daily-firewall-report.md -
daily-performance-summary.md -
org-health-report.md -
daily-copilot-token-report.md -
daily-secrets-analysis.md -
daily-team-evolution-insights.md -
agent-performance-analyzer.md -
campaign-generator.md -
copilot-session-insights.md -
portfolio-analyst.md -
repo-audit-analyzer.md -
repository-quality-improver.md
Proposed Shared Component
File: .github/workflows/shared/github-data-analysis-framework.md
Configuration:
```yaml
tools:
repo-memory:
description: "Shared data analysis storage"
file-glob: [".json", ".jsonl", ".csv", ".md"]
max-file-size: 102400
bash:
- "jq *"
- "date *"
- "mkdir *"
- "cp *"
- "cat *"
- "bc *"
steps:
-
name: Setup analysis environment
run: |Create structured directories for data analysis
mkdir -p /tmp/gh-aw/analysis/{data,historical,output}
mkdir -p /tmp/gh-aw/repo-memory/default/metrics/daily
echo "Analysis environment ready"
echo "Current run: $(date -u +%Y-%m-%dT%H:%M:%SZ)" -
name: Load historical context
run: |Load last 30 days of historical data for trend analysis
HISTORY_DIR="/tmp/gh-aw/repo-memory/default/metrics/daily"
if [ -d "$HISTORY_DIR" ]; then
# Copy last 30 days of data
find "$HISTORY_DIR" -name "*.json" -mtime -30
-exec cp {} /tmp/gh-aw/analysis/historical/ ; 2>/dev/null || trueHIST_COUNT=$(ls -1 /tmp/gh-aw/analysis/historical/ 2>/dev/null | wc -l) echo "Loaded $HIST_COUNT days of historical data"else
echo "No historical data found (first run)"
fi
GitHub Data Analysis Framework
This shared component provides a standardized framework for workflows that analyze GitHub data with persistent storage and trend tracking.
Features
- Persistent storage with repo-memory for historical data
- Structured directories for organized data management
- Historical data loading (last 30 days automatically)
- Trend calculation helpers for 7-day and 30-day comparisons
- Standardized metrics storage format
Directory Structure
```
/tmp/gh-aw/analysis/
├── data/ # Current run data and inputs
├── historical/ # Last 30 days for comparison
└── output/ # Analysis results and reports
/tmp/gh-aw/repo-memory/default/
└── metrics/
└── daily/ # Daily metrics stored as YYYY-MM-DD.json
```
Trend Calculation Helpers
```bash
Calculate percentage change between two values
calculate_trend() {
local current=$1
local previous=$2
if [ -z "$previous" ] || [ "$previous" = "0" ] || [ "$previous" = "null" ]; then
echo "N/A"
return
fi
local change=$(echo "scale=2; (($current - $previous) / $previous) * 100" | bc)
printf "%.1f" "$change"
}
Get trend indicator emoji
get_trend_indicator() {
local change=$1
if [ "$change" = "N/A" ]; then
echo "➡️"
elif (( $(echo "$change > 10" | bc -l) )); then
echo "⬆️"
elif (( $(echo "$change < -10" | bc -l) )); then
echo "⬇️"
else
echo "➡️"
fi
}
Get value from N days ago
get_historical_value() {
local metric_path=$1
local days_ago=$2
local target_date=$(date -d "$days_ago days ago" '+%Y-%m-%d' 2>/dev/null ||
date -v-${days_ago}d '+%Y-%m-%d')
local hist_file="/tmp/gh-aw/analysis/historical/${target_date}.json"
if [ -f "$hist_file" ]; then
jq -r "$metric_path // null" "$hist_file"
else
echo "null"
fi
}
```
Storage Pattern
Store daily metrics in standardized format:
```bash
Store current run metrics
TODAY=$(date +%Y-%m-%d)
METRICS_FILE="/tmp/gh-aw/repo-memory/default/metrics/daily/${TODAY}.json"
jq -n
--arg date "$TODAY"
--arg timestamp "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
--argjson metrics "$metrics_json"
'{
date: $date,
timestamp: $timestamp,
metrics: $metrics
}' > "$METRICS_FILE"
echo "Metrics stored: $METRICS_FILE"
```
Trend Analysis Pattern
```bash
Calculate 7-day and 30-day trends
current_value=100
value_7d_ago=$(get_historical_value '.metrics.total_count' 7)
value_30d_ago=$(get_historical_value '.metrics.total_count' 30)
trend_7d=$(calculate_trend "$current_value" "$value_7d_ago")
trend_30d=$(calculate_trend "$current_value" "$value_30d_ago")
indicator_7d=$(get_trend_indicator "$trend_7d")
indicator_30d=$(get_trend_indicator "$trend_30d")
echo "Current: $current_value"
echo "7-day change: ${indicator_7d} ${trend_7d}%"
echo "30-day change: ${indicator_30d} ${trend_30d}%"
```
Usage Example
```yaml
imports:
- shared/github-data-analysis-framework.md
- shared/issues-data-fetch.md
Your workflow-specific tools
tools:
github:
toolsets: [default]
```
Then in your workflow prompt:
```bash
Analysis environment is already set up
Load your current data
cp /tmp/gh-aw/issues-data/issues.json /tmp/gh-aw/analysis/data/
Perform analysis
current_total=$(jq 'length' /tmp/gh-aw/analysis/data/issues.json)
Calculate trends using helpers
value_7d_ago=$(get_historical_value '.metrics.total_issues' 7)
trend_7d=$(calculate_trend "$current_total" "$value_7d_ago")
indicator_7d=$(get_trend_indicator "$trend_7d")
Store results
TODAY=$(date +%Y-%m-%d)
jq -n
--arg date "$TODAY"
--argjson total "$current_total"
--arg trend_7d "$trend_7d"
'{date: $date, metrics: {total_issues: $total, trend_7d: $trend_7d}}' \
"/tmp/gh-aw/repo-memory/default/metrics/daily/${TODAY}.json"
```
Best Practices
- Consistent metric names: Use same field names across runs for trend tracking
- Date-based files: Always use YYYY-MM-DD.json format for daily metrics
- Null handling: Check for null/missing historical data before calculations
- Cross-platform: Use compatible date commands (GNU date with BSD fallback)
- Validation: Verify historical files exist before loading
```
Usage Example:
```yaml
In a workflow
imports:
- shared/github-data-analysis-framework.md
- shared/issues-data-fetch.md
```
Impact
- Workflows affected: 19 workflows
- Lines saved: ~2,850 lines (150 per workflow)
- Maintenance benefit: Single source of truth for data analysis patterns, easier to add features like new trend calculations or historical comparisons
Implementation Plan
- Create shared component at
.github/workflows/shared/github-data-analysis-framework.md - Pilot: Update
audit-workflows.mdto use the shared framework - Pilot: Update
workflow-health-manager.mdto use the shared framework - Validate pilots work correctly with historical data
- Update
daily-code-metrics.md - Update
daily-issues-report.md - Update remaining 15 workflows in batches of 3-4
- Test all updated workflows
- Update documentation with usage examples
Related Analysis
This recommendation comes from the Workflow Pattern Harvester analysis run on 2026-01-16.
AI generated by Workflow Pattern Harvester