-
Notifications
You must be signed in to change notification settings - Fork 225
Description
Skill Overview
Multiple workflows fetch GitHub data using gh api graphql with custom GraphQL queries for issues, pull requests, discussions, commits, and releases. This pattern is repeated across workflows with slight variations, leading to duplication and inconsistency.
Why this should be shared: The existing issues-data-fetch.md component only uses gh issue list (REST API), which is limited. Many workflows need the full power of GraphQL to fetch nested data (labels, comments, reviews) in a single query. A comprehensive shared component would provide standardized GraphQL queries with intelligent caching.
Current Usage
This skill appears in the following workflows:
-
daily-news.md(lines 96-242) - Fetches issues, PRs, discussions, commits, releases using GraphQL -
copilot-pr-merged-report.md- Fetches PR data with reviews and comments -
weekly-issue-summary.md- Fetches recent issues with labels and comments -
daily-team-status.md- Fetches team activity data -
copilot-session-insights.md- Uses GraphQL for session data -
github-mcp-tools-report.md- Fetches repository metadata -
daily-observability-report.md- Fetches workflow run data -
org-health-report.md- Fetches organization-level data - Additional 5-7 workflows use similar patterns
Total: ~12-15 workflows use GraphQL queries
Proposed Shared Component
File: .github/workflows/shared/github-graphql-data-fetch.md
Configuration:
---
tools:
cache-memory:
key: github-graphql-data
bash:
- "gh api *"
- "jq *"
- "mkdir *"
- "date *"
- "cp *"
steps:
- name: Setup data directories and cache
env:
GITHUB_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
GH_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
run: |
set -e
# Create directories
mkdir -p /tmp/gh-aw/github-data
mkdir -p /tmp/gh-aw/cache-memory/github-graphql-data
# Check cache validity (< 24 hours)
TODAY=$(date '+%Y-%m-%d')
CACHE_VALID=false
CACHE_TIMESTAMP_FILE="/tmp/gh-aw/cache-memory/github-graphql-data/.timestamp"
if [ -f "$CACHE_TIMESTAMP_FILE" ]; then
CACHE_AGE=$(($(date +%s) - $(cat "$CACHE_TIMESTAMP_FILE")))
if [ $CACHE_AGE -lt 86400 ]; then
echo "✓ Found valid cached data (age: \$\{CACHE_AGE}s)"
CACHE_VALID=true
fi
fi
echo "cache_valid=$CACHE_VALID" >> "$GITHUB_OUTPUT"
- name: Fetch issues with GraphQL
if: steps.setup.outputs.cache_valid != 'true'
env:
GITHUB_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
GH_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
run: |
set -e
echo "Fetching issues data..."
gh api graphql -f query="
query(\$owner: String!, \$repo: String!) {
repository(owner: \$owner, name: \$repo) {
openIssues: issues(first: 100, states: OPEN, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
number
title
state
createdAt
updatedAt
author { login }
labels(first: 10) { nodes { name color } }
comments { totalCount }
body
}
}
closedIssues: issues(first: 100, states: CLOSED, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
number
title
state
createdAt
updatedAt
closedAt
author { login }
labels(first: 10) { nodes { name color } }
}
}
}
}
" -f owner="\$\{GITHUB_REPOSITORY_OWNER}" -f repo="\$\{GITHUB_REPOSITORY#*/}" > /tmp/gh-aw/github-data/issues.json
echo "✓ Issues data fetched"
- name: Fetch pull requests with GraphQL
if: steps.setup.outputs.cache_valid != 'true'
env:
GITHUB_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
GH_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
run: |
set -e
echo "Fetching pull requests..."
gh api graphql -f query="
query(\$owner: String!, \$repo: String!) {
repository(owner: \$owner, name: \$repo) {
openPRs: pullRequests(first: 50, states: OPEN, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
number
title
state
createdAt
updatedAt
author { login }
additions
deletions
changedFiles
reviews(first: 10) { totalCount }
labels(first: 10) { nodes { name color } }
}
}
mergedPRs: pullRequests(first: 50, states: MERGED, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
number
title
state
createdAt
updatedAt
mergedAt
author { login }
additions
deletions
}
}
}
}
" -f owner="\$\{GITHUB_REPOSITORY_OWNER}" -f repo="\$\{GITHUB_REPOSITORY#*/}" > /tmp/gh-aw/github-data/pull_requests.json
echo "✓ Pull requests data fetched"
- name: Fetch discussions with GraphQL
if: steps.setup.outputs.cache_valid != 'true'
env:
GITHUB_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
GH_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
run: |
set -e
echo "Fetching discussions..."
gh api graphql -f query="
query(\$owner: String!, \$repo: String!) {
repository(owner: \$owner, name: \$repo) {
discussions(first: 50, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
number
title
createdAt
updatedAt
author { login }
category { name }
comments { totalCount }
url
}
}
}
}
" -f owner="\$\{GITHUB_REPOSITORY_OWNER}" -f repo="\$\{GITHUB_REPOSITORY#*/}" > /tmp/gh-aw/github-data/discussions.json
echo "✓ Discussions data fetched"
- name: Cache fetched data
if: steps.setup.outputs.cache_valid != 'true'
env:
GITHUB_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
GH_TOKEN: $\{\{ secrets.GITHUB_TOKEN }}
run: |
set -e
echo "Caching data for future runs..."
cp -r /tmp/gh-aw/github-data/* /tmp/gh-aw/cache-memory/github-graphql-data/
date +%s > "/tmp/gh-aw/cache-memory/github-graphql-data/.timestamp"
echo "✓ Data cached"
- name: Restore from cache
if: steps.setup.outputs.cache_valid == 'true'
run: |
set -e
echo "Restoring cached data..."
cp -r /tmp/gh-aw/cache-memory/github-graphql-data/* /tmp/gh-aw/github-data/
echo "✓ Cached data restored"
---
# GitHub GraphQL Data Fetch
Pre-fetched GitHub data is available at `/tmp/gh-aw/github-data/`:
- **`issues.json`**: Open and recently closed issues (last 100 each) with labels, comments, body
- **`pull_requests.json`**: Open and merged PRs (last 50 each) with reviews, labels, stats
- **`discussions.json`**: Recent discussions (last 50) with category, comments, URL
### Intelligent Caching
Data is cached for 24 hours to reduce API calls and improve performance. Multiple workflows running on the same day share the same cached data.
### Usage Examples
``````bash
# Count open issues
jq '.repository.openIssues.nodes | length' /tmp/gh-aw/github-data/issues.json
# Get PRs merged today
TODAY=$(date '+%Y-%m-%d')
jq --arg date "$TODAY" '.repository.mergedPRs.nodes | map(select(.mergedAt | startswith($date)))' /tmp/gh-aw/github-data/pull_requests.json
# Get most active discussions
jq '.repository.discussions.nodes | sort_by(.comments.totalCount) | reverse | .[0:5]' /tmp/gh-aw/github-data/discussions.json
**Usage Example**:
``````yaml
# In a workflow
imports:
- shared/github-graphql-data-fetch.md
# Data is automatically available at /tmp/gh-aw/github-data/
# No need to write GraphQL queries or caching logic
Impact
- Workflows affected: 12-15 workflows
- Lines saved: ~150-200 lines per workflow = 1,800-3,000 total lines
- Maintenance benefit:
- Single location to update GraphQL queries
- Consistent data structure across all workflows
- Intelligent caching reduces API rate limit usage
- Easier to add new data types (commits, releases, workflow runs)
- Reduces cognitive load - developers don't need to write GraphQL
Implementation Plan
- Create shared component at
.github/workflows/shared/github-graphql-data-fetch.md - Implement GraphQL queries for issues, PRs, discussions
- Add intelligent caching with 24-hour expiry
- Test with
daily-news.mdas proof-of-concept - Add additional queries (commits, releases, workflow runs) based on needs
- Migrate 3-5 high-traffic workflows
- Document query customization patterns for special cases
- Gradually migrate remaining workflows
Example Before/After
Before (daily-news.md, lines 96-242):
steps:
- name: Setup directories and check cache
# ... 95 lines of caching logic ...
- name: Fetch issues data
# ... 35 lines of GraphQL query ...
- name: Fetch pull requests data
# ... 48 lines of GraphQL query ...
- name: Fetch discussions data
# ... 24 lines of GraphQL query ...
- name: Cache downloaded data
# ... 10 lines of caching logic ...After:
imports:
- shared/github-graphql-data-fetch.md
# All data available at /tmp/gh-aw/github-data/
# 212 lines replaced with 1 import!Related Analysis
This recommendation comes from the Workflow Skill Extractor analysis run on 2026-02-11. This is the highest impact opportunity identified, saving ~1,800-3,000 lines across 12-15 workflows.
Additional Benefits
- Consistency: All workflows get the same data structure
- Performance: 24-hour cache reduces API calls dramatically
- Reliability: Centralized error handling and retry logic
- Extensibility: Easy to add new data types to the shared component
- Discovery: New workflow developers don't need to learn GraphQL
AI generated by Workflow Skill Extractor
- expires on Feb 14, 2026, 12:05 AM UTC