-
Notifications
You must be signed in to change notification settings - Fork 36
Cache copilot PR data with date-based keys for cross-workflow reuse #4248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…hing Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds intelligent caching to the copilot-pr-data-fetch shared component to reduce redundant GitHub API calls and improve workflow performance. The caching strategy uses date-based file storage (e.g., copilot-prs-2024-11-18.json) to enable data sharing across multiple workflows running on the same day.
Key changes:
- Added
cache-memorytool withkey: copilot-pr-datafor workflow-level cache sharing - Implemented cache-first logic that checks for today's cached data before making API calls
- Added
cp *bash tool for copying files between cache and working directories - Updated documentation to explain the caching strategy and behavior
Critical issue identified: The compiled workflows use copilot-pr-data-${{ github.run_id }} as the cache key instead of the configured copilot-pr-data. Since github.run_id is unique per workflow run, this prevents the intended cross-workflow cache sharing. Each workflow run will create a new cache instead of reusing existing ones from the same day.
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/shared/copilot-pr-data-fetch.md |
Added cache-memory configuration and date-based cache logic to shared component |
.github/workflows/prompt-clustering-analysis.lock.yml |
Compiled workflow with updated cache key and bash tools (contains cache key bug) |
.github/workflows/copilot-pr-prompt-analysis.lock.yml |
Compiled workflow with updated cache key and bash tools (contains cache key bug) |
.github/workflows/copilot-pr-nlp-analysis.lock.yml |
Compiled workflow with updated cache key and bash tools (contains cache key bug) |
.github/workflows/copilot-agent-analysis.lock.yml |
Compiled workflow with updated cache key and bash tools (contains cache key bug) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| key: copilot-pr-data-${{ github.run_id }} | ||
| path: /tmp/gh-aw/cache-memory | ||
| restore-keys: | | ||
| memory-${{ github.workflow }}- | ||
| memory- | ||
| copilot-pr-data- | ||
| copilot-pr- | ||
| copilot- |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache key copilot-pr-data-${{ github.run_id }} prevents cross-workflow cache sharing because github.run_id is unique per workflow run. This defeats the purpose of the date-based caching strategy described in the PR.
Current behavior: Each workflow run gets a unique cache key, so no cache sharing occurs.
Expected behavior (per PR description): Multiple workflows running on the same day should share the cache.
Suggested fix: Use a date-based cache key instead:
key: copilot-pr-data-${{ github.run_number }}-${{ github.run_attempt }}Or better yet, since the files themselves are already date-based (e.g., copilot-prs-2024-11-18.json), use a simple key that allows any workflow to find today's cached data:
key: copilot-pr-data-${{ github.run_id }}
restore-keys: |
copilot-pr-data-The restore keys will match any previous copilot-pr-data-* cache, and the shell script's date-based logic will determine if the cached files are from today.
| key: copilot-pr-data-${{ github.run_id }} | ||
| path: /tmp/gh-aw/cache-memory | ||
| restore-keys: | | ||
| memory-${{ github.workflow }}- | ||
| memory- | ||
| copilot-pr-data- | ||
| copilot-pr- | ||
| copilot- |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache key copilot-pr-data-${{ github.run_id }} prevents cross-workflow cache sharing because github.run_id is unique per workflow run. This defeats the purpose of the date-based caching strategy.
Since the files themselves are already date-based (e.g., copilot-prs-2024-11-18.json), workflows should be able to share caches across runs. The restore keys will match previous caches, but the unique run_id in the primary key prevents proper cache reuse.
| key: copilot-pr-data-${{ github.run_id }} | ||
| path: /tmp/gh-aw/cache-memory | ||
| restore-keys: | | ||
| memory-${{ github.workflow }}- | ||
| memory- | ||
| copilot-pr-data- | ||
| copilot-pr- | ||
| copilot- |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache key copilot-pr-data-${{ github.run_id }} prevents cross-workflow cache sharing because github.run_id is unique per workflow run. This defeats the purpose of the date-based caching strategy.
Since the files themselves are already date-based (e.g., copilot-prs-2024-11-18.json), workflows should be able to share caches across runs. The restore keys will match previous caches, but the unique run_id in the primary key prevents proper cache reuse.
| key: copilot-pr-data-${{ github.run_id }} | ||
| path: /tmp/gh-aw/cache-memory | ||
| restore-keys: | | ||
| memory-${{ github.workflow }}- | ||
| memory- | ||
| copilot-pr-data- | ||
| copilot-pr- | ||
| copilot- |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache key copilot-pr-data-${{ github.run_id }} prevents cross-workflow cache sharing because github.run_id is unique per workflow run. This defeats the purpose of the date-based caching strategy.
Since the files themselves are already date-based (e.g., copilot-prs-2024-11-18.json), workflows should be able to share caches across runs. The restore keys will match previous caches, but the unique run_id in the primary key prevents proper cache reuse.
| - "ln *" | ||
|
|
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ln * bash tool is declared but never used in the script. Only cp commands are used to copy files between cache and working directories.
Consider removing this unused tool declaration unless there's a planned use case for creating symbolic links.
| - "ln *" |
The
copilot-pr-data-fetch.mdshared component downloaded PR data fresh on every workflow run, causing redundant API calls and slower execution for workflows running on the same day.Changes
copilot-pr-datato shared componentcopilot-prs-YYYY-MM-DD.jsonnamingcopilot-pr-data-→copilot-pr-→copilot-Behavior
Impact
copilot-pr-nlp-analysis,copilot-pr-prompt-analysis,copilot-agent-analysis,prompt-clustering-analysis/tmp/gh-aw/pr-data/pathsOriginal prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.