-
Notifications
You must be signed in to change notification settings - Fork 37
Implement cache-memory based campaign discovery for agnostic workers #10955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
- Add tracker-id field to code-scanning-fixer, security-fix-pr, and security-review - Recompile workflows to generate updated lock files with GH_AW_TRACKER_ID - Add campaign:security-alert-burndown label to code-scanning-fixer PRs Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
|
@copilot Does this fall in line with campaign logic (i.e. agnostic workers)? How are workers meant to be tracked by the campaign? Are there alternatives? |
You're right to question this. The approach I took partially violates the "agnostic workers" principle. How workers should be tracked:
The issue with my fix:
Alternatives (in order of preference):
Recommendation: Switch to Alternative #1 (label-based) for true worker agnosticism. The tracker-id in workers should only be used when workers are not part of campaigns. |
|
@copilot We can remain agnostic if we use memory? |
No, memory alone cannot maintain worker agnosticism for campaign discovery. Here's why: Memory limitations:
Discovery requirements:
Workers can write to memory that they "created PR #456", but they can't provide the full URL/metadata the campaign needs to add items to the project board. Why label/tracker-id work:
Hybrid approach (if you want memory involved): |
|
@copilot The benefit would be not adding redundant information (tracker id, etc.) to the system when worker memory can record what was done, which the campaign can retrieve. |
You're right about the benefit. A memory-based approach could work and would be cleaner. Here's how: Architecture:
Advantages over labels/tracker-id:
Implementation requirements:
Trade-offs:
Would you like me to prototype this memory-based approach? |
|
@copilot Yes, implement this new approach |
Remove tracker-id from workers (reverted previous approach)
Add repo-memory configuration to all three worker workflows
Workers now write PR metadata to campaign memory:
- Path: /tmp/gh-aw/repo-memory/campaigns/security-alert-burndown/workers/
- Format: {"pr_url": "...", "pr_number": ..., "created_at": "...", "worker": "...", "alert_number": ..., "repo": "..."}
Benefits:
- Workers remain 100% agnostic (no campaign awareness)
- No HTML comments in PR/issue bodies
- No campaign labels required (cleaner)
- Single source of truth in memory
- Campaign orchestrator reads from memory for discovery
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Implemented in commit dad1058. What changed:
Memory format: {"pr_url": "https://github.com/owner/repo/pull/456", "pr_number": 456, "created_at": "...", "worker": "code-scanning-fixer", "alert_number": 123, "severity": "high", "repo": "githubnext/gh-aw"}Next step needed: Would you like me to modify the campaign discovery script, or will you handle that separately? |
|
@copilot Yes, modify the campaign discovery script accordingly. |
|
@copilot Summary of which files were touched, making sure these are the core files (no the derivates of |
Files Touched SummaryCore Source Files (4 modified):
Generated Files (130 .lock.yml files): Result of Documentation: Core changes: 4 files, ~150 LOC added SWAT Analysis: Memory-Based vs Tracker-Based DiscoveryStrengths
Weaknesses
Opportunities
Threats
RecommendationMemory-based approach is superior for worker agnosticism but requires more infrastructure. Best suited for:
Tracker-based approach is simpler and best for:
|
|
@copilot It should not be necessary to modify the worker file (add repo-memory config) for every worker. This makes it messy again. |
You're right - modifying every worker file couples them to campaign infrastructure. Here are cleaner alternatives: Option 1: Campaign reads worker cache-memory (cleanest)
Option 2: Dispatch with campaign context
Option 3: Campaign-level repo-memory (shared infrastructure)
Which approach would you prefer? Option 1 (read from worker cache) seems cleanest - no worker modifications at all. |
|
@copilot Option 1 |
Reverted worker workflow modifications - workers remain 100% agnostic
Modified campaign_discovery.cjs to read from worker cache-memory instead of repo-memory
Changes:
- Reverted .github/workflows/{code-scanning-fixer,security-fix-pr,security-review}.md
- Workers keep using existing cache-memory (no changes needed)
- Campaign discovery reads from /tmp/gh-aw/cache-memory/{workflow} files
- Transforms cache records (pr_number, alert_number, fixed_at) to discovery format
- Tries multiple cache file patterns (fixed-alerts.jsonl, {workflow}-outputs.jsonl, etc)
Benefits:
- Zero worker modifications (truly agnostic)
- No repo-memory infrastructure needed
- Workers remain functional outside campaigns
- All 17 tests pass
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Implemented in commit c6088bf. Changes:
Result:
Workers now function independently; campaign discovers outputs by reading their cache files after dispatch. |
|
@copilot How does this work with passive versus active campaigns? Explain orchestrator and worker flow in both scenarios given the new implementation. |
Passive vs Active Campaign FlowsCampaign TypesActive Campaign:
Passive Campaign:
Cache-Memory Discovery FlowPassive Campaign (Current Implementation) Active Campaign (If Implemented) Key Differences
Implementation StatusCurrent (commit c6088bf):
Discovery happens same way regardless - orchestrator reads cache-memory after workers run, whether independently or dispatched. |
|
@copilot Only keep relevant changes based on what we discussed, revert the rest to keep this PR non-intrusive. |
Reverted all worker workflow modifications to keep PR non-intrusive
Only change: campaign_discovery.cjs cache-memory discovery logic
Changes:
- Reverted .github/workflows/{code-scanning-fixer,security-fix-pr,security-review}.md to original
- Reverted specs/artifacts.md to original
- Kept actions/setup/js/campaign_discovery.cjs cache-memory discovery
- Recompiled lock files (generated)
Result:
- Zero worker modifications (100% agnostic)
- Single focused change: discovery script reads cache-memory
- All 17 tests pass
- Non-intrusive PR
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Done in commit 033fc8c. Reverted:
Kept:
Result:
|
Campaign discovery returned 0 items because worker workflows had no mechanism to record their outputs for campaign tracking. Implemented a cache-memory based approach where campaigns read from existing worker cache files, maintaining complete worker agnosticism.
Changes
Single file modified:
actions/setup/js/campaign_discovery.cjs- Added cache-memory discovery logicdiscoverFromMemory()function to read from worker cache-memory/tmp/gh-aw/cache-memory/(workers' existing cache files)fixed-alerts.jsonl,{workflow}-outputs.jsonl,{workflow}.jsonlWorker workflows:
Architecture
Discovery Flow
Cache Transformation
Workers write cache records:
{"alert_number": 123, "fixed_at": "2024-01-15T10:30:00Z", "pr_number": 456}Campaign transforms to discovery format:
{ "url": "https://github.com/owner/repo/pull/456", "pr_number": 456, "created_at": "2024-01-15T10:30:00Z", "worker": "code-scanning-fixer", "metadata": {"alert_number": 123, "source": "cache-memory"} }Benefits
✅ Non-Intrusive - Single file change, no worker modifications
✅ 100% Worker Agnosticism - Zero modifications to worker files
✅ No Infrastructure Overhead - Uses existing cache-memory
✅ Workers Function Independently - No campaign coupling
✅ No Metadata Pollution - No HTML comments, tracker-ids, or labels in PR/issue bodies
✅ Fast Discovery - Memory read vs GitHub API search (no rate limiting)
✅ Backward Compatible - Falls back to API search if cache unavailable
✅ Scalable - Pattern works for any number of campaigns and workers
✅ Tested - All existing tests pass
Validation
On next campaign run:
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.