diff --git a/.github/aw/execute-agentic-campaign-workflow.md b/.github/aw/execute-agentic-campaign-workflow.md index 4a4247ce44..1af0f6df4e 100644 --- a/.github/aw/execute-agentic-campaign-workflow.md +++ b/.github/aw/execute-agentic-campaign-workflow.md @@ -1,200 +1,284 @@ # Workflow Execution -This campaign orchestrator can execute workflows as needed. Your role is to run the workflows listed in sequence, collect their outputs, and use those outputs to drive the campaign forward. +This campaign references the following campaign workers. These workers follow the first-class worker pattern: they are dispatch-only workflows with standardized input contracts. -**IMPORTANT: Workflow execution is an advanced capability. Exercise caution and follow all guidelines carefully.** +**IMPORTANT: Workers are orchestrated, not autonomous. They accept `campaign_id` and `payload` inputs via workflow_dispatch.** --- -## Workflows to Execute +## Campaign Workers {{ if .Workflows }} -The following workflows should be executed in order: +The following campaign workers are referenced by this campaign: {{ range $idx, $workflow := .Workflows }} {{ add1 $idx }}. `{{ $workflow }}` {{ end }} {{ end }} +**Worker Pattern**: All workers MUST: +- Use `workflow_dispatch` as the ONLY trigger (no schedule/push/pull_request) +- Accept `campaign_id` (string) and `payload` (string; JSON) inputs +- Implement idempotency via deterministic work item keys +- Label all created items with `campaign:{{ .CampaignID }}` + --- ## Workflow Creation Guardrails -### Before Creating Any Workflow, Ask: +### Before Creating Any Worker Workflow, Ask: 1. **Does this workflow already exist?** - Check `.github/workflows/` thoroughly -2. **Can an existing workflow be used?** - Even if not perfect, existing is safer +2. **Can an existing workflow be adapted?** - Even if not perfect, existing is safer 3. **Is the requirement clear?** - Can you articulate exactly what it should do? -4. **Is it testable?** - Can you verify it works before using it in the campaign? -5. **Is it reusable?** - Could other campaigns benefit from this workflow? +4. **Is it testable?** - Can you verify it works with test inputs? +5. **Is it reusable?** - Could other campaigns benefit from this worker? -### Only Create New Workflows When: +### Only Create New Workers When: ✅ **All these conditions are met:** - No existing workflow does the required task - The campaign objective explicitly requires this capability -- You have a clear, specific design for the workflow -- The workflow has a focused, single-purpose scope +- You have a clear, specific design for the worker +- The worker has a focused, single-purpose scope - You can test it independently before campaign use -❌ **Never create workflows when:** +❌ **Never create workers when:** - You're unsure about requirements - An existing workflow "mostly" works -- The workflow would be complex or multi-purpose +- The worker would be complex or multi-purpose - You haven't verified it doesn't already exist - You can't clearly explain what it does in one sentence --- -## Execution Process +## Worker Creation Template -For each workflow: +If you must create a new worker (only after checking ALL guardrails above), use this template: -1. **Check if workflow exists** - Look for `.github/workflows/.md` +**Create the workflow file at `.github/workflows/.md`:** -2. **Create workflow if needed** - Only if ALL guardrails above are satisfied: - - **Design requirements:** - - **Single purpose**: One clear task (e.g., "scan for outdated dependencies", not "scan and update") - - **Explicit trigger**: Must include `workflow_dispatch` for manual/programmatic execution - - **Minimal tools**: Only include tools actually needed (principle of least privilege) - - **Safe outputs only**: Use appropriate safe-output limits (max: 5 for first version) - - **Clear prompt**: Describe exactly what the workflow should do and return - - **Create the workflow file at `.github/workflows/.md`:** - ```yaml - --- - name: - description: - - on: - workflow_dispatch: # Required for execution - inputs: - priority: - description: 'Priority level for this execution' - required: false - type: choice - options: - - low - - medium - - high - default: medium - target: - description: 'Specific target or scope for this run' - required: false - type: string - - tools: - github: - toolsets: [default] # Adjust based on needs - # Only add other tools if absolutely necessary - - safe-outputs: - create-issue: - max: 3 # Start conservative - add-comment: - max: 2 - --- - - # - - You are a focused workflow that . - - Priority: \$\{\{ github.event.inputs.priority \}\} - Target: \$\{\{ github.event.inputs.target \}\} - - ## Task - - - - ## Output - - +```yaml +--- +name: +description: + +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload with work item details' + required: true + type: string + +tracker-id: + +tools: + github: + toolsets: [default] + # Add minimal additional tools as needed + +safe-outputs: + create-pull-request: + max: 1 # Start conservative + add-comment: + max: 2 +--- + +# + +You are a campaign worker that processes work items. + +## Input Contract + +Parse inputs: +```javascript +const campaignId = context.payload.inputs.campaign_id; +const payload = JSON.parse(context.payload.inputs.payload); +``` + +Expected payload structure: +```json +{ + "repository": "owner/repo", + "work_item_id": "unique-id", + "target_ref": "main", + // Additional context... +} +``` + +## Idempotency Requirements + +1. **Generate deterministic key**: + ``` + const workKey = `campaign-${campaignId}-${payload.repository}-${payload.work_item_id}`; ``` - - **Note**: Define `inputs` under `workflow_dispatch` to accept parameters from the orchestrator. Use `\$\{\{ github.event.inputs.INPUT_NAME \}\}` to reference input values in your workflow markdown. See [DispatchOps documentation](https://githubnext.github.io/gh-aw/guides/dispatchops/#with-input-parameters) for input types and examples. - - - Compile it with `gh aw compile .md` - - **CRITICAL: Test before use** (see testing requirements below) -3. **Test newly created workflows** (MANDATORY): - - **Why test?** - Untested workflows may fail during campaign execution, blocking progress. Test first to catch issues early. - - **Testing steps:** - - Trigger test run: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - - Wait for completion: Poll until status is "completed" - - **Verify success**: Check that workflow succeeded and produced expected outputs - - **Review outputs**: Ensure results match expectations (check artifacts, issues created, etc.) - - **If test fails**: Revise the workflow, recompile, and test again - - **Only proceed** after successful test run +2. **Check for existing work**: + - Search for PRs/issues with `workKey` in title + - Filter by label: `campaign:${campaignId}` + - If found: Skip or update + - If not: Create new + +3. **Label all created items**: + - Apply `campaign:${campaignId}` label + - This enables discovery by orchestrator + +## Task + + + +## Output + +Report: +- Link to created/updated PR or issue +- Whether work was skipped (exists) or completed +- Any errors or blockers +``` + +**After creating:** +- Compile: `gh aw compile .md` +- **CRITICAL: Test with sample inputs** (see testing requirements below) + +--- + +## Worker Testing (MANDATORY) + +**Why test?** - Untested workers may fail during campaign execution. Test with sample inputs first to catch issues early. + +**Testing steps:** + +1. **Prepare test payload**: + ```json + { + "repository": "test-org/test-repo", + "work_item_id": "test-1", + "target_ref": "main" + } + ``` + +2. **Trigger test run**: + ```bash + gh workflow run .yml \ + -f campaign_id={{ .CampaignID }} \ + -f payload='{"repository":"test-org/test-repo","work_item_id":"test-1"}' + ``` - **Test failure actions:** - - DO NOT use the workflow in the campaign if testing fails - - Analyze the failure logs to understand what went wrong - - Make necessary corrections to the workflow - - Recompile and retest - - If you can't fix it after 2 attempts, report in status update and skip this workflow + Or via GitHub MCP: + ```javascript + mcp__github__run_workflow( + workflow_id: "", + ref: "main", + inputs: { + campaign_id: "{{ .CampaignID }}", + payload: JSON.stringify({repository: "test-org/test-repo", work_item_id: "test-1"}) + } + ) + ``` + +3. **Wait for completion**: Poll until status is "completed" -4. **Execute the workflow** (skip if just tested successfully): - - Trigger: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - - **Pass input parameters based on decisions**: If the workflow accepts inputs, provide them to guide execution (e.g., `inputs: {priority: "high", target: "security"}`) - - Wait for completion: Poll `mcp__github__get_workflow_run(run_id)` until status is "completed" - - Collect outputs: Check `mcp__github__download_workflow_run_artifact()` for any artifacts - - **Handle failures gracefully**: If execution fails, note it in status update but continue campaign +4. **Verify success**: + - Check that workflow succeeded + - Verify idempotency: Run again with same inputs, should skip/update + - Review created items have correct labels + - Confirm deterministic keys are used -5. **Use outputs for next steps** - Use information from workflow runs to: - - Inform subsequent workflow executions (e.g., scanner results → upgrader inputs) - - Pass contextual inputs to worker workflows based on campaign state and decisions - - Update project board items with relevant information - - Make decisions about campaign progress and next actions +5. **Test failure actions**: + - DO NOT use the worker if testing fails + - Analyze failure logs + - Make corrections + - Recompile and retest + - If unfixable after 2 attempts, report in status and skip **Note**: Workflows that accept `workflow_dispatch` inputs can receive parameters from the orchestrator. This enables the orchestrator to provide context, priorities, or targets based on its decisions. See [DispatchOps documentation](https://githubnext.github.io/gh-aw/guides/dispatchops/#with-input-parameters) for input parameter examples. --- -## Guidelines +## Orchestration Guidelines + +**Execution pattern:** +- Workers are **orchestrated, not autonomous** +- Orchestrator discovers work items via discovery manifest +- Orchestrator decides which workers to run and with what inputs +- Workers receive `campaign_id` and `payload` via workflow_dispatch +- Sequential vs parallel execution is orchestrator's decision + +**Worker dispatch:** +- Parse discovery manifest (`./.gh-aw/campaign.discovery.json`) +- For each work item needing processing: + 1. Determine appropriate worker for this item type + 2. Construct payload with work item details + 3. Dispatch worker via workflow_dispatch with campaign_id and payload + 4. Track dispatch status + +**Input construction:** +```javascript +// Example: Dispatching security-fix worker +const workItem = discoveryManifest.items[0]; +const payload = { + repository: workItem.repo, + work_item_id: `alert-${workItem.number}`, + target_ref: "main", + alert_type: "sql-injection", + file_path: "src/db.go", + line_number: 42 +}; -**Execution order:** -- Execute workflows **sequentially** (one at a time) -- Wait for each workflow to complete before starting the next -- **Why sequential?** - Ensures dependencies between workflows are respected and reduces API load +await github.actions.createWorkflowDispatch({ + owner: context.repo.owner, + repo: context.repo.repo, + workflow_id: "security-fix-worker.yml", + ref: "main", + inputs: { + campaign_id: "{{ .CampaignID }}", + payload: JSON.stringify(payload) + } +}); +``` -**Workflow creation:** -- **Always test newly created workflows** before using them in the campaign -- **Why test first?** - Prevents campaign disruption from broken workflows -- Start with minimal, focused workflows (easier to test and debug) -- **Why minimal?** - Reduces complexity and points of failure -- Keep designs simple and aligned with campaign objective -- **Why simple?** - Easier to understand, test, and maintain +**Idempotency by design:** +- Workers implement their own idempotency checks +- Orchestrator doesn't need to track what's been processed +- Can safely re-dispatch work items across runs +- Workers will skip or update existing items **Failure handling:** -- If a workflow test fails, revise and retest before proceeding -- **Why retry?** - Initial failures often due to minor issues easily fixed -- If a workflow fails during campaign execution, note the failure and continue -- **Why continue?** - One workflow failure shouldn't block entire campaign progress -- Report all failures in the status update with context -- **Why report?** - Transparency helps humans intervene if needed - -**Workflow reusability:** -- Workflows you create should be reusable for future campaign runs -- **Why reusable?** - Reduces need to create workflows repeatedly, builds library of capabilities -- Avoid campaign-specific logic in workflows (keep them generic) -- **Why generic?** - Enables reuse across different campaigns - -**Permissions and safety:** -- Keep workflow permissions minimal (only what's needed) -- **Why minimal?** - Reduces risk and follows principle of least privilege -- Prefer draft PRs over direct merges for code changes -- **Why drafts?** - Requires human review before merging changes -- Escalate to humans when uncertain about decisions -- **Why escalate?** - Human oversight prevents risky autonomous actions +- If a worker dispatch fails, note it but continue +- Worker failures don't block entire campaign +- Report all failures in status update with context +- Humans can intervene if needed --- -## After Workflow Execution +## After Worker Orchestration + +Once workers have been dispatched (or new workers created and tested), proceed with normal orchestrator steps: + +1. **Discovery** - Read state from discovery manifest and project board +2. **Planning** - Determine what needs updating on project board +3. **Project Updates** - Write state changes to project board +4. **Status Reporting** - Report progress, worker dispatches, failures, next steps + +--- + +## Key Differences from Fusion Approach + +**Old fusion approach (REMOVED)**: +- Workers had mixed triggers (schedule + workflow_dispatch) +- Fusion dynamically added workflow_dispatch to existing workflows +- Workers stored in campaign-specific folders +- Ambiguous ownership and trigger precedence + +**New first-class worker approach**: +- Workers are dispatch-only (on: workflow_dispatch) +- Standardized input contract (campaign_id, payload) +- Explicit idempotency via deterministic keys +- Clear ownership: workers are orchestrated, not autonomous +- Workers stored with regular workflows (not campaign-specific folders) +- Orchestration policy kept explicit in orchestrator -Once all workflows have been executed (or created and executed), proceed with the normal orchestrator steps: -- Step 1: Discovery (read state from manifest and project board) -- Step 2: Planning (determine what needs updating) -- Step 3: Project Updates (write state to project board) -- Step 4: Status Reporting (report progress, failures, and next steps) +This eliminates duplicate execution problems and makes orchestration concerns explicit. diff --git a/.github/workflows/security-alert-burndown.campaign.lock.yml b/.github/workflows/security-alert-burndown.campaign.lock.yml index 7ac9e2fe6b..e14f709e6c 100644 --- a/.github/workflows/security-alert-burndown.campaign.lock.yml +++ b/.github/workflows/security-alert-burndown.campaign.lock.yml @@ -911,16 +911,16 @@ jobs: --- # Workflow Execution - This campaign orchestrator can execute workflows as needed. Your role is to run the workflows listed in sequence, collect their outputs, and use those outputs to drive the campaign forward. + This campaign references the following campaign workers. These workers follow the first-class worker pattern: they are dispatch-only workflows with standardized input contracts. - **IMPORTANT: Workflow execution is an advanced capability. Exercise caution and follow all guidelines carefully.** + **IMPORTANT: Workers are orchestrated, not autonomous. They accept `campaign_id` and `payload` inputs via workflow_dispatch.** --- - ## Workflows to Execute + ## Campaign Workers - The following workflows should be executed in order: + The following campaign workers are referenced by this campaign: 1. `code-scanning-fixer` @@ -930,189 +930,273 @@ jobs: + **Worker Pattern**: All workers MUST: + - Use `workflow_dispatch` as the ONLY trigger (no schedule/push/pull_request) + - Accept `campaign_id` (string) and `payload` (string; JSON) inputs + - Implement idempotency via deterministic work item keys + - Label all created items with `campaign:security-alert-burndown` + --- ## Workflow Creation Guardrails - ### Before Creating Any Workflow, Ask: + ### Before Creating Any Worker Workflow, Ask: 1. **Does this workflow already exist?** - Check `.github/workflows/` thoroughly - 2. **Can an existing workflow be used?** - Even if not perfect, existing is safer + 2. **Can an existing workflow be adapted?** - Even if not perfect, existing is safer 3. **Is the requirement clear?** - Can you articulate exactly what it should do? - 4. **Is it testable?** - Can you verify it works before using it in the campaign? - 5. **Is it reusable?** - Could other campaigns benefit from this workflow? + 4. **Is it testable?** - Can you verify it works with test inputs? + 5. **Is it reusable?** - Could other campaigns benefit from this worker? - ### Only Create New Workflows When: + ### Only Create New Workers When: ✅ **All these conditions are met:** - No existing workflow does the required task - The campaign objective explicitly requires this capability - - You have a clear, specific design for the workflow - - The workflow has a focused, single-purpose scope + - You have a clear, specific design for the worker + - The worker has a focused, single-purpose scope - You can test it independently before campaign use - ❌ **Never create workflows when:** + ❌ **Never create workers when:** - You're unsure about requirements - An existing workflow "mostly" works - - The workflow would be complex or multi-purpose + - The worker would be complex or multi-purpose - You haven't verified it doesn't already exist - You can't clearly explain what it does in one sentence --- - ## Execution Process + ## Worker Creation Template - For each workflow: + If you must create a new worker (only after checking ALL guardrails above), use this template: - 1. **Check if workflow exists** - Look for `.github/workflows/.md` + **Create the workflow file at `.github/workflows/.md`:** - 2. **Create workflow if needed** - Only if ALL guardrails above are satisfied: - - **Design requirements:** - - **Single purpose**: One clear task (e.g., "scan for outdated dependencies", not "scan and update") - - **Explicit trigger**: Must include `workflow_dispatch` for manual/programmatic execution - - **Minimal tools**: Only include tools actually needed (principle of least privilege) - - **Safe outputs only**: Use appropriate safe-output limits (max: 5 for first version) - - **Clear prompt**: Describe exactly what the workflow should do and return - - **Create the workflow file at `.github/workflows/.md`:** - ```yaml - --- - name: - description: - - on: - workflow_dispatch: # Required for execution - inputs: - priority: - description: 'Priority level for this execution' - required: false - type: choice - options: - - low - - medium - - high - default: medium - target: - description: 'Specific target or scope for this run' - required: false - type: string - - tools: - github: - toolsets: [default] # Adjust based on needs - # Only add other tools if absolutely necessary - - safe-outputs: - create-issue: - max: 3 # Start conservative - add-comment: - max: 2 - --- - - # - - You are a focused workflow that . - - Priority: \$\{\{ github.event.inputs.priority \}\} - Target: \$\{\{ github.event.inputs.target \}\} - - ## Task - - - - ## Output - - + ```yaml + --- + name: + description: + + on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload with work item details' + required: true + type: string + + tracker-id: + + tools: + github: + toolsets: [default] + # Add minimal additional tools as needed + + safe-outputs: + create-pull-request: + max: 1 # Start conservative + add-comment: + max: 2 + --- + + # + + You are a campaign worker that processes work items. + + ## Input Contract + + Parse inputs: + ```javascript + const campaignId = context.payload.inputs.campaign_id; + const payload = JSON.parse(context.payload.inputs.payload); + ``` + + Expected payload structure: + ```json + { + "repository": "owner/repo", + "work_item_id": "unique-id", + "target_ref": "main", + // Additional context... + } + ``` + + ## Idempotency Requirements + + 1. **Generate deterministic key**: + ``` + const workKey = `campaign-${campaignId}-${payload.repository}-${payload.work_item_id}`; ``` - - **Note**: Define `inputs` under `workflow_dispatch` to accept parameters from the orchestrator. Use `\$\{\{ github.event.inputs.INPUT_NAME \}\}` to reference input values in your workflow markdown. See [DispatchOps documentation](https://githubnext.github.io/gh-aw/guides/dispatchops/#with-input-parameters) for input types and examples. - - - Compile it with `gh aw compile .md` - - **CRITICAL: Test before use** (see testing requirements below) - 3. **Test newly created workflows** (MANDATORY): - - **Why test?** - Untested workflows may fail during campaign execution, blocking progress. Test first to catch issues early. - - **Testing steps:** - - Trigger test run: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - - Wait for completion: Poll until status is "completed" - - **Verify success**: Check that workflow succeeded and produced expected outputs - - **Review outputs**: Ensure results match expectations (check artifacts, issues created, etc.) - - **If test fails**: Revise the workflow, recompile, and test again - - **Only proceed** after successful test run + 2. **Check for existing work**: + - Search for PRs/issues with `workKey` in title + - Filter by label: `campaign:${campaignId}` + - If found: Skip or update + - If not: Create new + + 3. **Label all created items**: + - Apply `campaign:${campaignId}` label + - This enables discovery by orchestrator + + ## Task + + + + ## Output + + Report: + - Link to created/updated PR or issue + - Whether work was skipped (exists) or completed + - Any errors or blockers + ``` + + **After creating:** + - Compile: `gh aw compile .md` + - **CRITICAL: Test with sample inputs** (see testing requirements below) + + --- + + ## Worker Testing (MANDATORY) + + **Why test?** - Untested workers may fail during campaign execution. Test with sample inputs first to catch issues early. + + **Testing steps:** + + 1. **Prepare test payload**: + ```json + { + "repository": "test-org/test-repo", + "work_item_id": "test-1", + "target_ref": "main" + } + ``` + + 2. **Trigger test run**: + ```bash + gh workflow run .yml \ + -f campaign_id=security-alert-burndown \ + -f payload='{"repository":"test-org/test-repo","work_item_id":"test-1"}' + ``` - **Test failure actions:** - - DO NOT use the workflow in the campaign if testing fails - - Analyze the failure logs to understand what went wrong - - Make necessary corrections to the workflow - - Recompile and retest - - If you can't fix it after 2 attempts, report in status update and skip this workflow + Or via GitHub MCP: + ```javascript + mcp__github__run_workflow( + workflow_id: "", + ref: "main", + inputs: { + campaign_id: "security-alert-burndown", + payload: JSON.stringify({repository: "test-org/test-repo", work_item_id: "test-1"}) + } + ) + ``` - 4. **Execute the workflow** (skip if just tested successfully): - - Trigger: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - - **Pass input parameters based on decisions**: If the workflow accepts inputs, provide them to guide execution (e.g., `inputs: {priority: "high", target: "security"}`) - - Wait for completion: Poll `mcp__github__get_workflow_run(run_id)` until status is "completed" - - Collect outputs: Check `mcp__github__download_workflow_run_artifact()` for any artifacts - - **Handle failures gracefully**: If execution fails, note it in status update but continue campaign + 3. **Wait for completion**: Poll until status is "completed" - 5. **Use outputs for next steps** - Use information from workflow runs to: - - Inform subsequent workflow executions (e.g., scanner results → upgrader inputs) - - Pass contextual inputs to worker workflows based on campaign state and decisions - - Update project board items with relevant information - - Make decisions about campaign progress and next actions + 4. **Verify success**: + - Check that workflow succeeded + - Verify idempotency: Run again with same inputs, should skip/update + - Review created items have correct labels + - Confirm deterministic keys are used + + 5. **Test failure actions**: + - DO NOT use the worker if testing fails + - Analyze failure logs + - Make corrections + - Recompile and retest + - If unfixable after 2 attempts, report in status and skip **Note**: Workflows that accept `workflow_dispatch` inputs can receive parameters from the orchestrator. This enables the orchestrator to provide context, priorities, or targets based on its decisions. See [DispatchOps documentation](https://githubnext.github.io/gh-aw/guides/dispatchops/#with-input-parameters) for input parameter examples. --- - ## Guidelines - - **Execution order:** - - Execute workflows **sequentially** (one at a time) - - Wait for each workflow to complete before starting the next - - **Why sequential?** - Ensures dependencies between workflows are respected and reduces API load + ## Orchestration Guidelines + + **Execution pattern:** + - Workers are **orchestrated, not autonomous** + - Orchestrator discovers work items via discovery manifest + - Orchestrator decides which workers to run and with what inputs + - Workers receive `campaign_id` and `payload` via workflow_dispatch + - Sequential vs parallel execution is orchestrator's decision + + **Worker dispatch:** + - Parse discovery manifest (`./.gh-aw/campaign.discovery.json`) + - For each work item needing processing: + 1. Determine appropriate worker for this item type + 2. Construct payload with work item details + 3. Dispatch worker via workflow_dispatch with campaign_id and payload + 4. Track dispatch status + + **Input construction:** + ```javascript + // Example: Dispatching security-fix worker + const workItem = discoveryManifest.items[0]; + const payload = { + repository: workItem.repo, + work_item_id: `alert-${workItem.number}`, + target_ref: "main", + alert_type: "sql-injection", + file_path: "src/db.go", + line_number: 42 + }; + + await github.actions.createWorkflowDispatch({ + owner: context.repo.owner, + repo: context.repo.repo, + workflow_id: "security-fix-worker.yml", + ref: "main", + inputs: { + campaign_id: "security-alert-burndown", + payload: JSON.stringify(payload) + } + }); + ``` - **Workflow creation:** - - **Always test newly created workflows** before using them in the campaign - - **Why test first?** - Prevents campaign disruption from broken workflows - - Start with minimal, focused workflows (easier to test and debug) - - **Why minimal?** - Reduces complexity and points of failure - - Keep designs simple and aligned with campaign objective - - **Why simple?** - Easier to understand, test, and maintain + **Idempotency by design:** + - Workers implement their own idempotency checks + - Orchestrator doesn't need to track what's been processed + - Can safely re-dispatch work items across runs + - Workers will skip or update existing items **Failure handling:** - - If a workflow test fails, revise and retest before proceeding - - **Why retry?** - Initial failures often due to minor issues easily fixed - - If a workflow fails during campaign execution, note the failure and continue - - **Why continue?** - One workflow failure shouldn't block entire campaign progress - - Report all failures in the status update with context - - **Why report?** - Transparency helps humans intervene if needed - - **Workflow reusability:** - - Workflows you create should be reusable for future campaign runs - - **Why reusable?** - Reduces need to create workflows repeatedly, builds library of capabilities - - Avoid campaign-specific logic in workflows (keep them generic) - - **Why generic?** - Enables reuse across different campaigns - - **Permissions and safety:** - - Keep workflow permissions minimal (only what's needed) - - **Why minimal?** - Reduces risk and follows principle of least privilege - - Prefer draft PRs over direct merges for code changes - - **Why drafts?** - Requires human review before merging changes - - Escalate to humans when uncertain about decisions - - **Why escalate?** - Human oversight prevents risky autonomous actions + - If a worker dispatch fails, note it but continue + - Worker failures don't block entire campaign + - Report all failures in status update with context + - Humans can intervene if needed + + --- + + ## After Worker Orchestration + + Once workers have been dispatched (or new workers created and tested), proceed with normal orchestrator steps: + + 1. **Discovery** - Read state from discovery manifest and project board + 2. **Planning** - Determine what needs updating on project board + 3. **Project Updates** - Write state changes to project board + 4. **Status Reporting** - Report progress, worker dispatches, failures, next steps --- - ## After Workflow Execution + ## Key Differences from Fusion Approach + + **Old fusion approach (REMOVED)**: + - Workers had mixed triggers (schedule + workflow_dispatch) + - Fusion dynamically added workflow_dispatch to existing workflows + - Workers stored in campaign-specific folders + - Ambiguous ownership and trigger precedence + + **New first-class worker approach**: + - Workers are dispatch-only (on: workflow_dispatch) + - Standardized input contract (campaign_id, payload) + - Explicit idempotency via deterministic keys + - Clear ownership: workers are orchestrated, not autonomous + - Workers stored with regular workflows (not campaign-specific folders) + - Orchestration policy kept explicit in orchestrator - Once all workflows have been executed (or created and executed), proceed with the normal orchestrator steps: - - Step 1: Discovery (read state from manifest and project board) - - Step 2: Planning (determine what needs updating) - - Step 3: Project Updates (write state to project board) - - Step 4: Status Reporting (report progress, failures, and next steps) + This eliminates duplicate execution problems and makes orchestration concerns explicit. --- # ORCHESTRATOR INSTRUCTIONS --- @@ -1288,6 +1372,8 @@ jobs: --- + PROMPT_EOF + cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" ### Step 1 — Read State (Discovery) [NO WRITES] **IMPORTANT**: Discovery has been precomputed. Read the discovery manifest instead of performing GitHub-wide searches. @@ -1300,8 +1386,6 @@ jobs: 2) Read current GitHub Project board state (items + required fields). 3) Parse discovered items from the manifest: - PROMPT_EOF - cat << 'PROMPT_EOF' >> "$GH_AW_PROMPT" - Each item has: url, content_type (issue/pull_request/discussion), number, repo, created_at, updated_at, state - Closed items have: closed_at (for issues) or merged_at (for PRs) - Items are pre-sorted by updated_at for deterministic processing diff --git a/docs/campaign-worker-fusion.md b/docs/campaign-worker-fusion.md deleted file mode 100644 index e8bb499913..0000000000 --- a/docs/campaign-worker-fusion.md +++ /dev/null @@ -1,5 +0,0 @@ -# Campaign Worker Workflow Fusion - -Campaign worker workflow fusion adapts existing workflows for campaign use by adding `workflow_dispatch` triggers and storing them in `.github/workflows/campaigns//` folders. This enables campaign orchestrators to dispatch workers on-demand using the `dispatch_workflow` safe output, while maintaining clear lineage through metadata (`campaign-worker: true`, `campaign-id`, `source-workflow`). The separate folder structure supports future pattern analysis to identify which workflow patterns work best for different campaign types. - -See [Campaign Examples](./docs/src/content/docs/examples/campaigns.md) for usage examples. diff --git a/docs/campaign-workers.md b/docs/campaign-workers.md new file mode 100644 index 0000000000..5a9774f293 --- /dev/null +++ b/docs/campaign-workers.md @@ -0,0 +1,474 @@ +# Campaign Workers + +Campaign workers are first-class workflows designed to be orchestrated by campaign orchestrators. This document describes the worker pattern, input contract, and idempotency requirements. + +## Overview + +Campaign workers follow these principles: + +1. **Dispatch-only**: Workers are triggered via `workflow_dispatch` by orchestrators +2. **Standardized contract**: All workers accept `campaign_id` and `payload` inputs +3. **Idempotent**: Workers use deterministic keys to avoid duplicate work +4. **Orchestration-agnostic**: Workers don't encode orchestration policy + +## Why Dispatch-Only? + +Making workers dispatch-only (no schedule/push/pull_request triggers) provides several benefits: + +- **Unambiguous ownership**: Workers are clearly orchestrated, not autonomous +- **Prevents duplicate execution**: Avoids conflicts between original triggers and orchestrator +- **Explicit orchestration**: Orchestrator controls when and how workers run +- **Clear responsibility**: Sequential vs parallel execution is orchestrator's concern + +## Input Contract + +All campaign workers MUST accept these inputs: + +```yaml +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload with work item details' + required: true + type: string +``` + +### Campaign ID + +The `campaign_id` identifies the campaign orchestrating this worker. Use it to: + +- Label created items: `campaign:${campaign_id}` +- Generate deterministic keys: `campaign-${campaign_id}-${work_item_id}` +- Track work in repo-memory: `memory/campaigns/${campaign_id}/` + +### Payload + +The `payload` is a JSON string containing work-specific data. Parse it to extract: + +- `repository`: Target repository (owner/repo format) +- `work_item_id`: Unique identifier for this work item +- `target_ref`: Target branch/ref (e.g., "main") +- Additional context specific to the worker + +Example payload: +```json +{ + "repository": "owner/repo", + "work_item_id": "alert-123", + "target_ref": "main", + "alert_type": "sql-injection", + "severity": "high", + "file_path": "src/database/query.go", + "line_number": 42 +} +``` + +## Idempotency Requirements + +Workers MUST implement idempotency to prevent duplicate work over repeated orchestrator runs. + +### Deterministic Work Item Keys + +Compute a stable key for each work item: + +``` +campaign-{campaign_id}-{repository}-{work_item_id} +``` + +Example: `campaign-security-q1-2025-myorg-myrepo-alert-123` + +Use this key in: +- Branch names: `fix/campaign-security-q1-2025-myorg-myrepo-alert-123` +- PR titles: `[campaign-security-q1-2025-myorg-myrepo-alert-123] Fix SQL injection` +- Issue titles: `[alert-123] High severity: SQL injection vulnerability` + +### Check Before Create + +Before creating any GitHub resource: + +1. **Search for existing items** with the deterministic key +2. **Filter by campaign label**: `campaign:${campaign_id}` +3. **If found**: Skip or update existing item +4. **If not found**: Proceed with creation + +Example: +```javascript +const workKey = `campaign-${campaignId}-${repository}-${workItemId}`; +const searchQuery = `repo:${repository} is:pr is:open "${workKey}" in:title`; + +const existingPRs = await github.search.issuesAndPullRequests({ + q: searchQuery +}); + +if (existingPRs.total_count > 0) { + console.log(`PR already exists: ${existingPRs.items[0].html_url}`); + return; // Skip creation +} +``` + +### Label All Created Items + +Apply the campaign tracker label to all created items: + +- Label format: `campaign:${campaign_id}` +- Prevents interference from other workflows +- Enables discovery by orchestrator + +## Worker Template + +```yaml +--- +name: My Campaign Worker +description: Worker workflow for campaign orchestration + +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload with work item details' + required: true + type: string + +tracker-id: my-campaign-worker + +tools: + github: + toolsets: [default] + +safe-outputs: + create-pull-request: + max: 1 + add-comment: + max: 2 +--- + +# My Campaign Worker + +You are a campaign worker that processes work items. + +## Step 1: Parse Input + +Parse the workflow_dispatch inputs: + +```javascript +const campaignId = context.payload.inputs.campaign_id; +const payload = JSON.parse(context.payload.inputs.payload); +``` + +Extract work item details from payload: +- `repository`: Target repository +- `work_item_id`: Unique identifier +- Additional context fields + +## Step 2: Check for Existing Work + +Generate deterministic key: +```javascript +const workKey = `campaign-${campaignId}-${payload.repository}-${payload.work_item_id}`; +``` + +Search for existing PR/issue with this key in title. + +If found: +- Log that work already exists +- Optionally add a comment with status update +- Exit successfully + +## Step 3: Perform Work + +If no existing work found: +1. Create branch with deterministic name +2. Make required changes +3. Create PR with deterministic title +4. Apply labels: `campaign:${campaignId}`, [additional labels] + +## Step 4: Report Status + +Report completion: +- Link to created/updated PR or issue +- Whether work was skipped or completed +- Any errors or blockers encountered +``` + +## Idempotency Patterns + +### Pattern 1: Branch-based Idempotency + +```yaml +# In worker prompt +Branch naming pattern: `fix/campaign-${campaignId}-${repo}-${workItemId}` + +Before creating: +1. Check if branch exists in target repo +2. If exists: Checkout and update +3. If not: Create new branch +``` + +### Pattern 2: PR Title-based Idempotency + +```yaml +# In worker prompt +PR title pattern: `[${workKey}] ${description}` + +Before creating PR: +1. Search for PRs with `${workKey}` in title +2. Filter by `campaign:${campaignId}` label +3. If found: Update with comment or skip +4. If not: Create new PR +``` + +### Pattern 3: Cursor-based Tracking + +```yaml +# In worker prompt +Track processed items in repo-memory: +- Path: `memory/campaigns/${campaignId}/processed-${workerId}.json` +- Format: `{"processed": ["item-1", "item-2"]}` + +Before processing: +1. Load processed items from repo-memory +2. Check if current work_item_id is in list +3. If in list: Skip +4. If not: Process, add to list, save +``` + +### Pattern 4: Issue Title-based Idempotency + +```yaml +# In worker prompt +Issue title pattern: `[${workItemId}] ${description}` + +Before creating issue: +1. Search for issues with `[${workItemId}]` in title +2. Filter by `campaign:${campaignId}` label +3. If found: Update existing issue +4. If not: Create new issue +``` + +## Example: Security Fix Worker + +```yaml +--- +name: Security Fix Worker +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON with alert details' + required: true + type: string + +tracker-id: security-fix-worker + +tools: + github: + toolsets: [default, code_security] + bash: ["*"] + edit: true + +safe-outputs: + create-pull-request: + max: 1 +--- + +# Security Fix Worker + +Process a code scanning alert and create a fix PR. + +## Parse Input + +```javascript +const campaignId = context.payload.inputs.campaign_id; +const payload = JSON.parse(context.payload.inputs.payload); +// payload: { repository, work_item_id: "alert-123", alert_type, file_path, ... } +``` + +## Idempotency Check + +```javascript +const workKey = `campaign-${campaignId}-alert-${payload.work_item_id}`; +const branchName = `fix/${workKey}`; +const prTitle = `[${workKey}] Fix: ${payload.alert_type} in ${payload.file_path}`; + +// Search for existing PR +const existingPRs = await github.search.issuesAndPullRequests({ + q: `repo:${payload.repository} is:pr is:open "${workKey}" in:title` +}); + +if (existingPRs.total_count > 0) { + console.log(`PR already exists: ${existingPRs.items[0].html_url}`); + // Optionally add comment with update + await github.issues.createComment({ + owner: payload.repository.split('/')[0], + repo: payload.repository.split('/')[1], + issue_number: existingPRs.items[0].number, + body: `Still being tracked by campaign ${campaignId}` + }); + return; +} +``` + +## Create Fix + +```bash +# Clone repo and create branch +git clone https://github.com/${payload.repository}.git +cd $(basename ${payload.repository}) +git checkout -b ${branchName} + +# Make security fix +# ... fix code ... + +# Commit and push +git add . +git commit -m "Fix ${payload.alert_type} in ${payload.file_path}" +git push origin ${branchName} +``` + +## Create PR + +```javascript +const pr = await github.pulls.create({ + owner: payload.repository.split('/')[0], + repo: payload.repository.split('/')[1], + title: prTitle, + body: `Fixes security alert ${payload.work_item_id}\n\n**Campaign**: ${campaignId}\n**Alert Type**: ${payload.alert_type}`, + head: branchName, + base: payload.target_ref || 'main' +}); + +// Apply labels +await github.issues.addLabels({ + owner: payload.repository.split('/')[0], + repo: payload.repository.split('/')[1], + issue_number: pr.number, + labels: [`campaign:${campaignId}`, 'security', 'automated'] +}); + +console.log(`Created PR: ${pr.html_url}`); +``` + +## Report Status + +Output: +- PR URL +- Alert ID processed +- Fix applied +- Labels added +``` + +## Best Practices + +### 1. Single Responsibility + +Each worker should have one clear purpose: +- ✅ "Create security fix PRs" +- ✅ "Update dependency versions" +- ❌ "Scan and fix and test and deploy" + +### 2. Deterministic Behavior + +Workers should produce the same output for the same input: +- Use deterministic keys based on input data +- Don't rely on timestamps or random values +- Make work idempotent via existence checks + +### 3. Explicit Errors + +Report errors clearly: +- Log what failed and why +- Include relevant context (repo, work item ID) +- Don't fail silently + +### 4. Minimal Permissions + +Request only needed permissions: +- Use specific GitHub toolsets +- Limit safe-output maxima (start with 1-3) +- Don't request wildcard permissions + +### 5. Clear Completion Status + +Always report what happened: +- "Created PR: [url]" +- "Skipped: PR already exists" +- "Failed: Missing required data" + +## Testing Workers + +Before using a worker in a campaign: + +1. **Test manually** with workflow_dispatch: + ```bash + gh workflow run my-worker.yml \ + -f campaign_id=test-campaign \ + -f payload='{"repository":"owner/repo","work_item_id":"test-1"}' + ``` + +2. **Verify idempotency** by running twice: + - First run should create resources + - Second run should skip/update without errors + +3. **Check labels** on created items: + - Verify `campaign:test-campaign` label is applied + - Confirm tracker-id is in description (if applicable) + +4. **Test error cases**: + - Invalid repository + - Missing payload fields + - Duplicate work items + +## Migration from Fusion Approach + +If you have workflows that used the old fusion approach: + +### Before (Fusion): +```yaml +on: + schedule: daily + push: + workflow_dispatch: # Added by fusion +``` + +### After (Dispatch-Only): +```yaml +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload' + required: true + type: string +``` + +### Migration Steps: + +1. **Remove autonomous triggers**: Delete schedule/push/pull_request +2. **Add input contract**: Add campaign_id and payload inputs +3. **Update prompt**: Parse inputs at the start +4. **Add idempotency**: Implement deterministic key checking +5. **Apply campaign label**: Label all created items +6. **Test**: Verify with manual dispatch + +## See Also + +- [Campaign Files Architecture](../specs/campaigns-files.md) +- [Campaign Examples](./src/content/docs/examples/campaigns.md) +- [Safe Outputs Documentation](./src/content/docs/reference/safe-outputs.md) diff --git a/docs/src/content/docs/examples/campaigns.md b/docs/src/content/docs/examples/campaigns.md index 55b72dd805..116d41f6ef 100644 --- a/docs/src/content/docs/examples/campaigns.md +++ b/docs/src/content/docs/examples/campaigns.md @@ -1,26 +1,27 @@ --- title: Campaign Examples -description: Example campaign workflows demonstrating worker orchestration and pattern analysis +description: Example campaign workflows demonstrating worker orchestration with standardized contracts sidebar: badge: { text: 'Examples', variant: 'note' } --- -This section contains example campaign workflows that demonstrate how to use campaign worker orchestration, workflow discovery, and the dispatch_workflow safe output. +This section contains example campaign workflows that demonstrate how to use first-class campaign workers with standardized input contracts and idempotency. ## Security Audit Campaign [**Security Audit 2026**](/gh-aw/examples/campaigns/security-auditcampaign/) - A comprehensive security audit campaign that demonstrates: -- **Worker Discovery**: Finding existing security-related workflows -- **Workflow Fusion**: Adapting workflows with `workflow_dispatch` triggers -- **Orchestration**: Using `dispatch_workflow` to coordinate multiple workers +- **Worker Discovery**: Finding security-related issues and PRs via tracker labels +- **Dispatch-Only Workers**: Workers designed specifically for campaign orchestration +- **Standardized Contract**: All workers accept `campaign_id` and `payload` inputs +- **Idempotency**: Workers check for existing work before creating duplicates - **KPI Tracking**: Measuring vulnerability reduction over time -- **Pattern Analysis**: Organizing workers in campaign-specific folders ### Key Features -- 3 worker workflows (scanner, updater, reporter) +- 3 dispatch-only worker workflows (scanner, fixer, reviewer) - Governance policies for pacing and opt-out +- Deterministic work item keys to prevent duplicates - Quarterly timeline with weekly status updates - Executive sponsorship and risk management @@ -28,10 +29,11 @@ This section contains example campaign workflows that demonstrate how to use cam [**Security Scanner**](/gh-aw/examples/campaigns/security-scanner/) - An example security scanner workflow that: -- Runs on a schedule (weekly) -- Creates issues for vulnerabilities -- Uses tracker-id for campaign discovery -- Can be dispatched by campaign orchestrators +- Accepts `campaign_id` and `payload` inputs via workflow_dispatch +- Uses deterministic keys for branch names and PR titles +- Checks for existing PRs before creating new ones +- Labels all created items with `campaign:{id}` for tracking +- Reports completion status back to orchestrator ## Using These Examples @@ -39,43 +41,70 @@ This section contains example campaign workflows that demonstrate how to use cam Campaign specs (`.campaign.md` files) define: - Campaign goals and KPIs -- Worker workflows to orchestrate +- Worker workflows to reference (by name) +- Discovery scope (repos/orgs to search) - Memory paths for state persistence - Governance and pacing policies -### 2. Worker Workflow Pattern +### 2. Worker Workflow Pattern (Dispatch-Only) -Worker workflows should: -- Support workflow_dispatch for orchestration +Worker workflows MUST: +- Use `workflow_dispatch` as the ONLY trigger (no schedule/push/pull_request) +- Accept standardized inputs: `campaign_id` (string) and `payload` (string; JSON) +- Implement idempotency via deterministic work item keys +- Label all created items with `campaign:{campaign_id}` - Focus on specific, repeatable tasks -- Be campaign-agnostic (reusable) -- Optionally include tracker-id in frontmatter to add tracking metadata to created issues/PRs -### 3. Folder Organization +Example: +```yaml +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload with work item details' + required: true + type: string +``` + +### 3. Idempotency Requirements + +Workers prevent duplicates by: +1. Computing deterministic keys: `campaign-{campaign_id}-{repository}-{work_item_id}` +2. Using keys in branch names, PR titles, issue titles +3. Checking for existing work with the key before creating +4. Skipping or updating existing items rather than creating duplicates + +### 4. Folder Organization ``` -docs/src/content/docs/examples/campaigns/ -├── security-audit.campaign.md # Campaign spec -└── security-scanner.md # Example worker workflow - -.github/workflows/campaigns/ -└── security-audit-2026/ # Fused workers at runtime - ├── security-scanner-worker.md - └── ... +.github/workflows/ +├── my-campaign.campaign.md # Campaign spec +├── my-worker.md # Worker workflow (dispatch-only) +└── my-campaign.campaign.lock.yml # Compiled orchestrator + +docs/ +└── campaign-workers.md # Worker pattern documentation ``` +Workers are stored alongside regular workflows, not in campaign-specific folders. The dispatch-only trigger makes ownership clear. + ## Learn More -- [Campaign Guides](/gh-aw/guides/campaigns/) - Complete campaign documentation +- [Campaign Guides](/gh-aw/guides/campaigns/) - Campaign setup and configuration - [Flow & lifecycle](/gh-aw/guides/campaigns/flow/) - How the orchestrator runs -- [Dispatch Workflow](/gh-aw/guides/dispatchops/) - Using workflow_dispatch - [Safe Outputs](/gh-aw/reference/safe-outputs/) - dispatch_workflow configuration ## Pattern Analysis -These examples are organized to enable future pattern analysis: -- Which workflows work best for security campaigns? -- What KPIs are most effective for different campaign types? -- How should workers be organized for optimal results? +These examples demonstrate best practices for campaign workers: +- **Explicit ownership**: Workers are dispatch-only, clearly orchestrated +- **Standardized contract**: All workers use the same input format +- **Idempotent behavior**: Workers avoid duplicate work across runs +- **Deterministic keys**: Enable reliable duplicate detection +- **Simple units**: Workers are focused, stateless, deterministic -The separate folder structure allows tracking and learning from campaign outcomes over time. +The dispatch-only pattern eliminates confusion about trigger precedence and makes orchestration explicit. diff --git a/pkg/campaign/workflow_fusion.go b/pkg/campaign/workflow_fusion.go deleted file mode 100644 index 03e11fcbf5..0000000000 --- a/pkg/campaign/workflow_fusion.go +++ /dev/null @@ -1,158 +0,0 @@ -package campaign - -import ( - "fmt" - "os" - "path/filepath" - "strings" - - "github.com/githubnext/gh-aw/pkg/logger" - "github.com/githubnext/gh-aw/pkg/parser" - "github.com/goccy/go-yaml" -) - -var workflowFusionLog = logger.New("campaign:workflow_fusion") - -// FusionResult contains the result of fusing a workflow for campaign use -type FusionResult struct { - OriginalWorkflowID string // Original workflow ID - CampaignWorkflowID string // New workflow ID in campaign folder - OutputPath string // Path to the fused workflow file - WorkflowDispatch bool // Whether workflow_dispatch was added -} - -// FuseWorkflowForCampaign takes an existing workflow and adapts it for campaign use -// by adding workflow_dispatch trigger and storing it in a campaign-specific folder -func FuseWorkflowForCampaign(rootDir string, workflowID string, campaignID string) (*FusionResult, error) { - workflowFusionLog.Printf("Fusing workflow %s for campaign %s", workflowID, campaignID) - - // Read original workflow - originalPath := filepath.Join(rootDir, ".github", "workflows", workflowID+".md") - content, err := os.ReadFile(originalPath) - if err != nil { - return nil, fmt.Errorf("failed to read workflow file: %w", err) - } - - // Parse frontmatter - result, err := parser.ExtractFrontmatterFromContent(string(content)) - if err != nil { - return nil, fmt.Errorf("failed to parse workflow: %w", err) - } - - frontmatter := result.Frontmatter - bodyContent := result.Markdown - - // Check if workflow_dispatch already exists - hasWorkflowDispatch := checkWorkflowDispatch(frontmatter) - - // Add workflow_dispatch if not present - if !hasWorkflowDispatch { - workflowFusionLog.Printf("Adding workflow_dispatch trigger to %s", workflowID) - frontmatter = addWorkflowDispatch(frontmatter) - } - - // Add campaign metadata - frontmatter["campaign-worker"] = true - frontmatter["campaign-id"] = campaignID - frontmatter["source-workflow"] = workflowID - - // Marshal frontmatter back to YAML - frontmatterYAML, err := yaml.Marshal(frontmatter) - if err != nil { - return nil, fmt.Errorf("failed to marshal frontmatter: %w", err) - } - - // Reconstruct workflow content - newContent := fmt.Sprintf("---\n%s---\n%s", string(frontmatterYAML), bodyContent) - - // Create campaign folder structure - campaignDir := filepath.Join(rootDir, ".github", "workflows", "campaigns", campaignID) - if err := os.MkdirAll(campaignDir, 0755); err != nil { - return nil, fmt.Errorf("failed to create campaign directory: %w", err) - } - - // Write fused workflow to campaign folder - campaignWorkflowID := fmt.Sprintf("%s-worker", workflowID) - outputPath := filepath.Join(campaignDir, campaignWorkflowID+".md") - if err := os.WriteFile(outputPath, []byte(newContent), 0644); err != nil { - return nil, fmt.Errorf("failed to write fused workflow: %w", err) - } - - workflowFusionLog.Printf("Fused workflow written to %s", outputPath) - - return &FusionResult{ - OriginalWorkflowID: workflowID, - CampaignWorkflowID: campaignWorkflowID, - OutputPath: outputPath, - WorkflowDispatch: !hasWorkflowDispatch, - }, nil -} - -// checkWorkflowDispatch checks if the workflow already has workflow_dispatch trigger -func checkWorkflowDispatch(frontmatter map[string]any) bool { - onField, ok := frontmatter["on"] - if !ok { - return false - } - - // Handle string format: "on: workflow_dispatch" - if onStr, ok := onField.(string); ok { - return strings.Contains(onStr, "workflow_dispatch") - } - - // Handle map format - if onMap, ok := onField.(map[string]any); ok { - _, hasDispatch := onMap["workflow_dispatch"] - return hasDispatch - } - - return false -} - -// addWorkflowDispatch adds workflow_dispatch trigger to the frontmatter -func addWorkflowDispatch(frontmatter map[string]any) map[string]any { - onField, ok := frontmatter["on"] - if !ok { - // No trigger defined, add workflow_dispatch - frontmatter["on"] = "workflow_dispatch" - return frontmatter - } - - // Handle string format - if onStr, ok := onField.(string); ok { - // Parse existing triggers - triggers := strings.Fields(onStr) - triggers = append(triggers, "workflow_dispatch") - frontmatter["on"] = strings.Join(triggers, "\n ") - return frontmatter - } - - // Handle map format - if onMap, ok := onField.(map[string]any); ok { - onMap["workflow_dispatch"] = nil // Add workflow_dispatch - frontmatter["on"] = onMap - return frontmatter - } - - // Fallback: replace with workflow_dispatch - frontmatter["on"] = "workflow_dispatch" - return frontmatter -} - -// FuseMultipleWorkflows fuses multiple workflows for a campaign -func FuseMultipleWorkflows(rootDir string, workflowIDs []string, campaignID string) ([]FusionResult, error) { - workflowFusionLog.Printf("Fusing %d workflows for campaign %s", len(workflowIDs), campaignID) - - var results []FusionResult - for _, workflowID := range workflowIDs { - result, err := FuseWorkflowForCampaign(rootDir, workflowID, campaignID) - if err != nil { - workflowFusionLog.Printf("Failed to fuse workflow %s: %v", workflowID, err) - continue - } - results = append(results, *result) - } - - workflowFusionLog.Printf("Successfully fused %d workflows", len(results)) - return results, nil -} diff --git a/pkg/campaign/workflow_fusion_test.go b/pkg/campaign/workflow_fusion_test.go deleted file mode 100644 index 03aa3181c8..0000000000 --- a/pkg/campaign/workflow_fusion_test.go +++ /dev/null @@ -1,305 +0,0 @@ -package campaign - -import ( - "os" - "path/filepath" - "strings" - "testing" - - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" -) - -func TestFuseWorkflowForCampaign(t *testing.T) { - tests := []struct { - name string - workflowContent string - campaignID string - expectWorkflowDispatch bool - expectCampaignMetadata bool - expectError bool - }{ - { - name: "add workflow_dispatch to workflow without it", - workflowContent: `--- -name: Security Scanner -description: Scan for vulnerabilities -on: issues ---- -# Security Scanner -Scan repositories`, - campaignID: "security-q1-2025", - expectWorkflowDispatch: true, - expectCampaignMetadata: true, - }, - { - name: "preserve existing workflow_dispatch", - workflowContent: `--- -name: Dependency Updater -on: - workflow_dispatch: - schedule: - - cron: "0 0 * * *" ---- -# Updater`, - campaignID: "deps-update", - expectWorkflowDispatch: true, - expectCampaignMetadata: true, - }, - { - name: "handle string format trigger", - workflowContent: `--- -name: Test Workflow -on: workflow_dispatch ---- -# Test`, - campaignID: "test-campaign", - expectWorkflowDispatch: true, - expectCampaignMetadata: true, - }, - } - - for _, tt := range tests { - t.Run(tt.name, func(t *testing.T) { - // Create temporary directory - tmpDir := t.TempDir() - workflowsDir := filepath.Join(tmpDir, ".github", "workflows") - require.NoError(t, os.MkdirAll(workflowsDir, 0755)) - - // Create original workflow - workflowID := "test-workflow" - originalPath := filepath.Join(workflowsDir, workflowID+".md") - require.NoError(t, os.WriteFile(originalPath, []byte(tt.workflowContent), 0644)) - - // Fuse workflow - result, err := FuseWorkflowForCampaign(tmpDir, workflowID, tt.campaignID) - - if tt.expectError { - assert.Error(t, err) - return - } - - require.NoError(t, err) - require.NotNil(t, result) - - // Verify result - assert.Equal(t, workflowID, result.OriginalWorkflowID) - assert.Equal(t, workflowID+"-worker", result.CampaignWorkflowID) - - // Verify file was created - assert.FileExists(t, result.OutputPath) - - // Read fused workflow - fusedContent, err := os.ReadFile(result.OutputPath) - require.NoError(t, err) - - fusedStr := string(fusedContent) - - // Verify workflow_dispatch exists - if tt.expectWorkflowDispatch { - assert.Contains(t, fusedStr, "workflow_dispatch", "Expected workflow_dispatch in fused workflow") - } - - // Verify campaign metadata - if tt.expectCampaignMetadata { - assert.Contains(t, fusedStr, "campaign-worker: true", "Expected campaign-worker metadata") - assert.Contains(t, fusedStr, "campaign-id: "+tt.campaignID, "Expected campaign-id metadata") - assert.Contains(t, fusedStr, "source-workflow: "+workflowID, "Expected source-workflow metadata") - } - }) - } -} - -func TestCheckWorkflowDispatch(t *testing.T) { - tests := []struct { - name string - frontmatter map[string]any - expected bool - }{ - { - name: "has workflow_dispatch in map format", - frontmatter: map[string]any{ - "on": map[string]any{ - "workflow_dispatch": nil, - }, - }, - expected: true, - }, - { - name: "has workflow_dispatch in string format", - frontmatter: map[string]any{ - "on": "workflow_dispatch", - }, - expected: true, - }, - { - name: "no workflow_dispatch", - frontmatter: map[string]any{ - "on": map[string]any{ - "issues": nil, - }, - }, - expected: false, - }, - { - name: "no on field", - frontmatter: map[string]any{}, - expected: false, - }, - } - - for _, tt := range tests { - t.Run(tt.name, func(t *testing.T) { - result := checkWorkflowDispatch(tt.frontmatter) - assert.Equal(t, tt.expected, result) - }) - } -} - -func TestAddWorkflowDispatch(t *testing.T) { - tests := []struct { - name string - frontmatter map[string]any - verify func(t *testing.T, result map[string]any) - }{ - { - name: "add to empty frontmatter", - frontmatter: map[string]any{}, - verify: func(t *testing.T, result map[string]any) { - assert.Equal(t, "workflow_dispatch", result["on"]) - }, - }, - { - name: "add to existing map format", - frontmatter: map[string]any{ - "on": map[string]any{ - "issues": nil, - }, - }, - verify: func(t *testing.T, result map[string]any) { - onMap, ok := result["on"].(map[string]any) - require.True(t, ok) - _, hasDispatch := onMap["workflow_dispatch"] - assert.True(t, hasDispatch) - }, - }, - { - name: "add to existing string format", - frontmatter: map[string]any{ - "on": "issues", - }, - verify: func(t *testing.T, result map[string]any) { - onStr, ok := result["on"].(string) - require.True(t, ok) - assert.Contains(t, onStr, "workflow_dispatch") - }, - }, - } - - for _, tt := range tests { - t.Run(tt.name, func(t *testing.T) { - result := addWorkflowDispatch(tt.frontmatter) - tt.verify(t, result) - }) - } -} - -func TestFuseMultipleWorkflows(t *testing.T) { - // Create temporary directory - tmpDir := t.TempDir() - workflowsDir := filepath.Join(tmpDir, ".github", "workflows") - require.NoError(t, os.MkdirAll(workflowsDir, 0755)) - - // Create multiple workflows - workflows := map[string]string{ - "workflow1": `--- -name: Workflow 1 -on: issues ---- -# W1`, - "workflow2": `--- -name: Workflow 2 -on: pull_request ---- -# W2`, - } - - for id, content := range workflows { - path := filepath.Join(workflowsDir, id+".md") - require.NoError(t, os.WriteFile(path, []byte(content), 0644)) - } - - // Fuse multiple workflows - workflowIDs := []string{"workflow1", "workflow2"} - results, err := FuseMultipleWorkflows(tmpDir, workflowIDs, "test-campaign") - require.NoError(t, err) - - // Verify results - assert.Len(t, results, 2) - - for _, result := range results { - assert.True(t, strings.HasSuffix(result.CampaignWorkflowID, "-worker")) - assert.FileExists(t, result.OutputPath) - } -} - -func TestFuseWorkflowWithoutTrackerID(t *testing.T) { - // This test verifies that campaign worker workflows do NOT require tracker-id - // tracker-id is optional and campaigns can discover workers via labels instead - - // Create temporary directory - tmpDir := t.TempDir() - workflowsDir := filepath.Join(tmpDir, ".github", "workflows") - require.NoError(t, os.MkdirAll(workflowsDir, 0755)) - - // Create workflow WITHOUT tracker-id - this should work fine - workflowContent := `--- -name: Security Scanner -description: Scan for vulnerabilities -on: issues -permissions: - contents: read -safe-outputs: - create-issue: ---- -# Security Scanner -Scan repositories for security issues. -` - - workflowID := "security-scanner" - originalPath := filepath.Join(workflowsDir, workflowID+".md") - require.NoError(t, os.WriteFile(originalPath, []byte(workflowContent), 0644)) - - // Fuse workflow for campaign - should succeed even without tracker-id - campaignID := "security-audit-2025" - result, err := FuseWorkflowForCampaign(tmpDir, workflowID, campaignID) - - // Should succeed - tracker-id is NOT required - require.NoError(t, err, "Campaign worker fusion should succeed without tracker-id") - require.NotNil(t, result) - - // Verify result - assert.Equal(t, workflowID, result.OriginalWorkflowID) - assert.Equal(t, workflowID+"-worker", result.CampaignWorkflowID) - - // Verify file was created - assert.FileExists(t, result.OutputPath) - - // Read fused workflow - fusedContent, err := os.ReadFile(result.OutputPath) - require.NoError(t, err) - - fusedStr := string(fusedContent) - - // Verify campaign metadata was added - assert.Contains(t, fusedStr, "campaign-worker: true", "Expected campaign-worker metadata") - assert.Contains(t, fusedStr, "campaign-id: "+campaignID, "Expected campaign-id metadata") - assert.Contains(t, fusedStr, "source-workflow: "+workflowID, "Expected source-workflow metadata") - - // Verify workflow_dispatch was added - assert.Contains(t, fusedStr, "workflow_dispatch", "Expected workflow_dispatch trigger") - - // Verify that tracker-id is NOT present (it wasn't in the original) - // This is fine - campaigns can discover workers via labels - assert.NotContains(t, fusedStr, "tracker-id:", "tracker-id should not be added if not present in original") -} diff --git a/pkg/cli/templates/execute-agentic-campaign-workflow.md b/pkg/cli/templates/execute-agentic-campaign-workflow.md index 4a4247ce44..1af0f6df4e 100644 --- a/pkg/cli/templates/execute-agentic-campaign-workflow.md +++ b/pkg/cli/templates/execute-agentic-campaign-workflow.md @@ -1,200 +1,284 @@ # Workflow Execution -This campaign orchestrator can execute workflows as needed. Your role is to run the workflows listed in sequence, collect their outputs, and use those outputs to drive the campaign forward. +This campaign references the following campaign workers. These workers follow the first-class worker pattern: they are dispatch-only workflows with standardized input contracts. -**IMPORTANT: Workflow execution is an advanced capability. Exercise caution and follow all guidelines carefully.** +**IMPORTANT: Workers are orchestrated, not autonomous. They accept `campaign_id` and `payload` inputs via workflow_dispatch.** --- -## Workflows to Execute +## Campaign Workers {{ if .Workflows }} -The following workflows should be executed in order: +The following campaign workers are referenced by this campaign: {{ range $idx, $workflow := .Workflows }} {{ add1 $idx }}. `{{ $workflow }}` {{ end }} {{ end }} +**Worker Pattern**: All workers MUST: +- Use `workflow_dispatch` as the ONLY trigger (no schedule/push/pull_request) +- Accept `campaign_id` (string) and `payload` (string; JSON) inputs +- Implement idempotency via deterministic work item keys +- Label all created items with `campaign:{{ .CampaignID }}` + --- ## Workflow Creation Guardrails -### Before Creating Any Workflow, Ask: +### Before Creating Any Worker Workflow, Ask: 1. **Does this workflow already exist?** - Check `.github/workflows/` thoroughly -2. **Can an existing workflow be used?** - Even if not perfect, existing is safer +2. **Can an existing workflow be adapted?** - Even if not perfect, existing is safer 3. **Is the requirement clear?** - Can you articulate exactly what it should do? -4. **Is it testable?** - Can you verify it works before using it in the campaign? -5. **Is it reusable?** - Could other campaigns benefit from this workflow? +4. **Is it testable?** - Can you verify it works with test inputs? +5. **Is it reusable?** - Could other campaigns benefit from this worker? -### Only Create New Workflows When: +### Only Create New Workers When: ✅ **All these conditions are met:** - No existing workflow does the required task - The campaign objective explicitly requires this capability -- You have a clear, specific design for the workflow -- The workflow has a focused, single-purpose scope +- You have a clear, specific design for the worker +- The worker has a focused, single-purpose scope - You can test it independently before campaign use -❌ **Never create workflows when:** +❌ **Never create workers when:** - You're unsure about requirements - An existing workflow "mostly" works -- The workflow would be complex or multi-purpose +- The worker would be complex or multi-purpose - You haven't verified it doesn't already exist - You can't clearly explain what it does in one sentence --- -## Execution Process +## Worker Creation Template -For each workflow: +If you must create a new worker (only after checking ALL guardrails above), use this template: -1. **Check if workflow exists** - Look for `.github/workflows/.md` +**Create the workflow file at `.github/workflows/.md`:** -2. **Create workflow if needed** - Only if ALL guardrails above are satisfied: - - **Design requirements:** - - **Single purpose**: One clear task (e.g., "scan for outdated dependencies", not "scan and update") - - **Explicit trigger**: Must include `workflow_dispatch` for manual/programmatic execution - - **Minimal tools**: Only include tools actually needed (principle of least privilege) - - **Safe outputs only**: Use appropriate safe-output limits (max: 5 for first version) - - **Clear prompt**: Describe exactly what the workflow should do and return - - **Create the workflow file at `.github/workflows/.md`:** - ```yaml - --- - name: - description: - - on: - workflow_dispatch: # Required for execution - inputs: - priority: - description: 'Priority level for this execution' - required: false - type: choice - options: - - low - - medium - - high - default: medium - target: - description: 'Specific target or scope for this run' - required: false - type: string - - tools: - github: - toolsets: [default] # Adjust based on needs - # Only add other tools if absolutely necessary - - safe-outputs: - create-issue: - max: 3 # Start conservative - add-comment: - max: 2 - --- - - # - - You are a focused workflow that . - - Priority: \$\{\{ github.event.inputs.priority \}\} - Target: \$\{\{ github.event.inputs.target \}\} - - ## Task - - - - ## Output - - +```yaml +--- +name: +description: + +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload with work item details' + required: true + type: string + +tracker-id: + +tools: + github: + toolsets: [default] + # Add minimal additional tools as needed + +safe-outputs: + create-pull-request: + max: 1 # Start conservative + add-comment: + max: 2 +--- + +# + +You are a campaign worker that processes work items. + +## Input Contract + +Parse inputs: +```javascript +const campaignId = context.payload.inputs.campaign_id; +const payload = JSON.parse(context.payload.inputs.payload); +``` + +Expected payload structure: +```json +{ + "repository": "owner/repo", + "work_item_id": "unique-id", + "target_ref": "main", + // Additional context... +} +``` + +## Idempotency Requirements + +1. **Generate deterministic key**: + ``` + const workKey = `campaign-${campaignId}-${payload.repository}-${payload.work_item_id}`; ``` - - **Note**: Define `inputs` under `workflow_dispatch` to accept parameters from the orchestrator. Use `\$\{\{ github.event.inputs.INPUT_NAME \}\}` to reference input values in your workflow markdown. See [DispatchOps documentation](https://githubnext.github.io/gh-aw/guides/dispatchops/#with-input-parameters) for input types and examples. - - - Compile it with `gh aw compile .md` - - **CRITICAL: Test before use** (see testing requirements below) -3. **Test newly created workflows** (MANDATORY): - - **Why test?** - Untested workflows may fail during campaign execution, blocking progress. Test first to catch issues early. - - **Testing steps:** - - Trigger test run: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - - Wait for completion: Poll until status is "completed" - - **Verify success**: Check that workflow succeeded and produced expected outputs - - **Review outputs**: Ensure results match expectations (check artifacts, issues created, etc.) - - **If test fails**: Revise the workflow, recompile, and test again - - **Only proceed** after successful test run +2. **Check for existing work**: + - Search for PRs/issues with `workKey` in title + - Filter by label: `campaign:${campaignId}` + - If found: Skip or update + - If not: Create new + +3. **Label all created items**: + - Apply `campaign:${campaignId}` label + - This enables discovery by orchestrator + +## Task + + + +## Output + +Report: +- Link to created/updated PR or issue +- Whether work was skipped (exists) or completed +- Any errors or blockers +``` + +**After creating:** +- Compile: `gh aw compile .md` +- **CRITICAL: Test with sample inputs** (see testing requirements below) + +--- + +## Worker Testing (MANDATORY) + +**Why test?** - Untested workers may fail during campaign execution. Test with sample inputs first to catch issues early. + +**Testing steps:** + +1. **Prepare test payload**: + ```json + { + "repository": "test-org/test-repo", + "work_item_id": "test-1", + "target_ref": "main" + } + ``` + +2. **Trigger test run**: + ```bash + gh workflow run .yml \ + -f campaign_id={{ .CampaignID }} \ + -f payload='{"repository":"test-org/test-repo","work_item_id":"test-1"}' + ``` - **Test failure actions:** - - DO NOT use the workflow in the campaign if testing fails - - Analyze the failure logs to understand what went wrong - - Make necessary corrections to the workflow - - Recompile and retest - - If you can't fix it after 2 attempts, report in status update and skip this workflow + Or via GitHub MCP: + ```javascript + mcp__github__run_workflow( + workflow_id: "", + ref: "main", + inputs: { + campaign_id: "{{ .CampaignID }}", + payload: JSON.stringify({repository: "test-org/test-repo", work_item_id: "test-1"}) + } + ) + ``` + +3. **Wait for completion**: Poll until status is "completed" -4. **Execute the workflow** (skip if just tested successfully): - - Trigger: `mcp__github__run_workflow(workflow_id: "", ref: "main")` - - **Pass input parameters based on decisions**: If the workflow accepts inputs, provide them to guide execution (e.g., `inputs: {priority: "high", target: "security"}`) - - Wait for completion: Poll `mcp__github__get_workflow_run(run_id)` until status is "completed" - - Collect outputs: Check `mcp__github__download_workflow_run_artifact()` for any artifacts - - **Handle failures gracefully**: If execution fails, note it in status update but continue campaign +4. **Verify success**: + - Check that workflow succeeded + - Verify idempotency: Run again with same inputs, should skip/update + - Review created items have correct labels + - Confirm deterministic keys are used -5. **Use outputs for next steps** - Use information from workflow runs to: - - Inform subsequent workflow executions (e.g., scanner results → upgrader inputs) - - Pass contextual inputs to worker workflows based on campaign state and decisions - - Update project board items with relevant information - - Make decisions about campaign progress and next actions +5. **Test failure actions**: + - DO NOT use the worker if testing fails + - Analyze failure logs + - Make corrections + - Recompile and retest + - If unfixable after 2 attempts, report in status and skip **Note**: Workflows that accept `workflow_dispatch` inputs can receive parameters from the orchestrator. This enables the orchestrator to provide context, priorities, or targets based on its decisions. See [DispatchOps documentation](https://githubnext.github.io/gh-aw/guides/dispatchops/#with-input-parameters) for input parameter examples. --- -## Guidelines +## Orchestration Guidelines + +**Execution pattern:** +- Workers are **orchestrated, not autonomous** +- Orchestrator discovers work items via discovery manifest +- Orchestrator decides which workers to run and with what inputs +- Workers receive `campaign_id` and `payload` via workflow_dispatch +- Sequential vs parallel execution is orchestrator's decision + +**Worker dispatch:** +- Parse discovery manifest (`./.gh-aw/campaign.discovery.json`) +- For each work item needing processing: + 1. Determine appropriate worker for this item type + 2. Construct payload with work item details + 3. Dispatch worker via workflow_dispatch with campaign_id and payload + 4. Track dispatch status + +**Input construction:** +```javascript +// Example: Dispatching security-fix worker +const workItem = discoveryManifest.items[0]; +const payload = { + repository: workItem.repo, + work_item_id: `alert-${workItem.number}`, + target_ref: "main", + alert_type: "sql-injection", + file_path: "src/db.go", + line_number: 42 +}; -**Execution order:** -- Execute workflows **sequentially** (one at a time) -- Wait for each workflow to complete before starting the next -- **Why sequential?** - Ensures dependencies between workflows are respected and reduces API load +await github.actions.createWorkflowDispatch({ + owner: context.repo.owner, + repo: context.repo.repo, + workflow_id: "security-fix-worker.yml", + ref: "main", + inputs: { + campaign_id: "{{ .CampaignID }}", + payload: JSON.stringify(payload) + } +}); +``` -**Workflow creation:** -- **Always test newly created workflows** before using them in the campaign -- **Why test first?** - Prevents campaign disruption from broken workflows -- Start with minimal, focused workflows (easier to test and debug) -- **Why minimal?** - Reduces complexity and points of failure -- Keep designs simple and aligned with campaign objective -- **Why simple?** - Easier to understand, test, and maintain +**Idempotency by design:** +- Workers implement their own idempotency checks +- Orchestrator doesn't need to track what's been processed +- Can safely re-dispatch work items across runs +- Workers will skip or update existing items **Failure handling:** -- If a workflow test fails, revise and retest before proceeding -- **Why retry?** - Initial failures often due to minor issues easily fixed -- If a workflow fails during campaign execution, note the failure and continue -- **Why continue?** - One workflow failure shouldn't block entire campaign progress -- Report all failures in the status update with context -- **Why report?** - Transparency helps humans intervene if needed - -**Workflow reusability:** -- Workflows you create should be reusable for future campaign runs -- **Why reusable?** - Reduces need to create workflows repeatedly, builds library of capabilities -- Avoid campaign-specific logic in workflows (keep them generic) -- **Why generic?** - Enables reuse across different campaigns - -**Permissions and safety:** -- Keep workflow permissions minimal (only what's needed) -- **Why minimal?** - Reduces risk and follows principle of least privilege -- Prefer draft PRs over direct merges for code changes -- **Why drafts?** - Requires human review before merging changes -- Escalate to humans when uncertain about decisions -- **Why escalate?** - Human oversight prevents risky autonomous actions +- If a worker dispatch fails, note it but continue +- Worker failures don't block entire campaign +- Report all failures in status update with context +- Humans can intervene if needed --- -## After Workflow Execution +## After Worker Orchestration + +Once workers have been dispatched (or new workers created and tested), proceed with normal orchestrator steps: + +1. **Discovery** - Read state from discovery manifest and project board +2. **Planning** - Determine what needs updating on project board +3. **Project Updates** - Write state changes to project board +4. **Status Reporting** - Report progress, worker dispatches, failures, next steps + +--- + +## Key Differences from Fusion Approach + +**Old fusion approach (REMOVED)**: +- Workers had mixed triggers (schedule + workflow_dispatch) +- Fusion dynamically added workflow_dispatch to existing workflows +- Workers stored in campaign-specific folders +- Ambiguous ownership and trigger precedence + +**New first-class worker approach**: +- Workers are dispatch-only (on: workflow_dispatch) +- Standardized input contract (campaign_id, payload) +- Explicit idempotency via deterministic keys +- Clear ownership: workers are orchestrated, not autonomous +- Workers stored with regular workflows (not campaign-specific folders) +- Orchestration policy kept explicit in orchestrator -Once all workflows have been executed (or created and executed), proceed with the normal orchestrator steps: -- Step 1: Discovery (read state from manifest and project board) -- Step 2: Planning (determine what needs updating) -- Step 3: Project Updates (write state to project board) -- Step 4: Status Reporting (report progress, failures, and next steps) +This eliminates duplicate execution problems and makes orchestration concerns explicit. diff --git a/specs/campaigns-files.md b/specs/campaigns-files.md index 7154d2245d..5b80f6e905 100644 --- a/specs/campaigns-files.md +++ b/specs/campaigns-files.md @@ -476,6 +476,259 @@ governance: This ensures that campaign items remain under the control of their respective campaign orchestrators and aren't interfered with by other automated workflows. +## Campaign Workers + +Campaign workers are specialized workflows designed to be orchestrated by campaign orchestrators. They follow a first-class worker pattern with explicit contracts and idempotency. + +### Worker Design Principles + +1. **Dispatch-only triggers**: Workers use `workflow_dispatch` as the primary/only trigger + - No schedule, push, or pull_request triggers + - Clear ownership: workers are orchestrated, not autonomous + - Prevents duplicate execution from multiple trigger sources + +2. **Standardized input contract**: All workers accept: + - `campaign_id` (string): The campaign identifier orchestrating this worker + - `payload` (string): JSON-encoded data specific to the work item + +3. **Idempotency**: Workers implement deterministic behavior: + - Compute deterministic work item keys (e.g., `campaign-{id}-{repo}-{alert-id}`) + - Use keys in branch names, PR titles, issue titles + - Check for existing PR/issue with key + tracker label before creating + - Skip or update existing items rather than creating duplicates + +4. **Orchestration agnostic**: Workers don't know about orchestration policy + - Sequential vs parallel execution is orchestrator's concern + - Workers are simple, focused, deterministic units + +### Worker Workflow Template + +```yaml +--- +name: Campaign Worker Example +description: Example worker workflow for campaign orchestration + +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON payload with work item details' + required: true + type: string + +tracker-id: campaign-worker-example + +tools: + github: + toolsets: [default] + +safe-outputs: + create-pull-request: + max: 1 + add-comment: + max: 2 +--- + +# Campaign Worker Example + +You are a campaign worker that processes work items from a campaign orchestrator. + +## Input Contract + +The `payload` input contains JSON with the following structure: +```json +{ + "repository": "owner/repo", + "work_item_id": "unique-identifier", + "target_ref": "main", + "additional_context": {} +} +``` + +Parse the payload and extract the work item details. + +## Idempotency Requirements + +Before creating any GitHub resources: + +1. **Generate deterministic key**: + - Format: `campaign-${campaign_id}-${repository}-${work_item_id}` + - Use this key in branch names, PR titles, issue titles + +2. **Check for existing work**: + - Search for PRs/issues with the deterministic key in the title + - Filter by tracker label: `campaign:${campaign_id}` + - If found: Skip creation or update existing item + - If not found: Proceed with creation + +3. **Label all created items**: + - Apply tracker label: `campaign:${campaign_id}` + - This enables discovery by the orchestrator + - Prevents interference from other workflows + +## Work to Perform + +[Specific task description for this worker] + +## Expected Output + +Report completion status including: +- Whether work was skipped (already exists) or completed +- Links to created/updated PRs or issues +- Any errors or blockers encountered +``` + +### Idempotency Implementation Patterns + +#### Pattern 1: Deterministic Branch Names + +```yaml +# In worker prompt +Generate a deterministic branch name: +- Format: `campaign-${campaign_id}-${repository.replace('/', '-')}-${work_item_id}` +- Example: `campaign-security-q1-2025-myorg-myrepo-alert-123` + +Before creating a new branch: +1. Check if the branch already exists +2. If exists: checkout and update +3. If not: create new branch +``` + +#### Pattern 2: PR Title Prefixing + +```yaml +# In worker prompt +Use a deterministic PR title prefix: +- Format: `[campaign:${campaign_id}] ${work_item_description}` +- Example: `[campaign:security-q1-2025] Fix SQL injection in user.go` + +Before creating a PR: +1. Search for open PRs with this title prefix in the target repo +2. If found: Add a comment with updates or close as duplicate +3. If not: Create new PR with title +``` + +#### Pattern 3: Issue Title Keying + +```yaml +# In worker prompt +Use a deterministic issue title with key: +- Format: `[${work_item_id}] ${description}` +- Example: `[alert-123] High severity: Path traversal vulnerability` + +Before creating an issue: +1. Search for issues with `[${work_item_id}]` in title +2. Filter by label: `campaign:${campaign_id}` +3. If found: Update existing issue with new information +4. If not: Create new issue +``` + +#### Pattern 4: Cursor-based Work Tracking + +```yaml +# In worker prompt +Track processed work items in repo-memory: +- File: `memory/campaigns/${campaign_id}/processed-items.json` +- Structure: `{"processed": ["item-1", "item-2", ...]}` + +Before processing a work item: +1. Load the processed items list from repo-memory +2. Check if current work_item_id is in the list +3. If found: Skip processing +4. If not: Process and add to list +5. Save updated list back to repo-memory +``` + +### Worker Discovery + +Campaign orchestrators discover worker-created items via: + +1. **Tracker Label**: Items labeled with `campaign:${campaign_id}` +2. **Tracker ID**: Items with `tracker-id: worker-name` in their description +3. **Discovery Script**: `campaign_discovery.cjs` searches for both + +Workers should: +- Apply the campaign tracker label to all created items +- Include the worker's tracker-id in issue/PR descriptions (optional) +- This enables orchestrators to find and track worker output + +### Example: Security Fix Worker + +```yaml +--- +name: Security Fix Worker +description: Creates PRs with security fixes for code scanning alerts + +on: + workflow_dispatch: + inputs: + campaign_id: + description: 'Campaign identifier' + required: true + type: string + payload: + description: 'JSON with alert details' + required: true + type: string + +tracker-id: security-fix-worker + +tools: + github: + toolsets: [default, code_security] + bash: ["*"] + edit: true + +safe-outputs: + create-pull-request: + max: 1 +--- + +# Security Fix Worker + +Process a code scanning alert and create a fix PR. + +## Idempotency Implementation + +```javascript +const payload = JSON.parse(process.env.PAYLOAD); +const campaignId = process.env.CAMPAIGN_ID; +const alertId = payload.alert_id; +const repository = payload.repository; + +// Deterministic key +const workKey = `campaign-${campaignId}-alert-${alertId}`; +const branchName = `fix/${workKey}`; +const prTitle = `[${workKey}] Fix: ${payload.alert_title}`; + +// Check for existing PR +const existingPRs = await searchPullRequests({ + query: `repo:${repository} is:pr is:open "${workKey}" in:title` +}); + +if (existingPRs.length > 0) { + console.log(`PR already exists: ${existingPRs[0].url}`); + // Optionally update with new information + return; +} + +// Proceed with fix and PR creation... +``` + +## Expected Behavior + +1. Parse payload to get alert details +2. Check for existing PR with deterministic key +3. If exists: Skip or update +4. If not: Generate fix and create PR +5. Apply labels: `campaign:${campaign_id}`, `security`, `automated` +6. Report completion status +``` + ## For Third-Party Users ### Using gh-aw Compiler Outside This Repository