Add deduplication logic to smoke tests

## Problem Statement

Smoke test workflows are creating duplicate issues, resulting in **15% of all issues being duplicates** (9 patterns identified in last 100 issues).

## Evidence

### Duplicate Patterns Identified

**Top 5 duplicate patterns:**
1. **"Smoke Test: Claude - XXXXXX"**: 15 instances
2. **"Smoke Test: Copilot - XXXXXX"**: 13 instances
3. **"[agentics] Smoke Copilot failed"**: 4 instances
4. **"[agentics] agentic workflows out of sync"**: 3 instances
5. **"Smoke Claude - Issue Group"**: 3 instances

### Impact
- **Noise in issue tracker** - harder to find signal in 248 open issues
- **Maintenance overhead** - need to manually close duplicates
- **Reduced credibility** - creates perception of low quality
- **15% duplicate rate** - approximately 37 duplicate issues out of 248

## Root Cause

**Smoke tests not checking for existing open issues:**
- Creating new issue for each test failure
- No deduplication logic
- Not closing resolved issues before creating new ones

**Workflow behavior:**
1. Test fails
2. Workflow creates issue immediately
3. No check for existing open issue with same title pattern
4. Result: Multiple issues for same problem

## Proposed Solution

### Add Deduplication Logic to Smoke Tests

**Implementation (2-4 hours):**

1. **Before creating new issue, check for existing open issues:**
   ```javascript
   // Pseudocode
   const existingIssues = await github.rest.issues.listForRepo({
     owner, repo,
     state: 'open',
     labels: ['smoke-test'],
     per_page: 100
   });
   
   const duplicatePattern = /^Smoke Test: (Copilot|Claude) -/;
   const duplicate = existingIssues.data.find(issue => 
     duplicatePattern.test(issue.title) && 
     issue.title.includes(testName)
   );
   
   if (duplicate) {
     // Update existing issue instead of creating new one
     await github.rest.issues.createComment({
       issue_number: duplicate.number,
       body: `Test still failing as of ${new Date().toISOString()}\n\n[Latest run](...)`
     });
     return; // Don't create new issue
   }
   ```

2. **Close resolved issues before creating new ones:**
   - If test passes, close any open issues for that test
   - Add comment indicating test now passes

3. **Use consistent title patterns:**
   - `Smoke Test: [Engine] - [Test Name]`
   - Makes duplicate detection easier
   - Improves searchability

### Alternative: Issue Groups

Instead of individual issues per run, create **issue groups** that get updated:
- One issue per failing test that stays open
- Add comments for each failure occurrence
- Close when test passes consistently

## Expected Outcomes

**Metrics:**
- **Duplicate rate:** From 15% to <5%
- **Issue clarity:** Easier to identify unique problems
- **Maintenance:** Less time closing duplicates

**User experience:**
- Cleaner issue tracker
- Easier to find real problems
- Better signal-to-noise ratio

## Implementation Plan

### Phase 1: Add Deduplication (2-4 hours)

1. **Update smoke-copilot.md workflow:**
   - Add GitHub API query for existing issues
   - Implement duplicate detection logic
   - Add comment-instead-of-create behavior

2. **Update smoke-claude.md workflow:**
   - Same changes as Copilot
   - Ensure consistent behavior

3. **Test changes:**
   - Trigger smoke tests manually
   - Verify no duplicates created
   - Verify existing issues get updated

### Phase 2: Clean Up Existing Duplicates (1 hour)

4. **Identify and close duplicates:**
   - Query for duplicate patterns
   - Keep most recent issue open
   - Close older duplicates with "duplicate of #XXX" comment

### Phase 3: Monitor (Ongoing)

5. **Track duplicate rate:**
   - Agent Performance Analyzer monitors rate
   - Alert if rate exceeds 5%
   - Iterate on detection logic

## Testing

**Manual testing:**
1. Trigger smoke test that fails
2. Verify no existing open issue → creates new issue ✅
3. Trigger same failing test again
4. Verify existing open issue → adds comment ✅
5. Fix test, trigger passing test
6. Verify existing issue → gets closed ✅

**Automated verification:**
- Agent Performance Analyzer tracks duplicate rate
- Alert if rate exceeds 5% for 2 consecutive weeks

## Success Metrics

- **Duplicate rate:** Target <5% (from 15%)
- **Issue count:** Expect 15-20% reduction in total open issues
- **Signal quality:** Easier to identify unique problems
- **Maintenance time:** Less time closing duplicates

## Timeline

- **Phase 1 (Implementation):** 2-4 hours
- **Phase 2 (Cleanup):** 1 hour
- **Phase 3 (Monitoring):** Ongoing
- **Total:** 3-5 hours initial, then automated

## Priority

**P1 (High)** because:
- Affects 15% of issues (significant noise)
- Reduces issue tracker credibility
- Easy to fix (2-4 hours)
- High impact on user experience
- Not blocking (lower priority than PR merge crisis)

## Related Workflows

- `.github/workflows/smoke-copilot.md`
- `.github/workflows/smoke-claude.md`
- Any workflow creating issues for test failures

## Recommended Owner

- Smoke test workflow maintainers
- Testing team




> AI generated by [Agent Performance Analyzer - Meta-Orchestrator](https://github.com/githubnext/gh-aw/actions/runs/21160076691)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add deduplication logic to smoke tests - 15% of issues are duplicates #10789

Problem Statement

Evidence

Duplicate Patterns Identified

Impact

Root Cause

Proposed Solution

Alternative: Issue Groups

Expected Outcomes

Implementation Plan

Phase 1: Add Deduplication (2-4 hours)

Phase 2: Clean Up Existing Duplicates (1 hour)

Phase 3: Monitor (Ongoing)

Testing

Success Metrics

Timeline

Priority

Related Workflows

Recommended Owner

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development