Skip to content

Add deduplication logic to smoke tests - 15% of issues are duplicates #10789

@github-actions

Description

@github-actions

Problem Statement

Smoke test workflows are creating duplicate issues, resulting in 15% of all issues being duplicates (9 patterns identified in last 100 issues).

Evidence

Duplicate Patterns Identified

Top 5 duplicate patterns:

  1. "Smoke Test: Claude - XXXXXX": 15 instances
  2. "Smoke Test: Copilot - XXXXXX": 13 instances
  3. "[agentics] Smoke Copilot failed": 4 instances
  4. "[agentics] agentic workflows out of sync": 3 instances
  5. "Smoke Claude - Issue Group": 3 instances

Impact

  • Noise in issue tracker - harder to find signal in 248 open issues
  • Maintenance overhead - need to manually close duplicates
  • Reduced credibility - creates perception of low quality
  • 15% duplicate rate - approximately 37 duplicate issues out of 248

Root Cause

Smoke tests not checking for existing open issues:

  • Creating new issue for each test failure
  • No deduplication logic
  • Not closing resolved issues before creating new ones

Workflow behavior:

  1. Test fails
  2. Workflow creates issue immediately
  3. No check for existing open issue with same title pattern
  4. Result: Multiple issues for same problem

Proposed Solution

Add Deduplication Logic to Smoke Tests

Implementation (2-4 hours):

  1. Before creating new issue, check for existing open issues:

    // Pseudocode
    const existingIssues = await github.rest.issues.listForRepo({
      owner, repo,
      state: 'open',
      labels: ['smoke-test'],
      per_page: 100
    });
    
    const duplicatePattern = /^Smoke Test: (Copilot|Claude) -/;
    const duplicate = existingIssues.data.find(issue => 
      duplicatePattern.test(issue.title) && 
      issue.title.includes(testName)
    );
    
    if (duplicate) {
      // Update existing issue instead of creating new one
      await github.rest.issues.createComment({
        issue_number: duplicate.number,
        body: `Test still failing as of ${new Date().toISOString()}\n\n[Latest run](...)`
      });
      return; // Don't create new issue
    }
  2. Close resolved issues before creating new ones:

    • If test passes, close any open issues for that test
    • Add comment indicating test now passes
  3. Use consistent title patterns:

    • Smoke Test: [Engine] - [Test Name]
    • Makes duplicate detection easier
    • Improves searchability

Alternative: Issue Groups

Instead of individual issues per run, create issue groups that get updated:

  • One issue per failing test that stays open
  • Add comments for each failure occurrence
  • Close when test passes consistently

Expected Outcomes

Metrics:

  • Duplicate rate: From 15% to <5%
  • Issue clarity: Easier to identify unique problems
  • Maintenance: Less time closing duplicates

User experience:

  • Cleaner issue tracker
  • Easier to find real problems
  • Better signal-to-noise ratio

Implementation Plan

Phase 1: Add Deduplication (2-4 hours)

  1. Update smoke-copilot.md workflow:

    • Add GitHub API query for existing issues
    • Implement duplicate detection logic
    • Add comment-instead-of-create behavior
  2. Update smoke-claude.md workflow:

    • Same changes as Copilot
    • Ensure consistent behavior
  3. Test changes:

    • Trigger smoke tests manually
    • Verify no duplicates created
    • Verify existing issues get updated

Phase 2: Clean Up Existing Duplicates (1 hour)

  1. Identify and close duplicates:
    • Query for duplicate patterns
    • Keep most recent issue open
    • Close older duplicates with "duplicate of #XXX" comment

Phase 3: Monitor (Ongoing)

  1. Track duplicate rate:
    • Agent Performance Analyzer monitors rate
    • Alert if rate exceeds 5%
    • Iterate on detection logic

Testing

Manual testing:

  1. Trigger smoke test that fails
  2. Verify no existing open issue → creates new issue ✅
  3. Trigger same failing test again
  4. Verify existing open issue → adds comment ✅
  5. Fix test, trigger passing test
  6. Verify existing issue → gets closed ✅

Automated verification:

  • Agent Performance Analyzer tracks duplicate rate
  • Alert if rate exceeds 5% for 2 consecutive weeks

Success Metrics

  • Duplicate rate: Target <5% (from 15%)
  • Issue count: Expect 15-20% reduction in total open issues
  • Signal quality: Easier to identify unique problems
  • Maintenance time: Less time closing duplicates

Timeline

  • Phase 1 (Implementation): 2-4 hours
  • Phase 2 (Cleanup): 1 hour
  • Phase 3 (Monitoring): Ongoing
  • Total: 3-5 hours initial, then automated

Priority

P1 (High) because:

  • Affects 15% of issues (significant noise)
  • Reduces issue tracker credibility
  • Easy to fix (2-4 hours)
  • High impact on user experience
  • Not blocking (lower priority than PR merge crisis)

Related Workflows

  • .github/workflows/smoke-copilot.md
  • .github/workflows/smoke-claude.md
  • Any workflow creating issues for test failures

Recommended Owner

  • Smoke test workflow maintainers
  • Testing team

AI generated by Agent Performance Analyzer - Meta-Orchestrator

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions