Skip to content

Add semantic function refactoring workflow for Go code analysis#2109

Merged
pelikhan merged 4 commits intomainfrom
copilot/refactor-go-functions-clustering
Oct 22, 2025
Merged

Add semantic function refactoring workflow for Go code analysis#2109
pelikhan merged 4 commits intomainfrom
copilot/refactor-go-functions-clustering

Conversation

Copy link
Contributor

Copilot AI commented Oct 22, 2025

This PR adds a new agentic workflow that performs semantic analysis of Go source files to identify refactoring opportunities through function clustering and duplicate detection.

Overview

The workflow analyzes all non-test Go files in the pkg/ directory to:

  • Collect and catalog function names and signatures per file
  • Cluster functions semantically by naming patterns and purpose
  • Identify outliers (functions that don't match their file's primary purpose)
  • Detect duplicate or near-duplicate implementations using semantic code analysis
  • Generate detailed refactoring recommendations with priority levels

Implementation Details

Workflow Configuration:

  • Engine: Claude AI for semantic reasoning capabilities
  • MCP Server: Serena for advanced code analysis (imported from shared/mcp/serena.md)
  • Tools: GitHub API (file reading, code search), edit tool, and specific bash commands
  • Mode: Strict mode with explicitly allowed bash commands
  • Trigger: Daily scheduled (8 AM UTC) + manual dispatch
  • Output: Creates GitHub issue with [refactor] prefix containing detailed findings

Analysis Steps:

  1. Activate Serena project for semantic analysis
  2. Discover all .go files (excluding *_test.go)
  3. Use Serena's get_symbols_overview to catalog functions
  4. Cluster functions by naming patterns (prefixes, suffixes, data types)
  5. Identify outliers based on "1 file per feature" organizational principle
  6. Detect duplicates using Serena's semantic pattern matching
  7. Generate comprehensive issue with prioritized recommendations

Serena Tools Utilized:

  • activate_project - Initialize analysis environment
  • get_symbols_overview - Extract function signatures
  • find_symbol - Locate similarly named functions
  • search_for_pattern - Find duplicate code patterns
  • find_referencing_symbols - Understand usage patterns
  • read_file - Examine implementations

Expected Output

The workflow creates a GitHub issue containing:

  • Executive Summary: Overview of total files analyzed and findings count
  • Function Inventory: Organized by package with clustering results
  • Identified Issues:
    • Outlier functions (validation in compiler files, parsers in network files, etc.)
    • Duplicates (similar implementations with >80% code similarity)
    • Scattered helpers (common utilities spread across multiple files)
    • Generic opportunities (type-specific functions that could use generics)
  • Refactoring Recommendations: Prioritized list with effort estimates
    • Priority 1: Move outlier functions to appropriate files
    • Priority 2: Consolidate duplicate functions
    • Priority 3: Centralize scattered helpers

Benefits

This workflow helps maintain code quality by:

  • Ensuring consistent file organization following the "1 file per feature" pattern
  • Reducing code duplication through automated detection
  • Identifying refactoring opportunities before they become technical debt
  • Providing actionable recommendations with specific file paths and examples

Testing

  • ✅ Workflow compiles successfully (183.2 KB lock file generated)
  • ✅ All unit tests pass
  • ✅ Strict mode validation passed (no wildcard bash commands)
  • ✅ No compilation errors or warnings

The workflow is ready to be executed and will provide valuable insights into code organization and refactoring opportunities in the Go codebase.

Original prompt

Generate an agentic workflow that clusters functions semantically by name and tries to refactor similar functionalities.

  • imports Serena.md

  • enables github tool

  • uses Claude

  • has edit tool

  • bash *

  • only works on.go file.

The agent starts by collecting all the function names per file, then it tries to cluster them.

The functions should have been organized as 1 file per feature (file is named after the gesture). Look for outliers of that rule.

After that, use Serena and reasoning to detect potential duplicates and suggest fixes.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
on:
workflow_dispatch:
schedule:
- cron: "0 9 * * 1" # Weekly on Monday at 9 AM UTC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot runs daily, find a time slot not used yet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to daily at 8 AM UTC (unused time slot). Commit: 561b53e

@pelikhan pelikhan marked this pull request as ready for review October 22, 2025 09:12
@github-actions
Copy link
Contributor

Agentic Changeset Generator triggered by this pull request.

Copilot AI changed the title [WIP] Cluster and refactor Go functions by semantic similarity Add semantic function refactoring workflow for Go code analysis Oct 22, 2025
Copilot AI requested a review from pelikhan October 22, 2025 09:13
github-actions bot and others added 2 commits October 22, 2025 09:15
@pelikhan pelikhan merged commit f67ff20 into main Oct 22, 2025
4 checks passed
@pelikhan pelikhan deleted the copilot/refactor-go-functions-clustering branch October 22, 2025 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants