Skip to content

[refactor] Semantic Function Clustering Analysis - Refactoring Opportunities Identified #1003

@github-actions

Description

@github-actions

🔧 Semantic Function Clustering Analysis

Analysis of repository: github/gh-aw-mcpg
Date: 2026-02-16


Executive Summary

Analysis of 65 non-test Go files across 16 internal packages identified significant refactoring opportunities:

  • 3 critical duplicate patterns affecting 12+ functions across multiple files
  • File organization issues in 4 packages (logger, server, config, mcp)
  • Scattered helper functions creating maintenance burden
  • Estimated 20-30% code reduction potential through consolidation

Key Findings:

  • internal/logger/ package has the most critical duplicates (3 separate implementations of log-level wrappers)
  • internal/config/ has validation functions scattered across 3 files with overlapping logic
  • internal/server/ has middleware and handler logic fragmented
  • Variable expansion logic duplicated with 90% code similarity

Detailed Findings

🔴 CRITICAL: Logger Package Duplicates

1. Log Level Wrapper Functions (98% Identical Code)

Issue: Three sets of nearly identical log wrapper functions across different logger implementations.

Locations:

  • internal/logger/file_logger.go (lines 125-143): LogInfo(), LogWarn(), LogError(), LogDebug()
  • internal/logger/server_file_logger.go (lines 188-205): LogInfoWithServer(), LogWarnWithServer(), LogErrorWithServer(), LogDebugWithServer()
  • internal/logger/markdown_logger.go (lines 206-224): LogInfoMd(), LogWarnMd(), LogErrorMd(), LogDebugMd()

Code Example:

// Pattern repeated in 3 files:
func LogInfo(category, format string, args ...interface{}) {
    logWithLevel(LogLevelInfo, category, format, args...)
}
func LogWarn(category, format string, args ...interface{}) {
    logWithLevel(LogLevelWarn, category, format, args...)
}
func LogError(category, format string, args ...interface{}) {
    logWithLevel(LogLevelError, category, format, args...)
}
func LogDebug(category, format string, args ...interface{}) {
    logWithLevel(LogLevelDebug, category, format, args...)
}

Impact:

  • 12 duplicate functions (4 wrappers × 3 files)
  • Maintenance burden: changes must be replicated 3 times
  • Inconsistency risk if implementations diverge

Recommendation:
Create a shared abstraction using a logger interface:

// internal/logger/base.go
type LevelLogger interface {
    Log(level LogLevel, category, format string, args ...interface{})
}

// Then each logger (FileLogger, ServerFileLogger, MarkdownLogger) 
// implements Log() once, and shared wrappers call it
func LogInfo(logger LevelLogger, category, format string, args ...interface{}) {
    logger.Log(LogLevelInfo, category, format, args...)
}

Estimated Impact: Reduce ~36 lines to ~12 lines (66% reduction)


2. RPC Logging Fragmentation

Issue: RPC logging functionality split across 3 separate files with unclear boundaries.

Files:

  • internal/logger/rpc_logger.go (3 public functions): LogRPCRequest(), LogRPCResponse(), LogRPCMessage()
  • internal/logger/rpc_formatter.go (3 formatting functions): formatRPCMessage(), formatRPCMessageMarkdown(), formatJSONWithoutFields()
  • internal/logger/rpc_helpers.go (5 helper functions): truncateAndSanitize(), extractEssentialFields(), isEffectivelyEmpty(), etc.

Pattern Analysis:
All three files work together for RPC logging, but the separation isn't clear:

  • rpc_logger.go dispatches to formatters
  • rpc_formatter.go formats messages
  • rpc_helpers.go provides utilities used by formatter

Recommendation:
Consolidate into a single file or create a sub-package internal/logger/rpc/ with clear separation:

  • rpc.go - Public API
  • format.go - Formatting logic
  • helpers.go - Shared utilities

Estimated Impact: Improved maintainability, clearer code organization


🟡 MEDIUM: Config Package Validation Sprawl

3. Variable Expansion Logic Duplication (90% Similar)

Issue: Two functions with nearly identical logic for expanding environment variables.

Location: internal/config/validation.go

Functions:

  • expandVariables() (lines 22-48) - Expands variables in strings
  • ExpandRawJSONVariables() (lines 53-79) - Expands variables in JSON bytes

Code Comparison:

// expandVariables (string version)
func expandVariables(input string) (string, error) {
    result := varExprPattern.ReplaceAllStringFunc(input, func(match string) string {
        varName := match[2 : len(match)-1] // Extract \$\{VAR}
        val, exists := os.LookupEnv(varName)
        if !exists {
            errors = append(errors, UndefinedVariable(varName))
            return match
        }
        return val
    })
    // ... error handling
}

// ExpandRawJSONVariables (byte version) - NEARLY IDENTICAL LOGIC
func ExpandRawJSONVariables(data []byte) ([]byte, error) {
    result := varExprPattern.ReplaceAllFunc(data, func(match []byte) []byte {
        varName := string(match[2 : len(match)-1]) // Extract \$\{VAR}
        val, exists := os.LookupEnv(varName)
        if !exists {
            errors = append(errors, UndefinedVariable(varName))
            return match
        }
        return []byte(val)
    })
    // ... identical error handling
}

Similarity: 90% identical logic, only differs in type conversion (string vs []byte).

Recommendation:
Consolidate using generics (Go 1.18+) or a shared implementation:

// Shared expansion logic
func expandVariablesInBytes(data []byte) ([]byte, error) {
    // Core logic here
}

func expandVariables(input string) (string, error) {
    result, err := expandVariablesInBytes([]byte(input))
    return string(result), err
}

func ExpandRawJSONVariables(data []byte) ([]byte, error) {
    return expandVariablesInBytes(data)
}

Estimated Impact: Reduce ~60 lines to ~40 lines, single source of truth for expansion logic


4. Validation Functions Scattered Across Multiple Files

Issue: Validation logic distributed across 3 files without clear organization.

Files & Function Counts:

  • validation.go (11 functions): General validation, variable expansion, mount validation, server config validation
  • validation_env.go (13 functions): Environment validation, Docker checks, execution environment validation
  • validation_schema.go (7 functions): JSON schema validation, string pattern validation

Overlap Examples:

  • validation.go:validateServerConfig() calls functions from both validation_env.go and validation_schema.go
  • Environment variable validation appears in both validation.go (via expandVariables) and validation_env.go (via ValidateExecutionEnvironment)

Recommendation:
Option A: Consolidate into a single validation.go with clear section comments
Option B: Create sub-package internal/config/validation/ with:

  • env.go - Environment-specific validation
  • schema.go - JSON schema validation
  • variables.go - Variable expansion
  • config.go - High-level config validation orchestration

Estimated Impact: Improved code discoverability and reduced cognitive load


🟡 MEDIUM: Server Package Handler Fragmentation

5. Middleware and Handler Logic Scattered

Issue: HTTP handler and middleware logic fragmented across multiple files.

Files:

  • handlers.go (6 functions): High-level endpoint handlers (handleOAuthDiscovery, handleClose, handleUnified, handleRouted, etc.)
  • auth.go (3 functions): Authentication middleware (authMiddleware, extractAuthHeader, error logging)
  • routed.go (~400 lines): Routed mode implementation with inline middleware (rejectIfShutdown)
  • unified.go (~200 lines): Unified mode implementation
  • http_helpers.go (3 functions): Helper utilities (extractAndValidateSession, logHTTPRequestBody, injectSessionContext)
  • response_writer.go (1 struct + 4 methods): Custom response writer

Pattern:

  • handlers.go creates endpoint handlers
  • auth.go provides authentication middleware wrapper
  • routed.go has inline shutdown check middleware (rejectIfShutdown)
  • Middleware composition is implicit, not explicit

Recommendation:
Consolidate middleware chain composition:

// internal/server/middleware.go (new file)
type Middleware func(http.Handler) http.Handler

func Chain(middlewares ...Middleware) Middleware {
    return func(final http.Handler) http.Handler {
        for i := len(middlewares) - 1; i >= 0; i-- {
            final = middlewares[i](final)
        }
        return final
    }
}

// Then in handlers.go:
handler := Chain(
    authMiddleware(apiKey),
    shutdownMiddleware(server),
)(finalHandler)

Estimated Impact: Clearer request processing flow, easier to add/remove middleware


🟢 LOW: HTTP Response Writing Pattern Duplication

6. JSON Response Writing Pattern Repeated 50+ Times

Issue: The same 3-line pattern for JSON responses repeated throughout codebase.

Pattern:

w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK) // or other status
json.NewEncoder(w).Encode(response)

Locations:

  • internal/server/handlers.go (multiple locations)
  • internal/server/auth.go
  • Test files throughout internal/server/

Recommendation:
Extract helper function:

// internal/server/http_helpers.go
func WriteJSONResponse(w http.ResponseWriter, status int, v interface{}) error {
    w.Header().Set("Content-Type", "application/json")
    w.WriteHeader(status)
    return json.NewEncoder(w).Encode(v)
}

// Usage:
WriteJSONResponse(w, http.StatusOK, response)

Estimated Impact: Reduce ~150 lines to ~50 lines across codebase


🟢 LOW: String Validation Helpers

7. Repeated Empty String Checks

Issue: No centralized string validation utilities.

Patterns found 50+ times:

  • len(x) == 0
  • len(strings.TrimSpace(x)) == 0
  • Manual port number validation
  • Manual path validation

Example Locations:

  • internal/config/validation_schema.go:419: len(strings.TrimSpace(server.Entrypoint)) == 0
  • internal/server/server.go:139: len(serverIDs) == 0
  • Various files throughout codebase

Recommendation:
Create utility package or add to existing internal/sys/:

// internal/sys/validation.go (new file) or internal/sys/sys.go
func IsEmpty(s string) bool {
    return len(strings.TrimSpace(s)) == 0
}

func IsValidPort(port int) bool {
    return port > 0 && port <= 65535
}

func IsAbsolutePath(path string) bool {
    return filepath.IsAbs(path)
}
````

**Estimated Impact:** Minor code reduction, improved readability

---

## File Organization Issues

### Package: internal/logger/ (20 files)

**Issue:** Largest package with diverse responsibilities, difficult to navigate.

**Current Structure:**
- Core logging: `logger.go`, `constants.go`, `common.go`
- Format-specific loggers: `file_logger.go`, `jsonl_logger.go`, `markdown_logger.go`, `tools_logger.go`, `server_file_logger.go`
- RPC logging: `rpc_logger.go`, `rpc_formatter.go`, `rpc_helpers.go`
- Integration: `slog_adapter.go`
- Global state: `global_helpers.go`
- Sub-package: `sanitize/`

**Recommendation:**
Create sub-packages for better organization:

````
internal/logger/
├── logger.go (core debug logger)
├── constants.go
├── common.go
├── formatters/
│   ├── file.go (file_logger.go)
│   ├── jsonl.go (jsonl_logger.go)
│   ├── markdown.go (markdown_logger.go)
│   ├── tools.go (tools_logger.go)
│   └── server.go (server_file_logger.go)
├── rpc/
│   ├── rpc.go (rpc_logger.go)
│   ├── format.go (rpc_formatter.go)
│   └── helpers.go (rpc_helpers.go)
├── adapters/
│   └── slog.go (slog_adapter.go)
├── sanitize/
│   └── sanitize.go
└── global_helpers.go
````

---

### Package: internal/mcp/ (minimal files, test-heavy)

**Issue:** Core implementation minimal, test files suggest missing structure.

**Current Structure:**
- `connection.go` - Connection interface and implementation
- `types.go` - MCP type definitions
- Multiple test files suggesting separate concerns

**Test File Analysis:**
- `http_connection_test.go` - HTTP-specific connection
- `connection_stderr_test.go` - stderr handling
- `connection_arguments_test.go` - argument handling

**Recommendation:**
Extract HTTP connection into separate file:

````
internal/mcp/
├── connection.go (base interface)
├── stdio_connection.go (stdio implementation)
├── http_connection.go (HTTP implementation)
└── types.go

Summary Statistics

Category Count Files Affected
Go Files Analyzed 65 internal/*
Packages Analyzed 16 internal/*
Critical Duplicates 3 patterns logger, config
Medium Issues 4 patterns server, config
Low-Priority Issues 2 patterns Various
Total Functions Reviewed 200+ All packages

Refactoring Recommendations

Priority 1: High Impact (Estimated 4-6 hours)

  1. ✅ Consolidate Log Level Wrappers

    • Create shared logger interface
    • Extract common log-level wrapper functions
    • Impact: 66% code reduction in wrappers, single source of truth
    • Files: internal/logger/file_logger.go, server_file_logger.go, markdown_logger.go
  2. ✅ Merge Variable Expansion Functions

    • Consolidate expandVariables and ExpandRawJSONVariables
    • Use shared implementation with type-specific wrappers
    • Impact: 33% code reduction, eliminates duplicate logic
    • File: internal/config/validation.go

Priority 2: Medium Impact (Estimated 6-8 hours)

  1. 📦 Restructure Logger Package

    • Create sub-packages: formatters/, rpc/, adapters/
    • Move files to appropriate sub-packages
    • Impact: Improved discoverability, clearer organization
    • Package: internal/logger/
  2. 🔧 Consolidate Validation Files

    • Choose between: single file with sections OR sub-package
    • Reorganize validation functions by concern
    • Impact: Reduced cognitive load, clearer validation flow
    • Package: internal/config/
  3. 🌐 Standardize Middleware Composition

    • Create explicit middleware chain in internal/server/middleware.go
    • Extract inline middleware from routed.go
    • Impact: Explicit request processing flow, easier to modify
    • Package: internal/server/

Priority 3: Long-term Improvements (Estimated 8-10 hours)

  1. Extract HTTP Response Helper

    • Create WriteJSONResponse() utility
    • Replace 50+ instances of duplicate pattern
    • Impact: Minor code reduction, improved consistency
  2. Create String Validation Utilities

    • Add IsEmpty(), IsValidPort(), IsAbsolutePath() helpers
    • Replace scattered validation patterns
    • Impact: Improved readability, consistent validation
  3. Extract MCP Connection Types

    • Split connection.go into stdio_connection.go and http_connection.go
    • Impact: Clearer separation of concerns

Implementation Checklist


Analysis Methodology

Tools Used:

  • Manual code review of 65 Go files
  • Pattern analysis for function naming and structure
  • Semantic analysis for duplicate detection
  • Directory structure analysis for organization issues

Detection Criteria:

  • Code similarity >80% for duplicates
  • Function naming patterns and clustering
  • File organization by feature/functionality
  • Go best practices and idioms

Exclusions:

  • Test files (*_test.go) - not analyzed
  • Generated code - none found
  • Vendor dependencies - not applicable

Next Steps

  1. Review and Prioritize: Team reviews findings and confirms priority order
  2. Create Sub-Issues: Break down Priority 1 items into actionable issues
  3. Implement Incrementally: Tackle one refactoring at a time with full test coverage
  4. Verify Quality: Run make agent-finished after each change
  5. Document Changes: Update AGENTS.md if refactoring affects workflow

Analysis Complete
This analysis provides a roadmap for systematic code improvement through consolidation and better organization.

Generated by Semantic Function Refactoring

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions