Skip to content

[refactor] Semantic Function Clustering - High-Priority Refactoring Opportunities #728

@github-actions

Description

@github-actions

🔧 Semantic Function Clustering Analysis

Analysis performed on repository: github/gh-aw-mcpg
Analysis date: 2026-02-05
Total Go files analyzed: 62 files (39 non-test files)

Executive Summary

The codebase demonstrates excellent overall organization with clear package boundaries and minimal code duplication. The analysis identified 2 high-priority refactoring opportunities and several minor organizational improvements. Most patterns follow Go idioms and domain-driven design principles.

Key Findings:

  • Strong package cohesion - Most files serve a single, clear purpose
  • Minimal duplication - Most "similar" functions serve different domains
  • ⚠️ 1 misplaced function - TruncateSessionID in auth package (used only by logger)
  • ⚠️ 1 outlier function - runDockerInspect in validation package (Docker operation)
  • Well-organized helpers - Domain-specific helpers properly placed

Function Inventory Summary

Packages Analyzed

Package Files Primary Purpose Organization Quality
internal/auth 1 Authentication header parsing ✅ Good (1 outlier)
internal/cmd 8 CLI commands and flags ✅ Excellent
internal/config 8 Configuration loading/validation ⚠️ Good (1 outlier)
internal/difc 5 DIFC security labels ✅ Excellent
internal/guard 4 Security guards ✅ Excellent
internal/launcher 3 Backend process management ✅ Excellent
internal/logger 13 Logging framework ✅ Excellent
internal/mcp 2 MCP protocol types ✅ Excellent
internal/middleware 1 Request middleware ✅ Excellent
internal/server 10 HTTP server implementation ✅ Excellent
internal/sys 1 System utilities ✅ Good
internal/testutil 4 Testing utilities ✅ Excellent
internal/timeutil 1 Time formatting ✅ Excellent
internal/tty 2 TTY detection ✅ Excellent
internal/version 1 Version management ✅ Excellent

Total Functions Cataloged: ~350 functions and methods
Semantic Clusters Identified: 9 major patterns
High-Priority Issues Found: 2
Duplicates Detected: 0 (all similar functions serve different domains)


Identified Issues

1. ⚠️ Misplaced Function: TruncateSessionID

Priority: 🔴 HIGH
Impact: Medium - Code organization and discoverability

Problem

TruncateSessionID() function is defined in internal/auth/header.go but is exclusively used by the logger package (internal/server/sdk_logging.go, 7 occurrences). This violates the principle of placing functions close to their usage.

Current Location

// File: internal/auth/header.go:172
func TruncateSessionID(sessionID string) string {
    if len(sessionID) <= 8 {
        return sessionID
    }
    return sessionID[:8] + "..."
}

Usage Analysis

All 7 calls are in internal/server/sdk_logging.go:

  • Line 54: Logging session comparison
  • Line 70: JSON-RPC request logging
  • Line 111: Error response logging
  • Lines 119-120: Session ID diagnostics
  • Lines 125-126: Session mismatch logging

Evidence

$ grep -r "TruncateSessionID" internal/ --include="*.go" | grep -v "_test.go"
internal/server/sdk_logging.go:54:     mode, auth.TruncateSessionID(sessionID), ...
internal/server/sdk_logging.go:70:     mode, jsonrpcReq.Method, auth.TruncateSessionID(sessionID))
internal/server/sdk_logging.go:111:    mode, jsonrpcReq.Method, auth.TruncateSessionID(sessionID), ...
internal/server/sdk_logging.go:119:    logSDK.Printf("    Session ID: %s", auth.TruncateSessionID(sessionID))
internal/server/sdk_logging.go:120:    logSDK.Printf("    MCP-Session-Id header: %s", auth.TruncateSessionID(mcpSessionID))
internal/server/sdk_logging.go:125:    mode, jsonrpcReq.Method, auth.TruncateSessionID(sessionID), ...
internal/server/sdk_logging.go:126:    auth.TruncateSessionID(mcpSessionID), errorMsg)
internal/auth/header.go:172:func TruncateSessionID(sessionID string) string {

Recommendation

Move TruncateSessionID to internal/logger/sanitize/sanitize.go

Benefits:

  1. ✅ Collocates with similar function TruncateSecret() (same purpose, different length)
  2. ✅ Places function near its only usage (logger package)
  3. ✅ Improves discoverability - all truncation helpers in one place
  4. ✅ Maintains consistency with existing sanitize package purpose

Implementation Steps:

  1. Move function from internal/auth/header.go to internal/logger/sanitize/sanitize.go
  2. Update imports in internal/server/sdk_logging.go: auth.TruncateSessionIDsanitize.TruncateSessionID
  3. Optionally: Parameterize truncation length for reusability
    func TruncateString(s string, maxLen int) string {
        if len(s) <= maxLen {
            return s
        }
        return s[:maxLen] + "..."
    }

Estimated Effort: 30-45 minutes
Risk: Low - Simple move with clear usage pattern


2. ⚠️ Outlier Function: runDockerInspect in validation_env.go

Priority: 🟡 MEDIUM
Impact: Low-Medium - Single-responsibility principle violation

Problem

runDockerInspect() is a Docker operation helper located in a validation package. This function executes docker inspect commands but doesn't perform validation logic itself.

Current Location

// File: internal/config/validation_env.go:156
func runDockerInspect(containerID, formatTemplate string) (string, error) {
    cmd := exec.Command("docker", "inspect", 
        "--format", formatTemplate, containerID)
    output, err := cmd.Output()
    // ... error handling ...
    return strings.TrimSpace(string(output)), nil
}

Analysis

Why it's an outlier:

  • ❌ Executes Docker commands (infrastructure operation)
  • ❌ Not validation logic (just a Docker client wrapper)
  • ❌ Breaks single-responsibility principle for validation_env.go
  • ✅ Used by validation functions (5 callers in same file)

Why current placement is acceptable:

  • ✅ Only used within validation_env.go (unexported helper)
  • ✅ Tightly coupled to containerized validation checks
  • ✅ Small function (~20 lines)

Recommendation

Option A (Recommended): Keep as-is with improved documentation

  • Add comment explaining it's a Docker operation helper for validation
  • Document that it's intentionally kept in-file due to tight coupling
  • Estimated Effort: 5 minutes

Option B (Future Enhancement): Create internal/docker/ package

  • Extract Docker operations if more Docker helpers are added
  • Consolidate Docker client interactions
  • Estimated Effort: 2-3 hours (requires careful dependency analysis)
  • Trigger: If 3+ Docker helper functions emerge

Current Recommendation: Option A - The function is an acceptable outlier given its tight coupling and single-file usage.


Detailed Function Clusters

Cluster 1: Validation Functions ✅

Pattern: validate*, check*, ensure*
Files: 8 files across config/, auth/, server/ packages
Organization:Excellent - Well-distributed by domain

Examples:

  • config/validation.go: Config structure validation
  • config/validation_env.go: Environment validation
  • config/validation_schema.go: JSON schema validation
  • config/rules/rules.go: Validation rule constructors
  • auth/header.go: API key validation
  • server/http_helpers.go: Session validation

Analysis: Each validation function is correctly placed in its domain package. No consolidation needed.


Cluster 2: Logging Functions ✅

Pattern: Log*, log*, format*, sanitize*
Files: 13 files in logger/ package + domain-specific helpers
Organization:Excellent - Centralized core with domain helpers

Core Logger Files:

  • logger/file_logger.go - File-based logging
  • logger/jsonl_logger.go - JSONL format logging
  • logger/markdown_logger.go - Markdown format logging
  • logger/server_file_logger.go - Per-server log files
  • logger/sanitize/sanitize.go - Secret sanitization

Domain-Specific Helpers:

  • launcher/log_helpers.go - Launcher-specific logging (✅ Correctly placed)
  • server/auth.go: logRuntimeError - Runtime error logging (✅ Correctly placed)
  • server/http_helpers.go: logHTTPRequestBody - HTTP debugging (✅ Correctly placed)

Analysis: The separation between core logging (logger package) and domain helpers (in domain packages) follows Go best practices. No changes needed.


Cluster 3: HTTP/Transport Functions ✅

Pattern: handle*, create*, setup*, http*, transport*
Files: 10 files in server/ and mcp/ packages
Organization:Excellent - Clear separation by mode

Files:

  • server/handlers.go - Generic HTTP handlers
  • server/health.go - Health endpoint
  • server/routed.go - Routed mode implementation
  • server/unified.go - Unified mode implementation
  • server/transport.go - Transport layer
  • mcp/connection.go - MCP protocol connections

Analysis: Clear functional separation. Each file serves a distinct purpose. No refactoring needed.


Cluster 4: Initialization Functions ✅

Pattern: New*, Init*, Create*, Register*
Files: Distributed across all packages
Organization:Excellent - Idiomatic Go constructors

Patterns Identified:

  • New* - Constructor for structs (e.g., NewLauncher, NewUnified)
  • Init* - Initialize global state (e.g., InitFileLogger, InitServerFileLogger)
  • Create* - Factory functions (e.g., CreateHTTPServerForMCP)
  • Register* - Registration pattern (e.g., RegisterGuardType, RegisterDefaults)

Analysis: Naming conventions are consistent and idiomatic. The generic helper logger/common.go:initLogger[T] demonstrates good use of generics to reduce duplication.


Cluster 5: Formatting Functions ✅

Pattern: format*, Format*, truncate*, Truncate*
Files: 7 files across logger/, config/, auth/, difc/ packages
Organization: ⚠️ Good (1 outlier: TruncateSessionID)

Core Formatting:

  • logger/rpc_formatter.go - RPC message formatting
  • logger/sanitize/sanitize.go - Secret truncation/sanitization
  • timeutil/format.go - Duration formatting

Domain-Specific Formatting:

  • config/validation_schema.go: formatSchemaError - JSON schema errors (✅ Correctly placed)
  • difc/evaluator.go: FormatViolationError - DIFC violations (✅ Correctly placed)
  • auth/header.go: TruncateSessionID - Session ID truncation (⚠️ Outlier - see Issue Configure as a Go CLI tool #1)

Analysis: Domain-specific formatters are correctly placed except for TruncateSessionID (see Issue #1).


Cluster 6: Parsing/Extraction Functions ✅

Pattern: parse*, extract*, Parse*, Extract*
Files: 5 files across auth/, mcp/, logger/, server/ packages
Organization:Excellent

Examples:

  • auth/header.go: ParseAuthHeader, ExtractSessionID, ExtractAgentID
  • mcp/connection.go: parseSSEResponse
  • logger/rpc_helpers.go: extractEssentialFields, ExtractErrorMessage
  • server/unified.go: parseToolArguments

Analysis: All parsing functions are correctly placed in their domain packages. No consolidation needed.


Cluster 7: Connection/Session Management ✅

Pattern: Get*, Set*, Delete*, connection*, session*
Files: 4 files in launcher/, mcp/, server/ packages
Organization:Excellent - Clear ownership boundaries

Key Files:

  • launcher/connection_pool.go - Connection pool CRUD operations
  • launcher/launcher.go - Connection lifecycle (GetOrLaunch, GetOrLaunchForSession)
  • mcp/connection.go - MCP protocol connections
  • server/unified.go - Session management

Analysis: Well-defined interfaces with clear responsibilities. No refactoring needed.


Cluster 8: Cleanup/Close Functions ✅

Pattern: Close, close*, Stop, cleanup*
Files: Distributed across multiple packages
Organization:Excellent - Following Go interface patterns

Examples:

  • launcher/connection_pool.go: Stop, cleanupIdleConnections
  • launcher/launcher.go: Close
  • logger/*: Close methods on all logger types
  • mcp/connection.go: Close
  • server/unified.go: Close

Analysis: Consistent use of Close() method for resource cleanup following Go's io.Closer pattern. No changes needed.


Cluster 9: DIFC Security Functions ✅

Pattern: DIFC-specific operations (labels, evaluation, guards)
Files: 5 files in difc/ package
Organization:Excellent - Cohesive security domain

Files:

  • difc/agent.go - Agent label management
  • difc/capabilities.go - Capabilities tracking
  • difc/evaluator.go - Access control evaluation
  • difc/labels.go - Label types and operations
  • difc/resource.go - Resource labeling

Analysis: DIFC implementation is well-isolated in a dedicated package with clear interfaces. Excellent domain-driven design.


Duplicate Analysis: No True Duplicates Found

Investigated Candidates

❌ Not Duplicates: Environment Variable Expansion

Functions:

  • config/validation.go: expandVariables - Runtime config value expansion
  • cmd/root.go: loadEnvFile - .env file parsing
  • mcp/connection.go: expandDockerEnvArgs - Docker argument expansion

Analysis: These serve different contexts and have different error handling requirements. Consolidation would create artificial coupling.


❌ Not Duplicates: Empty Checking

Functions:

  • logger/rpc_helpers.go: isEffectivelyEmpty - JSON-RPC data validation
  • difc/labels.go: IsEmpty - DIFC label checking

Analysis: Type-specific logic operating on different data structures. No duplication.


✅ Acceptable Pattern: Error Formatting

Functions:

  • config/validation_schema.go: formatSchemaError - JSON schema errors
  • difc/evaluator.go: FormatViolationError - DIFC violations
  • config/rules/rules.go: Error constructors - Config validation errors

Analysis: Domain-specific error formatting is idiomatic Go. Each formatter handles its domain's error context. Consolidation would lose domain specificity.


✅ Acceptable Pattern: Logging Initialization

Functions:

  • logger/file_logger.go: InitFileLogger
  • logger/jsonl_logger.go: InitJSONLLogger
  • logger/markdown_logger.go: InitMarkdownLogger
  • logger/server_file_logger.go: InitServerFileLogger

Analysis: All use the generic helper logger/common.go:initLogger[T]. Type-specific wrappers are needed for their return types. No further consolidation possible.


Summary of Recommendations

Priority 1: High Impact (Implement Soon)

# Issue Effort Risk Impact
1 Move TruncateSessionID to logger/sanitize/ 30-45 min Low Medium

Implementation Checklist:

  • Move TruncateSessionID from internal/auth/header.go to internal/logger/sanitize/sanitize.go
  • Update import in internal/server/sdk_logging.go
  • Run tests: make test
  • Verify no functionality broken

Priority 2: Medium Impact (Consider for Future)

# Issue Effort Risk Impact
2 Document runDockerInspect placement 5 min None Low
3 Consider internal/docker/ package if more Docker helpers emerge 2-3 hours Medium Medium

Architecture Insights

✅ Strengths Identified

  1. Excellent Package Cohesion

    • Each package has a clear, single purpose
    • Dependencies flow in correct direction (no circular dependencies detected)
    • Clear separation between core logic and domain helpers
  2. Minimal Code Duplication

    • Functions with similar names serve different domains
    • Generic helpers exist where appropriate (e.g., logger/common.go:initLogger[T])
    • Type-specific wrappers are kept minimal
  3. Idiomatic Go Patterns

    • Consistent constructor naming (New*, Init*, Create*)
    • Proper use of interfaces (io.Closer, Guard, BackendCaller)
    • Domain-driven design with clear boundaries
  4. Good Use of Generics

    • logger/common.go:initLogger[T] eliminates logger initialization duplication
    • Type constraints properly used (closableLogger constraint)
  5. Security-First Design

    • Dedicated logger/sanitize/ package for secret handling
    • DIFC implementation isolated in dedicated package
    • Consistent truncation for safe logging

⚠️ Minor Observations

  1. Naming Convention Inconsistency (Minor)

    • Mix of New* vs Create* for constructors
    • Mix of Init* vs New* for initialization
    • Recommendation: Document conventions in CONTRIBUTING.md
  2. Large Files

    • internal/server/unified.go - 1008 lines
    • internal/mcp/connection.go - 999 lines
    • Status: Acceptable - Each implements a cohesive feature set

Analysis Metadata

  • Total Go Files Analyzed: 62 files (39 non-test + 23 test files)
  • Total Functions/Methods Cataloged: ~350
  • Semantic Clusters Identified: 9 major patterns
  • High-Priority Issues: 1 (TruncateSessionID misplacement)
  • Medium-Priority Issues: 1 (runDockerInspect outlier)
  • True Duplicates Found: 0
  • Detection Method: Manual semantic analysis + grep pattern matching
  • Time to Analyze: ~45 minutes
  • Analysis Tool: GitHub Copilot CLI Agent

Conclusion

The gh-aw-mcpg codebase demonstrates excellent code organization with minimal refactoring needs. The single high-priority issue (TruncateSessionID misplacement) is straightforward to fix. Most functions are correctly placed following Go best practices and domain-driven design principles.

Recommended Action: Implement Priority 1 refactoring (TruncateSessionID move) in the next maintenance cycle. Priority 2 items are optional enhancements.

Overall Code Quality Rating: ⭐⭐⭐⭐⭐ (5/5) - Exceptionally well-organized codebase

AI generated by Semantic Function Refactoring

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions