Skip to content

[refactor] Semantic Function Clustering Analysis - Code Organization Opportunities #13126

@github-actions

Description

@github-actions

Executive Summary

Comprehensive semantic function clustering analysis of the Go codebase identified high-impact refactoring opportunities focused on reducing code duplication and improving organization. The analysis examined 487 non-test Go files with 312 exported functions in the pkg/workflow package.

Key Findings:

  • 55 parse*Config functions with 95%+ boilerplate similarity representing ~400-600 lines of duplicated code
  • 12 helper files with inconsistent usage patterns and 5 files containing no exported functions
  • 27 validation files showing good organization, but with scattered validation functions in non-validation files
  • 19 files with mixed responsibilities (3+ different function prefixes indicating semantic drift)

Analysis Overview

Codebase Statistics:

  • Total Go files analyzed: 487 files (excluding tests)
  • Primary focus: pkg/workflow (250 files, 51% of codebase)
  • Total exported functions: 312 in pkg/workflow
  • Unique function prefixes: 56 different semantic patterns
  • Helper files identified: 12 files
  • Validation files: 27 files
  • Config files: 8 files

Function Prefix Distribution (Top 10):

Prefix Count Percentage Category
Get 66 21.2% Accessor
New 47 15.1% Constructor
Build 34 10.9% Builder
Generate 20 6.4% Generator
Parse 17 5.4% Parser
Extract 14 4.5% Extractor
Validate 14 4.5% Validator
With 12 3.8% Configuration
Write 6 1.9% Writer
Resolve 4 1.3% Resolver

High-Impact Issues Identified

1. Repetitive parse*Config Pattern - Major Duplication

Issue: The codebase contains 55 parse*Config functions across 43 files with nearly identical implementations, representing the single largest code duplication opportunity.

Pattern Structure:

func (c *Compiler) parseXYZConfig(outputMap map[string]any) *XYZConfig {
    // 1. Check if key exists in map (100% identical across all functions)
    if _, exists := outputMap["config-key"]; !exists {
        return nil
    }
    
    // 2. Log the parsing operation (100% identical pattern)
    log.Print("Parsing configuration")
    
    // 3. Extract and unmarshal config (95% identical)
    var config XYZConfig
    if err := unmarshalConfig(outputMap, "config-key", &config, log); err != nil {
        // Error handling
    }
    
    // 4. Set defaults (90% identical pattern)
    if config.Max == 0 {
        config.Max = 1
    }
    
    // 5. Return config struct (100% identical)
    return &config
}
View Concrete Examples with File Paths

Example 1: Simple Pattern (add_labels.go)

File: pkg/workflow/add_labels.go:19-39

func (c *Compiler) parseAddLabelsConfig(outputMap map[string]any) *AddLabelsConfig {
    if _, exists := outputMap["add-labels"]; !exists {
        return nil
    }
    
    addLabelsLog.Print("Parsing add-labels configuration")
    
    var config AddLabelsConfig
    if err := unmarshalConfig(outputMap, "add-labels", &config, addLabelsLog); err != nil {
        // Error handling
    }
    
    addLabelsLog.Printf("Parsed configuration: allowed_count=%d, target=%s", ...)
    return &config
}

Example 2: Pattern with Defaults (assign_to_user.go)

File: pkg/workflow/assign_to_user.go:17-44

func (c *Compiler) parseAssignToUserConfig(outputMap map[string]any) *AssignToUserConfig {
    if _, exists := outputMap["assign-to-user"]; !exists {
        return nil
    }
    
    assignToUserLog.Print("Parsing assign-to-user configuration")
    
    var config AssignToUserConfig
    if err := unmarshalConfig(outputMap, "assign-to-user", &config, assignToUserLog); err != nil {
        // Error handling
    }
    
    // Default setting pattern (90% identical)
    if config.Max == 0 {
        config.Max = 1
    }
    
    return &config
}

Example 3: Generic Wrapper (close_entity_helpers.go)

File: pkg/workflow/close_entity_helpers.go:185-215

This file shows a good refactoring approach - a generic function with specialized wrappers:

// Generic function handling the common pattern
func (c *Compiler) parseCloseEntityConfig(outputMap map[string]any, 
    params CloseEntityJobParams, logger *logger.Logger) *CloseEntityConfig {
    
    if _, exists := outputMap[params.ConfigKey]; !exists {
        return nil
    }
    
    logger.Printf("Parsing %s configuration", params.ConfigKey)
    
    var config CloseEntityConfig
    if err := unmarshalConfig(outputMap, params.ConfigKey, &config, logger); err != nil {
        // Error handling
    }
    
    if config.Max == 0 {
        config.Max = 1
    }
    
    return &config
}

// Specialized wrappers calling the generic function
func (c *Compiler) parseCloseIssuesConfig(outputMap map[string]any) *CloseIssuesConfig {
    def := closeEntityRegistry[0]
    params := CloseEntityJobParams{EntityType: def.EntityType, ConfigKey: def.ConfigKey}
    return c.parseCloseEntityConfig(outputMap, params, def.Logger)
}

This is the pattern that should be applied to all 55 parse*Config functions.

All 55 Affected Functions:

Complete List of parse*Config Functions

Compiler Methods (40 functions):

  1. pkg/workflow/add_comment.go:135 - parseCommentsConfig
  2. pkg/workflow/add_labels.go:19 - parseAddLabelsConfig
  3. pkg/workflow/add_reviewer.go:17 - parseAddReviewerConfig
  4. pkg/workflow/assign_milestone.go:17 - parseAssignMilestoneConfig
  5. pkg/workflow/assign_to_agent.go:19 - parseAssignToAgentConfig
  6. pkg/workflow/assign_to_user.go:17 - parseAssignToUserConfig
  7. pkg/workflow/autofix_code_scanning_alert.go:15 - parseAutofixCodeScanningAlertConfig
  8. pkg/workflow/close_entity_helpers.go:89 - parseCloseEntityConfig
  9. pkg/workflow/close_entity_helpers.go:185 - parseCloseIssuesConfig
  10. pkg/workflow/close_entity_helpers.go:195 - parseClosePullRequestsConfig
  11. pkg/workflow/close_entity_helpers.go:205 - parseCloseDiscussionsConfig
  12. pkg/workflow/copy_project.go:16 - parseCopyProjectsConfig
  13. pkg/workflow/create_agent_session.go:20 - parseAgentSessionConfig
  14. pkg/workflow/create_code_scanning_alert.go:89 - parseCodeScanningAlertsConfig
  15. pkg/workflow/create_discussion.go:26 - parseDiscussionsConfig
  16. pkg/workflow/create_issue.go:26 - parseIssuesConfig
  17. pkg/workflow/create_pr_review_comment.go:86 - parsePullRequestReviewCommentsConfig
  18. pkg/workflow/create_project.go:18 - parseCreateProjectsConfig
  19. pkg/workflow/create_project_status_update.go:17 - parseCreateProjectStatusUpdateConfig
  20. pkg/workflow/create_pull_request.go:160 - parsePullRequestsConfig
  21. pkg/workflow/dispatch_workflow.go:17 - parseDispatchWorkflowConfig
  22. pkg/workflow/hide_comment.go:17 - parseHideCommentConfig
  23. pkg/workflow/link_sub_issue.go:20 - parseLinkSubIssueConfig
  24. pkg/workflow/mark_pull_request_as_ready_for_review.go:17 - parseMarkPullRequestAsReadyForReviewConfig
  25. pkg/workflow/missing_data.go:92 - parseMissingDataConfig
  26. pkg/workflow/missing_tool.go:92 - parseMissingToolConfig
  27. pkg/workflow/noop.go:9 - parseNoOpConfig
  28. pkg/workflow/publish_assets.go:22 - parseUploadAssetConfig
  29. pkg/workflow/push_to_pull_request_branch.go:42 - parsePushToPullRequestBranchConfig
  30. pkg/workflow/remove_labels.go:17 - parseRemoveLabelsConfig
  31. pkg/workflow/safe_jobs.go:40 - parseSafeJobsConfig
  32. pkg/workflow/safe_output_config.go:6 - parseBaseSafeOutputConfig
  33. pkg/workflow/threat_detection.go:21 - parseThreatDetectionConfig
  34. pkg/workflow/update_discussion.go:19 - parseUpdateDiscussionsConfig
  35. pkg/workflow/update_entity_helpers.go:114 - parseUpdateEntityConfig
  36. pkg/workflow/update_entity_helpers.go:276 - parseUpdateEntityConfigWithFields
  37. pkg/workflow/update_issue.go:18 - parseUpdateIssuesConfig
  38. pkg/workflow/update_project.go:34 - parseUpdateProjectConfig
  39. pkg/workflow/update_pull_request.go:18 - parseUpdatePullRequestsConfig
  40. pkg/workflow/update_release.go:15 - parseUpdateReleaseConfig

Standalone Functions (15 functions):

  1. pkg/workflow/config_helpers.go:80 - parseLabelsFromConfig
  2. pkg/workflow/config_helpers.go:101 - parseTitlePrefixFromConfig
  3. pkg/workflow/config_helpers.go:109 - parseTargetRepoFromConfig
  4. pkg/workflow/config_helpers.go:144 - parseParticipantsFromConfig
  5. pkg/workflow/config_helpers.go:176 - parseAllowedLabelsFromConfig
  6. pkg/workflow/frontmatter_types.go:315 - parseRuntimesConfig
  7. pkg/workflow/frontmatter_types.go:369 - parsePermissionsConfig
  8. pkg/workflow/safe_output_builder.go:98 - parseRequiredLabelsFromConfig
  9. pkg/workflow/safe_output_builder.go:104 - parseRequiredTitlePrefixFromConfig
  10. pkg/workflow/safe_outputs_app.go:31 - parseAppConfig
  11. pkg/workflow/safe_outputs_config_messages.go:13 - parseMessagesConfig
  12. pkg/workflow/safe_outputs_config_messages.go:84 - parseMentionsConfig
  13. pkg/workflow/time_delta.go:378 - parseExpiresFromConfig
  14. pkg/workflow/tools_parser.go:469 - parseMCPServerConfig
  15. pkg/workflow/update_entity_helpers.go:343 - parseUpdateEntityConfigTyped

Impact Quantification:

  • 400-600 lines of duplicated boilerplate code across the 55 functions
  • Each function averages 10-30 lines with 95%+ structural similarity
  • Estimated reduction: Consolidating into a generic parser could reduce to a single ~50-100 line generic function with type parameters

Recommendation:

Follow the pattern already established in close_entity_helpers.go - create a generic config parser that all 40 Compiler methods delegate to:

// Generic config parser with type constraints
func parseConfigGeneric[T any](
    outputMap map[string]any,
    configKey string,
    logger *logger.Logger,
    defaultsFunc func(*T),
    validationFunc func(*T) error,
) *T {
    if _, exists := outputMap[configKey]; !exists {
        return nil
    }
    
    logger.Printf("Parsing %s configuration", configKey)
    
    var config T
    if err := unmarshalConfig(outputMap, configKey, &config, logger); err != nil {
        logger.Printf("Error parsing config: %v", err)
        return nil
    }
    
    if defaultsFunc != nil {
        defaultsFunc(&config)
    }
    
    if validationFunc != nil {
        if err := validationFunc(&config); err != nil {
            logger.Printf("Validation error: %v", err)
            return nil
        }
    }
    
    return &config
}

Then each specific parser becomes a simple wrapper:

func (c *Compiler) parseAddLabelsConfig(outputMap map[string]any) *AddLabelsConfig {
    return parseConfigGeneric[AddLabelsConfig](
        outputMap,
        "add-labels",
        addLabelsLog,
        nil, // no defaults needed
        nil, // no custom validation
    )
}

Estimated Effort: 8-12 hours
Estimated Impact: Reduce 400-600 lines of boilerplate by 70-80%, improve maintainability, ensure consistent parsing behavior


2. Helper File Inconsistency

Issue: The codebase has 12 files with _helpers.go suffix, but 5 of them contain no exported functions, and the remaining files have inconsistent usage patterns.

Helper Files Analysis:

File Exported Functions Status Issue
close_entity_helpers.go 0 ❌ Empty No exported functions despite "helpers" name
compiler_test_helpers.go 0 ✓ OK Test utilities (appropriately named)
compiler_yaml_helpers.go 1 ⚠️ Minimal Only GetWorkflowIDFromPath
config_helpers.go 3 ✓ OK ParseBoolFromConfig, ParseIntFromConfig, ParseStringArrayFromConfig
engine_helpers.go 11 ❌ Mixed 6 different prefixes - substantial business logic, not just helpers
error_helpers.go 5 ✓ OK Error creation and wrapping utilities
git_helpers.go 1 ⚠️ Single-function Only GetCurrentGitTag - could be relocated
map_helpers.go 0 ❌ Empty No exported functions
safe_outputs_config_generation_helpers.go 0 ❌ Empty No exported functions
safe_outputs_config_helpers.go 2 ✓ OK GetEnabledSafeOutputToolNames, HasSafeOutputsEnabled
update_entity_helpers.go 0 ❌ Empty No exported functions
validation_helpers.go 6 ✓ OK Core validation utilities

Specific Issues:

A. engine_helpers.go - Misnamed File

File: pkg/workflow/engine_helpers.go (11 exported functions)

Function Prefixes: Build, Extract, Filter, Format, Get, Inject, Resolve (6 different semantic categories)

Problem: This file is called "helpers" but contains substantial business logic:

  • BuildNpmInstallStep - Complex npm installation step generation
  • InjectCustomSteps - Custom step injection logic
  • ResolveCustomImageRegistry - Image registry resolution

Impact: The "helpers" name suggests lightweight utilities, but the file contains core domain logic with multiple responsibilities.

Recommendation: Rename to engine_installation.go or split into domain-specific files:

  • engine_npm_setup.go - npm installation logic
  • engine_custom_steps.go - custom step injection
  • engine_registry.go - registry resolution

B. Single-Function Helper Files

Three helper files contain only 1 exported function:

  • git_helpers.go:GetCurrentGitTag (1 function)
  • compiler_yaml_helpers.go:GetWorkflowIDFromPath (1 function)

Recommendation: Move these functions to more appropriate domain files:

  • GetCurrentGitTag → Move to pkg/gitutil/ or a git operations file
  • GetWorkflowIDFromPath → Move to compiler_yaml.go (already handles YAML operations)

C. Empty Helper Files

Five helper files contain no exported functions:

  • close_entity_helpers.go
  • map_helpers.go
  • safe_outputs_config_generation_helpers.go
  • update_entity_helpers.go

Recommendation:

  • If these files contain only private helper functions used within the package, consider merging them into the files that use them
  • If they're truly unused, consider removing them
  • Document why they exist if they serve a specific architectural purpose

Estimated Effort: 3-5 hours
Estimated Impact: Improved code organization, clearer file purposes, reduced cognitive load


3. Scattered Validation Functions

Issue: While the codebase has excellent validation organization (27 dedicated *_validation.go files), some validation functions are scattered in non-validation files, creating inconsistent patterns.

Validation Architecture:

Well-Organized: 27 dedicated validation files for domain-specific validation:

  • agent_validation.go, docker_validation.go, firewall_validation.go, etc.
  • validation_helpers.go with 6 core validation utilities

Scattered Validation Functions found in non-validation files:

File Validation Function Line Issue
action_sha_checker.go ValidateActionSHAsInLockFile (function) Should be in action_validation.go
github_tool_to_toolset.go ValidateGitHubToolsAgainstToolsets (function) Should be in github_validation.go
artifact_manager.go ValidateDownload, ValidateAllDownloads (methods) Should be in artifact_validation.go

Example: action_sha_checker.go Mixed Concerns

File: pkg/workflow/action_sha_checker.go

// Three different semantic concerns in one file:
func ExtractActionsFromLockFile(...) []string { ... }  // Extract prefix
func CheckActionSHAUpdates(...) (bool, error) { ... }  // Check prefix
func ValidateActionSHAsInLockFile(...) error { ... }   // Validate prefix
``````

**Problem**: This file mixes extraction, checking, and validation logic. The `Validate*` function should be in a dedicated validation file.

**Recommendation:**

1. **Create missing validation files**:
   - `pkg/workflow/action_validation.go` for action-related validation
   - `pkg/workflow/artifact_validation.go` for artifact validation
   - `pkg/workflow/github_validation.go` for GitHub-specific validation

2. **Move scattered validation functions** to their appropriate validation files

3. **Adopt consistent validation placement**: All `Validate*` functions should live in `*_validation.go` files unless they're one-line inline validations

**Estimated Effort**: 4-6 hours
**Estimated Impact**: Improved consistency, easier discovery of validation logic, clearer separation of concerns

---

### 4. Files with Mixed Responsibilities

**Issue**: 19 files contain 3+ different function prefixes, indicating mixed semantic concerns and potential for further modularization.

**High-Diversity Files:**

<details>
<summary><b>Files with Mixed Responsibilities (3+ Function Prefix Types)</b></summary>

| File | Functions | Prefixes | Assessment |
|------|-----------|----------|------------|
| **expression_builder.go** | 27 | Add, Build (25), Render | ⚠️ Mostly focused on Build, but has tangential Add/Render |
| **js.go** | 25 | Format, Get (22), Write | ⚠️ Mostly Get operations, but has Write/Format concerns |
| **compiler_types.go** | 16 | Get, New, Set, With | ✓ OK - Type accessors naturally have multiple prefixes |
| **engine_helpers.go** | 11 | Build, Extract, Filter, Format, Get, Inject, Resolve | ❌ Too diverse - should be split |
| **metrics.go** | 10 | Convert, Extract, Finalize, Prettify | ⚠️ Moderate diversity - consider splitting |
| **bundler_file_mode.go** | 9 | Collect, Generate, Get, Prepare, Rewrite, Transform | ❌ Too diverse - consider splitting |
| **comment.go** | 7 | Filter, Get, Merge, Parse | ⚠️ Moderate diversity |
| **agentic_engine.go** | 6 | Convert, Generate, Get, New | ✓ OK - Reasonable diversity for engine file |
| **error_helpers.go** | 5 | Enhance, New, Wrap | ✓ OK - Error utilities naturally group these |
| **version.go** | 5 | Get, Is, Set | ✓ OK - Version accessors and predicates |
| **mcp_renderer.go** | 5 | Handle, New, Render | ✓ OK - Renderer lifecycle methods |
| **expression_parser.go** | 5 | Break, Normalize, Parse, Visit | ✓ OK - Parser operations |
| **strings.go** | 5 | Sanitize, Shorten, Sort | ✓ OK - String utilities |
| **safe_inputs_parser.go** | 4 | Has, Is, Parse | ✓ OK - Parser predicates and operations |
| **yaml.go** | 4 | Clean, Marshal, Order, Unquote | ✓ OK - YAML manipulation utilities |
| **permissions_validation.go** | 3 | Format, Get, Validate | ✓ OK - Validation file with helpers |
| **action_sha_checker.go** | 3 | Check, Extract, Validate | ❌ Mixed concerns - should split |
| **step_types.go** | 3 | Map, Slice, Steps | ✓ OK - Type conversion utilities |
| **error_aggregation.go** | 3 | Format, New, Split | ✓ OK - Error aggregation utilities |

</details>

**Critical Cases:**

#### A. engine_helpers.go (11 functions, 6 prefixes)

Already discussed in Issue #2 - this file should be renamed or split.

#### B. bundler_file_mode.go (9 functions, 6 prefixes)

**File**: `pkg/workflow/bundler_file_mode.go`

**Prefixes**: Collect, Generate, Get, Prepare, Rewrite, Transform

**Recommendation**: Consider splitting into:
- `bundler_collection.go` - Collection operations
- `bundler_transformation.go` - Transformation and rewriting operations
- `bundler_generation.go` - Generation operations

#### C. action_sha_checker.go (3 functions, 3 different prefixes)

**File**: `pkg/workflow/action_sha_checker.go`

**Functions**:
- `ExtractActionsFromLockFile` - Extraction concern
- `CheckActionSHAUpdates` - Checking/comparison concern
- `ValidateActionSHAsInLockFile` - Validation concern

**Recommendation**: Split into separate concerns:
- Move Extract function to extraction utilities
- Move Validate function to validation file
- Keep Check function in this file or rename file to `action_sha_updater.go`

**Estimated Effort**: 6-8 hours (for critical cases)
**Estimated Impact**: Medium - primarily affects developer navigation and code organization

---

## Positive Findings

### Excellent Organization Patterns

The codebase demonstrates many strengths that should be maintained:

#### 1. MCP Configuration Files (16 files)

✅ **Exemplary Organization** - Clear naming, logical separation, consistent prefix pattern:

- `mcp_config_builtin.go` - Built-in configurations
- `mcp_config_custom.go` - Custom server handling
- `mcp_config_playwright_renderer.go` - Playwright rendering
- `mcp_config_serena_renderer.go` - Serena rendering
- `mcp_config_types.go` - Type definitions
- `mcp_config_utils.go` - Utility functions
- `mcp_config_validation.go` - Validation logic
- (9 more related files)

**Why This Works:**
- Clear file naming with consistent `mcp_` prefix
- Logical separation by concern (types, utils, validation, renderers)
- Feature-based grouping (github, playwright, serena)
- Each file has a specific, descriptive purpose

**Recommendation**: Use MCP file organization as a reference pattern for other subsystems.

#### 2. Create/Update Entity Pattern (15 files)

✅ **Consistent Entity Operations** - One entity type per file:

**Create Files** (8):
- `create_agent_session.go`
- `create_code_scanning_alert.go`
- `create_discussion.go`
- `create_issue.go`
- `create_pr_review_comment.go`
- `create_project.go`
- `create_project_status_update.go`
- `create_pull_request.go`

**Update Files** (7):
- `update_discussion.go`
- `update_issue.go`
- `update_project.go`
- `update_project_job.go`
- `update_pull_request.go`
- `update_release.go`

**Why This Works:**
- Consistent naming: `{action}_{entity}.go`
- One entity per file
- Predictable structure: parse configbuild jobgenerate steps

#### 3. Validation Architecture (27 dedicated files)

✅ **Well-Modularized Validation** - Each domain has dedicated validation:

- `agent_validation.go`, `docker_validation.go`, `firewall_validation.go`
- `bundler_runtime_validation.go`, `bundler_safety_validation.go`, `bundler_script_validation.go`
- `compiler_filters_validation.go`, `expression_validation.go`, `schema_validation.go`
- (24 more validation files)

**Why This Works:**
- Clear `*_validation.go` naming pattern
- Domain-specific validation concerns are isolated
- Easy to locate validation logic
- `validation_helpers.go` provides common utilities

---

## Recommendations Summary

### Priority 1: High Impact, Low Risk (Weeks 1-2)

#### 1.1 Introduce Generic Config Parser

**Action**: Create a generic `parseConfigGeneric[T]` function following the pattern in `close_entity_helpers.go`

**Files to Create/Modify:**
- Create `pkg/workflow/config_parser_generic.go` with generic parser
- Refactor 40 Compiler parse*Config methods to use the generic parser
- Consolidate 15 standalone parse functions

**Estimated Effort**: 8-12 hours
**Estimated Benefit**: Reduce 400-600 lines of boilerplate code by 70-80%

#### 1.2 Relocate Single-Function Helper Files

**Action**: Move functions from single-function helper files to appropriate domain files

**Changes:**
- `git_helpers.go::GetCurrentGitTag`Move to `pkg/gitutil/` or git operations file
- `compiler_yaml_helpers.go::GetWorkflowIDFromPath` → Move to `compiler_yaml.go`

**Estimated Effort**: 1-2 hours
**Estimated Benefit**: Reduced file count, clearer organization

### Priority 2: Medium Impact, Medium Effort (Weeks 3-4)

#### 2.1 Rename or Split engine_helpers.go

**Action**: Address the misnamed `engine_helpers.go` file with substantial business logic

**Option A**: Rename to `engine_installation.go` (quick fix)
**Option B**: Split into domain-specific files:
- `engine_npm_setup.go` - npm installation logic
- `engine_custom_steps.go` - custom step injection
- `engine_registry.go` - registry resolution

**Estimated Effort**: 4-6 hours
**Estimated Benefit**: Clearer file purpose, better code organization

#### 2.2 Standardize Validation File Organization

**Action**: Move scattered validation functions to dedicated `*_validation.go` files

**Changes:**
- Create `action_validation.go` and move `ValidateActionSHAsInLockFile`
- Create `artifact_validation.go` and move artifact validation methods
- Create `github_validation.go` and move GitHub validation functions

**Estimated Effort**: 4-6 hours
**Estimated Benefit**: Consistent validation patterns, easier discovery

#### 2.3 Review and Consolidate Empty Helper Files

**Action**: Evaluate the 5 helper files with no exported functions

**Files to Review:**
- `close_entity_helpers.go`
- `map_helpers.go`
- `safe_outputs_config_generation_helpers.go`
- `update_entity_helpers.go`

**Estimated Effort**: 2-3 hours
**Estimated Benefit**: Reduced file clutter, clearer purpose

### Priority 3: Long-term Improvements (Future)

#### 3.1 Split High-Diversity Files

**Action**: Evaluate and potentially split files with 6+ different function prefix types

**Candidates:**
- `bundler_file_mode.go` (9 functions, 6 prefixes)
- `action_sha_checker.go` (3 functions, 3 distinct concerns)

**Estimated Effort**: 6-8 hours
**Estimated Benefit**: Improved semantic cohesion

#### 3.2 Consider Package Subdivision

**Action**: Evaluate subdividing the `pkg/workflow` package (250 files) into subpackages

**Potential Structure:**
``````
pkg/workflow/
  compiler/          # Compiler-related files (21 files)
    orchestrator/    # Orchestrator files
    safeoutputs/     # Safe outputs files (9 files)
    yaml/            # YAML generation
  engines/           # Engine implementations (12 files)
  validation/        # Validation functions (27 files)
  entities/          # Entity operations (create/update files)
  mcp/              # MCP configuration (16 files)

Estimated Effort: 40-60 hours (large refactoring)
Estimated Benefit: Long-term maintainability for growing codebase
Risk: High - requires updating all imports across codebase

Recommendation: Defer until codebase grows beyond 300 files or clear pain points emerge.


Implementation Checklist

Phase 1: Quick Wins (Week 1)

  • Create generic config parser (config_parser_generic.go)
  • Migrate 5-10 parse*Config functions as pilot
  • Validate pilot results and test coverage
  • Relocate single-function helper files (GetCurrentGitTag, GetWorkflowIDFromPath)
  • Run full test suite to verify no regressions

Phase 2: Medium Refactoring (Weeks 2-3)

  • Complete migration of all 55 parse*Config functions to generic parser
  • Rename or split engine_helpers.go
  • Create missing validation files (action_validation.go, artifact_validation.go, github_validation.go)
  • Move scattered validation functions to appropriate files
  • Review and consolidate empty helper files

Phase 3: Validation and Documentation (Week 4)

  • Run full test suite (make test-unit)
  • Run linting (make lint)
  • Verify build succeeds (make build)
  • Update documentation on config parsing patterns
  • Document file organization guidelines based on MCP example

Phase 4: Long-term Considerations (Future)

  • Review files with high prefix diversity (bundler_file_mode.go, action_sha_checker.go)
  • Evaluate need for package subdivision if codebase continues to grow
  • Consider code generation for repetitive patterns
  • Document organization principles for future contributors

Analysis Metadata

  • Analysis Date: 2026-02-01
  • Total Files Analyzed: 487 Go files (excluding tests)
  • Primary Focus: pkg/workflow (250 files, 51% of codebase)
  • Functions Cataloged: 312 exported functions
  • Function Prefixes Identified: 56 unique semantic patterns
  • Helper Files: 12 files
  • Validation Files: 27 files
  • Config Files: 8 files
  • Detection Method: Semantic code analysis via pattern matching and function prefix clustering
  • Tools Used: grep, find, manual code review

References:

AI generated by Semantic Function Refactoring

  • expires on Feb 3, 2026, 3:08 PM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions