-
Notifications
You must be signed in to change notification settings - Fork 42
Description
Executive Summary
Analysis of 220 non-test Go files in the pkg/ directory has identified significant opportunities for code consolidation and improved organization. The most critical finding is ~2,000 lines of duplicated code in safe output configuration handling, with repetitive patterns across 60+ nearly identical functions.
Full Report
Analysis Overview
- Total Go files analyzed: 220 (excluding tests)
- Total functions cataloged: 1,200+
- Function clusters identified: 8 major clusters
- Critical duplications found: 5 high-impact areas
- Files needing reorganization: 47
- Detection method: Pattern analysis + semantic code review
1. Critical Issues - High Priority
π΄ Issue #1: Massive Safe Output Config Duplication
Scope: 19 files with 60+ nearly identical functions
Lines of duplicate code: ~2,000
Impact: High - makes maintenance difficult and error-prone
Problem Details
The safe output configuration system has severe code duplication:
-
Single 476-line mega-function in
pkg/workflow/compiler_safe_outputs_config.go:17-476- Function:
addHandlerManagerConfigEnvVar() - Contains 15 nearly identical if-blocks, one per output type
- Each block follows identical pattern with minor variations
- Function:
-
60+ duplicate
parse*Config()functions across 30+ files- Files:
create_issue.go,create_discussion.go,create_pull_request.go, etc. - Pattern: 80-90% identical code structure
- Each has same boilerplate for expires handling, base config parsing, validation
- Files:
-
37+ duplicate
build*Job()functions- Pattern: Similar job-building logic repeated
- Note:
BuildListSafeOutputJob()exists insafe_output_builder.gobut is underutilized
Example: Repetitive Code in compiler_safe_outputs_config.go
// Lines 17-48 - CreateIssues handler
if data.SafeOutputs.CreateIssues != nil {
cfg := data.SafeOutputs.CreateIssues
handlerConfig := make(map[string]any)
if cfg.Max > 0 { handlerConfig["max"] = cfg.Max }
if len(cfg.AllowedLabels) > 0 { handlerConfig["allowed_labels"] = cfg.AllowedLabels }
if cfg.RequiredTitlePrefix != "" { handlerConfig["required-title-prefix"] = cfg.RequiredTitlePrefix }
// ... 8 more similar lines
config["create_issue"] = handlerConfig
}
// Lines 50-70 - AddComments handler (nearly identical!)
if data.SafeOutputs.AddComments != nil {
cfg := data.SafeOutputs.AddComments
handlerConfig := make(map[string]any)
if cfg.Max > 0 { handlerConfig["max"] = cfg.Max }
// ... same pattern repeated
config["add_comment"] = handlerConfig
}
// REPEATS 13 MORE TIMES with minor variationsExample: Duplicate Parse Pattern
Found in create_issue.go:23-50, create_discussion.go:23-50, create_pull_request.go:23-50, and 10+ other files:
func (c *Compiler) parseXConfig(outputMap map[string]any) *XConfig {
if _, exists := outputMap["x"]; !exists {
return nil
}
log.Print("Parsing x configuration")
configData, _ := outputMap["x"].(map[string]any)
// Expires handling - DUPLICATED in 10+ files
if configData != nil {
if expires, exists := configData["expires"]; exists {
if _, ok := expires.(string); ok {
expiresInt := parseExpiresFromConfig(configData)
if expiresInt > 0 {
configData["expires"] = expiresInt
}
}
}
}
var config XConfig
if err := unmarshalConfig(outputMap, "x", &config, log); err != nil {
return nil
}
c.parseBaseSafeOutputConfig(configData, &config.BaseSafeOutputConfig, defaultMax)
return &config
}This pattern is repeated 30+ times with only the type name changing!
Recommendation
Refactor using generics and registry pattern:
-
Create generic config parser in
pkg/workflow/config_helpers.go:func ParseSafeOutputConfig[T any]( outputMap map[string]any, key string, defaultMax int, preprocessor func(map[string]any), ) (*T, error)
-
Replace 476-line function with handler registry:
type HandlerConfigBuilder interface { BuildConfig(cfg any) map[string]any } var handlerRegistry = map[string]HandlerConfigBuilder{ "create_issue": &CreateIssueBuilder{}, "add_comment": &AddCommentBuilder{}, // ... register all handlers }
-
Consolidate into subdirectory structure:
pkg/workflow/safe_outputs/ config/ parser.go # Generic config parser builder.go # Generic handler config builder validation.go # Shared validation types/ create_types.go # CreateIssue, CreateDiscussion, CreatePR update_types.go # UpdateIssue, UpdateDiscussion, UpdatePR close_types.go # CloseIssue, CloseDiscussion misc_types.go # Other configs handlers/ registry.go # Handler registration
Estimated Impact:
- Reduce code by ~1,800 lines
- Eliminate 60+ duplicate functions
- Make adding new safe output types trivial (add to registry)
- Significantly improve maintainability
π΄ Issue #2: Scattered Validation Logic
Scope: 27 files with *validation* in name
Impact: Medium - inconsistent validation patterns
Problem Details
Validation logic is spread across 27 separate files with no common framework:
Files with validation concerns:
pkg/workflow/*_validation.go(27 files)pkg/workflow/permissions_validator.gopkg/workflow/validation_helpers.go(exists but underutilized)
Issues:
- No common validation interface or framework
- Each validator implements its own pattern
permissions_validator.gomixes validation logic with static data definitions- Helper functions in
validation_helpers.goare rarely used
Example: Mixed Concerns in permissions_validator.go
// Contains both validation logic AND static toolset data
func ValidatePermissions(...) error {
// validation logic
}
func GetToolsetsData() map[string]ToolsetData {
// Returns 200+ lines of static configuration data
// Should be in separate file or JSON
}Recommendation
-
Create validation framework in
pkg/workflow/validation/:type ValidationRule interface { Validate(value any) error ErrorMessage() string } type Validator struct { rules []ValidationRule } func (v *Validator) ValidateAll(config any) []error
-
Extract static data from
permissions_validator.goto:- Separate file:
pkg/workflow/toolset_data.go, or - JSON file loaded at runtime
- Separate file:
-
Group related validations:
pkg/workflow/validation/ framework.go # Common validation interfaces schema.go # Schema validation (combine 5 schema_*.go files) expression.go # Expression validation permissions.go # Permission validation runtime.go # Runtime validation features.go # Feature validation
Estimated Impact:
- Consolidate 27 files into ~8 focused files
- Create reusable validation framework
- Improve consistency across validators
π‘ Issue #3: Helper Function Underutilization
Scope: Existing helper functions not widely adopted
Impact: Medium - leads to continued code duplication
Problem Details
Well-designed helper functions exist but are underutilized:
-
pkg/workflow/config_helpers.go(13 functions)ParseStringArrayFromConfig(),ParseIntFromConfig(),ParseBoolFromConfig()parseLabelsFromConfig(),parseTitlePrefixFromConfig()- Usage: Only 12 files use these helpers
- Problem: 30+ files still implement custom parsing
-
pkg/workflow/safe_output_builder.go(13+ functions)BuildTargetEnvVar(),BuildRequiredLabelsEnvVar()BuildListSafeOutputJob()- comprehensive job builder- Usage: Only a few files leverage the full builder pattern
- Problem: Most files build env vars manually
-
pkg/workflow/engine_helpers.go(7 functions)GetBaseInstallationSteps(),ExtractAgentIdentifier()- Good example: Properly centralized and widely reused β
Recommendation
-
Promote awareness of existing helpers through:
- Code review guidelines
- Developer documentation
- Linting rules to detect reinvention
-
Expand helper coverage:
- Add generic config parser (as mentioned in Issue rejig docsΒ #1)
- Add environment variable builder registry
- Add common validation helpers
-
Refactor existing code to use helpers:
- Replace custom parsing with
ParseStringArrayFromConfig() - Replace manual env var building with
BuildTargetEnvVar()
- Replace custom parsing with
Estimated Impact:
- Reduce ~500 lines of duplicate parsing code
- Standardize configuration handling
- Improve code discoverability
2. Function Clusters by Semantic Pattern
Cluster 1: Safe Output Functions β οΈ NEEDS REFACTORING
Pattern: create_*, update_*, close_*, add_*, safe_outputs_*
File count: 19 files
Function count: 100+ functions
Files:
pkg/workflow/
create_issue.go, create_discussion.go, create_pull_request.go
update_issue.go, update_discussion.go, update_pull_request.go
close_entity_helpers.go, update_entity_helpers.go
add_comment.go, add_labels.go, add_reviewer.go
safe_output_*.go (4 files)
safe_outputs_*.go (9 files)
safe_inputs_*.go (3 files)
compiler_safe_outputs_*.go (9 files)
Analysis: Critical refactoring needed - see Issue #1 above
Cluster 2: Compiler Functions β MOSTLY GOOD
Pattern: compiler_*
File count: 21 files
Function count: 150+ functions
Files:
pkg/workflow/
compiler.go (main compiler)
compiler_jobs.go, compiler_activation_jobs.go
compiler_orchestrator.go, compiler_types.go
compiler_yaml*.go (5 YAML generation files)
compiler_safe_outputs*.go (9 safe output compilation files)
compiler_filters_validation.go
Analysis: Generally well-organized, but:
- Could benefit from subdirectory organization
compiler_safe_outputs_config.goneeds refactoring (Issue rejig docsΒ #1)- YAML generation files could be grouped in subdirectory
Recommendation: Minor reorganization into subdirectories:
pkg/workflow/compiler/
core/compiler.go, types.go, orchestrator.go
jobs/main.go, activation.go, safe_outputs.go
yaml/generator.go, helpers.go, artifacts.go
Cluster 3: Validation Functions β οΈ NEEDS FRAMEWORK
Pattern: *_validation.go, validate*
File count: 27 files
Function count: 80+ validation functions
Analysis: See Issue #2 above - needs validation framework
Cluster 4: Engine Functions β WELL ORGANIZED
Pattern: *_engine.go, engine_*
File count: 13 files
Function count: 60+ functions
Files:
pkg/workflow/
agentic_engine.go, claude_engine.go, codex_engine.go
copilot_engine.go, custom_engine.go
engine.go, engine_helpers.go, engine_output.go
engine_validation.go, engine_firewall_support.go
copilot_engine_execution.go
copilot_engine_installation.go
copilot_engine_tools.go
copilot_logs.go, claude_logs.go, codex_logs.go
Analysis: Good organization β
- Each engine type has dedicated file
- Common helpers centralized in
engine_helpers.go - Engine-specific concerns properly separated
Minor improvement: Could group Copilot files into subdirectory:
pkg/workflow/engines/
copilot/engine.go, execution.go, installation.go, tools.go
claude/engine.go, logs.go
codex/engine.go, logs.go
Cluster 5: MCP Functions β GOOD ORGANIZATION
Pattern: mcp_*
File count: 7 files
Function count: 40+ functions
Files:
pkg/workflow/
mcp-config.go, mcp_config_validation.go
mcp_gateway_constants.go, mcp_renderer.go
mcp_servers.go
pkg/cli/
mcp*.go (20+ files - separate CLI concern)
Analysis: Well organized β
- Clear separation of concerns
- Validation properly separated
- Constants properly isolated
Cluster 6: Expression Functions β GOOD ORGANIZATION
Pattern: expression_*
File count: 5 files
Function count: 30+ functions
Files:
pkg/workflow/
expression_builder.go
expression_extraction.go
expression_nodes.go
expression_parser.go
expression_validation.go
Analysis: Excellent organization β
- Clear separation by concern (builder, parser, validator)
- One feature per file principle followed
- Easy to navigate and understand
Cluster 7: Frontmatter Functions β GOOD ORGANIZATION
Pattern: frontmatter_*
File count: 5 files
Function count: 20+ functions
Files:
pkg/workflow/
frontmatter_error.go
frontmatter_extraction_metadata.go
frontmatter_extraction_security.go
frontmatter_extraction_yaml.go
frontmatter_types.go
Analysis: Well organized β
- Separation by aspect (metadata, security, YAML)
- Types properly isolated
- Error handling dedicated file
Cluster 8: Runtime Functions β GOOD ORGANIZATION
Pattern: runtime_*
File count: 6 files
Function count: 25+ functions
Files:
pkg/workflow/
runtime_deduplication.go
runtime_definitions.go
runtime_detection.go
runtime_overrides.go
runtime_step_generator.go
runtime_validation.go
Analysis: Well organized β
- Each aspect has dedicated file
- Clear separation of concerns
- Easy to locate specific functionality
3. Files with Potential Outlier Functions
Files where functions might not match primary purpose:
-
pkg/workflow/agentic_engine.go(195 lines)- Primary purpose: Agentic engine implementation
- Outliers found:
GetEngineRegistry()- Engine registry management (should beengine_registry.go)GenerateSecretValidationStep()- Secret validation (should beengine_secrets.go)GetCommonErrorPatterns()- Error patterns (should beengine_error_patterns.go)
- Recommendation: Split into 3 files by concern
-
pkg/workflow/permissions_validator.go- Primary purpose: Permission validation
- Outlier found:
GetToolsetsData()- 200+ lines of static configuration data
- Recommendation: Move data to
toolset_data.goor JSON file
-
pkg/workflow/compiler_safe_outputs_config.go(476 lines, 1 function!)- Primary purpose: Safe outputs configuration
- Issue: Single mega-function with repetitive logic
- Recommendation: See Issue rejig docsΒ #1 - refactor using registry pattern
4. Package-Level Organization Analysis
pkg/campaign (10 files) β EXCELLENT
campaign/
command.go, loader.go, orchestrator.go
spec.go, status.go, summary.go
template.go, validation.go
prompt_sections.go
project_update_contract_validator.go
Analysis: Well-organized, clear separation of concerns
pkg/cli (130+ files) β οΈ LARGE BUT ORGANIZED
cli/
commands.go (main command structure)
*_command.go (25+ command files)
compile_*.go (15+ compile-related files)
mcp_*.go (20+ MCP-related files)
logs_*.go (12+ logs-related files)
update_*.go (8+ update-related files)
Analysis:
- Very large package but logically grouped
- Command files properly separated
- Feature clusters (compile, mcp, logs, update) are identifiable
- Minor improvement: Could use subdirectories for large feature groups
pkg/console (10 files) β EXCELLENT
console/
accessibility.go, banner.go, confirm.go
console.go, format.go, layout.go
list.go, progress.go, render.go, spinner.go
Analysis: Perfect size and organization
pkg/parser (29 files) β WELL ORGANIZED
parser/
ansi_strip.go, content_extractor.go
frontmatter*.go (3 files)
github*.go (2 files)
import_*.go (4 files)
include_*.go (2 files)
schema_*.go (6 files)
schedule_parser.go
mcp.go, remote_fetch.go
Analysis: Good logical grouping, clear naming conventions
pkg/workflow (370+ files) β οΈ NEEDS SUBDIRECTORIES
Size: Largest package with 370+ files
Analysis: Flat structure is difficult to navigate
Recommendation: Organize into subdirectories:
pkg/workflow/
core/ # compiler.go, workflow.go, etc.
safe_outputs/ # All safe output files (50+ files)
validation/ # All validation files (27 files)
engines/ # All engine files (13 files)
expressions/ # Expression files (5 files)
runtime/ # Runtime files (6 files)
frontmatter/ # Frontmatter files (5 files)
mcp/ # MCP files (7 files)
helpers/ # config_helpers.go, map_helpers.go, etc.
5. Summary Statistics
| Metric | Count | Status |
|---|---|---|
| Total non-test Go files | 220 | - |
| Packages analyzed | 10 | - |
| Function clusters | 8 major | - |
| Well-organized clusters | 5 | β |
| Needs refactoring | 2 | |
| Needs framework | 1 | |
| Files with duplicated code | 60+ | |
| Estimated duplicate lines | ~2,000 | |
| Files needing reorganization | 47 | - |
| Outlier functions found | 12 | - |
6. Refactoring Roadmap
Phase 1: Critical Duplications (High Impact)
Priority: π΄ Critical
Estimated effort: 8-12 hours
Impact: Eliminate ~1,800 lines of duplicate code
Tasks:
- Refactor
compiler_safe_outputs_config.gomega-function using registry pattern - Create generic
ParseSafeOutputConfig[T]()function - Consolidate 60+
parse*Config()functions - Create handler config builder registry
- Move safe output files to subdirectory structure
Phase 2: Validation Framework (Medium Impact)
Priority: π‘ Important
Estimated effort: 6-8 hours
Impact: Consolidate 27 validation files, improve consistency
Tasks:
- Design and implement validation framework interface
- Extract static data from
permissions_validator.go - Group related validation files into subdirectories
- Create common validation helpers
- Update existing validators to use framework
Phase 3: Helper Adoption (Medium Impact)
Priority: π‘ Important
Estimated effort: 4-6 hours
Impact: Reduce ~500 lines of duplicate parsing code
Tasks:
- Document existing helper functions
- Refactor files to use
config_helpers.gofunctions - Refactor files to use
safe_output_builder.gofunctions - Add linting rules to detect helper reinvention
- Create developer guide for adding new safe outputs
Phase 4: Package Organization (Low Impact, Long-term)
Priority: π’ Enhancement
Estimated effort: 6-10 hours
Impact: Improved code navigation and maintainability
Tasks:
- Create subdirectory structure for
pkg/workflow - Move safe output files to subdirectories
- Move validation files to subdirectories
- Move compiler files to subdirectories
- Update import paths across codebase
- Update documentation
7. Implementation Checklist
Immediate Actions (Week 1)
- Review findings with team
- Prioritize refactoring tasks
- Create design document for generic config parser
- Create design document for handler registry
- Set up feature branch for refactoring work
Short-term Actions (Weeks 2-4)
- Implement generic
ParseSafeOutputConfig[T]()function - Refactor
compiler_safe_outputs_config.gousing registry - Consolidate
parse*Config()functions (batch 1: create_*) - Consolidate
parse*Config()functions (batch 2: update_*) - Add comprehensive tests for new generic functions
- Begin validation framework design
Long-term Actions (Months 2-3)
- Complete validation framework implementation
- Implement package reorganization
- Create developer documentation
- Add linting rules for code quality
- Consider code generation for remaining boilerplate
Analysis Metadata
- Repository: githubnext/gh-aw
- Analysis date: 2026-01-07
- Files analyzed: 220 non-test Go files
- Primary focus: pkg/workflow (largest package)
- Detection method: Pattern analysis, semantic clustering, manual code review
- Tools used: grep, file reading, structural analysis
Quick Action Items
- Immediate: Refactor
pkg/workflow/compiler_safe_outputs_config.go:17-476using registry pattern - This week: Create generic
ParseSafeOutputConfig[T]()to replace 60+ duplicate functions - This month: Implement validation framework to consolidate 27 validation files
- Long-term: Reorganize
pkg/workflow/into subdirectories (370+ files in flat structure)
Benefits of Refactoring
- Code reduction: ~2,000 lines of duplicate code eliminated
- Maintainability: Adding new safe output types becomes trivial
- Consistency: Validation handled uniformly across codebase
- Navigation: Subdirectory structure improves code discoverability
- Testing: Generic functions easier to test thoroughly
AI generated by Semantic Function Refactoring