-
Notifications
You must be signed in to change notification settings - Fork 51
Description
A comprehensive semantic analysis of 462 non-test Go files across the pkg/ directory identified significant opportunities for improving code organization through function clustering, consolidation, and refactoring.
Executive Summary
Analysis Scope:
- 462 Go files analyzed (454 initially discovered + 8 additional)
- ~3,500+ total functions cataloged (2000+ exported, 1500+ unexported)
- Primary packages:
pkg/workflow/: 241 files (52% of codebase)pkg/cli/: 151 files (33%)pkg/parser/: 29 files (6%)pkg/campaign/: 11 files (2%)- Utility packages: 30 files (7%)
Key Findings:
- ✅ Excellent patterns: Parser, console, and campaign packages show strong organization
⚠️ Fragmentation issues: Workflow and CLI packages have 100+ small files that could consolidate- 🔄 Duplicate code: Multiple implementations of parsing and validation logic detected
- 📁 Helper scatter: Format/parse/validate helpers spread across 40+ files
Critical Issues Identified
Issue #1: Compiler Safe Outputs Fragmentation (14 files → 4 files)
Current structure: 14 separate files for safe outputs compilation
compiler_safe_outputs.go,compiler_safe_outputs_core.gocompiler_safe_outputs_config.go,compiler_safe_outputs_env.gocompiler_safe_outputs_job.go,compiler_safe_outputs_jobs.go,compiler_safe_outputs_steps.gocompiler_safe_outputs_shared.go,compiler_safe_outputs_specialized.gocompiler_safe_outputs_discussions.go- Plus 4 additional config files
Issue: Related functions split across too many files (6-12 functions per file), making code navigation difficult.
Recommendation:
Consolidate to 4 focused files:
1. compiler_safe_outputs.go - Main orchestration
2. compiler_safe_outputs_config.go - Config generation (merge 5 config files)
3. compiler_safe_outputs_env.go - Environment setup
4. compiler_safe_outputs_jobs.go - Job/step generation (merge 6 job-related files)
``````
**Estimated Impact:** Reduced file count, improved discoverability, easier maintenance
---
#### Issue #2: Safe Outputs Config Generation Split (8 files → 2 files)
**Current structure:** Configuration generation fragmented across:
- `safe_outputs_config.go` (6 functions)
- `safe_outputs_config_generation.go` (8 functions)
- `safe_outputs_config_generation_helpers.go` (10 functions)
- `safe_outputs_config_helpers.go` (9 functions)
- `safe_outputs_config_helpers_reflection.go` (6 functions)
- `safe_outputs_config_messages.go` (4 functions)
**Issue:** Semantically related config generation logic scattered across 6 files.
**Recommendation:**
``````
Consolidate to 2 files:
1. safe_outputs_config.go - Main types and core functions
2. safe_outputs_config_generation.go - All generation logic + helpers
``````
**Files to merge:** Generation, generation_helpers, config_helpers, helpers_reflection, messages
---
#### Issue #3: Permissions.go Overly Large (36 functions → 2-3 files)
**File:** `pkg/workflow/permissions.go` (36 functions - 800+ lines)
**Current mix:**
- **Parsing:** `NewPermissionsParser()`, `parse()`, `IsShorthand()`, `GetPermissions()`
- **Building:** `NewPermissions()`, `NewPermissionsReadAll()`, 10+ builder constructors
- **Utilities:** `ContainsCheckout()`, `GetAllPermissionScopes()`, conversion functions
**Issue:** Single file handles parsing, building, and utility functions - violates Single Responsibility Principle.
**Recommendation:**
``````
Split into focused files:
1. permissions_parser.go - Parsing logic (NewPermissionsParser, parse, getters)
2. permissions_builder.go - Builder/constructor functions (New* methods)
3. permissions_utilities.go - Utility functions (scope getters, converters)
Keep: permissions_validator.go (already exists)
``````
**Location:** pkg/workflow/permissions.go:1
---
#### Issue #4: Duplicate GitHub URL/Repo Parsing Functions
**Duplicate implementations detected:**
<details>
<summary><b>Locations with similar parsing logic</b></summary>
1. **extractBaseRepo()** - appears in multiple locations:
- pkg/workflow/action_pins.go (~line 150)
- pkg/workflow/action_resolver.go (~line 80)
- Both parse GitHub action repo strings identically
2. **GitHub URL parsing** scattered across:
- pkg/parser/github.go - `ParseGitHubURL()`
- pkg/parser/github_urls.go - `ParseGitHubURL()` variants, `ParsePRURL()`, `ParseRunURL()`
- pkg/workflow/github_tool_to_toolset.go - Custom GitHub parsing
- pkg/repoutil/repoutil.go - Repository slug parsing
3. **Target repo parsing** duplicated:
- `parseTargetRepo()` functions in 3+ different files
- Similar logic, different contexts
</details>
**Recommendation:**
``````
1. Consolidate extractBaseRepo() into single location (action_resolver.go)
2. Create unified GitHub URL parsing in pkg/parser/github_urls.go
3. Eliminate duplicate parseTargetRepo() implementations
``````
**Estimated Impact:** Reduced duplication, single source of truth for parsing logic
---
#### Issue #5: Validation Functions Scattered Across 40+ Files
**Pattern:** Excellent validation architecture documented in `pkg/workflow/validation.go`, but implementation incomplete.
**Properly consolidated (✅ good examples):**
- `runtime_validation.go` (6 functions)
- `repository_features_validation.go` (5 functions)
- `agent_validation.go` (5 functions)
- `expression_validation.go` (13 functions)
- `strict_mode_validation.go` (7 functions)
**Improperly scattered (⚠️ needs consolidation):**
- Validation in entity files: `create_issue.go`, `create_discussion.go`, `add_comment.go` (1-2 validate functions each)
- 3 separate bundler validation files (safety, script, runtime)
- Small validation files with 2-3 functions: `firewall_validation.go`, `sandbox_validation.go`
**Recommendation:**
``````
1. Create entity_validation.go for entity-specific validations currently in create*/update* files
2. Merge bundler validations: bundler_safety_validation.go + bundler_script_validation.go + bundler_runtime_validation.go → bundler_validation.go
3. Merge small files:
- firewall_validation.go (2-3 functions) → engine_validation.go
- sandbox_validation.go (2-3 functions) → strict_mode_validation.go
- dangerous_permissions_validation.go → permissions_validator.go
``````
**Impact:** Reduce 40+ validation files to ~30, improve consistency with documented architecture
---
#### Issue #6: Helper Functions Fragmentation
**Problem:** Format, parse, and utility helpers scattered across 40+ files.
<details>
<summary><b>Examples of scattered helpers</b></summary>
**String/Format helpers:**
- `formatYAMLValue()` in one location
- `formatValidationOutput()` in validation_helpers.go
- `formatTemplateInjectionError()` in template_injection_validation.go
- `formatDangerousPermissionsError()` in dangerous_permissions_validation.go
- `formatBlockedDomains()` in safe_outputs_domains_validation.go
- `formatNetworkAccess()` in scattered location
- `formatMissingPermissionsMessage()` in scattered location
**Parse/Extract helpers:**
- `extractBaseRepo()` in action_resolver.go and action_pins.go
- `parseTargetRepo()` in 3 different files
- `parseTimeoutTool()` in isolated location
- `parseTimeDelta()` in multiple locations
**Validation helpers:**
- `validateDomainPattern()` in multiple validation files
- `validateTargetRepoSlug()` in multiple places
- `validateNoTemplateInjection()` pattern repeated
- `validateNoExecSync()`, `validateNoModuleReferences()` - similar patterns
</details>
**Recommendation:**
``````
Create unified helper modules:
1. pkg/workflow/helpers_format.go - All format* functions (~20 functions)
2. pkg/workflow/helpers_parse.go - All parse/extract helpers (~15 functions)
3. pkg/workflow/helpers_build.go - All build* generators (~10 functions)
4. pkg/workflow/helpers_validate.go - Standalone validators (~10 functions)
``````
**Benefit:** Centralized, discoverable utilities; eliminate duplication
---
#### Issue #7: CLI Compile Subsystem Fragmentation (14 files → 7 files)
**Current structure:** 14 compile_* files in pkg/cli/
- `compile_command.go`, `compile_config.go`, `compile_compiler_setup.go`
- `compile_batch_operations.go`, `compile_campaign.go`
- `compile_helpers.go`, `compile_orchestration.go`, `compile_orchestrator.go`
- `compile_output_formatter.go`, `compile_post_processing.go`, `compile_stats.go`
- `compile_validation.go`, `compile_watch.go`, `compile_workflow_processor.go`
**Issue:** Overlapping concerns - multiple files handle orchestration, multiple handle output.
**Recommendation:**
``````
Consolidate to 7 files:
1. compile_command.go - Main entry point (keep as-is)
2. compile_config.go - Configuration (merge compile_compiler_setup.go)
3. compile_processor.go - Processing (merge orchestration + orchestrator + workflow_processor)
4. compile_campaign.go - Campaigns (keep as-is)
5. compile_validation.go - Validation (merge validation + relevant helpers)
6. compile_output.go - Output (merge output_formatter + stats + post_processing)
7. compile_watch.go - Watch mode (keep as-is)
``````
**Impact:** 14 files → 7 files, clearer concern separation
---
#### Issue #8: CLI Codemod Subsystem Over-Fragmentation (15 files → 6 files)
**Current structure:** 15 separate codemod files, one per operation
- `codemod_agent_session.go`, `codemod_discussion_flag.go`, `codemod_grep_tool.go`
- `codemod_mcp_network.go`, `codemod_network_firewall.go`, `codemod_permissions.go`
- `codemod_safe_inputs.go`, `codemod_sandbox_agent.go`, `codemod_schedule.go`
- `codemod_schema_file.go`, `codemod_slash_command.go`, `codemod_timeout_minutes.go`
- `codemod_upload_assets.go`, `codemod_yaml_utils.go`
**Issue:** Each codemod in separate file, even when semantically related.
**Recommendation:**
``````
Group by functional area (15 → 6 files):
1. codemod_core.go - Core modifications (permissions, schedule, timeout)
2. codemod_infrastructure.go - Infrastructure (firewall, network, sandbox)
3. codemod_features.go - Features (safe_inputs, upload_assets)
4. codemod_tools.go - Tooling (grep_tool, yaml_utils)
5. codemod_integration.go - Integration (agent_session, discussion_flag)
6. codemod_schema.go - Schema operations (keep focused)
Impact: Better semantic grouping, reduced file proliferation
Detailed Recommendations by Priority
High Priority (Immediate Impact)
1. Extract Duplicate Functions
Action: Eliminate duplicate parseGitHub/parseRepo functions
- Create
pkg/parser/github_url_utils.goor consolidate in existing parser files - Remove duplicates in workflow and repoutil packages
- Files affected: action_pins.go, action_resolver.go, github_tool_to_toolset.go, repoutil.go
Benefit: Single source of truth for URL parsing, reduced maintenance burden
2. Consolidate Bundler Validation (3 → 1 file)
Action: Merge bundler validation files
- Combine: bundler_safety_validation.go, bundler_script_validation.go, bundler_runtime_validation.go
- Create:
pkg/workflow/bundler_validation.gowith organized sections
Files to consolidate:
- pkg/workflow/bundler_safety_validation.go
- pkg/workflow/bundler_script_validation.go
- pkg/workflow/bundler_runtime_validation.go
3. Split Permissions.go (1 → 3 files)
Action: Refactor oversized permissions.go (36 functions)
- Extract: permissions_parser.go (parsing logic)
- Extract: permissions_builder.go (constructor functions)
- Keep utilities in focused file
Location: pkg/workflow/permissions.go:1
4. Create Unified Helpers Module
Action: Extract scattered helper functions to dedicated files
- Create: pkg/workflow/helpers_format.go (~20 format functions)
- Create: pkg/workflow/helpers_parse.go (~15 parse functions)
- Create: pkg/workflow/helpers_validate.go (~10 validation utilities)
Benefit: Centralized, discoverable utilities
Medium Priority (Next Phase)
5. Consolidate Safe Outputs Config (8 → 2 files)
Action: Merge safe_outputs_config_* files
- Keep: safe_outputs_config.go (types + core)
- Merge into safe_outputs_config_generation.go: generation, helpers, reflection, messages files
Files affected: 8 files in pkg/workflow/safe_outputs_config*.go
6. Reorganize CLI Codemod (15 → 6 files)
Action: Group codemods by functional area
- See Issue Add workflow: githubnext/agentics/weekly-research #8 above for detailed breakdown
Impact: Better semantic organization, easier to find related operations
7. Reduce Compiler Safe Outputs Files (14 → 4)
Action: Consolidate compiler_safe_outputs_* files
- See Issue rejig docs #1 above for detailed breakdown
Impact: Clearer structure, easier navigation
8. Move Entity Helpers to Proper Locations
Action: Consolidate or distribute helper files
- Option A: Merge close_entity_helpers.go and update_entity_helpers.go into entity_operations_helpers.go
- Option B: Distribute functions into respective entity files as unexported helpers
Files affected:
- pkg/workflow/close_entity_helpers.go
- pkg/workflow/update_entity_helpers.go
Low Priority (Future Improvements)
9. Split Agentic Engine (1 → 2-3 files)
File: pkg/workflow/agentic_engine.go (29 functions)
Recommendation: Separate core logic from tools/execution
- agentic_engine_core.go - Main engine logic
- agentic_engine_tools.go - Tool handling
- agentic_engine_execution.go - Execution logic (if needed)
10. Consolidate Prompt Files (3 → 2)
Files:
- pkg/workflow/prompt_step.go
- pkg/workflow/prompt_step_helper.go
- pkg/workflow/unified_prompt_step.go
Recommendation: Merge helpers and unified into main prompt_step files
11. Reduce Frontmatter Fragmentation (10 → 6 files)
Action: Consolidate frontmatter_extraction_* files
- Merge metadata, security, YAML extraction into 2 focused files
Summary Statistics
File Metrics:
- Total analyzed: 462 non-test Go files
- Exported functions: ~2,000+
- Unexported functions: ~1,500+
- Files with 20+ functions: 15 (potential split candidates)
- Files with 1-5 functions: 200+ (consolidation candidates)
Largest files by function count:
js.go- 46 functionspermissions.go- 36 functionsagentic_engine.go- 29 functionscompiler_types.go- 28 functionsexpression_builder.go- 27 functions
Most fragmented areas:
- Safe outputs subsystem (37 files)
- Compiler subsystem (60 files)
- CLI utilities (60+ scattered files)
- Validation functions (40+ files)
- Create/Update entity operations (41 files)
Best organized areas (✅ good patterns to follow):
- Parser package (29 files) - Clear semantic groups (frontmatter, imports, schedule, schema)
- Console package (10 files) - Focused concerns (format, render, layout, progress, spinner)
- Campaign package (11 files) - Clear module separation
- Engine subsystems - Consistent naming patterns (engine.go, _logs.go, _mcp.go, _tools.go)
Implementation Checklist
-
Phase 1: High-Priority Refactoring
- Extract duplicate GitHub URL parsing functions
- Consolidate bundler validation files (3 → 1)
- Split permissions.go (1 → 3)
- Create unified helpers module (4 new files)
-
Phase 2: Medium-Priority Consolidation
- Consolidate safe_outputs_config files (8 → 2)
- Reorganize CLI codemod subsystem (15 → 6)
- Reduce compiler_safe_outputs files (14 → 4)
- Move entity helpers to proper locations
-
Phase 3: Long-term Improvements
- Split oversized files (agentic_engine.go)
- Consolidate prompt files (3 → 2)
- Reduce frontmatter fragmentation (10 → 6)
- Systematic CLI package cleanup (150 → 100 files)
-
Phase 4: Testing & Validation
- Verify no functionality broken after refactoring
- Update tests to reflect new file structure
- Update documentation with new organization
- Validate all imports and references updated
Analysis Metadata
- Analysis Date: 2026-01-24
- Repository: githubnext/gh-aw
- Total Files Analyzed: 462 Go files (excluding tests)
- Detection Method: Semantic code analysis using naming patterns, function clustering, and code organization assessment
- Primary Focus: pkg/workflow (52% of files), pkg/cli (33% of files)
- Workflow Run: §21316929471
AI generated by Semantic Function Refactoring