Skip to content

[Code Quality] Improve comment density in complex files to meet 15% target #13881

@github-actions

Description

@github-actions

Description

The codebase has a comment ratio of 10.39% across 503,851 lines of code, below the target of 15%. With 190 large files (>500 LOC) comprising 10.3% of the codebase, complex algorithmic sections and data structures need better inline documentation.

Problem

Current Metrics:

  • Comment lines: 40,418
  • Total LOC: 503,851
  • Comment ratio: 10.39% (actual) vs 15% (target)
  • Large files (>500 LOC): 190 files
  • Comment density score: 6.9/10 (69%)

Impact:

  • Complex code sections are harder to understand
  • New contributors face steeper learning curve
  • Maintenance burden increases over time
  • Code reviews require more back-and-forth

Most Critical Files (typically):

  • pkg/workflow/compiler.go (workflow compilation logic)
  • pkg/workflow/parser.go (frontmatter parsing)
  • pkg/workflow/*_validation.go (validation rules)
  • pkg/cli/*_command.go (CLI command logic)
  • pkg/mcp/server.go (MCP server implementation)

Suggested Changes

Step 1: Identify Target Files

Create script to identify files with low comment ratios and high complexity:

# Find large files with low comment density
find pkg/ -name "*.go" -not -name "*_test.go" -exec wc -l {} \; | 
  awk '$1 > 500 {print $2}' |
  xargs -I {} sh -c 'echo "{}:$(grep -c "^[ \t]*//" {} || echo 0)"' |
  sort -t: -k2 -n | head -20

Step 2: Comment Guidelines

Focus on these areas in complex files:

Algorithm Explanations:

// ❌ Bad - states the obvious
// Loop through items
for _, item := range items {

// ✅ Good - explains why and algorithm choice
// Use two-pass algorithm to avoid O(n²) complexity when merging overlapping ranges.
// First pass: sort by start position. Second pass: merge adjacent ranges.
for _, item := range items {

Complex Data Structures:

// ❌ Bad - no explanation
type ValidationContext struct {
    rules []Rule
    cache map[string]bool
    depth int
}

// ✅ Good - explains purpose and invariants
// ValidationContext tracks validation state across recursive validation passes.
// 
// Fields:
// - rules: Active validation rules (immutable after initialization)
// - cache: Memoization cache for expensive validations (key: validation ID)
// - depth: Current recursion depth (used to prevent infinite loops, max: 10)
type ValidationContext struct {
    rules []Rule
    cache map[string]bool
    depth int
}

Edge Cases:

// ❌ Bad - magic check without explanation
if len(items) == 0 || items[0].ID == "" {
    return nil
}

// ✅ Good - explains the edge case
// Handle empty slice or uninitialized first item (occurs when workflow has no jobs).
// Return nil to signal "no validation needed" rather than error.
if len(items) == 0 || items[0].ID == "" {
    return nil
}

Step 3: Target Comment Ratio

Aim for these ratios by file type:

  • Validators: 15-20% (complex logic, many edge cases)
  • Parsers: 12-15% (data transformation logic)
  • CLI commands: 10-12% (mostly straightforward)
  • Utilities: 8-10% (simple helpers)

Step 4: Review Process

  • Identify top 20 complex files with low comment ratios
  • Add explanatory comments for algorithmic sections
  • Document complex data structures with field explanations
  • Explain edge case handling with "why" comments
  • Review with team to ensure comments add value
  • Run go doc to verify godoc output quality

Success Criteria

  • Overall comment ratio improves from 10.39% to at least 12%
  • Top 20 most complex files have at least 15% comment ratio
  • Algorithm choices documented with rationale
  • Complex data structures have field-level documentation
  • Edge cases explained with "why" comments
  • All tests pass after adding comments
  • No "obvious" comments that state what code already shows

Files Affected

Focus on these categories (exact files TBD after analysis):

  • pkg/workflow/compiler*.go (compilation logic)
  • pkg/workflow/*_validation.go (validation rules)
  • pkg/workflow/parser*.go (parsing logic)
  • pkg/cli/*_command.go (complex CLI commands)
  • pkg/mcp/*.go (MCP server implementation)

Source

Extracted from Daily Code Metrics Report discussion #13148

Discussion finding: "Comment Ratio: 10.39% (actual) vs 15% (target). With 190 large files (>500 LOC), complex code sections need better explanation."

Priority

Low - Code maintainability improvement, not blocking functionality.

AI generated by Discussion Task Miner - Code Quality Improvement Agent

  • expires on Feb 6, 2026, 9:14 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions