Skip to content

[file-diet] Refactor Large Go File: pkg/parser/schedule_parser.go (1084 lines) #11285

@agentic-workflows-dev

Description

@agentic-workflows-dev

Overview

The file pkg/parser/schedule_parser.go has grown to 1084 lines, significantly exceeding the healthy 800-line threshold. While it has excellent test coverage (2031 test lines, ~1.87 ratio), the file's size makes it difficult to navigate and maintain. This task involves refactoring it into smaller, focused modules with clear boundaries.

Current State

  • File: pkg/parser/schedule_parser.go
  • Size: 1084 lines
  • Test Coverage: 2031 lines in schedule_parser_test.go (~1.87 ratio)
  • Complexity: High - contains 5 distinct functional areas mixed together

Problem Analysis

The file mixes multiple concerns:

  1. Cron detection - Pattern matching for different cron types
  2. Fuzzy scattering - Complex deterministic time distribution (290+ lines in ScatterSchedule)
  3. Parser orchestration - Tokenization and high-level parsing
  4. Expression parsing - Interval and base schedule parsing
  5. Time utilities - Time parsing and conversion helpers

Refactoring Strategy

Split the file into focused modules aligned with functional boundaries:

1. schedule_cron_detection.go (~150 lines)

Functions to move:

  • IsDailyCron(cron string) bool
  • IsHourlyCron(cron string) bool
  • IsWeeklyCron(cron string) bool
  • IsFuzzyCron(cron string) bool
  • IsCronExpression(input string) bool

Responsibility: Cron pattern detection and classification

Rationale: These are pure functions that analyze cron strings. They have no dependencies on the parser state and can be tested independently.

2. schedule_fuzzy_scatter.go (~320 lines)

Functions to move:

  • ScatterSchedule(fuzzyCron, workflowIdentifier string) (string, error) (290+ lines)
  • stableHash(s string, modulo int) int

Responsibility: Deterministic fuzzy schedule time distribution

Rationale: The ScatterSchedule function is self-contained with complex logic for different fuzzy patterns (DAILY, HOURLY, WEEKLY, etc.). Extracting it reduces the main file by ~30%.

3. schedule_time_utils.go (~150 lines)

Functions to move:

  • parseTime(timeStr string) (minute string, hour string) (120 lines)
  • parseTimeToMinutes(hourStr, minuteStr string) int
  • mapWeekday(day string) string

Responsibility: Time parsing and conversion utilities

Rationale: These are helper functions used by the parser but don't depend on parser state. Can be reused by other modules.

4. schedule_parser_core.go (~250 lines)

Keep in this file:

  • ScheduleParser struct
  • ParseSchedule(input string) (cron, original, error) (entry point)
  • (p *ScheduleParser) tokenize() error
  • (p *ScheduleParser) parse() (string, error)

Responsibility: High-level parsing orchestration

Rationale: This is the main API surface. Keeping it focused on orchestration makes the flow clear.

5. schedule_parser_expressions.go (~250 lines)

Functions to move:

  • (p *ScheduleParser) parseInterval() (string, error) (150+ lines)
  • (p *ScheduleParser) parseBase() (string, error) (180+ lines)
  • (p *ScheduleParser) extractTime(startPos int) (string, error)
  • (p *ScheduleParser) extractTimeBetween(startPos, endPos int) (string, error)
  • (p *ScheduleParser) extractTimeAfter(startPos int) (string, error)

Responsibility: Schedule expression parsing (interval, base, time extraction)

Rationale: These are the detailed parsing methods that consume tokens. They depend on parser state but can be in a separate file via schedule_parser_core.go.

Implementation Plan

Step-by-step refactoring approach
  1. Create test infrastructure first

    • Run make test-unit to establish baseline
    • Ensure all existing tests pass before changes
  2. Extract utilities (no dependencies)

    • Create schedule_time_utils.go with time parsing functions
    • Create schedule_time_utils_test.go with relevant tests
    • Run tests to verify behavior unchanged
  3. Extract detection functions

    • Create schedule_cron_detection.go with Is*Cron functions
    • Create schedule_cron_detection_test.go with relevant tests
    • Run tests to verify behavior unchanged
  4. Extract fuzzy scattering

    • Create schedule_fuzzy_scatter.go with ScatterSchedule and stableHash
    • Create schedule_fuzzy_scatter_test.go with relevant tests
    • Run tests to verify behavior unchanged
  5. Split parser methods

    • Create schedule_parser_expressions.go with parsing methods
    • Move parseInterval, parseBase, extractTime* methods
    • Ensure methods can still access ScheduleParser receiver
    • Run tests to verify behavior unchanged
  6. Rename original file

    • Rename schedule_parser.go to schedule_parser_core.go
    • Keep only core orchestration (struct, ParseSchedule, tokenize, parse)
    • Run tests to verify behavior unchanged
  7. Update package documentation

    • Add file-level comments to each new file
    • Update any package-level documentation
  8. Final validation

    • Run make agent-finish to ensure all checks pass
    • Verify no breaking changes to public API
    • Review test coverage with go test -cover

Test Coverage Plan

Each new file should have corresponding test file with ≥80% coverage:

schedule_cron_detection_test.go

  • Test each Is*Cron function with valid/invalid patterns
  • Edge cases: empty strings, malformed cron, boundary values
  • Target coverage: >85%

schedule_fuzzy_scatter_test.go

  • Test ScatterSchedule with all fuzzy patterns (DAILY, HOURLY, WEEKLY, etc.)
  • Test stableHash determinism and distribution
  • Test error cases (invalid patterns, out-of-range values)
  • Target coverage: >80%

schedule_time_utils_test.go

  • Test parseTime with various time formats
  • Test parseTimeToMinutes edge cases
  • Test mapWeekday with all weekday variations
  • Target coverage: >85%

schedule_parser_expressions_test.go

  • Test parseInterval with short/long duration formats
  • Test parseBase with daily/weekly/monthly patterns
  • Test extractTime* methods with various input positions
  • Target coverage: >80%

schedule_parser_core_test.go

  • Integration tests ensuring all modules work together
  • Test ParseSchedule end-to-end with complex expressions
  • Target coverage: >85%

Implementation Guidelines

  1. Preserve Behavior: All existing tests must pass without modification
  2. Maintain Exports: Public API (ParseSchedule, IsCronExpression, etc.) unchanged
  3. Add Tests First: Create test files for new modules before moving code
  4. Incremental Changes: Split one module at a time, commit after each successful split
  5. Run Tests Frequently: Verify make test-unit passes after each module extraction
  6. Update Imports: The new files will be in the same parser package (no import changes needed)
  7. Document Boundaries: Add clear file-level comments explaining each module's responsibility

Acceptance Criteria

  • Original file split into 5 focused files
  • Each new file is under 350 lines
  • All existing tests pass (make test-unit)
  • New test files created with ≥80% coverage per module
  • No breaking changes to public API (ParseSchedule, Is*Cron functions)
  • Code passes linting (make lint)
  • Code passes formatting (make fmt)
  • Build succeeds (make build)
  • Integration tests verify end-to-end functionality

Additional Context

  • Repository Guidelines: Follow patterns in AGENTS.md
  • Code Organization: Prefer many small files grouped by functionality (per AGENTS.md)
  • Testing: Match existing test patterns in pkg/parser/*_test.go
  • Logger: Each new file should have its own logger (e.g., logger.New("parser:schedule_fuzzy_scatter"))

Estimated Impact

Priority: Medium
Effort: Medium (5-8 hours estimated)
Benefits:

  • Maintainability: Easier to locate and modify specific functionality
  • Testing: Clearer test organization and faster test execution per module
  • Complexity: Reduced cognitive load when reading/debugging
  • Reusability: Utility functions can be used by other parser components

References

AI generated by Daily File Diet

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions