Skip to content

perf: Move regex compilation to package level to eliminate hot-path overhead#11885

Merged
pelikhan merged 2 commits intomainfrom
copilot/move-regex-compilation
Jan 26, 2026
Merged

perf: Move regex compilation to package level to eliminate hot-path overhead#11885
pelikhan merged 2 commits intomainfrom
copilot/move-regex-compilation

Conversation

Copy link
Contributor

Copilot AI commented Jan 26, 2026

Regex patterns were being recompiled on every function call in workflow compilation hot paths. For workflows with 100+ jobs, this caused O(n) compilation overhead.

Changes

Moved regex compilation to package-level variables in:

  • compiler_jobs.go - runtimeImportMacroRe (called per-job during compilation)
  • expression_extraction.go - expressionExtractionRegex (called during expression mapping)
  • template_validation.go - templateRegionPattern (called during template validation)
  • repo_memory.go - branchPrefixValidPattern (called during repo-memory validation)

Pattern:

// Before: Compiled on every function call
func containsRuntimeImports(markdownContent string) bool {
    macroRe := regexp.MustCompile(`\{\{#runtime-import\??[ \t]+([^\}]+)\}\}`)
    matches := macroRe.FindAllStringSubmatch(markdownContent, -1)
    // ...
}

// After: Compiled once at package initialization
var runtimeImportMacroRe = regexp.MustCompile(`\{\{#runtime-import\??[ \t]+([^\}]+)\}\}`)

func containsRuntimeImports(markdownContent string) bool {
    matches := runtimeImportMacroRe.FindAllStringSubmatch(markdownContent, -1)
    // ...
}

Follows existing pattern from expression_validation.go. Estimated ~99ms savings on 100-job workflows.

Original prompt

This section details on the original issue you should resolve

<issue_title>[Code Quality] Move Regex Compilation to Package Level to Avoid Hot-Path Recompilation</issue_title>
<issue_description>## Description

The workflow compilation pipeline recompiles identical regex patterns on every function call instead of compiling once at package initialization, causing O(n) compilation overhead during workflow builds.

Problem

In pkg/workflow/compiler_jobs.go:455, the containsRuntimeImports() function calls regexp.MustCompile() inside the function:

func containsRuntimeImports(markdownContent string) bool {
    macroPattern := `\\{\\{#runtime-import\\??[ \\t]+([^\\}]+)\\}\\}`
    macroRe := regexp.MustCompile(macroPattern)  // ❌ Compiled EVERY call
    matches := macroRe.FindAllStringSubmatch(markdownContent, -1)
    return len(matches) > 0
}

This function is called per-job during workflow compilation. For workflows with 100+ jobs, the same regex is compiled 100+ times.

Impact

  • Performance: O(n) regex compilation overhead scales with workflow size
  • CPU waste: Identical pattern compiled repeatedly
  • Latency: Slower workflow builds for complex workflows

Benchmark Impact (Estimated)

  • Current: ~1ms per regex compilation × 100 jobs = 100ms overhead
  • After fix: 1ms one-time compilation + 0ms per job = 1ms total
  • Improvement: ~99ms saved on 100-job workflows

Additional Instances to Fix

Found in performance analysis:

  • pkg/workflow/compiler_jobs.go:455 - CONFIRMED CRITICAL (per-job hot path)
  • pkg/workflow/expression_extraction.go:45 - expressionRegex (needs review)
  • pkg/workflow/template_validation.go:50 - templateRegionPattern (needs review)
  • pkg/workflow/repo_memory.go:77 - validPattern (needs review)

Good Example Already in Codebase

pkg/workflow/expression_validation.go:65-69 already follows best practice:

// ✅ CORRECT - Compiled once at package level
var (
    expressionLog = logging.MustGetLogger("workflow.expression")
    
    // Regex patterns compiled at init
    githubContextRegex     = regexp.MustCompile(`\\$\\{\\{ *github\\.[a-zA-Z0-9_]+ *\\}\\}`)
    inputsRegex            = regexp.MustCompile(`\\$\\{\\{ *inputs\\.[a-zA-Z0-9_-]+ *\\}\\}`)
    secretsRegex           = regexp.MustCompile(`\\$\\{\\{ *secrets\\.[A-Z0-9_]+ *\\}\\}`)
)

Suggested Changes

Move regex compilation to package-level variable declarations:

// At package level (top of compiler_jobs.go):
var (
    compilerJobsLog = logging.MustGetLogger("workflow.compiler.jobs")
    
    // ✅ Regex patterns compiled once at init
    runtimeImportMacroRe = regexp.MustCompile(`\\{\\{#runtime-import\\??[ \\t]+([^\\}]+)\\}\\}`)
)

// In containsRuntimeImports function:
func containsRuntimeImports(markdownContent string) bool {
    matches := runtimeImportMacroRe.FindAllStringSubmatch(markdownContent, -1)
    return len(matches) > 0
}

Files to Update

Confirmed Critical:

  • pkg/workflow/compiler_jobs.go:455 - Hot path per-job compilation

To Review:

  • pkg/workflow/expression_extraction.go:45
  • pkg/workflow/template_validation.go:50
  • pkg/workflow/repo_memory.go:77

Reference (already correct):

  • pkg/workflow/expression_validation.go - Follow this pattern ✅

Success Criteria

  • All in-function regexp.MustCompile() calls moved to package level
  • Benchmark shows improved compilation time: go test -bench=BenchmarkCompile -benchmem
  • No allocation increase: verify with -benchmem (should decrease)
  • All tests pass: go test ./pkg/workflow/... -v
  • Verify no in-function compilation remains: grep -rn "regexp.MustCompile" pkg/workflow/*.go | grep -v "var.*="
  • Large workflow compilation (100+ jobs) shows measurable improvement
  • make agent-finish passes

Priority

High - Performance optimization with clear benefit and low risk

Estimated Effort: Small (2-3 hours including verification across all files)

Source

Extracted from Sergo Performance Optimization Analysis - Discussion githubnext/gh-aw#11840

Analysis Quote:

"Regex Compilation in Hot Paths: pkg/workflow/compiler_jobs.go:455 - regexp.MustCompile() inside containsRuntimeImports() function. Function called per-job during workflow compilation (potentially 100+ times for complex workflows). Each call recompiles identical regex pattern."

Impact Assessment:

"Impact: Eliminate O(n) regex compilation overhead in workflow compilation"

References:

AI generated by Discussion Task Miner - Code Quality Improvement Agent

  • expires ...

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor regex compilation to package level perf: Move regex compilation to package level to eliminate hot-path overhead Jan 26, 2026
Copilot AI requested a review from pelikhan January 26, 2026 15:20
@pelikhan pelikhan marked this pull request as ready for review January 26, 2026 15:22
@pelikhan pelikhan merged commit 498f2ef into main Jan 26, 2026
138 checks passed
@pelikhan pelikhan deleted the copilot/move-regex-compilation branch January 26, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code Quality] Move Regex Compilation to Package Level to Avoid Hot-Path Recompilation

2 participants