Skip to content

[Code Quality] Fix TOCTOU Race Condition in Docker Image Download Logic #11880

@github-actions

Description

@github-actions

Description

The StartDockerImageDownload function in pkg/cli/docker_images.go has a time-of-check-time-of-use (TOCTOU) race condition that can cause duplicate goroutines and resource leaks in concurrent Docker image download operations.

Problem

The availability check (IsDockerImageAvailable) happens outside the mutex lock, creating a race window:

func StartDockerImageDownload(image string) bool {
    // ❌ Check availability WITHOUT lock
    if IsDockerImageAvailable(image) {
        return false
    }

    // ✅ Lock acquired HERE - too late!
    pullState.mu.Lock()
    if pullState.downloading[image] {
        pullState.mu.Unlock()
        return false
    }
    pullState.downloading[image] = true
    pullState.mu.Unlock()
    
    go func() { /* download */ }()
    return true
}

Race Scenario

  1. Thread A checks IsDockerImageAvailable(image) → returns false
  2. Thread B checks IsDockerImageAvailable(image) → returns false
  3. Image becomes available externally (e.g., pulled by another process)
  4. Thread A acquires lock, marks downloading, spawns goroutine
  5. Thread B sees downloading flag, returns false
  6. Result: Unnecessary goroutine wastes resources pulling already-available image

Impact

  • Severity: Critical (concurrency bug)
  • Risk: Duplicate download attempts, resource waste, confusing logs
  • Hot path: Called by CheckAndPrepareDockerImages (line 167)

Suggested Changes

Move the availability check inside the lock for atomic check-and-set:

func StartDockerImageDownload(image string) bool {
    // ✅ Atomic check-and-set under lock
    pullState.mu.Lock()
    defer pullState.mu.Unlock()
    
    // Check availability (now atomic with downloading check)
    if IsDockerImageAvailable(image) {
        dockerImagesLog.Printf("Image %s is already available", image)
        return false
    }
    
    // Check if already downloading
    if pullState.downloading[image] {
        dockerImagesLog.Printf("Image %s is already downloading", image)
        return false
    }
    
    // Mark as downloading and spawn goroutine
    pullState.downloading[image] = true
    
    go func() {
        defer func() {
            pullState.mu.Lock()
            delete(pullState.downloading, image)
            pullState.mu.Unlock()
        }()
        
        // Download logic (may need RLock optimization if expensive)
        /* ... */
    }()
    
    return true
}

Alternative: Double-Checked Locking

If IsDockerImageAvailable is expensive (calls docker image inspect):

  1. Check availability under RLock first (fast path)
  2. Upgrade to exclusive Lock only if download needed
  3. Re-check availability after upgrade (prevents race)

Files Affected

  • pkg/cli/docker_images.go:68-83 - Main race condition
  • pkg/cli/docker_images_test.go - Add concurrency test

Success Criteria

  • Availability check moved inside mutex lock
  • No race conditions detected: go test -race ./pkg/cli/...
  • Existing tests pass: go test ./pkg/cli/docker_images_test.go
  • New concurrency test added (multiple goroutines calling StartDockerImageDownload)
  • No performance regression (verify lock is not held excessively)
  • Check for deadlocks if IsDockerImageAvailable also takes locks
  • make agent-finish passes

Priority

Critical - Concurrency bug that can cause resource leaks and duplicate operations

Estimated Effort: Medium (4-6 hours including testing and lock analysis)

Source

Extracted from Sergo Concurrency & Error Flow Analysis - Discussion #10665

Analysis Quote:

"Critical Finding: Discovered a time-of-check-time-of-use (TOCTOU) race condition in pkg/cli/docker_images.go:68-83 that could cause duplicate goroutines and resource leaks in Docker image download operations."

References:

AI generated by Discussion Task Miner - Code Quality Improvement Agent

  • expires on Feb 9, 2026, 2:09 PM UTC

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions