Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 19, 2026

Fix TOCTOU race condition in Docker image download logic

Issue: The StartDockerImageDownload function has a time-of-check-time-of-use (TOCTOU) race condition where the availability check happens outside the lock, creating a race window for duplicate downloads.

Changes Made:

  1. Created isDockerImageAvailableUnlocked helper function - Internal function that checks image availability without taking locks, allowing it to be called while holding a write lock
  2. Refactored IsDockerImageAvailable - Now uses the unlocked helper with proper RLock for external callers
  3. Fixed StartDockerImageDownload - Moved both availability and downloading checks inside a single lock scope using defer, eliminating the TOCTOU race condition
  4. Added comprehensive concurrency tests:
    • TestStartDockerImageDownload_ConcurrentCalls - Verifies only one goroutine starts download
    • TestStartDockerImageDownload_ConcurrentCallsWithAvailableImage - Verifies no download starts for available images
    • TestStartDockerImageDownload_RaceWithExternalDownload - Verifies race condition handling
  5. Merged main branch and recompiled all workflows - Incorporated latest changes from main and regenerated workflow lock files

Testing:

  • Analyze the current implementation
  • Understand the locking mechanism and potential deadlock risks
  • Verify IsDockerImageAvailable uses RLock (safe to call inside write lock)
  • Modify StartDockerImageDownload to move availability check inside the lock
  • Handle the fact that IsDockerImageAvailable takes RLock by creating unlocked helper
  • Run existing unit tests to verify no regressions (all passing)
  • Add concurrency test with multiple goroutines (all passing)
  • Run race detector to verify no race conditions (no issues found)
  • Merge main branch and recompile workflows

Success Criteria Met:

✅ Availability check moved inside mutex lock
✅ All tests in pkg/cli/docker_images_test.go pass
✅ Added concurrency test with multiple goroutines
✅ No performance regression (same lock behavior, just better ordering)
✅ No deadlocks (verified by avoiding nested RLock/Lock calls)
✅ Merged main branch and recompiled workflows

Original prompt

This section details on the original issue you should resolve

<issue_title>[Code Quality] Fix TOCTOU race condition in Docker image download logic</issue_title>
<issue_description>## Description

The StartDockerImageDownload function in pkg/cli/docker_images.go has a time-of-check-time-of-use (TOCTOU) race condition. The availability check happens outside the lock, creating a race window where multiple threads could initiate duplicate downloads or miss externally-downloaded images.

Problem

func StartDockerImageDownload(image string) bool {
    // First check if already available (NO LOCK)
    if IsDockerImageAvailable(image) {
        return false
    }

    // Check if already downloading (LOCK ACQUIRED HERE)
    pullState.mu.Lock()
    if pullState.downloading[image] {
        pullState.mu.Unlock()
        return false
    }
    pullState.downloading[image] = true
    pullState.mu.Unlock()
    
    go func() { /* download logic */ }()
    return true
}

Impact

  • Severity: Critical
  • Risk: Duplicate goroutines, resource waste, potential deadlocks
  • Files Affected: pkg/cli/docker_images.go:68-83

Suggested Fix

Move the availability check inside the lock to make it atomic:

func StartDockerImageDownload(image string) bool {
    pullState.mu.Lock()
    defer pullState.mu.Unlock()
    
    // Check if already available (inside lock for atomicity)
    if IsDockerImageAvailable(image) {
        dockerImagesLog.Printf("Image %s is already available", image)
        return false
    }
    
    // Check if already downloading
    if pullState.downloading[image] {
        dockerImagesLog.Printf("Image %s is already downloading", image)
        return false
    }
    
    pullState.downloading[image] = true
    
    go func() { /* download logic */ }()
    return true
}

Success Criteria

  • Availability check moved inside mutex lock
  • All tests in pkg/cli/docker_images_test.go pass
  • Add concurrency test with multiple goroutines
  • Verify no performance regression
  • Check for deadlocks if IsDockerImageAvailable also takes locks

Source

Extracted from Sergo Concurrency Analysis discussion githubnext/gh-aw#10665

Priority

Critical - Could cause resource leaks and duplicate downloads in production

AI generated by Discussion Task Miner - Code Quality Improvement Agent

  • expires on Feb 2, 2026, 2:08 PM UTC

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits January 19, 2026 19:05
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix TOCTOU race condition in Docker image download logic Fix TOCTOU race condition in Docker image download logic Jan 19, 2026
Copilot AI requested a review from mnkiefer January 19, 2026 19:19
@pelikhan pelikhan marked this pull request as ready for review January 19, 2026 19:22
@pelikhan
Copy link
Contributor

@copilot merge main and recompile

…ce-condition

# Conflicts:
#	.github/workflows/agentic-campaign-generator.lock.yml
Copy link
Contributor Author

Copilot AI commented Jan 19, 2026

@copilot merge main and recompile

Merged main branch and recompiled workflows in commit 22ddca7. All conflicts resolved and workflow lock files regenerated.

Copilot AI requested a review from pelikhan January 19, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code Quality] Fix TOCTOU race condition in Docker image download logic

3 participants