Fix Redis import race condition in Celery executor#61362
Merged
shahar1 merged 2 commits intoapache:mainfrom Feb 10, 2026
Merged
Fix Redis import race condition in Celery executor#61362shahar1 merged 2 commits intoapache:mainfrom
shahar1 merged 2 commits intoapache:mainfrom
Conversation
9e07313 to
fbaaefb
Compare
Member
hussein-awala
left a comment
There was a problem hiding this comment.
This looks good to me, @avolant could you check the unit tests, looks like there are some failures in the CI
fbaaefb to
550c5a5
Compare
Contributor
Author
|
I removed 4 broken unit tests for the redis import in c57aaef Why the tests failed:
Why we don't need them:
|
Pre-import redis.client before timeout context to prevent SIGALRM from interrupting module initialization and leaving redis module partially cached in sys.modules without client submodule properly bound. Fixes apache#41359
c57aaef to
4b0df79
Compare
shahar1
approved these changes
Feb 10, 2026
|
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
81 tasks
Alok-kumar-priyadarshi
pushed a commit
to Alok-kumar-priyadarshi/airflow
that referenced
this pull request
Feb 11, 2026
Ratasa143
pushed a commit
to Ratasa143/airflow
that referenced
this pull request
Feb 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The Airflow Celery executor experienced sporadic
"module 'redis' has no attribute 'client'"errors in production. This occurred when the 1-second POSIX signal-based timeout (SIGALRM) interrupted Redis module initialization, leaving the module partially cached in sys.modules without the client submodule properly bound to the parent namespace.Root Cause: The timeout() context manager in send_task_to_executor() and fetch_celery_task_state() could fire during redis module import (triggered by Celery's apply_async() or state access), interrupting the import before redis.client was fully initialized. Python would cache the incomplete module, causing all subsequent attempts to access redis.client to fail with AttributeError until the scheduler pod was restarted.
Production Impact
Solution
Pre-import
redis.clientbefore entering the timeout context in both critical functions. This ensures modules are fully loaded before any signal interruptions can occur, completely eliminating the race condition.Implementation
redis.clientbeforewith timeout(...)insend_task_to_executor()(line 274-281)redis.clientbeforewith timeout(...)infetch_celery_task_state()(line 306-311)Design Decisions
Testing
Performance Impact
References