-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Fix memory leak in remote logging connection cache #56695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The remote logging connection cache was using `@lru_cache` with the API client instance as a parameter. This caused client references to be retained in the cache indefinitely, preventing garbage collection and causing memory leaks when tasks created multiple client instances. The new implementation ensures connection details are cached for performance while allowing client instances to be properly garbage collected after use.
amoghrajesh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good investigation!
Backport failed to create: v3-1-test. View the failure log Run details
You can attempt to backport this manually by running: cherry_picker 416c73e v3-1-testThis should apply the commit to the v3-1-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continue |
|
Looks like some earlier change needs to be cherry-picked first |
The remote logging connection cache was using `@lru_cache` with the API client instance as a parameter. This caused client references to be retained in the cache indefinitely, preventing garbage collection and causing memory leaks when tasks created multiple client instances. The new implementation ensures connection details are cached for performance while allowing client instances to be properly garbage collected after use.
The remote logging connection cache was using `@lru_cache` with the API client instance as a parameter. This caused client references to be retained in the cache indefinitely, preventing garbage collection and causing memory leaks when tasks created multiple client instances. The new implementation ensures connection details are cached for performance while allowing client instances to be properly garbage collected after use.
The remote logging connection cache was using `@lru_cache` with the API client instance as a parameter. This caused client references to be retained in the cache indefinitely, preventing garbage collection and causing memory leaks when tasks created multiple client instances. The new implementation ensures connection details are cached for performance while allowing client instances to be properly garbage collected after use.
The remote logging connection cache was using `@lru_cache` with the API client instance as a parameter. This caused client references to be retained in the cache indefinitely, preventing garbage collection and causing memory leaks when tasks created multiple client instances. The new implementation ensures connection details are cached for performance while allowing client instances to be properly garbage collected after use. (cherry picked from commit 416c73e)
The remote logging connection cache was using `@lru_cache` with the API client instance as a parameter. This caused client references to be retained in the cache indefinitely, preventing garbage collection and causing memory leaks when tasks created multiple client instances. The new implementation ensures connection details are cached for performance while allowing client instances to be properly garbage collected after use.
)" This reverts commit 416c73e.
* Revert "Fix memory leak in remote logging connection cache (apache#56695)" This reverts commit 416c73e. * enable e2e ui test to install pnpm if not installed
* Revert "Fix memory leak in remote logging connection cache (apache#56695)" This reverts commit 416c73e. * enable e2e ui test to install pnpm if not installed
* Revert "Fix memory leak in remote logging connection cache (apache#56695)" This reverts commit 416c73e. * enable e2e ui test to install pnpm if not installed
The remote logging connection cache was using
@lru_cachewith the API client instance as a parameter. This caused client references to be retained in the cache indefinitely, preventing garbage collection and causing memory leaks when tasks created multiple client instances.The new implementation ensures connection details are cached for performance while allowing client instances to be properly garbage collected after use.
In Airflow 3.0.6 various tasks running on Celery failed with OOMs as the memory leaks were significant. After applying changes in this PR, the memory stayed mostly-flat and there were 0 task failures.
Celery Worker with 4 GB memory on Airflow 3.0.6

Celery Worker with 4 GB memory with the changes in this PR**

Part of #56641
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.