-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Fix LocalExecutor memory spike by applying gc.freeze #58365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. but there is the comment on the immediate ramping up of remaining workers, and definitely we need more than one pair of eyes to take a look.
|
I also marked it as backportable to 3-1-test. That would be a fantastic bugfix for 3.1.4 Local Executor memory usage. |
c530aee to
e3188f9
Compare
|
0728297 to
e3188f9
Compare
9f32807 to
83d1381
Compare
435be23 to
88b1153
Compare
|
@ashb I’ve incorporated the feedback! |
|
@ashb - any comment? I would love to merge this one today in preparation for upcoming 3.1.4 |
|
I merged for 3.1.4, We can always iterate in the future and the results of tests sounds very plausible . Thanks @wjddn279 and we can I think work on other memory optimisations after that. |
) * fix local executor issue caused by cow * fix test * fix test * remove gc utils * fix test to prevent timeout * fix tests * fix tests * fix tests (cherry picked from commit baec49a) Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com>
* fix local executor issue caused by cow * fix test * fix test * remove gc utils * fix test to prevent timeout * fix tests * fix tests * fix tests
|
#protm self nominating |
* fix local executor issue caused by cow * fix test * fix test * remove gc utils * fix test to prevent timeout * fix tests * fix tests * fix tests
related: #58143
Body
As discussed (not confirmed yet), this resolves the issue of sudden memory usage spikes in worker processes when using LocalExecutor. Memory increases due to unnecessary copying of read-only shared memory through COW caused by gc. By applying gc.freeze and moving existing objects to the permanent generation, we prevent COW from occurring.
When using fork mode, we create many worker processes at once to minimize gc.freeze and unfreeze calls. When using spawn mode, we maintain the existing approach to ensure stability.
Benchmark
memory usage
Comparison of per-process memory usage in LocalExecutor before and after applying this PR. Measured in the same environment running 500 tasks per minute for 12 hours.
gc.freeze / unfreeze performance (elapsed time)
We measured the elapsed time of gc.freeze and gc.unfreeze for each scheduler loop iteration. Most operations took microseconds, confirming virtually no impact. The actual operation is a very lightweight process that simply marks objects in the current generation as permanent generation without any memory copying.https://github.com/python/cpython/pull/3705/files
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.