apply gc.freeze in dag-processor to improve memory performance #60505

wjddn279 · 2026-01-14T08:54:22Z

Motivation

discussed: https://lists.apache.org/thread/33hdp3hm705mzgrltv7o3468wvwbjsr3
closed: #56879

Insights

trying to apply gc.freeze / unfreeze cycle

First, to apply it in the same way as implemented in LocalExecutor, I perform gc.freeze and gc.unfreeze immediately before and after forking:

os.register_at_fork(before=gc.freeze, after_in_parent=gc.unfreeze)

However, after applying this, memory inspection revealed excessive memory leaks.

This is the existing (v3.1.5) memory graph pattern.

Looking at the graph shape, you can see heap memory dropping at specific intervals, which appears to be a typical pattern of old gc, so I inferred there might be a connection.
I believe objects that should be cleaned up when old gc (generation 2 gc) occurs are frozen and thus escape gc, continuing to accumulate. As shown below, if you forcibly collect gc before freezing or reduce the generation 2 gc threshold to an extreme low, memory doesn't increase:

def freeze():
	gc.collect()
	gc.freeze()
os.register_at_fork(before=freeze, after_in_parent=gc.unfreeze)

or

gc.set_threshold(700, 10, 1)

However, I judged that forcibly changing the gc flow would have very significant side effects, so I didn't apply this cycle.

apply it before parsing start

Instead, I inferred that simply freezing existing objects would be sufficient to help prevent COW.
There was a debate in the Python community about gc.freeze, and the main points are as follows:
https://discuss.python.org/t/it-seems-to-me-that-gc-freeze-is-pointless-and-the-documentation-misleading/71775

Even with gc.freeze applied, COW occurs when the ref count of actual objects changes.
Therefore, even if frozen, COW occurs for objects actually used in the fork process.
However, the COW prevention effect for unused objects is clear.

Since Airflow loads the same modules for all components and much of it goes unused, I judged that simply freezing these would be sufficient to prevent COW, and I froze objects created before the dag parsing loop runs.

Performance

I deployed both the existing 3.1.5 version image and an image with gc.freeze applied to k8s. I deployed the same plugins and dags to the dag-processor. The parsing stats are as follows (dag name is masked):

Bundle       File Path                                 PID  Current Duration      # DAGs    # Errors  Last Duration    Last Run At
-----------  --------------------------------------  -----  ------------------  --------  ----------  ---------------  -------------------
dags-folder  **/dynamic_dags_**_****.py              54374  7.01s                     47           0  27.83s           2026-01-14T08:21:15
dags-folder  **/dynamic_dags_**_*******.py                                            54           0  20.90s           2026-01-14T08:21:41
dags-folder  **/dynamic_dags_**.py                   54325  8.04s                     38           0  19.06s           2026-01-14T08:21:13
dags-folder  **/dynamic_dags_**_*******_**_*.py                                       13           0  18.43s           2026-01-14T08:21:35
dags-folder  **/dynamic_dags_**_*******_**_*.py                                        5           0  4.64s            2026-01-14T08:21:19

After monitoring memory usage for about two days, the results are as follows (x axis is time with KST):

I confirmed that the overall average memory usage is lower with gc.freeze, and the memory peak is also lower in the applied version. This difference can be attributed to improved memory usage due to COW prevention in the fork process when dag file parsing time is long. Looking broadly, both show a slight upward trend in memory usage, which I judge is ultimately a problem that needs to be resolved.

Was generative AI tooling used to co-author this PR?

Yes (please specify the tool below)

Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
When adding dependency, check compliance with the ASF 3rd Party License Policy.
For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

potiuk

Looks great - thanks for the thorough investigation. I am fine with early freezing, as this indeed should help.

What I think about the remaining memory growth - this might be connected with the imports initialized even before that - basically when airflow imports happen and I hope we will be able to get rid of it eventually when we implement explicit initialization rather than having all the import airlfow side effects we still have - so I would rather come back to the memory exercise after we do it.

potiuk · 2026-01-14T16:19:24Z

I added the usual suspects for reviews -> if there will be no more comments, we can merge it and backport for 3.1.7

potiuk · 2026-01-21T00:05:22Z

Merging.

potiuk · 2026-01-21T00:05:36Z

Thanks @wjddn279 !

(cherry picked from commit 9d31db3) Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com>

github-actions · 2026-01-21T00:06:32Z

Backport successfully created: v3-1-test

Status	Branch	Result
✅	v3-1-test

(cherry picked from commit 9d31db3) Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com>

apply gc.freeze in dag-processor

29cd4d4

wjddn279 requested review from ephraimbuddy and jedcunningham as code owners January 14, 2026 08:54

boring-cyborg bot added the area:DAG-processing label Jan 14, 2026

potiuk approved these changes Jan 14, 2026

View reviewed changes

potiuk added the backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch label Jan 14, 2026

potiuk added this to the Airflow 3.1.7 milestone Jan 14, 2026

potiuk requested review from amoghrajesh and kaxil January 14, 2026 16:18

wjddn279 mentioned this pull request Jan 16, 2026

Dag processor loses connection to db very often #56497

Closed

2 tasks

potiuk merged commit 9d31db3 into apache:main Jan 21, 2026
73 checks passed

github-actions bot pushed a commit that referenced this pull request Jan 21, 2026

[v3-1-test] apply gc.freeze in dag-processor (#60505)

e604a7b

(cherry picked from commit 9d31db3) Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com>

potiuk pushed a commit that referenced this pull request Jan 21, 2026

[v3-1-test] apply gc.freeze in dag-processor (#60505) (#60845)

aa566c3

(cherry picked from commit 9d31db3) Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com>

potiuk mentioned this pull request Jan 21, 2026

Fix DAG processor crash on MySQL connection failure during import error recording #59167

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

apply gc.freeze in dag-processor to improve memory performance #60505

apply gc.freeze in dag-processor to improve memory performance #60505

wjddn279 commented Jan 14, 2026 •

edited

Loading

Uh oh!

potiuk left a comment

Uh oh!

potiuk commented Jan 14, 2026

Uh oh!

potiuk commented Jan 21, 2026

Uh oh!

Uh oh!

potiuk commented Jan 21, 2026

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

apply gc.freeze in dag-processor to improve memory performance #60505

apply gc.freeze in dag-processor to improve memory performance #60505

Conversation

wjddn279 commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Insights

trying to apply gc.freeze / unfreeze cycle

apply it before parsing start

Performance

Was generative AI tooling used to co-author this PR?

Uh oh!

potiuk left a comment

Choose a reason for hiding this comment

Uh oh!

potiuk commented Jan 14, 2026

Uh oh!

potiuk commented Jan 21, 2026

Uh oh!

Uh oh!

potiuk commented Jan 21, 2026

Uh oh!

github-actions bot commented Jan 21, 2026

Backport successfully created: v3-1-test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wjddn279 commented Jan 14, 2026 •

edited

Loading