-
Notifications
You must be signed in to change notification settings - Fork 16.3k
fix get latest serialized_dag model query to prevent "Out of sort memory" error #55589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix get latest serialized_dag model query to prevent "Out of sort memory" error #55589
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
ca8869b to
7aea8e0
Compare
|
@potiuk |
|
Could you please rebase and fix the static checks too please. And it would be trivial for MySQL specific query: @classmethod
def latest_item_select_object(cls, dag_id):
from airflow.settings import engine
if engine.dialect.name == 'mysql':
# Prevent "Out of sort memory" caused by large values in cls.data column for MySQL. Details in https://github.com/apache/airflow/pull/55589
latest_item_id = select(cls.id).where(cls.dag_id == dag_id).order_by(cls.created_at.desc()).limit(1).scalar_subquery()
return select(cls).where(cls.id == latest_item_id)
else:
return select(cls).where(cls.dag_id == dag_id).order_by(cls.created_at.desc()).limit(1)or use
Apologies for the delay in review @wjddn279 |
7aea8e0 to
0454907
Compare
|
@kaxil |
|
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
… of sort memory" error (#55589) (#57042) * [v3-1-test] Fix Outlet Event Extra Data is Empty in Task Instance Success Listener (#54568) (#57031) Co-authored-by: Kevin Yang <85313829+sjyangkevin@users.noreply.github.com> * [v3-1-test] fix get latest serialized_dag model query to prevent "Out of sort memory" error (#55589) * fix get latest serialized_dag model query * fix get latest serialized_dag model query * add db type check logic (cherry picked from commit 757db27) Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Kevin Yang <85313829+sjyangkevin@users.noreply.github.com> Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com>
… of sort memory" error (#55589) (#57042) * [v3-1-test] Fix Outlet Event Extra Data is Empty in Task Instance Success Listener (#54568) (#57031) Co-authored-by: Kevin Yang <85313829+sjyangkevin@users.noreply.github.com> * [v3-1-test] fix get latest serialized_dag model query to prevent "Out of sort memory" error (#55589) * fix get latest serialized_dag model query * fix get latest serialized_dag model query * add db type check logic (cherry picked from commit 757db27) Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Kevin Yang <85313829+sjyangkevin@users.noreply.github.com> Co-authored-by: Jeongwoo Do <48639483+wjddn279@users.noreply.github.com>
Description
Hello,
While testing Airflow 3.0.6 on Kubernetes, I observed that the dag-processor keeps restarting.
Upon checking the pod logs, I found the following error:
After investigating the root cause, I found that some rows in serialized_dag contained data values exceeding 1MB

Since MySQL’s default sort_buffer_size is 256KB, any row where serialized_dag.data exceeds this size cannot fit into the buffer. As a result, the query fails with the “Out of sort memory” error and it makes pod restart.
There are three ways to solve this problem.
No. 1 requires the user to change arbitrarily, and it is difficult to figure out what side effects No. 2 will have on the system. Therefore, we propose to change the existing query in the following way.
The corresponding pr is the pr that contains the change point.
Since the modified query produces the same result as the original one, no additional test cases or changes to existing tests are required.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.