Skip to content

Conversation

@MaksYermak
Copy link
Contributor

In this PR I have added a try-except block for preventing the scheduler from crashing due to RecursionError when making a SQL query. This error happens when a session tries to commit a transaction with non-json-serializble parameters(task-instance, dag-runs, context, etc.) here.

The DAG which can reproduce this error:

from airflow.models.dag import DAG
from datetime import datetime
from airflow.operators.empty import EmptyOperator
from airflow.operators.python import PythonOperator

with DAG(
    dag_id="circular_conf_error",
    start_date=datetime(2021, 1, 1),
    schedule=None,
    catchup=False,
    max_active_runs=1,
    default_args={"retries": 0},
    description="Example of a DAG which can cause Airflow Scheduler crash.",
) as dag:

    start = EmptyOperator(task_id="start")

    def _create_circular_conf(**context):
        context["dag_run"].conf["steps"] = context["dag_run"].conf

    def _downstream_task(**context):
        print("Help!")

    create_circular_conf = PythonOperator(
        task_id="create_circular_conf", python_callable=_create_circular_conf
    )

    downstream_task = PythonOperator(
        task_id="downstream_task", python_callable=_downstream_task
    )

    end = EmptyOperator(task_id="end")

    # Define task dependencies
    start >> create_circular_conf >> downstream_task >> end

    # If you run the create_circular_conf task without a real downstream task
    # (note that the "end" task is an EmptyOperator which doesn't actually get run)
    # then the scheduler doesn't run into infinite sigterm loop.
    # start >> create_circular_conf >> end

This problem doesn't exist for AF3. Because for AF3 the code raise exception and failed immediately in place where discover the non-json-serializable data. In the AF2 the code shows the warning that using non-json-serializable parameters will be deprecated in AF3. It does not prevent airflow from crashing later with more dramatic consequences.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@MaksYermak MaksYermak force-pushed the prevent-scheduler-crash branch from 86ec93e to 2a4f3ba Compare September 18, 2025 12:42
@eladkal eladkal added this to the Airflow 2.11.1 milestone Sep 18, 2025
@MaksYermak MaksYermak force-pushed the prevent-scheduler-crash branch from 2a4f3ba to 99d1123 Compare September 19, 2025 08:28
@ashb
Copy link
Member

ashb commented Sep 19, 2025

Generally you should target main with any bug fixes, unless there is a specific reason why it only applies to the 2.x branch.

@MaksYermak
Copy link
Contributor Author

Generally you should target main with any bug fixes, unless there is a specific reason why it only applies to the 2.x branch.

@ashb as I mentioned in PR's description this problem doesn't exist in AF3(main branch). And it is a reason why I prepared this fix for 2.11 only.

This problem doesn't exist for AF3. Because for AF3 the code raise exception and failed immediately in place where discover the non-json-serializable data. In the AF2 the code shows the warning that using non-json-serializable parameters will be deprecated in AF3. It does not prevent airflow from crashing later with more dramatic consequences.

@ashb
Copy link
Member

ashb commented Sep 19, 2025

Whoops sorry. That'll teach me for doing PR reviews before the coffee has kicked in

@MaksYermak MaksYermak closed this Oct 23, 2025
@MaksYermak MaksYermak reopened this Oct 23, 2025
@MaksYermak
Copy link
Contributor Author

Hello @ashb , @kaxil !
Could you please check this PR?

@MaksYermak MaksYermak force-pushed the prevent-scheduler-crash branch from 99d1123 to 3250c43 Compare November 19, 2025 15:39
@MaksYermak
Copy link
Contributor Author

Hello @potiuk,
Could you please help me to fix the CI here, because I am out of ideas here and do not know what the problem is?

@potiuk
Copy link
Member

potiuk commented Nov 22, 2025

This is going to be fixed once I merge finally #51681 which I will be working on in the next days much more,

@potiuk
Copy link
Member

potiuk commented Nov 22, 2025

3.1.3 out -> so we can focus now on 2.11.1 :)

@VladaZakharova
Copy link
Contributor

Hi @potiuk ! thank you for checking this one :)
The question is, can it be still merged to 2.x? If I right understand, the time for supporting AF2 has already passed the deadline. Are you sure we can merge it?

@potiuk
Copy link
Member

potiuk commented Nov 25, 2025

IT's not passed. I am working on 2.11.1 soon, and I will decide then what to merge @VladaZakharova

We are committed to releasing criticial and security fixes for Airflow 2 until May 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants