Skip to content

Task does not retry when worker is killed due to OOM #55753

@rawwar

Description

@rawwar

Apache Airflow version

main

If "Other Airflow 2 version" selected, which one?

No response

What happened?

When using CeleryExecutor, when a task gets killed due to OOM, it directly goes to failed state instead of retrying.

What you think should happen instead?

If retries are configured, it should go into "up_for_retry" state.

How to reproduce

from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime

def func():
    a = "asd"
    while True:
        a += a*100000

with DAG(
    dag_id="oom_example",
    start_date=datetime(2024, 1, 1),
    schedule="@daily",
    catchup=False,
    doc_md=__doc__,
    default_args={"retries": 3},
    tags=["oom"],
) as dag:
    hello_task = PythonOperator(
        task_id="oom_task",
        python_callable=func,
    )

Operating System

macos

Versions of Apache Airflow Providers

No response

Deployment

Astronomer

Deployment details

I tested this on Astronomer deployment as well as locally with breeze(on main branch) and was able to replicate this.

breeze command i used - breeze start-airflow -b postgres -P 17 --executor CeleryExecutor

Anything else?

On worker, I noticed this log:

2025-09-17 05:48:03.348301 [info     ] Task finished                  [supervisor] duration=3.201495791989146 exit_code=-9 final_state=failed

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions