Skip to content

Conversation

@romsharon98
Copy link
Contributor

When running Airflow with the KubernetesExecutor (4 schedulers, kubernetes.worker_pods_creation_batch_size=16), pod creation occasionally fails with:

Pod creation failed with reason 'Conflict'

This happens because the namespace has a ResourceQuota. When multiple pods are created simultaneously, the quota controller must update the ResourceQuota object to reflect new usage. Concurrent updates cause a 409 Conflict, and the admission of some pods fails. As a result, the corresponding Airflow tasks are marked as failed even though there are still available resources under the quota.

This PR addresses the issue by adding a retry mechanism for pod creation when a 409 Conflict is encountered, ensuring tasks are not marked as failed due to transient quota update conflicts.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues labels Aug 21, 2025
@romsharon98 romsharon98 requested a review from eladkal August 21, 2025 12:20
@potiuk
Copy link
Member

potiuk commented Aug 21, 2025

Some tests fail :(

@eladkal eladkal merged commit 7162153 into apache:main Aug 23, 2025
86 checks passed
mangal-vairalkar pushed a commit to mangal-vairalkar/airflow that referenced this pull request Aug 30, 2025
* Retry on 409 conflict

* Change comment

* linting

* change or and hirarchy

---------

Co-authored-by: rom-impala <rom@getimpala.ai>
nothingmin pushed a commit to nothingmin/airflow that referenced this pull request Sep 2, 2025
* Retry on 409 conflict

* Change comment

* linting

* change or and hirarchy

---------

Co-authored-by: rom-impala <rom@getimpala.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants