Skip to content

Conversation

@AutomationDev85
Copy link
Contributor

Overview

We found an issue with deferred mode and ErrImagePull handling. When the Triggerer emits an error event and the pod enters ErrImagePull, the KubernetesPodOperator continues waiting in await_pod_completion even though the pod will never start (image does not exist). This PR applies the same fast-fail logic used during startup to await_pod_completion, aborting the wait when image pull errors are detected. It also includes ImagePullBackOff so long check intervals don’t cause unnecessary retries and timeouts.

Change Summary

  • Add ImagePullBackOff to fast-fail detection.
  • Apply fast-fail detection in await_pod_completion to stop waiting on pods that cannot start.
  • Add unit tests covering await_pod_completion image pull error paths.

@jscheffl
Copy link
Contributor

jscheffl commented Dec 7, 2025

@AutomationDev85 seems a small mypy glitch blocks CI, can you fix this?

@AutomationDev85 AutomationDev85 force-pushed the bugfix/handle-image-not-exist-error-deferred branch from 55d257d to a77d0f4 Compare December 12, 2025 09:41
@AutomationDev85
Copy link
Contributor Author

AutomationDev85 commented Dec 12, 2025

@jscheffl Thanks! I fixed my glitch and rebased to fix the merge conflict.

@potiuk
Copy link
Member

potiuk commented Dec 13, 2025

Restarted failed checks - seems GitHub had hiccup

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. Looks good in my view.

@jscheffl jscheffl merged commit 2b92c97 into apache:main Dec 14, 2025
173 of 176 checks passed
TempestShaw pushed a commit to TempestShaw/airflow that referenced this pull request Dec 24, 2025
…ErrImagePull/ImagePullBackOff) (apache#59010)

* Abort await complete function if a fail fast error occured in the pod

* Fix type issue in unit test

---------

Co-authored-by: AutomationDev85 <AutomationDev85>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants