-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow Provider(s)
cncf-kubernetes
Versions of Apache Airflow Providers
apache-airflow-providers-cncf-kubernetes 8.3.1
Apache Airflow version
2.9.2
Operating System
Debian GNU/Linux 12 (bookworm)
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
When reattach_on_restart option is on, SparkKubernetesOperator tries to find already launched driver pod by labels: dag_id, task_id, run_id. But operator doesn't add these labels to the driver, so there is no guarantee it can find that driver in case it exists. Operator can find already launched driver only if mentioned labels were specified for driver in parameters of itself.
What you think should happen instead
SparkKubernetesOperator should add labels dag_id, task_id, run_id to specification of SparkApplication for driver and executor. Specification come from application_file or template_spec parameters, and then it become template_body parameter. It is easy to add labels to template_body parameter, because operator has a context that keep all values for mentioned labels.
How to reproduce
Start SparkApplication using SparkKubernetesOperator. Don't specify dag_id, task_id, run_id labels in parameters for driver and executor (for example in application_file parameter). Then task pod submitting SparkApplication will have mentioned labels, but driver and executors pods will not.
That is a problem for reattach_on_restart logic, because it is searching driver by that labels.
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct