EmrContainerOperator in Async mode doesn't respect default "infinite" polling number #40483
Closed
2 tasks done
Labels
area:providers
good first issue
kind:bug
This is a clearly a bug
provider:amazon-aws
AWS/Amazon - related issues
Apache Airflow Provider(s)
amazon
Versions of Apache Airflow Providers
apache-airflow-providers-amazon[aiobotocore]==8.24.0
Apache Airflow version
2.7.3
Operating System
"Debian GNU/Linux 11 (bullseye)"
Deployment
Official Apache Airflow Helm Chart
Deployment details
Deployment to EKS
What happened
EMR EKS job timedout unexpectedly with error (EMRContainerOperator) when used in deferred mode:
While not providing any max_attempts
What you think should happen instead
The job should poll until it becomes FAILD or SUCCESSFUL
How to reproduce
Trigger long running job (over 5 hrs) using EMRContainerOperator in Async/Deferred mode
Anything else
I believe it's caused by the defaults defined here:
airflow/airflow/providers/amazon/aws/triggers/emr.py
Lines 185 to 186 in 6c12744
This contradicts documentation: https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/_api/airflow/providers/amazon/aws/operators/emr/index.html#airflow.providers.amazon.aws.operators.emr.EmrContainerOperator
Which outlines:
max_polling_attempts (int | None) – Maximum number of times to wait for the job run to finish. Defaults to None, which will poll until the job is not in a pending, submitted, or running state.
Which doesn't seem to be the case and hence raising this as an Issue.
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: