add wait_policy in EmrCreateJobFlorOperator#61195
Conversation
6a6d427 to
01db99b
Compare
|
I think the logic might be less confusing for users if you allow them to set both If you want to use the current logic, I think you should clarify a few things in the param descriptions:
|
|
Are we "un-deprecating" |
|
@jroachgolf84 do you know why it was deprecated? Imo we should allow users to configure |
01db99b to
414ed3b
Compare
It looks to me like there was a bit of confusion maybe before? There was an issue created here that requested that
It looks to me like the Tagging @vincbeck who created the original issue to remove |
|
Let's try to untangle things here :) A per my understanding, before the operator had only Now, it seems you also want to have different options of waiting, which is fine. I do not think it is a problem to un-deprecate a parameter either. Though, we need to be careful and be backward compatible and handle all the different possibilities:
What do you think? |
Thanks for untangling this! The logic you proposed makes perfect sense. If everyone agrees with this proposal, I will proceed with the changes and add the deprecation warning |
The operator was incorrectly ignoring the `wait_policy` argument and always defaulting to waiting for cluster completion. This change ensures the `wait_policy` is correctly persisted and used to select the appropriate waiter (e.g., for step completion), fixing the hardcoded behavior.
414ed3b to
4bf75bc
Compare
The operator was incorrectly ignoring the `wait_policy` argument and always defaulting to waiting for cluster completion. This change ensures the `wait_policy` is correctly persisted and used to select the appropriate waiter (e.g., for step completion), fixing the hardcoded behavior.
The operator was incorrectly ignoring the `wait_policy` argument and always defaulting to waiting for cluster completion. This change ensures the `wait_policy` is correctly persisted and used to select the appropriate waiter (e.g., for step completion), fixing the hardcoded behavior.
The operator was incorrectly ignoring the `wait_policy` argument and always defaulting to waiting for cluster completion. This change ensures the `wait_policy` is correctly persisted and used to select the appropriate waiter (e.g., for step completion), fixing the hardcoded behavior.
What
This PR fixes a bug in
EmrCreateJobFlowOperatorwhere thewait_policyparameter was ignored, causing the operator to always default to the JobFlowWaiting waiter (waiting for the cluster to start) regardless of the user's input.The changes include:
Persisting Parameters: Ensuring
wait_policyis correctly persisted in the operator instance rather than being converted to a boolean and lost.Smart Initialization Logic: Updating init to allow users to provide both
wait_for_completionandwait_policywithout conflict.If
wait_policyis provided, wait_for_completion is implied to be True (unless explicitly set to False).If only
wait_for_completion=Trueis provided, wait_policy defaults toWAIT_FOR_COMPLETIONfor backward compatibility.Execution Update: Updating the execute method to select the correct boto3 waiter based on the stored
wait_policy.Deferrable Support: Passing the specific waiter name to
EmrCreateJobFlowTriggerto support this logic in deferrable mode.Why
Currently, if a user wants the operator to wait until the EMR cluster finishes all steps and terminates (using
WaitPolicy.WAIT_FOR_STEPS_COMPLETION), the operator fails to do so.It converts the policy to a boolean
wait_for_completion = Truein init and discards the specific policy type. Consequently, the execute method hardcodes the waiter toWAITER_POLICY_NAME_MAPPING[WaitPolicy.WAIT_FOR_COMPLETION].This behavior causes the task to be marked as "Success" as soon as the cluster enters the WAITING state. If the cluster subsequently fails during a step execution, Airflow does not catch the failure, leading to false positives in DAG runs.
closes: #61180
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.