Deferrable Operators don't respect execution_timeout
after being deferred
#19382
Closed
2 tasks done
Labels
affected_version:2.2
Issues Reported for 2.2
area:async-operators
AIP-40: Deferrable ("Async") Operators
area:core
kind:bug
This is a clearly a bug
Apache Airflow version
2.2.0
Operating System
Debian GNU/Linux 10 (buster)
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
What happened
When a task is resumed after being deferred, its
start_time
is not equal to the originalstart_time
, but to the timestamp when a task has resumed.In case a task has
execution_timeout
set up and it's running longer, it might not raise a timeout error, because technically a brand new task instance starts after being deferred.I know it's expected that it'd be a brand new task instance, but the documentation describes the behaviour with
execution_timeout
set differently (see below in "What you expected to happen")It is especially true, if an Operator needs to be deferred multiple times, so every time it continues after
defer
, time starts to count again.Some task instance details after an example task has completed:
What you expected to happen
Task failure with Timeout Exception.
Documentation says:
execution_timeout
on Operators is considered over the total runtime, not individual executions in-between deferrals - this means that ifexecution_timeout
is set, an Operator may fail while it's deferred or while it's running after a deferral, even if it's only been resumed for a few seconds.Also, I see the following code part trying to check the timeout value after the task is coming back from the deferral state:
But the issue is that
self.start_date
isn't equal to the original task'sstart_date
How to reproduce
DAG:
Since there're not so many async Operators at the moment I slightly modified
TimeDeltaSensorAsync
in order to simulate task work after defer.Here is the full code for
TimeDeltaSensorAsync
class I used for to reproduce the issue, the only difference is the line withtime.sleep(30)
to simulate post-processing after a trigger has completed.Anything else
I've checked the mark box "I'm willing to submit a PR", but not sure where to start, would be happy if someone could help me with the guidance in which direction I should look at.
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: