-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix infinite retries with pools #1299
Fix infinite retries with pools #1299
Conversation
`SchedulerJob` contains a `set` of `TaskInstance`s called `queued_tis`. `SchedulerJob.process_events` loops through `queued_tis` and tries to remove completed tasks. However, without customizing `__eq__` and `__hash__`, the following two lines have no effect, never removing elements from `queued_tis` leading to infinite retries on failure. This is related to my comment on apache#216. The following code was introduced in the fix to apache#1225. ``` elif ti in self.queued_tis: self.queued_tis.remove(ti) ````
|
|
@pradhanpk task instances are hashable and added/removed from sets and dicts in many places in Airflow's code. Does this not work for you? import airflow
import datetime
from airflow.models import DAG, TaskInstance as TI
from airflow.operators import DummyOperator
start_date = datetime.datetime(2016, 1, 1)
dag = DAG('dag')
op = DummyOperator(task_id='op', dag=dag, start_date=start_date, owner='airflow')
ti = TI(op, start_date)
ti.__hash__() # [hash]
test_set = set()
test_set.add(ti)
len(test_set) # 1
test_set.remove(ti)
len(test_set) # 0 It looks like the problem you're seeing is simply because |
I've added the fix to #1290 with a test case, it will be available once that one is merged |
@jlowin I think the issue is that you want two
Isn't |
Ah, I understand what you mean. I don't think we want to force TI equality, though. If I create two Tis, they are not the same thing. However there is already a The issue you're trying to correct, where Tis are removed from |
@jlowin Heh, should have looked harder for |
Addresses the issue raised in apache#1299
SchedulerJob
contains aset
ofTaskInstance
s calledqueued_tis
.SchedulerJob.process_events
loops throughqueued_tis
and tries to remove completed tasks. However, without customizing__eq__
and__hash__
, the following two lines injobs.py
'sSchedulerJob.process_events
have no effect, never removing elements fromqueued_tis
leading to infinite retries on failure. This is related to my comment on #216. The following code was introduced in the fix to #1225.