-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increment try_number while clearing deferred tasks. #38984
Conversation
# When the task is deferred the try_number is decremented so that the same try | ||
# number is used when the task resumes execution to process the event. But in case of | ||
# clearing the try number should be incremented so that the next run doesn't reuse the same try_number | ||
if ti.state == TaskInstanceState.DEFERRED: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe clear action is meant to quickly restart things. However, here we are increasing the try_number nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirrao Sorry, I don't understand your comment. When the task moves from running to deferred the try number is decremented so that when task resumes from the trigger to running the try number is incremented so that same try number is used and helps with logging etc. When the task is cleared then the try number is incremented but since it's in deferred state with try number decremented it causes issues where same log file is used, retry count mismatch etc. for different attempts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but we should definitely get a second approval on this one.
cc: @dstandish |
this should be obsoleted by this PR #39336 i'm just working through bunch of test fixes cus we make a lot of assertions about try_number because it's always been so problematic i would suggest we close this one |
@dstandish I agree the linked PR is a good long term solution but this PR is good enough for a bugfix in 2.9.x series with 2.10 release few months away. |
Sure no problem |
46f3b89
to
f450542
Compare
@tirkarthi @dstandish do we want to merge this PR for 2.9.2? |
@eladkal I think we cannot merge this anymore now that #39336 is merged. At least not into main. I think the only way this could be released is if it's patched on a 2.9.2 branch specifically. WDYT? |
Then its up to @ephraimbuddy |
We can only PR it to v2-9-test but not stable. However, we don't usually do that. We only PR against the test branch if we have issues during the release process and it's mostly CI changes. cc @potiuk |
I am okay with closing this since main has a long term fix and 2.10 will be released in few months hopefully. We had this patch since we found this during our upgrade to 2.7.0 using deferrable operators and the patch has been working well. |
Yeah. 2.10 is not that far away - and until then @tirkarthi can simply run their forked Airlfow until they upgrade to 2.10. |
closes: #38735
related: #38735
This is already done while marking tasks as success/failure. Should this be done for up_for_reschedule tasks too? See also #30669
airflow/airflow/api/common/mark_tasks.py
Lines 161 to 164 in d03ba59