You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pre-SoD stall was pragmatically rather than conceptually grounded: the scheduler had literally got stuck and didn't know what to do about it. There were no more tasks to run and one or more failed or unsatisfied waiting tasks in the pool.
The unsatisfied waiting tasks part could lead to normal workflow completion being incorrectly identified as a stall, because of all the wholly-unsatisfied waiting tasks spawned ahead even though they might not be needed.
Post-SoD there are no wholly-unsatisfied waiting tasks and there will soon (#3822) be no partially-satisfied ones either
(just partially-satisfied prerequisites in a hidden pool, and as the example in the doc section linked to above shows they can't be used to reliably identify a stall).
What stall should mean: the scheduler can't do anything more, but it knows that the flow is not finished.
The only way valid to make that determination now is if there are unhandled failed tasks in the pool. They are, by definition, task outcomes that were not meant to happen.
So:
if the active pool is empty:
completed
else if the active pool contains only unhandled failed tasks:
stalled
else:
still running
At normal shutdown or stall log any partially satisfied prerequisites in case they point to a flow design error, but in general we can't assume they were "meant" to be completed.
(Note special treatment of unhandled failed tasks is still under discussion; if that special treatment is revoked there will be no stall concept at all anymore).
The text was updated successfully, but these errors were encountered:
Better workflow completion handling (SoD Proposal)
Long story short:
Pre-SoD stall was pragmatically rather than conceptually grounded: the scheduler had literally got stuck and didn't know what to do about it. There were no more tasks to run and one or more failed or unsatisfied waiting tasks in the pool.
The unsatisfied waiting tasks part could lead to normal workflow completion being incorrectly identified as a stall, because of all the wholly-unsatisfied waiting tasks spawned ahead even though they might not be needed.
Post-SoD there are no wholly-unsatisfied waiting tasks and there will soon (#3822) be no partially-satisfied ones either
(just partially-satisfied prerequisites in a hidden pool, and as the example in the doc section linked to above shows they can't be used to reliably identify a stall).
What stall should mean: the scheduler can't do anything more, but it knows that the flow is not finished.
The only way valid to make that determination now is if there are unhandled failed tasks in the pool. They are, by definition, task outcomes that were not meant to happen.
So:
At normal shutdown or stall log any partially satisfied prerequisites in case they point to a flow design error, but in general we can't assume they were "meant" to be completed.
(Note special treatment of unhandled failed tasks is still under discussion; if that special treatment is revoked there will be no stall concept at all anymore).
The text was updated successfully, but these errors were encountered: