You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to add backoffLimit to DaskJobs? Kubernetes jobs have this argument so that the job is reported as failed only it the pod fails a certain number of times (see below). Could we add these to DaskJobs as well? I have been using this argument in jobs because Dask sometimes "just hangs/crashes" in very long jobs and restarting the job fixes that.
I agree that this would be a good improvement. Perhaps instead of making the DaskJob behave the same way as Job we should replace the internal Pod in the DaskJob with a Job so that we can leverage the existing functionality.
Would it be possible to add backoffLimit to DaskJobs? Kubernetes jobs have this argument so that the job is reported as failed only it the pod fails a certain number of times (see below). Could we add these to DaskJobs as well? I have been using this argument in jobs because Dask sometimes "just hangs/crashes" in very long jobs and restarting the job fixes that.
The text was updated successfully, but these errors were encountered: