[Core] Spot preemption related retries do not count towards the max retries #50640
Labels
core
Issues that should be addressed in Ray Core
enhancement
Request for new feature and/or capability
P1
Issue that should be fixed within a few weeks
Description
User is seeing spot instance preemption and it is causing their Jobs to fail due to them exhausting their max_restarts and max_task_retries. Ideally they want to retry infinitely for spot preemption caused failures.
Use case
No response
The text was updated successfully, but these errors were encountered: