-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Backoff Limit Per Job #3774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backoff Limit Per Job #3774
Conversation
Add New Indexed Job backoff limit mode that is counted per index rather than per job, allowing all indices to run to completion Issue Link: kubernetes/kubernetes#109712
|
|
Welcome @jensentanlo! |
|
Hi @jensentanlo. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jensentanlo The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
From my understanding you create an issue in the Kubernetes Enhancement repo and use that issue number as the KEP number. Also don't forget to sign the CLA. |
| indexed jobs were used as the basis for a suite of long-running integration tests, | ||
| then each test run would only be able to find a single test failure. | ||
|
|
||
| More generally, this use case is for any situation when all the indices in an indexed job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this comment to be a bit confusing. You mention above that the implementation does not cover the situation where the workload is truly embarrassingly parallel.
But then you also mention that the use case is for when all indices are independent. I always thought that was what embarrassingly parallel meant all the cases were independent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback, I absolutely agree that it's confusing. I meant that the current implementation does not cover the use case when all indices are independent/embarassingly parallel, and that this KEP would cover it.
I've pushed a revised version that hopefully clears it up
|
@jensentanlo I'd also suggest bringing this topic to sig-apps meeting before submitting a KEP, to present the idea, and get feedback from the sig about the direction and feasibility |
|
/cc |
Hi soltysh, I've been discussing this on-and-off for a while with some members of sig-apps on this Github issue: kubernetes/kubernetes#109712 I'd be happy to bring it to a sig-apps meetings if that would be helpful; what's the procedure to get on the agenda? |
Just add your topic with your name/github handle to our agenda for the next one, which will be Feb 20th. The meetings are open to everyone to join and discuss these topics in a more interactive form 😄 |
One-line PR description: Add New Indexed Job backoff limit mode that is counted per index rather than per job, allowing all indices to run to completion
Issue link: An Indexed Job mode that allows every index to execute kubernetes#109712
Other comments: I couldn't figure out which number to assign for this KEP so I left it blank for the moment