Backoff Limit Per Job #3774

jensentanlo · 2023-01-23T23:42:31Z

One-line PR description: Add New Indexed Job backoff limit mode that is counted per index rather than per job, allowing all indices to run to completion
Issue link: An Indexed Job mode that allows every index to execute kubernetes#109712
Other comments: I couldn't figure out which number to assign for this KEP so I left it blank for the moment

Add New Indexed Job backoff limit mode that is counted per index rather than per job, allowing all indices to run to completion Issue Link: kubernetes/kubernetes#109712

linux-foundation-easycla · 2023-01-23T23:42:34Z

❌ - login: @jensentanlo / name: Jensen Lo . The commit (7cbd213, 6113389, 39d51ad) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.

k8s-ci-robot · 2023-01-23T23:42:39Z

Welcome @jensentanlo!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2023-01-23T23:42:40Z

Hi @jensentanlo. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2023-01-23T23:42:40Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jensentanlo
Once this PR has been reviewed and has the lgtm label, please assign kow3ns for approval by writing /assign @kow3ns in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/sig-apps/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kannon92 · 2023-02-07T06:37:21Z

Other comments: I couldn't figure out which number to assign for this KEP so I left it blank for the moment

From my understanding you create an issue in the Kubernetes Enhancement repo and use that issue number as the KEP number.

Also don't forget to sign the CLA.

kannon92 · 2023-02-07T06:45:22Z

keps/sig-apps/NNNN-backoff-limits-per-index-for-indexed-jobs/README.md

+indexed jobs were used as the basis for a suite of long-running integration tests,
+then each test run would only be able to find a single test failure.
+
+More generally, this use case is for any situation when all the indices in an indexed job 


I find this comment to be a bit confusing. You mention above that the implementation does not cover the situation where the workload is truly embarrassingly parallel.

But then you also mention that the use case is for when all indices are independent. I always thought that was what embarrassingly parallel meant all the cases were independent?

Thank you for the feedback, I absolutely agree that it's confusing. I meant that the current implementation does not cover the use case when all indices are independent/embarassingly parallel, and that this KEP would cover it.

I've pushed a revised version that hopefully clears it up

soltysh · 2023-02-07T09:35:33Z

@jensentanlo I'd also suggest bringing this topic to sig-apps meeting before submitting a KEP, to present the idea, and get feedback from the sig about the direction and feasibility

alculquicondor · 2023-02-07T13:05:09Z

/cc

jensentanlo · 2023-02-07T14:08:23Z

@jensentanlo I'd also suggest bringing this topic to sig-apps meeting before submitting a KEP, to present the idea, and get feedback from the sig about the direction and feasibility

Hi soltysh, I've been discussing this on-and-off for a while with some members of sig-apps on this Github issue: kubernetes/kubernetes#109712

I'd be happy to bring it to a sig-apps meetings if that would be helpful; what's the procedure to get on the agenda?

soltysh · 2023-02-08T09:58:00Z

I'd be happy to bring it to a sig-apps meetings if that would be helpful; what's the procedure to get on the agenda?

Just add your topic with your name/github handle to our agenda for the next one, which will be Feb 20th. The meetings are open to everyone to join and discuss these topics in a more interactive form 😄

Backoff Limit Per Job

7cbd213

Add New Indexed Job backoff limit mode that is counted per index rather than per job, allowing all indices to run to completion Issue Link: kubernetes/kubernetes#109712

k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/apps Categorizes an issue or PR as relevant to SIG Apps. labels Jan 23, 2023

k8s-ci-robot requested review from kow3ns and soltysh January 23, 2023 23:42

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 23, 2023

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 23, 2023

Add API spec

6113389

kannon92 reviewed Feb 7, 2023

View reviewed changes

k8s-ci-robot requested a review from alculquicondor February 7, 2023 13:05

jensentanlo mentioned this pull request Feb 7, 2023

Backoff Limit Per Index For Indexed Jobs #3850

Closed

12 tasks

Add KEP number, clean-up motivation section

39d51ad

jensentanlo closed this May 4, 2023

mimowo mentioned this pull request May 17, 2023

Backoff limit per Job Index #3967

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Backoff Limit Per Job #3774

Backoff Limit Per Job #3774

Uh oh!

jensentanlo commented Jan 23, 2023

Uh oh!

linux-foundation-easycla bot commented Jan 23, 2023 •

edited

Loading

Uh oh!

k8s-ci-robot commented Jan 23, 2023

Uh oh!

k8s-ci-robot commented Jan 23, 2023

Uh oh!

k8s-ci-robot commented Jan 23, 2023

Uh oh!

kannon92 commented Feb 7, 2023

Uh oh!

kannon92 Feb 7, 2023

Uh oh!

jensentanlo Feb 7, 2023

Uh oh!

soltysh commented Feb 7, 2023

Uh oh!

alculquicondor commented Feb 7, 2023

Uh oh!

jensentanlo commented Feb 7, 2023

Uh oh!

soltysh commented Feb 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Backoff Limit Per Job #3774

Backoff Limit Per Job #3774

Uh oh!

Conversation

jensentanlo commented Jan 23, 2023

Uh oh!

linux-foundation-easycla bot commented Jan 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jan 23, 2023

Uh oh!

k8s-ci-robot commented Jan 23, 2023

Uh oh!

k8s-ci-robot commented Jan 23, 2023

Uh oh!

kannon92 commented Feb 7, 2023

Uh oh!

kannon92 Feb 7, 2023

Choose a reason for hiding this comment

Uh oh!

jensentanlo Feb 7, 2023

Choose a reason for hiding this comment

Uh oh!

soltysh commented Feb 7, 2023

Uh oh!

alculquicondor commented Feb 7, 2023

Uh oh!

jensentanlo commented Feb 7, 2023

Uh oh!

soltysh commented Feb 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

linux-foundation-easycla bot commented Jan 23, 2023 •

edited

Loading