Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support minSuccess #1384

Merged
merged 6 commits into from
Mar 31, 2021
Merged

Support minSuccess #1384

merged 6 commits into from
Mar 31, 2021

Conversation

zen-xu
Copy link
Contributor

@zen-xu zen-xu commented Mar 26, 2021

ref: #1379

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 26, 2021
@Thor-wl
Copy link
Contributor

Thor-wl commented Mar 26, 2021

/assign @william-wang @wpeng102 @shinytang6

@volcano-sh-bot
Copy link
Contributor

@Thor-wl: GitHub didn't allow me to assign the following users: wpeng102.

Note that only volcano-sh members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @william-wang @wpeng102 @shinytang6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -50,6 +50,10 @@ spec:
to the summary of tasks' replicas
format: int32
type: integer
minSuccess:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to set the replicas as the default value.

config/crd/v1beta1/batch.volcano.sh_jobs.yaml Show resolved Hide resolved

// The minimal success pods to run for this Job
// +optional
MinSuccess *int32 `json:"minSuccess,omitempty" protobuf:"varint,11,opt,name=minSuccess"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set this field as *int , not int?

@@ -56,6 +56,13 @@ func (ps *runningState) Execute(action v1alpha1.Action) error {
// when scale down to zero, keep the current job phase
return false
}

minSuccess := ps.job.Job.Spec.MinSuccess
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got the reason why you set it as *int instead of int. I perfer to int and set default value to relicas. That may be better. How do you think about it?

Copy link
Contributor Author

@zen-xu zen-xu Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this field should be *int.

MinSuccess is an optional feature, If the user has not set this field, the logic should same as the original.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this field should be *int.

MinSuccess is an optional feature, If the user has not set this field, the logic should same as the original.

Yes, it's reasonable.

status.State.Phase = vcbatch.Completed
return true
}

if status.Succeeded+status.Failed == jobReplicas {
if status.Succeeded >= ps.job.Job.Spec.MinAvailable {
status.State.Phase = vcbatch.Completed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use minSuccess, do we still need MinAvailable to determine job status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use MinAvailable to determine job status when users don't want to use MinSuccess

Copy link
Member

@shinytang6 shinytang6 Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, l agree. So maybe we can add another layer of logic inside if status.Succeeded+status.Failed == jobReplicas ? If minSuccess is specified, we use minSuccess instead of minAvailable.

Copy link
Contributor Author

@zen-xu zen-xu Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If minSuccess has matched,Job will be mark Completed. So it's not necessary to handle the next compare logic.

Copy link
Member

@shinytang6 shinytang6 Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l mean if minSuccess is specified while status.Succeeded < *minSuccess all the time, then after all the pods completed, it will enter if status.Succeeded+status.Failed == jobReplicas logic, we should replace if status.Succeeded >= ps.job.Job.Spec.MinAvailable with if status.Succeeded >= minSuccess in that case. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

zen-xu added 2 commits March 26, 2021 14:02
Signed-off-by: ZhengYu, Xu <zen-xu@outlook.com>
Signed-off-by: ZhengYu, Xu <zen-xu@outlook.com>
@Thor-wl
Copy link
Contributor

Thor-wl commented Mar 26, 2021

related issue: #1379

Signed-off-by: ZhengYu, Xu <zen-xu@outlook.com>
@Thor-wl
Copy link
Contributor

Thor-wl commented Mar 26, 2021

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 26, 2021
@zen-xu
Copy link
Contributor Author

zen-xu commented Mar 26, 2021

@Thor-wl needs approved label

@Thor-wl
Copy link
Contributor

Thor-wl commented Mar 27, 2021

/approve

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 27, 2021
@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 27, 2021
Signed-off-by: ZhengYu, Xu <zen-xu@outlook.com>
@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 27, 2021
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shinytang6, Thor-wl, zen-xu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 27, 2021
@zen-xu zen-xu mentioned this pull request Mar 31, 2021
@Thor-wl
Copy link
Contributor

Thor-wl commented Mar 31, 2021

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 31, 2021
@volcano-sh-bot volcano-sh-bot merged commit fd7a7f2 into volcano-sh:master Mar 31, 2021
@zen-xu zen-xu deleted the min-success branch April 6, 2021 07:23
@Thor-wl
Copy link
Contributor

Thor-wl commented May 26, 2021

There seems to be lack of doc design about this feature? @zen-xu

@zen-xu
Copy link
Contributor Author

zen-xu commented May 26, 2021

A little busy these weeks, I will complete it later.

@Thor-wl
Copy link
Contributor

Thor-wl commented May 26, 2021

A little busy these weeks, I will complete it later.

/assign @zen-xu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants