-
Notifications
You must be signed in to change notification settings - Fork 964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controllers: fix miscalculation of RunningDuration when killing job #3719
Conversation
Welcome @matbme! |
7aecf9a
to
2fb4d7a
Compare
/assign @Monokaix |
2fb4d7a
to
04870f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
/lgtm
/lgtm |
Hi, please rebase master. |
/assign @hwdef |
Could you please paste a test result or add an ut? |
Here's a manual test I did with and without this PR applied: # ---------- Without PR ----------
$ k describe jobs.batch.volcano.sh test-workload-1 | grep -B 11 'Running Duration'
Status:
Conditions:
Last Transition Time: 2024-09-17T19:25:22Z
Status: Pending
Last Transition Time: 2024-09-17T19:25:54Z
Status: Running
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completing
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completed
Min Available: 1
Running Duration: 6m8.810943003s
$ k -n volcano-system rollout restart deployment volcano-controllers
deployment.apps/volcano-controllers restarted
$ k describe jobs.batch.volcano.sh test-workload-1 | grep -B 11 'Running Duration'
Status:
Conditions:
Last Transition Time: 2024-09-17T19:25:22Z
Status: Pending
Last Transition Time: 2024-09-17T19:25:54Z
Status: Running
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completing
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completed
Min Available: 1
Running Duration: 8m36.103534833s
# ---------- With PR ----------
$ k describe jobs.batch.volcano.sh test-workload-1 | grep -B 11 'Running Duration'
Status:
Conditions:
Last Transition Time: 2024-09-17T19:25:22Z
Status: Pending
Last Transition Time: 2024-09-17T19:25:54Z
Status: Running
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completing
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completed
Min Available: 1
Running Duration: 51s
$ k -n volcano-system rollout restart deployment volcano-controllers
deployment.apps/volcano-controllers restarted
$ k describe jobs.batch.volcano.sh test-workload-1 | grep -B 11 'Running Duration'
Status:
Conditions:
Last Transition Time: 2024-09-17T19:25:22Z
Status: Pending
Last Transition Time: 2024-09-17T19:25:54Z
Status: Running
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completing
Last Transition Time: 2024-09-17T19:26:12Z
Status: Completed
Min Available: 1
Running Duration: 51s Note that Regarding unit tests, I originally intended to add a test for this scenario, but had trouble replicating this behavior inside a ut. Please let me know if you have any indicators on how I could achieve this (maybe a e2e would be more fitting). |
Please squash to one commit: ) |
6878089
to
72d3140
Compare
Uses the last transition time instead of current time to determine total running duration. Signed-off-by: Mateus Melchiades <mateus.melchiades@sap.com>
2646e56
to
c700b70
Compare
Done! |
@william-wang @Thor-wl @shinytang6 |
Hi! Any updates on the CI? |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Monokaix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
When restarting a controller pod, it calls
killJob
on completed jobs which will erroneously change theRunningDuration
value for said job, even though it had already been completed. This PR fixes the issue by using the last transition time instead of current time to determine total running duration insidekillJob
.Closes #3715.