-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JobSet TTL to clean up completed workloads #279
Comments
So I am a bit confused by this. Are you saying we should have a field in JobSet that will purge JobSet from etcd giving a TTL? Or are you wanting someone to test if one sets TTL in a Job, will the Job finish and then get deleted. And even if the JobSet was marked as success (when Jobs all finish), the JobSet will recreate the finished job because it was deleted? |
So individual jobs can use ttl without any issue.
Jobs are cleaned up and deleted from etcd but JobSet will still mark this was successful. I think we could add a ttl for JobSet as the JobSet will not be deleted. IMO this would be a nice feature for any CRD though.. |
Thanks for testing |
Yea I didn’t see jobs recreated in my case. |
Sounds good, thanks for taking a look at this. |
I think we actually need a TTL in JobSet to improve the user experience with Kueue integration. Right now, users have to manually delete all Workloads since JobSet has no TTL, so they either need to do it manually or have some process doing it periodically. If JobSet had a TTL, the JobSet + Workload object would be cleaned up automatically. |
Can you clean up the description? I think this could be a good first issue if we clear the objectives. Add a ttlafterFinished field, delete finished jobs sets after that time. My questions I would have is do we delete jobs and pods of finished jobsets? That would make debugging difficult but if they finished I guess we don’t need it. |
/help |
@kannon92: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I have capacity to help with this issue if no one else is looking at it. |
I guess we should only agree on the final question from @kannon92 and it should be a clear path to the finish line. |
/assign |
@danielvegamyhre what is your opinion on what Kevin asked for cleaning up jobs and pods after jobsets finishes? |
After the jobset finishes and the TTL has been reached, then we want the jobset and it's child resources (jobs, pods, service, etc.) to be deleted. |
Thanks @danielvegamyhre, I am good to proceed :) |
/retitle JobSet TTL to clean up completed workloads |
I was testing this feature. I noticed that pods and jobs were getting deleted with ttl but I did not see jobsets getting deleted. |
That is not good, do you want to open an issue with any logs you have noticed? |
@dejanzele can we add an e2e test as well? |
Sorry, I was mistaken. I created the example ttl job and noticed that jobset was still around but jobs/pods were deleted. But I tried testing again and I did actually see the jobset being deleted eventually. I'm not sure they will happen at the same time but I do see ttl working with JobSet/Jobs/Pods. |
@ahg-g ok, I'll create a PR |
We should look into having JobSet respect TTL so completed JobSets can be deleted from etcd and not take up space
https://kubernetes.io/docs/concepts/workloads/controllers/ttlafterfinished/
The text was updated successfully, but these errors were encountered: