Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to keep Ray clusters after a job has finished #794

Merged
merged 7 commits into from
Jul 19, 2023

Conversation

psschwei
Copy link
Collaborator

@psschwei psschwei commented Jul 18, 2023

Summary

Fixes #783
Add option to keep Ray clusters after a job has finished

Details and comments

Creates a new environmental variable, RAY_CLUSTER_NO_DELETE_ON_COMPLETE, for the scheduler which allows operators to preserve Ray clusters after a job has completed.

Also adds a nodelete=true label to the Ray cluster, which will allow a simple kubectl delete raycluster -l nodelete=true to remove all clusters that won't be automatically cleaned up

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
psschwei added 2 commits July 18, 2023 13:52
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
@psschwei psschwei added the on-hold On hold due to any reason label Jul 18, 2023
@psschwei psschwei changed the title Add shutdown delay for Ray clusters to gateway Add option to keep Ray clusters after a job has finished Jul 18, 2023
psschwei added 4 commits July 19, 2023 08:21
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Copy link
Collaborator

@akihikokuroda akihikokuroda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks!

@psschwei psschwei merged commit 5513b55 into Qiskit:main Jul 19, 2023
@psschwei psschwei deleted the shutdown-delay branch July 19, 2023 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on-hold On hold due to any reason
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add shut-down-delay parameter for Ray clusters
2 participants