-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core/UX] Add Job API and support managed job (both on-demand and spot) #3419
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this great work @Michaelvll !! It looks awesome to me. Left some nits and it should be ready to go! 🚀
docs/source/images/job-dashboard.png
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the job 13, 14 not contains any [spot]/[On-demand] suffixes? Does that means it not require any special resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not add the suffix for On-Demand
only request for simplicity, and add the suffix for all other cases. Wdyt?
sky/utils/controller_utils.py
Outdated
# TODO(zhwu): Backward compatibility for the old config for managed | ||
# job controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is done iiuc?
Co-authored-by: Tian Xia <cblmemo@gmail.com>
Co-authored-by: Tian Xia <cblmemo@gmail.com>
Note: this PR require a(We found a way to handle this)sky start -f sky-spot-controller-xx
to make the commands likesky spot logs
,sky spot queue
work.Refer to the latest doc: https://skypilot.readthedocs.io/en/job-api/examples/managed-jobs.html#using-both-spot-and-on-demand-instances
TODO:
sky job queue
runs on an old controllerTested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh
sky spot launch -n pipeline tests/test_yamls/pipeline.yaml
; this PR:sky job queue
;sky job logs
;sky job dashboard
;sky job cancel
;sky job queue
bash -i tests/backward_comaptibility_tests.sh 0 7