Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core/UX] Add Job API and support managed job (both on-demand and spot) #3419

Merged
merged 174 commits into from
May 5, 2024
Merged
Show file tree
Hide file tree
Changes from 153 commits
Commits
Show all changes
174 commits
Select commit Hold shift + click to select a range
f1c5176
Refactor spot core APIs to `sky.spot.core`
Michaelvll Apr 21, 2024
120cf45
Add comment
Michaelvll Apr 21, 2024
5f947fb
fix
Michaelvll Apr 21, 2024
deb4781
format
Michaelvll Apr 21, 2024
ef21314
change to spot_lib instead
Michaelvll Apr 4, 2024
8a6d4b8
change spot to job
Michaelvll Apr 4, 2024
3df1c13
rename modules
Michaelvll Apr 4, 2024
4b81f5d
rename to managed job
Michaelvll Apr 4, 2024
925281e
fix
Michaelvll Apr 4, 2024
11f931d
Allow on-demand for managed job
Michaelvll Apr 4, 2024
b6781aa
fix launch
Michaelvll Apr 4, 2024
20b067c
Fixes names
Michaelvll Apr 4, 2024
6c85015
rename to job controller
Michaelvll Apr 4, 2024
18bcead
rename to job controller
Michaelvll Apr 4, 2024
cf248d4
Fix job recovery
Michaelvll Apr 4, 2024
0cb98d0
format
Michaelvll Apr 4, 2024
7b70d32
Add CLI alias
Michaelvll Apr 4, 2024
70fb42b
format
Michaelvll Apr 4, 2024
934721f
rename
Michaelvll Apr 4, 2024
c63eecd
improve resources
Michaelvll Apr 4, 2024
fa3f2ea
fix doc
Michaelvll Apr 4, 2024
0db05d8
fix test
Michaelvll Apr 4, 2024
2d8ca95
fix _cpus
Michaelvll Apr 4, 2024
cd63b63
fallback to old controller
Michaelvll Apr 4, 2024
6fdd8c6
fix unit test
Michaelvll Apr 4, 2024
9737593
backward compat job
Michaelvll Apr 4, 2024
941f096
change to --managed-job
Michaelvll Apr 4, 2024
66f7e2b
fix job_recovery
Michaelvll Apr 4, 2024
7ef2fb9
refactor schemas
Michaelvll Apr 4, 2024
b6db72f
remove resources not having price
Michaelvll Apr 21, 2024
afe62dd
format
Michaelvll Apr 5, 2024
4cef17c
fix
Michaelvll Apr 21, 2024
dbea520
fix managed job
Michaelvll Apr 21, 2024
85645ae
format
Michaelvll Apr 21, 2024
25dea6d
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 21, 2024
66241cb
format
Michaelvll Apr 21, 2024
c0c9693
fix
Michaelvll Apr 22, 2024
119121f
fix type for resource str
Michaelvll Apr 22, 2024
ae619f8
Fix test smoke
Michaelvll Apr 22, 2024
572c1ca
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 22, 2024
91ad666
add request output
Michaelvll Apr 22, 2024
839325d
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 24, 2024
a043887
Merge and format
Michaelvll Apr 24, 2024
a1415f7
merge error fix
Michaelvll Apr 24, 2024
8446250
fix merge issue
Michaelvll Apr 24, 2024
ccf9a44
fix output
Michaelvll Apr 24, 2024
e8abd9a
fix test
Michaelvll Apr 24, 2024
fa2f0ff
rename to jobs
Michaelvll Apr 24, 2024
187df84
Replace spot_recovery to job_recovery
Michaelvll Apr 24, 2024
04dcc6b
rename spot_ to jobs_
Michaelvll Apr 24, 2024
db12120
format
Michaelvll Apr 24, 2024
8abcdca
add legacy job signal
Michaelvll Apr 24, 2024
04f5207
address comments
Michaelvll Apr 24, 2024
bda6ddd
renames
Michaelvll Apr 25, 2024
7a52581
Fix controller type
Michaelvll Apr 25, 2024
0573341
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 25, 2024
951c048
incorporate #2080
Michaelvll Apr 25, 2024
e67347d
fix managed jobs
Michaelvll Apr 25, 2024
50849c5
fix dashboard
Michaelvll Apr 25, 2024
81e7d8c
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 25, 2024
a4dd6ed
Fix test
Michaelvll Apr 25, 2024
df02544
remove old code
Michaelvll Apr 25, 2024
6547077
format
Michaelvll Apr 25, 2024
2b2cab6
update doc
Michaelvll Apr 26, 2024
0af2115
fix managed jobs
Michaelvll Apr 26, 2024
a57c8f7
Address comments
Michaelvll Apr 26, 2024
1fac5fb
address comments
Michaelvll Apr 26, 2024
9feb2ad
Add raw file
Michaelvll Apr 26, 2024
eec4760
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 26, 2024
493e828
only use aws and gcp for pipeline
Michaelvll Apr 26, 2024
aaf91c2
Fix doc
Michaelvll Apr 28, 2024
5682559
use managed-jobs
Michaelvll Apr 28, 2024
5cb9c4a
Fix back
Michaelvll Apr 28, 2024
29e6baa
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 29, 2024
28358b2
fix
Michaelvll Apr 29, 2024
708c0c6
address comments
Michaelvll Apr 29, 2024
0fc5043
Fix
Michaelvll Apr 29, 2024
4d6c6ee
format
Michaelvll Apr 29, 2024
3404f2a
fix
Michaelvll Apr 29, 2024
23e979b
Fix
Michaelvll Apr 29, 2024
431f567
minor
Michaelvll Apr 29, 2024
97e5dfa
Update sky/cli.py
Michaelvll Apr 30, 2024
69331ff
Update sky/job/utils.py
Michaelvll Apr 30, 2024
38212e3
Update sky/cli.py
Michaelvll Apr 30, 2024
238bc8b
Update sky/job/state.py
Michaelvll Apr 30, 2024
0580d1a
Update sky/job/core.py
Michaelvll Apr 30, 2024
04d4d6e
Update sky/job/utils.py
Michaelvll Apr 30, 2024
e02e2cf
Update sky/job/core.py
Michaelvll Apr 30, 2024
00b7259
format
Michaelvll Apr 30, 2024
cae05f1
Merge branch 'job-api' of github.com:skypilot-org/skypilot into job-api
Michaelvll Apr 30, 2024
affe273
address comments
Michaelvll Apr 30, 2024
b6ebf9a
Fix optimizer table
Michaelvll Apr 30, 2024
9e1a952
Fix best plan
Michaelvll Apr 30, 2024
c07d0e3
revert version 3
Michaelvll Apr 30, 2024
5936cc9
fix
Michaelvll Apr 30, 2024
5d18db3
add backward compat
Michaelvll Apr 30, 2024
552a62c
fix back compat
Michaelvll Apr 30, 2024
fec69d3
fix docs
Michaelvll Apr 30, 2024
7f80c8a
fix PR template
Michaelvll Apr 30, 2024
c524e06
Fix docs
Michaelvll Apr 30, 2024
faa3ab5
format
Michaelvll Apr 30, 2024
525444f
fix job logs
Michaelvll Apr 30, 2024
63c62bc
fix
Michaelvll Apr 30, 2024
e66d5a5
fix
Michaelvll Apr 30, 2024
87e0bcc
fix docs
Michaelvll Apr 30, 2024
0937799
fix backward
Michaelvll Apr 30, 2024
c78077a
fix backward
Michaelvll Apr 30, 2024
dbd7779
fix localstorage
Michaelvll May 1, 2024
65fce74
fix job name in optimizer table
Michaelvll May 1, 2024
0299942
Update managed-jobs.rst
concretevitamin May 1, 2024
97fd51d
Update docs/source/reference/cli.rst
Michaelvll May 1, 2024
67d58ec
address comments
Michaelvll May 1, 2024
1564583
Fix job controller name
Michaelvll May 1, 2024
7a987f0
Update sky/job/controller.py
Michaelvll May 1, 2024
6f469e4
Update sky/job/utils.py
Michaelvll May 1, 2024
1387e9f
add ress comment
Michaelvll May 1, 2024
b5a807a
Merge branch 'job-api' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 1, 2024
d6443f8
format
Michaelvll May 1, 2024
da028ad
Add comment
Michaelvll May 1, 2024
f3b609d
Add comments
Michaelvll May 1, 2024
81d7430
check status again
Michaelvll May 1, 2024
75756dd
check again
Michaelvll May 1, 2024
94a2a4a
avoid lock
Michaelvll May 1, 2024
c855685
format
Michaelvll May 1, 2024
8065ad3
Update sky/backends/cloud_vm_ray_backend.py
Michaelvll May 2, 2024
19534fe
address comments
Michaelvll May 2, 2024
7cd7d9b
Merge branch 'job-api' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 2, 2024
67c2ad6
rename to jobs and add CLI alias to job
Michaelvll May 2, 2024
f5c61cc
Add depdencies for all on-demand clouds
Michaelvll May 2, 2024
16023c6
fix
Michaelvll May 2, 2024
1847259
Fix
Michaelvll May 2, 2024
5d79bdf
fix test smoke
Michaelvll May 2, 2024
c5a3390
Update sky/utils/schemas.py
Michaelvll May 2, 2024
0892e0a
fix names
Michaelvll May 2, 2024
848194d
format
Michaelvll May 2, 2024
a5e8cd0
Merge branch 'job-api' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 2, 2024
73ff8bf
fix
Michaelvll May 2, 2024
3e89954
0.8.0 instead
Michaelvll May 2, 2024
15f50fd
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 2, 2024
0569a1d
format
Michaelvll May 2, 2024
cded6bb
fix doc
Michaelvll May 2, 2024
cf6b849
fix cloudflare
Michaelvll May 2, 2024
eedfb19
Fix job dashboard
Michaelvll May 2, 2024
b855c66
Merge branch 'master' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 2, 2024
7cf0c71
fix smoke
Michaelvll May 2, 2024
47c1a83
update
Michaelvll May 2, 2024
ec3b9ee
add comment for deprecation
Michaelvll May 2, 2024
914ac91
rename to jobs
Michaelvll May 3, 2024
40868cd
Update sky/clouds/cloud.py
Michaelvll May 3, 2024
a53f411
Update sky/jobs/utils.py
Michaelvll May 3, 2024
a57026a
Fix jobs
Michaelvll May 3, 2024
00edb54
format
Michaelvll May 3, 2024
c7c740b
Update sky/jobs/utils.py
Michaelvll May 3, 2024
fbbf489
minor
Michaelvll May 3, 2024
b540017
Merge branch 'job-api' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 3, 2024
2b9af1c
separate boto3 and awscli
Michaelvll May 4, 2024
24e45f2
fix
Michaelvll May 4, 2024
b2fbcbf
Update docs/source/reference/config.rst
Michaelvll May 4, 2024
b81884d
rename to jobs controller
Michaelvll May 4, 2024
a07be60
format
Michaelvll May 4, 2024
33b644c
Rename to JobsController
Michaelvll May 4, 2024
9db4e47
Merge branch 'job-api' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 4, 2024
81c6794
fix
Michaelvll May 4, 2024
4357685
renames
Michaelvll May 4, 2024
2610995
format
Michaelvll May 4, 2024
beaa5b7
format
Michaelvll May 4, 2024
fd7f97a
fix name in test
Michaelvll May 4, 2024
897c048
address comments
Michaelvll May 4, 2024
e683cfd
Fix docs
Michaelvll May 4, 2024
1155a2c
Add managed job yaml
Michaelvll May 4, 2024
f9cb3b7
fix
Michaelvll May 4, 2024
69ec86f
Update sky/cli.py
Michaelvll May 4, 2024
b69eee2
fix comment
Michaelvll May 4, 2024
5ca2f52
Merge branch 'job-api' of github.com:skypilot-org/skypilot into job-api
Michaelvll May 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ Tested (run the relevant ones):
- [ ] Any manual or new tests for this PR (please specify below)
- [ ] All smoke tests: `pytest tests/test_smoke.py`
- [ ] Relevant individual smoke tests: `pytest tests/test_smoke.py::test_fill_in_the_name`
- [ ] Backward compatibility tests: `bash tests/backward_comaptibility_tests.sh`
- [ ] Backward compatibility tests: `conda deactivate; bash -i tests/backward_compatibility_tests.sh`
2 changes: 1 addition & 1 deletion .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
- tests/test_optimizer_random_dag.py
- tests/test_storage.py
- tests/test_wheels.py
- tests/test_spot_serve.py
- tests/test_jobs_and_serve.py
- tests/test_yaml_parser.py
runs-on: ubuntu-latest
steps:
Expand Down
1 change: 1 addition & 0 deletions docs/source/_static/custom.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ document.addEventListener('DOMContentLoaded', () => {
// New items:
const newItems = [
{ selector: '.caption-text', text: 'SkyServe: Model Serving' },
{ selector: '.toctree-l1 > a', text: 'Managed Jobs' },
{ selector: '.toctree-l1 > a', text: 'Running on Kubernetes' },
{ selector: '.toctree-l1 > a', text: 'DBRX (Databricks)' },
{ selector: '.toctree-l1 > a', text: 'Ollama' },
Expand Down
4 changes: 2 additions & 2 deletions docs/source/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ Contents
:maxdepth: 1
:caption: Running Jobs

../examples/spot-jobs
../examples/managed-jobs
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
../reference/job-queue
../examples/auto-failover
../reference/kubernetes/index
Expand All @@ -139,7 +139,7 @@ Contents
:maxdepth: 1
:caption: Cutting Cloud Costs

../examples/spot-jobs
Managed Spot Jobs <../examples/spot-jobs>
../reference/auto-stop
../reference/benchmark/index

Expand Down
Loading
Loading