Commit ffc3d62
feat: Implement TrainerClient Backends & Local Process (#33)
* Implement TrainerClient Backends & Local Process
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Implement Job Cancellation
Signed-off-by: Saad Zaher <szaher@redhat.com>
* update local job to add resouce limitation in k8s style
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Update python/kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Fix linting issues
Signed-off-by: Saad Zaher <eng.szaher@gmail.com>
* fix unit tests
Signed-off-by: Saad Zaher <eng.szaher@gmail.com>
* add support wait_for_job_status
Signed-off-by: Saad Zaher <eng.szaher@gmail.com>
* Update data types
Signed-off-by: Saad Zaher <szaher@redhat.com>
* fix merge conflict
Signed-off-by: Saad Zaher <szaher@redhat.com>
* fix unit tests
Signed-off-by: Saad Zaher <szaher@redhat.com>
* remove TypeAlias
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Replace TRAINER_BACKEND_REGISTRY with TRAINER_BACKEND
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Update kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Update kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Restructure training backends into separate dirs
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Update kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Signed-off-by: Saad Zaher <szaher@redhat.com>
* add get_runtime_packages as not supported by local-exec
Signed-off-by: Saad Zaher <szaher@redhat.com>
* move backends and its configs to kubeflow.trainer
Signed-off-by: Saad Zaher <szaher@redhat.com>
* fix typo in delete_job
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Move local_runtimes to constants
* Move local_runtimes to constants
* allow list_jobs to filter by runtime
* keep runtime ref in __local_jobs
Signed-off-by: Saad Zaher <szaher@redhat.com>
* use google style docstring for LocalJob
Signed-off-by: Saad Zaher <szaher@redhat.com>
* remove debug opt from LocalProcessConfig
Signed-off-by: Saad Zaher <szaher@redhat.com>
* only use imports from kubeflow.trainer for backends
Signed-off-by: Saad Zaher <szaher@redhat.com>
* upload local-exec to use only one step
While I believe in simplicity and diving this into steps makes it easier
for debugging and extensibility. Addressing comments on this PR
consolidating all train job scripts into one and running it as single
step to match k8s.
Signed-off-by: Saad Zaher <szaher@redhat.com>
* optimize loops when getting runtime
Signed-off-by: Saad Zaher <szaher@redhat.com>
* add LocalRuntimeTrainer
Signed-off-by: Saad Zaher <szaher@redhat.com>
* rename cleanup config item to cleanup_venv
Signed-off-by: Saad Zaher <szaher@redhat.com>
* convert local runtime to runtime
Signed-off-by: Saad Zaher <szaher@redhat.com>
* convert runtimes before returning
Signed-off-by: Saad Zaher <szaher@redhat.com>
* fix get_job_logs to align with parent interface
Signed-off-by: Saad Zaher <szaher@redhat.com>
* rename get_runtime_trainer func
Signed-off-by: Saad Zaher <szaher@redhat.com>
* rename get_training_job_command to get_local_train_job_script
Signed-off-by: Saad Zaher <szaher@redhat.com>
* Ignore failures in Coveralls action
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
---------
Signed-off-by: Saad Zaher <szaher@redhat.com>
Signed-off-by: Saad Zaher <eng.szaher@gmail.com>
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
Co-authored-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>1 parent 6709dcf commit ffc3d62
File tree
9 files changed
+891
-2
lines changed- .github/workflows
- kubeflow/trainer
- api
- backends/localprocess
9 files changed
+891
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| 52 | + | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
41 | 46 | | |
42 | 47 | | |
43 | 48 | | |
| |||
55 | 60 | | |
56 | 61 | | |
57 | 62 | | |
| 63 | + | |
| 64 | + | |
58 | 65 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
| |||
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
30 | | - | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
31 | 35 | | |
32 | 36 | | |
33 | 37 | | |
| |||
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
| 50 | + | |
| 51 | + | |
46 | 52 | | |
47 | 53 | | |
48 | 54 | | |
| |||
Whitespace-only changes.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
0 commit comments