feat: support data generation #715

LMMilliken · 2023-04-14T12:30:16Z

This pr adds support for data generation jobs through the finetuner.synthesize function

This PR references an open issue
I have added a line about this change to CHANGELOG

bwanglzu

let's reword all the generation to synthesis to reserve space for real generation job

finetuner/__init__.py

tests/unit/test_client.py

finetuner/hubble.py

finetuner/data.py

finetuner/experiment.py

finetuner/hubble.py

bwanglzu

left some minor comments

bwanglzu

LGTM!

guenthermi

Nice work, added some comments

finetuner/experiment.py

finetuner/hubble.py

tests/integration/test_runs.py

guenthermi

Added some more comments

finetuner/__init__.py

finetuner/data.py

finetuner/experiment.py

finetuner/run.py

guenthermi · 2023-04-19T12:33:27Z

tests/integration/test_runs.py

+    status = run.status()[STATUS]
+    while status not in [FAILED, FINISHED]:
+        time.sleep(10)
+        status = run.status()[STATUS]


If something really goes wrong here it could end up in an infinite loop. We should not do things like this. Is there a reason why it could take more than 10 seconds if it is a mocker?

I believe an actual run is created in staging for these tests, and they will time out and fail if something goes wrong

yes in the CI it should time out at some point, but I think it is not good to rely only on that. Because, you might want to see if other tests still running and also because you might want to run those tests locally. Maybe replace it with a for loop if it really needs multiple tries.

tests/unit/test_experiment.py

finetuner/data.py

finetuner/hubble.py

guenthermi

Added some more small comments

finetuner/experiment.py

finetuner/hubble.py

guenthermi · 2023-04-24T07:28:56Z

tests/integration/test_runs.py

+    status = run.status()[STATUS]
+    while status not in [FAILED, FINISHED]:
+        time.sleep(10)
+        status = run.status()[STATUS]


yes in the CI it should time out at some point, but I think it is not good to rely only on that. Because, you might want to see if other tests still running and also because you might want to run those tests locally. Maybe replace it with a for loop if it really needs multiple tries.

lmmilliken added 2 commits April 14, 2023 10:16

feat: support generation

31f9bb9

test: add tests for generation jobs

d814724

LMMilliken linked an issue Apr 14, 2023 that may be closed by this pull request

Support data generation jobs in the client #712

Closed

github-actions bot added the size/l label Apr 14, 2023

LMMilliken changed the title ~~Feat support generation~~ feat: support generation Apr 14, 2023

github-actions bot added area/core area/entrypoint area/testing This issue/PR affects testing labels Apr 14, 2023

lmmilliken added 2 commits April 14, 2023 14:31

chore: remove print statements

0283c4f

refactor: rename generate to synthesize

4953e72

github-actions bot added the area/client label Apr 14, 2023

LMMilliken changed the title ~~feat: support generation~~ feat: support data generation Apr 14, 2023

fix: fix failing tests

4d73c68

LMMilliken marked this pull request as ready for review April 14, 2023 13:17

LMMilliken requested review from guenthermi, bwanglzu, alaeddine-13 and gmastrapas April 14, 2023 13:17

lmmilliken added 3 commits April 14, 2023 15:18

Merge branch 'main' into feat-support-generation

5e6b40e

chore: update changelog

d85f2d4

feat: update list_models function

4d627dd

LMMilliken force-pushed the feat-support-generation branch from fa1bb76 to 4d627dd Compare April 14, 2023 13:54

bwanglzu reviewed Apr 14, 2023

View reviewed changes

finetuner/__init__.py Outdated Show resolved Hide resolved

tests/unit/test_client.py Show resolved Hide resolved

tests/unit/test_client.py Show resolved Hide resolved

finetuner/hubble.py Outdated Show resolved Hide resolved

chore: implement review notes

88b3fab

LMMilliken force-pushed the feat-support-generation branch from b0561d0 to 88b3fab Compare April 14, 2023 15:37

chore: implement review notes

117ef74