Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

feat: support data generation #715

Merged
merged 24 commits into from
Apr 24, 2023
Merged

feat: support data generation #715

merged 24 commits into from
Apr 24, 2023

Conversation

LMMilliken
Copy link
Contributor

@LMMilliken LMMilliken commented Apr 14, 2023

This pr adds support for data generation jobs through the finetuner.synthesize function


  • This PR references an open issue
  • I have added a line about this change to CHANGELOG

@LMMilliken LMMilliken linked an issue Apr 14, 2023 that may be closed by this pull request
@LMMilliken LMMilliken changed the title Feat support generation feat: support generation Apr 14, 2023
@github-actions github-actions bot added area/core area/entrypoint area/testing This issue/PR affects testing labels Apr 14, 2023
@LMMilliken LMMilliken changed the title feat: support generation feat: support data generation Apr 14, 2023
@LMMilliken LMMilliken marked this pull request as ready for review April 14, 2023 13:17
Copy link
Member

@bwanglzu bwanglzu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's reword all the generation to synthesis to reserve space for real generation job

finetuner/__init__.py Outdated Show resolved Hide resolved
tests/unit/test_client.py Show resolved Hide resolved
tests/unit/test_client.py Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
finetuner/experiment.py Outdated Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
Copy link
Member

@bwanglzu bwanglzu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some minor comments

Copy link
Member

@bwanglzu bwanglzu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@guenthermi guenthermi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, added some comments

finetuner/experiment.py Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
tests/integration/test_runs.py Outdated Show resolved Hide resolved
Copy link
Member

@guenthermi guenthermi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more comments

finetuner/__init__.py Outdated Show resolved Hide resolved
finetuner/data.py Outdated Show resolved Hide resolved
finetuner/experiment.py Outdated Show resolved Hide resolved
finetuner/run.py Outdated Show resolved Hide resolved
status = run.status()[STATUS]
while status not in [FAILED, FINISHED]:
time.sleep(10)
status = run.status()[STATUS]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something really goes wrong here it could end up in an infinite loop. We should not do things like this. Is there a reason why it could take more than 10 seconds if it is a mocker?

Copy link
Contributor Author

@LMMilliken LMMilliken Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe an actual run is created in staging for these tests, and they will time out and fail if something goes wrong

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes in the CI it should time out at some point, but I think it is not good to rely only on that. Because, you might want to see if other tests still running and also because you might want to run those tests locally. Maybe replace it with a for loop if it really needs multiple tries.

tests/unit/test_experiment.py Outdated Show resolved Hide resolved
finetuner/data.py Outdated Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
@LMMilliken LMMilliken self-assigned this Apr 21, 2023
Copy link
Member

@guenthermi guenthermi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more small comments

finetuner/experiment.py Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
finetuner/hubble.py Outdated Show resolved Hide resolved
status = run.status()[STATUS]
while status not in [FAILED, FINISHED]:
time.sleep(10)
status = run.status()[STATUS]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes in the CI it should time out at some point, but I think it is not good to rely only on that. Because, you might want to see if other tests still running and also because you might want to run those tests locally. Maybe replace it with a for loop if it really needs multiple tries.

@LMMilliken LMMilliken merged commit ea9c62e into main Apr 24, 2023
@LMMilliken LMMilliken deleted the feat-support-generation branch April 24, 2023 10:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support data generation jobs in the client
4 participants