Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First attempt at a parametrized JobCreate #740

Merged
merged 23 commits into from
Feb 14, 2025

Conversation

javiermtorres
Copy link
Contributor

@javiermtorres javiermtorres commented Jan 24, 2025

What's changing

The JobCreate schema is changed to include a separate specific job_config. The openapi produced includes a oneOf constraint:

      "JobCreate": {
        "properties": {
          "name": {
            "type": "string",
            "title": "Name"
          },
[...]
          "job_config": {
            "oneOf": [
              {
                "$ref": "#/components/schemas/JobEvalConfig"
              },
              {
                "$ref": "#/components/schemas/JobEvalLiteConfig"
              },
              {
                "$ref": "#/components/schemas/JobInferenceConfig"
              },
              {
                "$ref": "#/components/schemas/JobAnnotateConfig"
              }
            ],
            "title": "Job Config",
            "discriminator": {
              "propertyName": "job_type",
              "mapping": {
                "annotate": "#/components/schemas/JobAnnotateConfig",
                "eval_lite": "#/components/schemas/JobEvalLiteConfig",
                "evaluate": "#/components/schemas/JobEvalConfig",
                "inference": "#/components/schemas/JobInferenceConfig"
              }
            }
          }
        },

The jobs and experiments services are changed accordingly.

Closes #706

How to test it

Tests should run correctly.

Additional notes for reviewers

N/A

I already...

  • Tested the changes in a working environment to ensure they work as expected
  • Added some tests for any new functionality
  • Updated the documentation (both comments in code and product documentation under /docs)
  • Checked if a (backend) DB migration step was required and included it if required
    • No DB migration needed

@github-actions github-actions bot added backend api Changes which impact API/presentation layer schemas Changes to schemas (which may be public facing) labels Jan 24, 2025
@javiermtorres javiermtorres force-pushed the javiermtorres/issue-706-organize-creation-records branch from cf6d9aa to d8ae072 Compare January 24, 2025 17:28
Copy link
Contributor

@njbrake njbrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far! Made a code suggestion but looks like a very logical refactor. My only question would be whether you plan on addressing the custom logic of _get_job_params in this PR. If you don't plan on addressing it here, can you make a separate issue to track elevating that out of the service layer?

@javiermtorres
Copy link
Contributor Author

The SDK needs to be updated. I have checked the backend unit and integration tests locally and they seem to work.

@github-actions github-actions bot added the sdk label Jan 28, 2025
@javiermtorres javiermtorres force-pushed the javiermtorres/issue-706-organize-creation-records branch 2 times, most recently from 72585b0 to a17f040 Compare January 28, 2025 15:30
@javiermtorres
Copy link
Contributor Author

The SDK and notebook tests have been updated. @veekaybee @aittalam I've changed the code of the notebook slightly. One important difference is that I have removed the model param in the eval lite job. AFAICT, it's not needed there. The notebook takes it from the initial model spec in the notebook and not from the output of the summarization job. Since the output is a csv, it didn't make sense to put the model there, but I'll check the results metadata.

@javiermtorres javiermtorres force-pushed the javiermtorres/issue-706-organize-creation-records branch 2 times, most recently from 0c8ef33 to 4ca7e65 Compare January 29, 2025 15:22
@javiermtorres javiermtorres marked this pull request as ready for review January 29, 2025 15:55
@javiermtorres javiermtorres force-pushed the javiermtorres/issue-706-organize-creation-records branch 2 times, most recently from 9159569 to e6f13f3 Compare January 31, 2025 11:43
@javiermtorres javiermtorres force-pushed the javiermtorres/issue-706-organize-creation-records branch 2 times, most recently from c9357b2 to 293ae98 Compare February 4, 2025 16:20
Copy link
Contributor

@njbrake njbrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that this PR is dropping support for the JobType.EVALUATION, which is needed to support the current frontend design. I may misunderstand the code. Other than that, only minor comments. Thanks for the work on this! (Let me know about JobType.EVALUATION and then I'll approve once that's worked out).

@ividal
Copy link
Contributor

ividal commented Feb 5, 2025

Thanks for this! One note, @javiermtorres this PR should still be in sync with the UI and keep an eye on how it interacts with /experiments.

Copy link
Contributor

@ividal ividal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that uploading a file w/o ground-truth and then clicking on "generate ground-truth" generates a "job not found error".

See log:
2025-02-10 - PR740.log

@njbrake njbrake changed the title First attempt at a parametrized JobCreate Jobs return a standardized and flexible output Feb 10, 2025
@njbrake njbrake changed the title Jobs return a standardized and flexible output First attempt at a parametrized JobCreate Feb 10, 2025
@javiermtorres javiermtorres force-pushed the javiermtorres/issue-706-organize-creation-records branch 2 times, most recently from bcad25e to ce76b2b Compare February 10, 2025 18:31
@javiermtorres javiermtorres force-pushed the javiermtorres/issue-706-organize-creation-records branch from 9113f26 to 02c0b79 Compare February 12, 2025 09:19
@ividal ividal self-requested a review February 14, 2025 14:24
Copy link
Contributor

@ividal ividal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested #847 in the context of reviewing #740 . I've only looked at the backend code (740), not the frontend code. But on the testing front, I did a demo:

  • uploading a dataset with and w/o gt
  • annotating one w/o gt
  • launching an experiment with both local and API-based models

image

Overall, works as expected (🥳 ). There are a number of smaller issues, but let's get 847 in here and then 740 into main - and iterate on smaller separate issues :)

javiermtorres and others added 2 commits February 14, 2025 15:42
* remove redundant folders

* refactor: fix imports

* refactor: change folder structure

* style: linting

* cleanup

* First attempt at a parametrized JobCreate

* Replace templates with pydantic models

* Adapt SDK and SDK tests

* Fix sdk unit tests

* Fix notebook tests

* Fix tests

* Fix job definition in workflows

* Fix job unit test

* Start a default workflow for experiments

* Rebase to main

* Align with routes in main

* Move to experiments new endpoint

* Streamline new experiments api

* remove redundant folders

* refactor: fix imports

* refactor: change folder structure

* style: linting

* cleanup

* WIP: migrate to new workflow apis

* refactor some more stuff

* use the new datastructure, hide runtime

* refactor: cleanup Job vs Experiment in ExperimentDetails mess

* style: linting

* fixing things

* style: linting

* current state

* results working

* style: linting

* current state

* after merge fixes

* checkpoint

* things working ish

* formatting

* style: linting

---------

Co-authored-by: Javier Torres <javier@mozilla.ai>
@ividal ividal self-requested a review February 14, 2025 15:13
Copy link
Contributor

@ividal ividal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #847 in here and checks green, we have ourselves a usable new experiments workflow :)

THANK YOU for the effort @javiermtorres @khaledosman

@javiermtorres javiermtorres enabled auto-merge (squash) February 14, 2025 15:27
@javiermtorres javiermtorres merged commit 4cf82e2 into main Feb 14, 2025
17 checks passed
@javiermtorres javiermtorres deleted the javiermtorres/issue-706-organize-creation-records branch February 14, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Changes which impact API/presentation layer backend frontend schemas Changes to schemas (which may be public facing) sdk
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consolidate the different creation jobs into an organized class/package.
4 participants