Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 525: Tracking Interface with Mlflow #750

Merged
merged 37 commits into from
Jan 31, 2025
Merged

Conversation

njbrake
Copy link
Contributor

@njbrake njbrake commented Jan 27, 2025

What's changing

This PR builds on top of the architecture changes introduced in #728.

It introduces the MLflow Tracking Client with a layer of abstraction to support other tracking backends in the future. I implemented a few mlflow methods to be sure that I correctly understand how to use the mlflow library. I have code that integrates it into our service and dependency layer, but I'll put that into a different PR.

This pull request includes significant updates to integrate MLflow for tracking machine learning experiments and workflows. The changes span environment configuration, Docker setup, backend settings, and the implementation of a new tracking client interface.

This pull request introduces significant changes to integrate MLflow for experiment tracking and management. The changes include updates to environment configurations, new MLflow-related classes and methods, and modifications to existing schemas to support MLflow.

MLflow Integration:

  • Environment Configuration:

    • Added MLFLOW_TRACKING_URI and related AWS environment variables to .env.template and .devcontainer/docker-compose.override.yaml.
    • Introduced a new mlflow service in the Docker Compose override file.
  • Backend Settings:

    • Added TrackingBackendType enum and TRACKING_BACKEND configuration in backend/settings.py.
    • Implemented a computed property TRACKING_BACKEND_URI for determining the tracking URI based on the backend type.
  • Tracking Client Implementation:

    • Created MLflowTrackingClient and MLflowClientManager classes in backend/tracking/mlflow.py to handle MLflow operations.
    • Added abstract base classes TrackingClient and TrackingClientManager in backend/tracking/tracking_interface.py.
    • Implemented TrackingClientFactory to create tracking client instances based on settings.
    • Updated __init__.py to include the new tracking client manager.
  • Schema Updates:

    • Modified WorkflowCreate, WorkflowResponse, WorkflowSummaryResponse, and WorkflowDetailsResponse schemas to use str instead of UUID for IDs.
  • Dependencies:

    • Added mlflow to the project dependencies in pyproject.toml.

If this PR is related to an issue or closes one, please link it here.

Closes #525

How to test it

This PR does not add the tracking client into the dependency or server layer, so there are no changes to behavior. However, to confirm that mlflow works in the docker-compose, run

make local-up

Additional notes for reviewers

I already...

  • Tested the changes in a working environment to ensure they work as expected
  • Added some tests for any new functionality
  • Updated the documentation (both comments in code and product documentation under /docs)
  • Checked if a (backend) DB migration step was required and included it if required

@github-actions github-actions bot added dependencies Pull requests that update a dependency file backend schemas Changes to schemas (which may be public facing) labels Jan 27, 2025
@njbrake njbrake changed the title 742: Tracking Interface with Mlflow implementation 525: Tracking Interface with Mlflow implementation Jan 27, 2025
@njbrake njbrake changed the title 525: Tracking Interface with Mlflow implementation Issue-525: Tracking Interface with Mlflow implementation Jan 27, 2025
@njbrake njbrake changed the title Issue-525: Tracking Interface with Mlflow implementation #525: Tracking Interface with Mlflow implementation Jan 27, 2025
@njbrake njbrake changed the title #525: Tracking Interface with Mlflow implementation Issue 525: Tracking Interface with Mlflow implementation Jan 27, 2025
@njbrake njbrake changed the title Issue 525: Tracking Interface with Mlflow implementation Issue 525: Tracking Interface with Mlflow Jan 28, 2025
@njbrake njbrake linked an issue Jan 28, 2025 that may be closed by this pull request
1 task
@njbrake njbrake marked this pull request as ready for review January 29, 2025 14:05
njbrake and others added 2 commits January 29, 2025 09:06
Signed-off-by: Nathan Brake <33383515+njbrake@users.noreply.github.com>
Base automatically changed from brake/route_rename_proposal to main January 29, 2025 16:05
@njbrake njbrake requested a review from javiermtorres January 31, 2025 00:09
Copy link
Contributor

@javiermtorres javiermtorres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments, otherwise lgtm

Copy link
Contributor

@javiermtorres javiermtorres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments.

@njbrake njbrake merged commit 560436e into main Jan 31, 2025
15 checks passed
@njbrake njbrake deleted the 741-tracking-interface branch January 31, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend dependencies Pull requests that update a dependency file schemas Changes to schemas (which may be public facing)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write a tracking interface with an MLflow implementation
3 participants