Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tagging & aliasing system for model retrieval from registry #553

Closed
mck-star-yar opened this issue May 14, 2024 · 5 comments · Fixed by #586
Closed

Use tagging & aliasing system for model retrieval from registry #553

mck-star-yar opened this issue May 14, 2024 · 5 comments · Fixed by #586
Labels
enhancement New feature or request

Comments

@mck-star-yar
Copy link

Description

MLflow is now using a tag system instead of stages for the model registry. MlflowModelRegistryDataset needs to be updated to include alias support and tagging filtering.

Context

Users of the plugin start using the newer version of MLflow with updated API and the library needs to support that

Possible Implementation

  • Alias support – remove the staging and substitute that with alias arg
  • Tagging support – provide a new argument that will allow for filtering out models by tags (if there's more than one model retrieved, warn user and either fail or take latest)
@mck-star-yar mck-star-yar changed the title Use tagging system for model retrieval from registry Use tagging & aliasing system for model retrieval from registry May 14, 2024
@Galileo-Galilei Galileo-Galilei added the enhancement New feature or request label May 15, 2024
@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented May 15, 2024

Thank you very much for the issue (and all the other suggestions!), this is definitely something that I should implement!

That said, I need more example of what you ant to achieve. indeed the MlflowModelRegistryDataset relies on mlflow.pyfunc.load_model, which can read any supported uri, e.g. models:/<registered model name>@<alias> should already work. Can yuo show a pseudo code example of a filter you'd want to do that you can't currently do?

@mck-star-yar
Copy link
Author

mck-star-yar commented May 17, 2024

Yeah, sure

So the current workflow is the following

  1. the model being registered in the model registry
  2. downstream pipeline utilizes the model from registry
    1. if this is a dev env, we just use the latest model
    2. if that's a prod env we use the model with alias champion

If that was a pure mlflow we'd have some if condition and load either <uri>/latest or <uri>@champion model

So the desired behaviour is

pipeline_inference_model:
  type: kedro_mlflow.io.models.MlflowModelRegistryDataset
  flavor: mlflow.pyfunc
  pyfunc_workflow: python_model
  artifact_path: inference_pipeline
  # stage_or_version: production    # can't use this with Unity Catalog (UC) on databricks (dbx)
  alias: champion    # dbx-UC-friendly API

@AngelPedroza
Copy link

Hi, are there any updates on this issue? Currently, I'm working on a project with kedro-mlflow, and I've encountered this issue in my use case too.
I would be happy to know if this issue could be solved.

@Galileo-Galilei
Copy link
Owner

I'll fix it in the next version.

@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Aug 27, 2024

@AngelPedroza @mck-star-yar Can you try to install the development version of the PR (pip install git+https://github.com/Galileo-Galilei/kedro-mlflow.git@553-model-registry-alias) and tell me if it suits your need?

my_model:
    type: kedro_mlflow.io.models.MlflowModelRegistryDataset
    model_name: my_awesome_model
    alias: my_alias

I've only added fetching from alias, because tagging support seems much more complex and I am not absolutely sure of the added value. I'll accept PR's for it though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: ✅ Done
3 participants