Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define the client library in Python for model registry #46

Closed
rareddy opened this issue Sep 30, 2023 · 6 comments
Closed

Define the client library in Python for model registry #46

rareddy opened this issue Sep 30, 2023 · 6 comments
Assignees
Labels

Comments

@rareddy
Copy link
Contributor

rareddy commented Sep 30, 2023

Is your feature request related to a problem? Please describe.

DSP/Model Serving teams will need a Python bindings to the Model Registry to interact with it.

Describe the solution you'd like

  • Define a Python API interface for Model Registry for creating registered models
  • Adding versions
  • Adding (not updating) model artifact data, description, any metadata defined in the requirements document. Updating means new version creation.
  • Listing registered models
  • Listing models by search criteria - such as by deployment, by stage, by name & version, by id.
  • Add the ability to add tags (need to define a common tag definition across languages) to the ModelVersion to signal deployment.
@tarilabs
Copy link
Member

Suggesting using https://github.com/microsoft/kiota for the purpose of generating a client (in Python or other PLs) from the OpenAPI

@dhirajsb
Copy link
Contributor

I didn't realize we were going to build a brand new custom REST API and client on top of the existing CPP server. That's going to easily take a month or two.
I was thinking we were going to re-use the envoy REST-to-gRPC proxy, and the existing Python client to directly invoke on the gRPC endpoint of the CPP server.
What is the value proposition of having a hand designed and built proxy in the middle to hide the CPP service?

@rareddy
Copy link
Contributor Author

rareddy commented Oct 1, 2023

What is the value proposition of having a hand designed and built proxy in the middle to hide the CPP service?

When UI or Model Serving makes calls into CPP service, they can call higher level coarse grained API which represents "logical" layer on the top of the MLMD layer.

I was thinking we were going to re-use the envoy REST-to-gRPC proxy, and the existing Python client to directly invoke on > the gRPC endpoint of the CPP server.

There are three clients we are trying to serve. Python clients, UI, Model Serving teams. Are you suggesting we implement the logical layer independently at each layer?

The need is to provide a friendlier higher level API than what ml-metadata provides, unless I am not understanding how MLMD logical definition of the data model works. This is similar to DataSet concepts that is exposed by Python library. Using the OpenAPI vs gRPC to base your end client library can be a debatable. However if we are mapping logical to physical API we need to do this once in a single layer.

@dhirajsb
Copy link
Contributor

dhirajsb commented Oct 1, 2023

The issue is that the amount of extra work that would be required in implementing a logical types API proxy is not trivial.

We are still going to expose the gRPC API for DSP, so why not re-use it for the existing Python Client? We can write some helper Python code (to lookup type ids, maybe provide higher level classes derived from the generic Python classes) that makes it easier to deal with the gRPC API.

If a higher level API is needed for the UI, it's going to turn into a Backend For Frontend (BFF) pattern.
Andrew mentioned that one option is to have a proxy in node on the server that routes calls from the UI to the gRPC API in the backend.
That would be ideal for BFF implementation and gives the UI team complete freedom over:

  • How they want the data represented in the UI
  • Which parts of the data to show on which page, like summaries, lists, all data for a particular section of the graph, etc.
  • How to create any kind of custom queries for the backend in the BFF, etc. The gRPC API supports full SQL where clause style queries.

This is an ideal scenario to separate concerns and keep the backend scope from getting out of hand. The alternative is to implement essentially a hand written version of the gRPC API in openapi with a few higher level types. Which still leaves the problem of extracting UI queries or having to implement them in this proxy layer, which turns it into a BFF. A BFF is really meant to be a specialized backend, ideally implemented by the UI team. Please read the BFF article I shared and how it solves the frontend to backend coupling problem.

Edited: link to BFF pattern from AWS blog.

@rareddy
Copy link
Contributor Author

rareddy commented Oct 2, 2023

We are still going to expose the gRPC API for DSP, so why not re-use it for the existing Python Client?

I was not advocating for DSP, as it already has existing Python library, how are we going to support Model Serving? we would need to extend Python library using gRPC or something else.

If a higher level API is needed for the UI

IMO, this is not limited to UI. It involves Model Serving and DSP teams too.

A BFF is really meant to be a specialized backend, ideally implemented by the UI team.

agree, I am familiar with the pattern. If UI want to implement that is fine with me. This is not our decision to make. The question I am trying to answer today what is API that MR team exposing various teams.

For the model registry access, we are currently saying we will be supporting python bindings, tomorrow we may also need to support R and Go language bindings. I foresee MR team implementing a CLI tool, for each instance we will be translating lower API into higher API in respective tools with differing levels of API constructs. Is that we want?

I asking our team to

  1. Not throw MR tech over all the wall and ask DSP/Model Serving/UI tools to map to underlaying MLMD and map however they see fit. We must collaborate with them understand their domain model and map that to MLMD physical model and provide APIs that they understand. Note some of this may be being used by end user.
  2. Possibility to support multiple language bindings and tools, we should avoid duplicating the translations in every scenario.

If it makes sense then can we explore extending the proto layer for higher level constructs instead of OpenAPI impl. Basically what has done in YAML file in proto def. Which should serve us similarly and match the existing pattern.

@isinyaaa
Copy link
Contributor

isinyaaa commented Nov 7, 2023

Closed by #77 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

4 participants