Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Python client #77

Merged
merged 12 commits into from
Nov 7, 2023
Merged

Add Python client #77

merged 12 commits into from
Nov 7, 2023

Conversation

isinyaaa
Copy link
Contributor

@isinyaaa isinyaaa commented Oct 20, 2023

Add a preliminary Python client implementation.

Description

I've based my implementation on #68 and our OpenAPI definition.

How Has This Been Tested?

There are simple tests included. I still have to add instructions to run, a CI job to run them
automatically, and docstrings.

To test, first make sure you have at least Python 3.10 installed, if you don't:

  1. Setup pyenv
  2. pyenv install 3.10.13

Now, install nox, then, to test, follow the instructions on the README.md file.

TODO:

  • Document nox, etc on the README
  • Build Docs
  • Add CI jobs for building docs and running tests
  • Release the first version

Merge criteria:

  • The commits and have meaningful messages
  • The author will squash them after approval or will ask to merge with squash.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

I don't know if it makes sense to squash this, we can discuss that.

@isinyaaa isinyaaa added the MLMD Related to ml-metadata label Oct 20, 2023
@isinyaaa isinyaaa self-assigned this Oct 20, 2023
@tarilabs tarilabs mentioned this pull request Oct 20, 2023
3 tasks
@dhirajsb
Copy link
Contributor

@isinyaaa I think one of the suggestions from @rimolive from DSP team was that we could re-use mlmd base classes and extend our types from them.
Would that make it possible to inherit common properties and perhaps make it easier to implement the API somehow?
I am not sure about all the implementation details. @rimolive can you provide more information about what you were thinking?

@isinyaaa
Copy link
Contributor Author

isinyaaa commented Oct 26, 2023

v2 updates:

  • Moved the code into a src folder as per guidelines.
  • Removed map submodule: let's keep things simple with a single interface and helper methods.
  • Moved model registry client into submodule
  • Created a new interface for prefixable types
  • Simplified artifact state mapping (removed bidict dependency)
  • Added simple docstrings
  • Created a noxfile to simplify running tasks
  • Better separation of core functionality from MLMD integration/dependency
  • Dealt with TODOs regarding MLMD errors
  • Added proper UUID prefixing
  • Auto-register Base{Artifact,Context} subclasses using __init_subclass__
  • Add back model when fetching ModelVersion
  • Added pytest coverage plugin
  • Improved test coverage (now at 87%) and logic

dhirajsb
dhirajsb previously approved these changes Oct 30, 2023
Copy link
Contributor

@dhirajsb dhirajsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Just to clarify, is the primary purpose of the store wrapper type for parity with the Golang client's service layer API?

@isinyaaa
Copy link
Contributor Author

lgtm

Just to clarify, is the primary purpose of the store wrapper type for parity with the Golang client's service layer API?

I'm focusing on parity on the base type definitions and on the registry client. The store wrapper should simply provide a useful interface to gRPC focusing on our use case (upserting and querying single protos).

@isinyaaa isinyaaa marked this pull request as ready for review October 31, 2023 23:55
@isinyaaa
Copy link
Contributor Author

isinyaaa commented Nov 1, 2023

Changes in v3:

  • Added sphinx to build docs
  • Added basic usage on README
  • Added attrs
  • Made exceptions a common submodule
  • Brought back client as single file module as we don't need much more right now
  • Fixed license string
  • Added CI to build docs and test
  • Added release commit
  • Fixed (most) get_* calls to automatically populate references to other proto objects.(Artifacts are still pending)
  • Fixed ListOptions
  • Fixed ModelRegistry constructor to take in SSL connection args
  • Renamed *Error -> *Exception following python conventions
  • Added description fields to all MLMD wrapper types
  • Some more improvements to tests

@isinyaaa isinyaaa requested review from dhirajsb and dtrifiro November 1, 2023 00:05
@@ -0,0 +1,44 @@
[tool.poetry]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you encounter any issues so far by using poetry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some issues with Python versions, and it's also not as clear how to get proper integration with an editor (I've been using VenvSelector on vim with some hiccups on my Mac).

@@ -0,0 +1,12 @@
"""Main package for the ODH model registry."""

__version__ = "0.1.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also related to the above: I find it rather cumbersome to have version numbers hardcoded in code and/or pyproject.toml.

To solve this issue, I like building with setuptools and using setuptools_scm, which makes it easy to retrieve version information from git.

If you like to stick with poetry, something that might work with poetry is this poetry-dynamic-versioning, although I have not used it before. If you want to do some reading, there's quite a few issues in the poetry repo related to this problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd had a look at that and thought it was unnecessary for now, as we are still iterating on an alpha version. This plugin you sent seems to rely on git tags, which we can't use, unless we share tags for the go MR as well as the python client.

Comment on lines +143 to +153
if isinstance(value, bool):
mlmd_props[key].bool_value = value
elif isinstance(value, int):
mlmd_props[key].int_value = value
elif isinstance(value, float):
mlmd_props[key].double_value = value
elif isinstance(value, str):
mlmd_props[key].string_value = value
else:
raise Exception(f"Unsupported type: {type(value)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tried this before only to get Irrefutable pattern is allowed only for the last case statement, but now I see I should be matching against an instance and not the type itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it looks like this isn't supported in 3.9... I wouldn't mind dropping compat for this version if we were able to use 3.11, but the MLMD py lib only goes so far as 3.10. Not sure if it's OK to only provide compat for a single version...

Copy link
Contributor

@dtrifiro dtrifiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

tarilabs
tarilabs previously approved these changes Nov 6, 2023
Copy link
Member

@tarilabs tarilabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho you can consider Dhiraj and Daniele's comments in further PR, if you prefer to merge this now, and could be further improved directly on main for their suggestions and further dev work.

For the records for QE and others;
I've tested this both in DevContainer (I personally find that convenient for IDE support):

Screenshot 2023-11-06 at 14 59 20

and Rosetta emulation:

$env /usr/bin/arch -x86_64 /bin/zsh --login
# then
poetry install
poetry build
poetry run pytest

clients/python/pyproject.toml Outdated Show resolved Hide resolved
Create a Python module for the model registry. We'll use gRPC to
communicate with a wrapper of the MLMD C++ server. For that, we'll use
the MLMD Python module because of its simplified interface, but also
define our own as we want to decouple from their implementation in the
future.

Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Models are artifacts, but when versioned we want to associate them with
other artifacts that belong together and compose a deployable unit.

As a first implementation, let's not concern with other specific types
of artifacts.

Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
@isinyaaa
Copy link
Contributor Author

isinyaaa commented Nov 6, 2023

Changes in v4:

  • Applied suggestions
  • Used structural pattern matching where possible

We want to be able to map those types to the MLMD gRPC types, so let's
use a interface to keep the fiddly data manipulation close to the data.

Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
This simplifies the direct interface to gRPC provided by MLMD library
for the client we want to expose:
- we're usually dealing with single instances of proto types
- in step with the Go client

Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
This sets the ground for our public interface, and should the client's
"low-level" interface to the registry.

Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Because of a limitation in the MLMD C++ server, names can't be repeated
for the same types of artifacts, contexts, etc. so we need to prefix
names to be able to allow the users to have more freedom when choosing
them.

Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
Signed-off-by: Isabella Basso do Amaral <idoamara@redhat.com>
@isinyaaa
Copy link
Contributor Author

isinyaaa commented Nov 6, 2023

Changes in v5:

  • Dropped structural pattern matching to keep compatibility with 3.9.

@isinyaaa isinyaaa requested review from tarilabs and dtrifiro November 6, 2023 15:19
@isinyaaa isinyaaa merged commit 958630a into opendatahub-io:main Nov 7, 2023
5 checks passed
@isinyaaa isinyaaa deleted the add-python-client branch November 7, 2023 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MLMD Related to ml-metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants