Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SDK] Get Trial Metrics from Katib DB #2050

Merged

Conversation

andreyvelich
Copy link
Member

@andreyvelich andreyvelich commented Dec 5, 2022

Related: #2022.

I've added API to get Trial metrics from the Katib DB using Katib DB Manager gRPC API.
Usage:

cl = KatibClient()
trial_name = cl.get_experiment(name="test-experiment-1")["status"]["succeededTrialList"][0]
cl.get_trial_metrics(name=trial_name)[0]

"
time_stamp: "2022-12-05T14:42:08Z"
metric {
  name: "Train-accuracy"
  value: "0.112390"
}
"

Since we don't publish our gRPC as Python package, we can't modify the requirements for our Katib SDK.
In the setup.py script I copied the Katib gRPC APIs file to the appropriate location, so we can use it in the Katib client.

What do you think about it @johnugeorge @gaocegege @tenzen-y @kimwnasptd @anencore94 ?

In the long-term (if we are going to create Katib API Server) we might want to consider publishing the Katib API Server Python APIs to PyPI, so other users can make integration on top of it (what @anencore94 mentioned here: #2022 (comment)).

Kubeflow Pipelines team also publish their python kfp-server-api to the public registry.

/assign @johnugeorge @kimwnasptd @tenzen-y @anencore94

/hold for review

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment


# Copy Katib gRPC Python APIs to use it in the Katib SDK Client.
# We need to copy this file only on the SDK building stage, not on installation stage.
if not os.path.exists(katib_grpc_api_file):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we ensure files are uptodate and in sync?

Copy link
Member Author

@andreyvelich andreyvelich Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnugeorge It's a good point.

Maybe instead of checking that file kubeflow/katib/katib_api_pb2.py doesn't exist, we should check if ../../../pkg/apis/manager/v1beta1/python/api_pb2.py file exists and always replace it with the newest ?

katib_grpc_api_file = "../../../pkg/apis/manager/v1beta1/python/api_pb2.py"

# Copy Katib gRPC Python APIs to use it in the Katib SDK Client.
# We need to copy this file only on the SDK building stage, not on installation stage.
if os.path.exists(katib_grpc_api_file):
    shutil.copy(
        katib_grpc_api_file, "kubeflow/katib/katib_api_pb2.py",
    )

In that case, on the SDK building stage we always copy the newest gRPC APIs file ?

Any other suggestions @johnugeorge @tenzen-y ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think, "always copy" is the safer solution

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, on the SDK building stage we always copy the newest gRPC APIs file?

sgtm

@andreyvelich
Copy link
Member Author

@johnugeorge @tenzen-y I've made the change. It would be great if you could take a look.

@johnugeorge
Copy link
Member

/lgtm

/hold
/assign @tenzen-y

@google-oss-prow google-oss-prow bot added the lgtm label Dec 8, 2022
Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich Sorry for the late response.
Thanks for implementing this! This feature looks great!
/lgtm
/hold cancel

@google-oss-prow google-oss-prow bot merged commit 54424f2 into kubeflow:master Dec 9, 2022
@andreyvelich andreyvelich deleted the issue-2022-api-get-trial-metrics branch December 9, 2022 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants