Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline and Katib Integration #331

Closed
gyliu513 opened this issue Jan 21, 2019 · 8 comments
Closed

Pipeline and Katib Integration #331

gyliu513 opened this issue Jan 21, 2019 · 8 comments
Assignees

Comments

@gyliu513
Copy link
Member

gyliu513 commented Jan 21, 2019

Katib is used for hyperparameter tuning, and Pipeline is used for end-to-end ML workflows, and the Pipeline do need some parameters provided by Katib to improve the efficiency, such as enabling Pipeline get the best parameters provided by Katib etc, any document or best practise for how Pipeline can get the parameters generated by Katib?

FYI @hougangliu @jinchihe

@gyliu513 gyliu513 changed the title pipeline and Katib Integration Pipeline and Katib Integration Jan 21, 2019
@hougangliu
Copy link
Member

hougangliu commented Jan 21, 2019

Maybe we can add an example of pipeline embedding katib studyjob. I think put the example in pipeline project is better.
@YujiOshima @richardsliu @jlewi any suggestion?

@hougangliu
Copy link
Member

/assign

@jlewi
Copy link
Contributor

jlewi commented Jan 22, 2019

An example of using Pipelines to orchestrate hyperparameter tuning would be great.

You can take a look at the TFJob launcher to get some sense of what a component for Katib might look like
https://github.com/kubeflow/pipelines/tree/master/components/kubeflow/launcher

There is an open issue to figure out best practices with respect to integrating pipelines with K8s resources; see kubeflow/pipelines#677

@hougangliu
Copy link
Member

/close
kubeflow/pipelines#754

@k8s-ci-robot
Copy link

@hougangliu: Closing this issue.

In response to this:

/close
kubeflow/pipelines#754

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gyliu513
Copy link
Member Author

/cc @animeshsingh

@oadams
Copy link

oadams commented Aug 22, 2022

What is the current status for integrating Katib into a Kubeflow pipeline?

I have a train pipeline that involves components for (a) fetching data (b) preprocessing the data and (c) training. I would like to do hyperparameter tuning over the train component. From a first look at the Katib documentation it appears not to have a native integration with Pipelines: you specify a container/command that does the training and fire up your katib experiment, but it doesn't appear that it can itself be a component without wrapping it somehow.

It seems my main options are:

  1. Have the container specificed in my Katib YAML actually orchestrate the whole pipeline
  2. Create a Pipeline component that runs Katib that runs the train script.
  3. Have my Pipeline only do the data preparation and then separately run Katib.

The first two seem overly complicated. The third approach seems the most natural, but to some extent undermines the point of using a pipeline in the first place, since the pipeline only strings together data downloading and preprocessing. It would be good to have a pipeline where the input is some data source and the final output is the best model from a hyperparameter tuning experiment.

@skliarpawlo
Copy link

@oadams I think the first option is possible and not that complicated, there is an example https://github.com/kubeflow/katib/blob/master/examples/v1beta1/argo/argo-workflow.yaml of such approach. It would allow tuning any part of the pipeline or multiple parts at the same time which is the most elastic approach IMO. However, I struggle to make it work using python dsl, not sure if it was designed to work together, but seems like technically there shouldn't be a problem to make it work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants