Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Allow KFP to specify Minio instance when configuring pipeline root #6517

Closed
capri-xiyue opened this issue Sep 7, 2021 · 19 comments
Assignees
Labels
help wanted The community is welcome to contribute. kind/feature lifecycle/frozen

Comments

@capri-xiyue
Copy link
Contributor

Feature Area

What feature would you like to see?

Currently when user configure pipeline root with minio, user can't change the MinIO instance. Kubeflow Pipelines can only use the Minio instance deployed with itself. Allow KFP to specify Minio instance other than the minio instance deployed with KFP when configuring pipeline root

What is the use case or pain point?

Is there a workaround currently?


Love this idea? Give it a 👍. We prioritize fulfilling features with the most 👍.

@Bobgy
Copy link
Contributor

Bobgy commented Sep 12, 2021

This can be achieved via #5753 too if implemented.

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Mar 2, 2022
@Bobgy
Copy link
Contributor

Bobgy commented Mar 2, 2022

/lifecycle frozen

@google-oss-prow google-oss-prow bot added lifecycle/frozen and removed lifecycle/stale The issue / pull request is stale, any activities remove this label. labels Mar 2, 2022
@dvaldivia
Copy link

What needs to be implemented to support this? maybe I can work on that

@ConverJens
Copy link
Contributor

@dvaldivia It would be awesome if you could look into this! Ping @Bobgy

@Bobgy
Copy link
Contributor

Bobgy commented Apr 14, 2022

/assign @chensun @zijianjoy

@connor-mccarthy
Copy link
Member

Users can now specify the Minio instance to use by using a KFP v2 deployment (alpha) and KFP SDK v2 (beta).

@dvaldivia
Copy link

@connor-mccarthy do you have an example?

@connor-mccarthy
Copy link
Member

connor-mccarthy commented Aug 4, 2022

@dvaldivia
This is specified via the pipeline_root argument in this example: https://www.kubeflow.org/docs/components/pipelines/sdk-v2/v2-compatibility/#compiling-and-running-pipelines-in-v2-compatibility-mode. The example is v2 compatible mode, but this applies to KFP v2 also.

pipeline_root:

pipeline_root: (Optional.) The root path where this pipeline’s outputs are stored. This can be a MinIO, Google Cloud Storage, or Amazon Web Services S3 URI. You can override the pipeline root when you run the pipeline.

@ashujain2
Copy link

@connor-mccarthy -
Default location for minio PipelineRoot - "minio://mlpipeline/v2/artifacts"
Assume I have installed minio in cluster through operator -
minio.ns-1.svc.cluster.local:80

How can i specify it's custom location ?

@connor-mccarthy
Copy link
Member

You can specify it when creating a pipeline with the @kfp.dsl.pipeline decorator:

from kfp import dsl

@dsl.pipeline(pipeline_root='gs://my-pipeline-root/example-pipeline')
def my_pipeline():
    ...

@ashujain2
Copy link

You can specify it when creating a pipeline with the @kfp.dsl.pipeline decorator:

from kfp import dsl

@dsl.pipeline(pipeline_root='gs://my-pipeline-root/example-pipeline')
def my_pipeline():
    ...

run_result = kfp.Client().create_run_from_pipeline_func(
pipeline_func,
experiment_name=experiment_name,
run_name=run_name,
arguments=arguments,
mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE,
pipeline_root='minio.ns-1://mlpipeline/v2/artifacts'
)

I tried that but its not working

@connor-mccarthy
Copy link
Member

@ashujain2, can you please provide an error message to help with debugging?

@ashujain2
Copy link

pipeline_root='minio.ns-1://mlpipeline/v2/artifacts'

time="2022-08-11T18:29:46.802Z" level=info msg="capturing logs" argo=true
I0811 18:29:46.846397      28 cache.go:143] Cannot detect ml-pipeline in the same namespace, default to ml-pipeline.kubeflow:8887 as KFP endpoint.
I0811 18:29:46.846416      28 cache.go:120] Connecting to cache endpoint ml-pipeline.kubeflow:8887
F0811 18:29:46.955301      28 main.go:50] Failed to execute component: parse bucket config failed: unrecognized pipeline root format: "minio.cserver-minio.svc.cluster.local://ns-1/artifacts/v2/pipeline/example-metrics-viz/b29c9ad8-9123*********"
time="2022-08-11T18:29:46.956Z" level=error msg="cannot save artifact /tmp/outputs/metrics/data" argo=true error="stat /tmp/outputs/metrics/data: no such file or directory"
time="2022-08-11T18:29:46.956Z" level=error msg="cannot save artifact /tmp/outputs/output_text_path/data" argo=true error="stat /tmp/outputs/output_text_path/data: no such file or directory"
Error: exit status 1

Basically , I have setup minio and kubeflow into same EKS cluster along with Kubeflow with an overall aim to achieve to segregate artifacts and metadata per namespace per bucket .

minIO Endpoint- minio.cserver-minio.svc.cluster.local
Into this hosted dedicated minio sever -I have couple of buckets created like ns-1, ns-2 ....

Somehow , I was able to achieve it by implementing steps written here - ### https://blog.min.io/how-to-kubeflow-minio/

Artifacts are going into dedicated minio bucket but pipeline O/p's are going to default minio .

@thesuperzapper
Copy link
Member

thesuperzapper commented Jun 29, 2023

I have raised a specific proposal for achieving this in:

@xrwang8
Copy link

xrwang8 commented Apr 1, 2024

pipeline_root='minio.ns-1://mlpipeline/v2/artifacts'

time="2022-08-11T18:29:46.802Z" level=info msg="capturing logs" argo=true
I0811 18:29:46.846397      28 cache.go:143] Cannot detect ml-pipeline in the same namespace, default to ml-pipeline.kubeflow:8887 as KFP endpoint.
I0811 18:29:46.846416      28 cache.go:120] Connecting to cache endpoint ml-pipeline.kubeflow:8887
F0811 18:29:46.955301      28 main.go:50] Failed to execute component: parse bucket config failed: unrecognized pipeline root format: "minio.cserver-minio.svc.cluster.local://ns-1/artifacts/v2/pipeline/example-metrics-viz/b29c9ad8-9123*********"
time="2022-08-11T18:29:46.956Z" level=error msg="cannot save artifact /tmp/outputs/metrics/data" argo=true error="stat /tmp/outputs/metrics/data: no such file or directory"
time="2022-08-11T18:29:46.956Z" level=error msg="cannot save artifact /tmp/outputs/output_text_path/data" argo=true error="stat /tmp/outputs/output_text_path/data: no such file or directory"
Error: exit status 1

Basically , I have setup minio and kubeflow into same EKS cluster along with Kubeflow with an overall aim to achieve to segregate artifacts and metadata per namespace per bucket .

minIO Endpoint- minio.cserver-minio.svc.cluster.local Into this hosted dedicated minio sever -I have couple of buckets created like ns-1, ns-2 ....

Somehow , I was able to achieve it by implementing steps written here - ### https://blog.min.io/how-to-kubeflow-minio/

Artifacts are going into dedicated minio bucket but pipeline O/p's are going to default minio .
Have you solved this problem @ashujain2

@dvaldivia
Copy link

@xrwang8 you may have missed some configs, I usually shutdown the internal MinIO after updating everything and all the artifacts start going to the new MinIO, have not tested with newer kubeflows, there might be a new configuration that needs to be updated, what version of kubeflow are you using?

@thesuperzapper
Copy link
Member

@xrwang8
Copy link

xrwang8 commented Apr 2, 2024

@xrwang8 you may have missed some configs, I usually shutdown the internal MinIO after updating everything and all the artifacts start going to the new MinIO, have not tested with newer kubeflows, there might be a new configuration that needs to be updated, what version of kubeflow are you using?

kubeflow-pipeline 2.0.5 ,What can I do with my own minio? @dvaldivia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted The community is welcome to contribute. kind/feature lifecycle/frozen
Projects
Status: Done
Development

No branches or pull requests

10 participants