Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-User support for Kubeflow Pipelines #1223

Closed
22 of 28 tasks
IronPan opened this issue Apr 24, 2019 · 67 comments
Closed
22 of 28 tasks

Multi-User support for Kubeflow Pipelines #1223

IronPan opened this issue Apr 24, 2019 · 67 comments
Assignees
Labels
area/backend area/frontend area/wide-impact help wanted The community is welcome to contribute. kind/feature priority/p1 status/triaged Whether the issue has been explicitly triaged

Comments

@IronPan
Copy link
Member

IronPan commented Apr 24, 2019

[April/6/2020]
Latest design is in https://docs.google.com/document/d/1R9bj1uI0As6umCTZ2mv_6_tjgFshIKxkSt00QLYjNV4/edit?ts=5e4d8fbb#heading=h.5s8rbufek1ax

Areas we are working on:

Release

Areas related to integration with Kubeflow

=============== original description

Some users express the interest of an isolation between the cluster admin and cluster user - Cluster admin deploy Kubeflow Pipelines as part of Kubeflow in the cluster;
Cluster user can use Kubeflow Pipelines functionalities, without being able to access the control plane.

Here are the steps to support this functionality.

  1. Provision control plane in one namespace, and launch argo workflow instances in another
    • provision control plane in kubeflow namespace, and argo job in namespace FOO (parameterization)
    • API server should update the incoming workflow definition to namespace FOO. Sample code that API server modify the workflow
  2. Currently all workflows are run under a clusterrole pipeline-runner (definition). And it's specified during compilation (link). Instead, it should run the workflows under a role instead of a clusterrole.
    • change pipeline-runner to role, and specify the namespace during deployment (expose as deployment parameter)
    • API server should update the incoming workflow definition to use pipeline-runner role.
  3. Cluster user can access UI through IAP/SimpleAuth endpoint, instead of port-forwarding.
@IronPan
Copy link
Member Author

IronPan commented Apr 24, 2019

Ideally this should be implemented in a way that get Kubeflow Pipeline closer to support multi-user. E.g. launch workflow in arbitrary namespace

@jlewi
Copy link
Contributor

jlewi commented Apr 25, 2019

What's the priority of this?

How does this align with the broader plans in Kubeflow to support multiple users?

@IronPan IronPan changed the title Cluster user and admin isolation Multi-User support for Kubeflow Pipelines Apr 26, 2019
@IronPan
Copy link
Member Author

IronPan commented Apr 26, 2019

This is not yet being prioritized, although I think this deserve a high priority.

In addition to admin/user isolation, here is a list of items to achieve the full multi-user support for KFP

  1. Every user (or group of users) will have a dedicate namespace and service account, role, and role binding in that namespace. These resources should be create by the Kubeflow Profile CRD.
  2. With IAP integration, the incoming request contains the user email. Pipeline API server should authorize the email with Kubernetes API by doing user impersonation check
  • In case of creating a job/run, the job/run should be created in the user's namespace, run by the service account in that namespace. Argo crd or scheduled workflow crd should be able to control resources across all namespaces.
  • In case of creating all resources, API server need to add additional column in the resource table to log the user's identity or namespace or both, so it can filter the resource in Get/List call.
  • In case of Get/List resource, API server need to filter the resource based on user's privileges.

@IronPan IronPan added help wanted The community is welcome to contribute. priority/p1 kind/feature labels Apr 26, 2019
@IronPan
Copy link
Member Author

IronPan commented Apr 26, 2019

@jlewi
Copy link
Contributor

jlewi commented Jul 22, 2019

@jessiezcc Any update on this work? Do you think this is something that will get done in Q3 and thus be part of 0.7?

@jessiezcc
Copy link
Contributor

This work is not currently scheduled for Q3.

@IronPan
Copy link
Member Author

IronPan commented Jul 24, 2019

Some customers express the interests of having ACL for API. e.g. lock down the API for deleting the resource to admin.

@krishnadurai
Copy link

/cc @krishnadurai

@songole
Copy link

songole commented Aug 14, 2019

/cc @songole

@yanniszark
Copy link
Contributor

Hi @IronPan.
We (Arrikto) have been exploring this problem for the past month and we generally agree with your overview of the steps required to have multi-user functionality in pipelines.

I'm assigning this to me, we have made good progress and we should have initial support for multi-user pipelines in v0.7.

/assign @yanniszark

@Bobgy
Copy link
Contributor

Bobgy commented Jul 10, 2020

Cross posting for clarification #4197 (comment):

EDIT: described features below will be released with Kubeflow 1.1. You can use these instructions for preview on GCP. It's NOT RELEASED YET.
Installation for Kubeflow 1.1 rc on GCP: https://github.com/kubeflow/gcp-blueprints/tree/v1.1-branch
KFP Multi User instructions: https://docs.google.com/document/d/1Ws4X1oNlaczhESNuEanZxbF-cnSfO78B1rBHWOkIAzo/edit?usp=sharing

pipeline runs are already designed to run in user namespaces.
The only resource in KFP core system that is not namespace separated (as of today) is static pipeline yaml files you upload to the server. They will remain public to anyone in the cluster. Users can try to launch any pipelines in their own namespaces.

For details about which resources and which services support namespace separation, please read this early access user instruction: https://docs.google.com/document/d/1Ws4X1oNlaczhESNuEanZxbF-cnSfO78B1rBHWOkIAzo/.

A quick list of things we don't support multi user separation in the upcoming KF 1.1 release:

  • pipeline resources (the static yaml/tar files you upload)
  • minio artifact storage
  • MLMD

@Bobgy
Copy link
Contributor

Bobgy commented Jul 10, 2020

If your organization would prefer pipeline resource separated by namespace, please upvote here. We can consider adding the support if there are enough user interest.

EDIT: enough reactions collected, the issue is tracked in #4197 with priority

@animeshsingh
Copy link
Contributor

@Bobgy it should be a feature which is enabled - if users want to "promote" their pipeline resource to be public, its allowed. Else int their namespace by default.

@Bobgy
Copy link
Contributor

Bobgy commented Jul 10, 2020

@Bobgy it should be a feature which is enabled - if users want to "promote" their pipeline resource to be public, its allowed. Else int their namespace by default.

Yes, I agree if we decide to implement, we'll make it configurable.

@yaliqin
Copy link

yaliqin commented Jul 10, 2020 via email

@jackwhelpton
Copy link
Contributor

Just working my way through the documentation, thanks for pointing me in that direction. It seems geared around using kfp.Client to execute pipelines; what's the corresponding vision when executing through the UI? I was hoping that pipelines would execute in a namespace based on what's selected in the top drop-down, is that the idea?

@Bobgy
Copy link
Contributor

Bobgy commented Jul 11, 2020

@jackwhelpton Yes, the feature you described is already there. They are not mentioned in the doc just because they work seamlessly.

@ca-scribner
Copy link
Contributor

@Bobgy re minio artifact store not being supported in KF 1.1 release, does that mean that a pipeline running in my namespace still writes to a shared artifact store? For example, anything my pipeline writes implicitly (eg: data written when piping results between steps in a pipeline like consumer_op(producer_task.output)) is accessible to anyone who can look inside that artifact store?

@Bobgy
Copy link
Contributor

Bobgy commented Jul 17, 2020

@ca-scribner That's right.
Current suggested workaround is to only pass urls through minio, let components read/write GCS/S3 directly and manage permission there if you care about data separation.
(If you use TFX, that's already the case.)

Or I think minio supports multi tenant natively: https://docs.min.io/docs/multi-tenant-minio-deployment-guide.html, we'd welcome contribution how that can be integrated with KFP multi user mode.

@ca-scribner
Copy link
Contributor

@Bobgy ok we lose kfp's helpful automatic piping of real data, but the data is still secure. Only meaningful downside I think is that everyone has to teach their components how to talk to their blob storage rather than offloading it to reusable blob-put/blob-get components. That's a fair compromise.

You're right about minio multi-tenancy (I work in one atm). I'll ask around for ideas.

@blairdrummond
Copy link

@ca-scribner I think the Minio "Multi tenant" is slightly different than what we're doing; I think we're using OPA or Istio magic or something to provide every namespace with a private bucket on a single tenant (We do have minimal v.s. premium tenants, but that's different). I think the term "tenant" is a bit overloaded here

@RoyerRamirez
Copy link

@jackwhelpton Yes, the feature you described is already there. They are not mentioned in the doc just because they work seamlessly.

Hi @Bobgy, we're hoping to get more clarification on multi-tenancy and the expected behavior. When you say "seamlessly", does that mean kubeflow will natively assign new experiments to the user's namespace as long as the headers are passed correctly, or do we need to add more components to our pipeline configuration to get the experiments to run under the user's namespace?

The reason I'm asking this is we're currently seeing the following msg in our [ ml-pipeline-scheduledworkflow ] logs:
time="2020-07-21T06:34:19Z" level=info msg="Processing object (inception-v3-transfer-hq5zv): object has no owner." Workflow=inception-v3-transfer-hq5zv

@Bobgy
Copy link
Contributor

Bobgy commented Jul 23, 2020

@RoyerRamirez Yes, experiments will be assigned to user's namespace (the namespace you selected in Kubeflow dashboard). Actions will be authorized by user's header.

The reason I'm asking this is we're currently seeing the following msg in our [ ml-pipeline-scheduledworkflow ] logs:
time="2020-07-21T06:34:19Z" level=info msg="Processing object (inception-v3-transfer-hq5zv): object has no owner." Workflow=inception-v3-transfer-hq5zv

Can you open a separate issue describing how you deployed and what problems you met?

@Jeffwan
Copy link
Member

Jeffwan commented Nov 5, 2020

@Bobgy

A quick list of things we don't support multi user separation in the upcoming KF 1.1 release:

  • pipeline resources (the static yaml/tar files you upload)
  • minio artifact storage
  • MLMD

Any plans for MLMD?
Are you talking about aggregation? like we only read artifacts/executions belongs to visible KFP resources from user's namespace?
Or native isolation on the MLMD side? I think MLMD schema currently doesn't provide any concept for users?

@Bobgy
Copy link
Contributor

Bobgy commented Nov 5, 2020

@Jeffwan Yes, you understandings are correct.
So far I'm not aware of any plan for MLMD multi-user separation.

/cc @neuromage @dushyanthsc
Is there anything you can share about this?

@maganaluis
Copy link
Contributor

@Jeffwan @Bobgy Based on the initial documents that the Karl shared as part of the Model Management group, MLMD was going to support a "Project" context, or at least the ability to create such a context. This project context could be tied to the User's Profile and provide the necessary isolation for metadata.

https://docs.google.com/presentation/d/1HiLIOm-ij0vdS_kEIQSAeICNsGSOl946qhT69WTgK5k/edit#slide=id.g8dfffc9b8a_0_37

@Jeffwan
Copy link
Member

Jeffwan commented Nov 6, 2020

@maganaluis em. Seems it remove context and bring in project product workflow. Have this proposal reviewed by mlmd team? I feel like this is a big schema change and some projects like TFX need to buy in the proposal which may take some time. At the same time, as a short term solution, we can group artifacts/executions by user's pipeline runs as @Bobgy originally proposed. Currently, I think only KFP use metadata service, so it's kind of safe to do this way.

@jlewi
Copy link
Contributor

jlewi commented Nov 6, 2020

@maganaluis I think @karlschriek 's doc is just a proposal; so it might change. I think in my discussions with @neuromage we were talking about using labels to group metadata. So "project", "experiment", etc... might just be user defined labels. As such they probably wouldn't be closely tied to multi-user support.

@neuromage
Copy link
Contributor

@Jeffwan Yes, you understandings are correct.
So far I'm not aware of any plan for MLMD multi-user separation.

/cc @neuromage @dushyanthsc
Is there anything you can share about this?

Hi, we have no current plans to add multi-user support directly in MLMD at this point in time. As you point out, there is no support for users in the MLMD schemas right now unfortunately. It would be worth exploring the use-cases for multi-user MLMD to figure out the right approach as well.

@jlewi
Copy link
Contributor

jlewi commented Nov 7, 2020

KFP multi-user shipped in KF 1.1.
I suggest closing this issue and opening up more actionable, scoped issues for further improvements.

@jlewi
Copy link
Contributor

jlewi commented Nov 7, 2020

/close

@k8s-ci-robot
Copy link
Contributor

@jlewi: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this issue Oct 22, 2023
…model container (kubeflow#1223)

* Add batcher docker publisher

* logger readiness probe

* Consolidate loggger to agent

* Inject logger to agent

* Update to golang 1.14

* Inject agent when logger is specified

* Fix port

* Add readiness probe when injecting logger

* Enable logger test

* Fix logger and agent tests

* Remove logger build

* Consolidate files

* Add dispatcher in test

* Add cloud event check test

* Fix agent image

* Add debugging

* Use a non-common port number
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend area/frontend area/wide-impact help wanted The community is welcome to contribute. kind/feature priority/p1 status/triaged Whether the issue has been explicitly triaged
Projects
None yet
Development

No branches or pull requests