[Multi user] How do we release KFP multi user in Kubeflow? #3645

Bobgy · 2020-04-29T02:49:28Z

Part of #1223

/cc @jlewi @chensun @IronPan

Options:

As a fork based on kubeflow 1.0.2
Merge to kubeflow 1.0 branch and release as 1.0.3
Wait for kubeflow 1.1 (any ETA?)

Note, it's only for gcp-iap deployment initially.

Bobgy · 2020-04-29T02:50:30Z

@jlewi I think we want a way to release it soon (maybe as experimental first).

So we can also go with release as an experimental fork from kubeflow 1.0.2 and wait for either 1.0.3 or 1.1.

What do you think?

jlewi · 2020-04-29T16:45:43Z

@Bobgy Why do we need a fork? If its on master can't people just deploy from master?

jlewi · 2020-04-29T17:04:36Z

@Bobgy is all the relevant code/fixes on master?

What is the path forward for turning on ISTIO in the kubeflow namespace? kubeflow/kfctl#296

Does the KFP manifest(https://github.com/kubeflow/manifests/tree/master/pipeline) need to be updated with multiuser changes?

Bobgy · 2020-04-30T09:12:03Z

@jlewi I verified using namespace resource works (I guess I misconfigured sth when I tried it before), so we can just use kfctl 1.0.2.

All relevant code/fixes are on KFP master and my forked manifest branch: https://github.com/Bobgy/manifests/pull/3/files.

(it can be deployed normally following https://www.kubeflow.org/docs/gke/deploy/deploy-cli/ and use CONFIG_URI="https://raw.githubusercontent.com/Bobgy/manifests/kfp-multi-user/kfdef/kfctl_gcp_iap.yaml" instead

Bobgy · 2020-04-30T09:17:16Z

I prefer that, in this release, only KFP is kind of experimental using the multi user mode, but all other components are stable. Therefore, I think KF 1.0.2 might be a good candidate as a base.

What do you think?

in the mean time, I can try to merge forked changes to Kubeflow master

jlewi · 2020-05-01T01:03:38Z

@Bobgy my suggestion would be to get things working on master. I don't know that we want to big changes onto release branches. Either way though getting it checked in working on master is a precursor so lets do that first.

Bobgy · 2020-05-01T04:29:58Z

I think we'd want to release a relatively stable version soon for community to try out KFP multi user mode in a forked branch.

In the mean time, I can try to start sending PRs to master.

jlewi · 2020-05-04T16:07:44Z

Would back-porting this onto 1.0 branch be consistent with semantic versioning?
If we backport this onto 1.0 do we risk blocking our ability to release minor patch fixes to 1.0?

Bobgy · 2020-05-05T13:22:36Z

I see, that makes sense to me.

Shall we

target 1.1 release like before
provide a quick try-out version based on forking 1.0.2 unofficially
?

jlewi · 2020-05-06T04:07:23Z

My suggestion would be to get it working on master. Once we have it working on master we can either add a git tag or possibly create a new release branch. I'd probably recommend we bump the minor version; e.g 1.1.0-alpha1.

Bobgy · 2020-05-06T05:03:36Z

@jlewi Sounds good.

After rethinking over it, I agree there's not much value added if the early access fork is based on 1.0.2, because KFP itself is also experimental.

I'll try to merge to kubeflow master first

jlewi · 2020-05-07T14:20:58Z

@Bobgy +1
It would be good to get the multiuser support integrated into the GCP blueprint for blueprint users. That should be fairly straightforward. Blueprints should be easier to upgrade so it will be easier to roll out changes to customers using the blueprints.

I think the only changes would be

Add pipelines to the stacks kustomization file
- Which is related to [Pipelines] Reuse KFP standalone manifests in kubeflow/manifests manifests#1048 but not necessarily blocked on it.
Replace the namespaces package in the blueprint (this [file])(https://github.com/kubeflow/gcp-blueprints/blob/master/kubeflow/instance/kustomize/namespaces/namespaces.yaml) with the one you added to kubeflow/manifests so the blueprint is in sync

Per: GoogleCloudPlatform/kubeflow-distribution#5 I'm setting up autodeployments for gcp blueprints. That should make it easy for us to verify and ultimately add tests to verify it is all working.

Bobgy · 2020-05-08T02:08:14Z

@jlewi Thanks for the suggestion! Because I have limited capacity working on these, I will get these changes merged to kubeflow/manifest master first before trying out the gcp blue print.

yanniszark · 2020-05-08T12:01:52Z

@Bobgy thanks a lot for confirming.
It seems that the user-identity parsing code is already present in a few places and probably more in the future.

While we (and other users) can test multi-user pipelines on their on-prem installations, it's very difficult to dig through every part of the code and make sure the headers are configurable.
The best time to incorporate these options is when the code is first written.

Is there a plan to make the headers configurable like the rest of Kubeflow's web apps? (Jupyter Web App, CentralDashboard, etc).
That would allow non-GCP users of Kubeflow to use multi-user pipelines.

@jlewi we had similar issues with web apps being GCP-only in v0.6 and we contributed changes throughout the code after the fact. After v0.6, we had agreed to use the established headers, so we wouldn't have to do it again. How can we make sure that new code is written in a way that it doesn't have an artificial dependence on GCP?

jlewi · 2020-05-08T13:59:11Z

we had agreed to use the established headers
Do you mean we agreed to use environment variables to make the headers programmable? I recall discussing 2 options

Using ISTIO to transform the headers to some standard values
Making the headers programmatic for each application.

I thought we mostly went with the second option?

@yanniszark I think the best way would be to come with ways to make it easy for application developers to follow best practices. Ideally, that would mean finding some way to handle it centrally so people don't have to think about it or creating reusable libraries.

Tests would also help. If application developers write tests for the feature (e.g. multiuser pipelines) and we have CI setup for different platforms then that test would fail on other platforms and provide appropriate signal.

Finally documenting it and promoting it is also helpful.

Bobgy · 2020-05-08T14:31:31Z

Multi user is a very complex feature, I don't feel there's any problem if we prototype with just one platform first. It was good enough we agreed on the high-level design in the beginning and guaranteed it should work with all platforms. Now, it's pretty trivial to port the feature.

There are only been two places we are using it: https://github.com/kubeflow/pipelines/search?q=x-goog-authenticated-user-email&unscoped_q=x-goog-authenticated-user-email

@yanniszark @jlewi Can you share what options for other apps look like for configuring headers? I'm not sure they have enough visibility for other developers.

yanniszark · 2020-05-08T15:32:36Z

Do you mean we agreed to use environment variables to make the headers programmable?

@jlewi
Yes, exactly, that was bad phrasing on my part.
Since in v0.6 we agreed to make them configurable (through envvars, cli args or config files), I would expect new applications integrating with Kubeflow's multi-tenancy to follow that approach.

@yanniszark @jlewi Can you share what options for other apps look like for configuring headers? I'm not sure they have enough visibility for other developers.

Absolutely!
Here are some examples:

@Bobgy I agree that expectations of Web Apps in terms of auth should be clear.
To help with making the concepts more clear and streamline implementations of future components, I have started a design doc that gathers all the current knowledge that is spread through the code in one place:
https://docs.google.com/document/d/11Xi-I2OqJvUuy_Zg0NskMF9GmworRDJhlu68poDgK5c/edit#bookmark=id.d3wqgeictsp4
I expect we will also discuss it further in Tuesday's Kubeflow community meeting.

The KUBEFLOW_USERID_HEADER and KUBEFLOW_USERID_PREFIX options would be a great start to get Pipelines working in all platforms.

Bobgy · 2020-05-13T06:41:57Z

/close
As we have reached agreement, further progress will be tracked in #3693

k8s-ci-robot · 2020-05-13T06:42:02Z

@Bobgy: Closing this issue.

In response to this:

/close
As we have reached agreement, further progress will be tracked in #3693

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

IronPan mentioned this issue Apr 29, 2020

Multi-User support for Kubeflow Pipelines #1223

Closed

28 tasks

Bobgy self-assigned this Apr 29, 2020

Bobgy added status/triaged Whether the issue has been explicitly triaged cuj/multi-user area/release labels Apr 29, 2020

jlewi mentioned this issue May 4, 2020

Multiple Teams & Multi-User Kubeflow Support kubeflow/kubeflow#4983

Closed

Bobgy mentioned this issue May 6, 2020

[Multi User] Multi user mode early access release #3693

Closed

Bobgy mentioned this issue May 13, 2020

[Multi User] Make user identity header configurable #3752

Closed

2 tasks

k8s-ci-robot closed this as completed May 13, 2020

sylus mentioned this issue May 27, 2020

Add multitenancy support for Kubeflow pipelines StatCan/aaw#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Multi user] How do we release KFP multi user in Kubeflow? #3645

[Multi user] How do we release KFP multi user in Kubeflow? #3645

Bobgy commented Apr 29, 2020 •

edited

Loading

Bobgy commented Apr 29, 2020 •

edited

Loading

jlewi commented Apr 29, 2020

jlewi commented Apr 29, 2020

Bobgy commented Apr 30, 2020

Bobgy commented Apr 30, 2020

jlewi commented May 1, 2020

Bobgy commented May 1, 2020

jlewi commented May 4, 2020

Bobgy commented May 5, 2020

jlewi commented May 6, 2020

Bobgy commented May 6, 2020

jlewi commented May 7, 2020

Bobgy commented May 8, 2020

yanniszark commented May 8, 2020

jlewi commented May 8, 2020

Bobgy commented May 8, 2020 •

edited

Loading

yanniszark commented May 8, 2020

Bobgy commented May 13, 2020

k8s-ci-robot commented May 13, 2020

[Multi user] How do we release KFP multi user in Kubeflow? #3645

[Multi user] How do we release KFP multi user in Kubeflow? #3645

Comments

Bobgy commented Apr 29, 2020 • edited Loading

Bobgy commented Apr 29, 2020 • edited Loading

jlewi commented Apr 29, 2020

jlewi commented Apr 29, 2020

Bobgy commented Apr 30, 2020

Bobgy commented Apr 30, 2020

jlewi commented May 1, 2020

Bobgy commented May 1, 2020

jlewi commented May 4, 2020

Bobgy commented May 5, 2020

jlewi commented May 6, 2020

Bobgy commented May 6, 2020

jlewi commented May 7, 2020

Bobgy commented May 8, 2020

yanniszark commented May 8, 2020

jlewi commented May 8, 2020

Bobgy commented May 8, 2020 • edited Loading

yanniszark commented May 8, 2020

Bobgy commented May 13, 2020

k8s-ci-robot commented May 13, 2020

Bobgy commented Apr 29, 2020 •

edited

Loading

Bobgy commented Apr 29, 2020 •

edited

Loading

Bobgy commented May 8, 2020 •

edited

Loading