feat(backend): Added multi-user pipelines API. Fixes #4197 #4835

maganaluis · 2020-11-28T00:16:46Z

Added namespaced pipelines, with UI and API changes, as well as the ability to share pipelines.

Description of your changes:

Added a new field in Pipelines table for namespace.
Uploaded Pipelines are by default namespaced.
Ability to share Pipelines by selecting "shared" check-mark in the UI.
Authorization via SubjectAccessReview for Pipelines, PipelinesVersions, and Upload Pipelines endpoints.

k8s-ci-robot · 2020-11-28T00:16:57Z

Hi @maganaluis. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Bobgy · 2020-11-28T02:10:44Z

Awesome 👍👍
/ok-to-test
I will take a deeper look next week

Bobgy · 2020-11-30T06:25:29Z

/cc @yanniszark @elikatsis @Jeffwan @IronPan @chensun
CC all stakeholders

Bobgy

Did a quick high-level review.

The backend behavior LGTM.
I think we need some frontend design how users may view/choose shared pipelines from UI

EDIT: I'll follow up with more detailed review once most people agree with the high-level behavior.

backend/api/pipeline.proto

frontend/src/pages/PipelineList.tsx

yanniszark · 2020-12-01T19:48:44Z

@Bobgy thanks for the ping, I'll take a look this week

backend/api/pipeline.proto

yanniszark

@maganaluis I took a quick look at the backend work. First of all, great work!
I have the following high-level comments:

I think the RBAC permissions are not aligned with the design outlined here: Multi-User Authorization: Add support for K8s RBAC via SubjectAccessReview #3513 (comment). More specifically, the design specifies that versions are a subresource of pipelines, but I see the checks on versions use the pipeline resource. In addition, several of the verbs used are wrong (e.g., checking for the list verb on a create handler). Could you go over the new endpoints and ensure they conform to the design?
I think that I saw the ListPipelines call return both namespaced and non-namespaced Pipelines. I could be wrong on this. Could you confirm that the List Pipelines API call only lists pipelines for the specified namespace?
What should we do about the separation of namespaced and non-namespaced pipelines? Should we differentiate between them in the authorization layer? (e.g., Pipeline vs ClusterPipelines). cc @Bobgy

Again, thanks for the great effort on this! 😄
cc @elikatsis to also take a look

backend/src/apiserver/model/pipeline.go

backend/src/apiserver/resource/resource_manager.go

backend/src/apiserver/server/pipeline_server.go

backend/src/apiserver/server/pipeline_upload_server.go

backend/src/apiserver/storage/pipeline_store.go

backend/src/apiserver/server/pipeline_server.go

backend/src/apiserver/server/pipeline_upload_server.go

maganaluis · 2020-12-05T01:06:40Z

@Bobgy @yanniszark I think returning only namespaced Pipelines makes sense, I will remove the shared capabilities for now. While still keeping the "" empty string as a default in the namespace field, so this capability can be added in the future. My main goal is to secure the Pipelines, so whatever is simpler.

Bobgy · 2020-12-05T01:11:50Z

Thanks for covering both high level and low level problems!

I think that I saw the ListPipelines call return both namespaced and non-namespaced Pipelines. I could be wrong on this. Could you confirm that the List Pipelines API call only lists pipelines for the specified namespace?

I can see he did it intentionally, if API returns both, then we wouldn't need to adjust UI to make both types of pipelines discoverable. This is good for saving some initial cost. Further speaking, this is backward compatible behavior to allow upgrading without breaking any sdk/UI client code.

I believe we need some further discussion, how we can introduce the default behavior change. Shall we add an request field to switch the behavior or add an API server configuration?

What should we do about the separation of namespaced and non-namespaced pipelines? Should we differentiate between them in the authorization layer? (e.g., Pipeline vs ClusterPipelines). cc @Bobgy

Again, thanks for the great effort on this! 😄
cc @elikatsis to also take a look

I think we can discuss this after this PR, because this can be a progressive improvement.

Overall, I'd say I want to scope down on this PR by focusing on MVP changes to introduce pipeline separation. We can improve further on demand in following ups.

yanniszark · 2020-12-11T19:53:17Z

I can see he did it intentionally, if API returns both, then we wouldn't need to adjust UI to make both types of pipelines discoverable. This is good for saving some initial cost. Further speaking, this is backward compatible behavior to allow upgrading without breaking any sdk/UI client code.

Returning pipelines only for the specified namespace is also backwards compatible. IMO, when the API call specifies a namespace, it doesn't make sense semantically to return things that are not in that namespace. It's a violation of the filter that the user has clearly specified.

@maganaluis please let me know when the PR is ready for another pass :)

maganaluis · 2020-12-12T00:02:10Z

@yanniszark I agree, let me go over the code one more time and re-test it from my side. Just a quick question, given we'll be making changes to the API, do I need to make modifications to the KFP SDK? Should this be a separate PR?

maganaluis · 2020-12-14T03:59:28Z

@yanniszark @Bobgy

I added the name_namespace index; thanks for the review this quite a big bug. :)

The current code will only display the Pipelines for the namespace being queried, and the "Shared" check-mark has been removed. I've already tested locally in MiniKube.

The only issue is that the Examples will not load for the users anymore, because these are being loaded in the "" (Public) namespace.

Bobgy · 2020-12-15T06:28:59Z

Just a quick question, given we'll be making changes to the API, do I need to make modifications to the KFP SDK? Should this be a separate PR?

You should regenerate python SDK with script: https://github.com/kubeflow/pipelines/blob/master/backend/api/build_kfp_server_api_python_package.sh.
Adding any other helpers/fields in kfp/_client.py can be a separate PR.

Bobgy · 2020-12-15T06:45:03Z

Returning pipelines only for the specified namespace is also backwards compatible. IMO, when the API call specifies a namespace, it doesn't make sense semantically to return things that are not in that namespace. It's a violation of the filter that the user has clearly specified.

Let me explain the end-to-end user journey for backward compatibility:

User installs current KFP with multi-user mode enabled, but pipelines are shared
User uses KFP, uploads some pipelines, and built some automation around KFP using KFP SDK and API.
User upgrades KFP to a version which pipelines are separated
I'd hope at this stage, all the existing KFP usages can still give the user what they had back.
4.1 The user should be able to open KFP UI and see all the shared pipelines
4.2 The user should be able to use KFP SDK to query all the shared pipelines
4.3 It's OK if the user uploads more namespaced pipelines, but they cannot be queried by shared pipelines

One thing that will be breaking if pipeline endpoint only returns namespaced pipelines is that:

we will change KFP UI to query pipelines in one namespace only
so when the user opens KFP UI for a namespace, it will show nothing

Reiterating on my goal here again: I'd want a user using KFP to be able to upgrade to a new version while keeping all the existing functionality still working --- including KFP UI and KFP SDK usages. All the information that were available should still be available.

And after going over these scenarios, I think the minimal effort fix is to:
Add a special flag -- "includeShared" (you might find a better name for it) -- when querying KFP pipeline endpoints

The flag defaults to false, because that's the desired long term behavior of only showing namespaced pipelines
When KFP UI lists pipelines, it should query the namespace with includeShared=true, so that all previously shown pipelines still show up, while namespaced pipelines should also show up.

Other use-cases should have no problem with the new behavior:

if create pipeline request do not specify namespace, it's a shared pipeline as before
if create pipeline request contains a namespace, it's a namespaced pipeline
if list pipeline request do not specify namespace, it lists all shared pipelines as before
if list pipeline request contains a namespace, it lists namespaced pipelines
if list pipeline request contains a namespace and includeShared=true, it lists both namespaced pipelines and shared pipelines (this will initially only be used by KFP UI to make UI UX backward compatible, we can figure out a long term path about this flag later)
if create run request specifies a pipeline, we need to verify the user has access permission to this pipeline, either it's shared or the user has get access to pipelines in this namespace

This is for discussion, what do you think? I think it'll be better to wait until we all agree on this topic before starting coding.

maganaluis · 2021-02-22T01:03:47Z

@Bobgy As suggested, I removed all the UI changes, and ensured backwards compatibility.

1. User does not provide resource reference
2. User provides resource reference to public namesapce ""
3. User provides resource reference to namespace x

I tested the namespaced functionality, and this also works as expected. If we are not going to make any changes on the UI, does it make sense to include the Python API changes or should we remove it as well?

Bobgy · 2021-02-22T08:16:11Z

@maganaluis Thanks! Let me take a look.

It's recommended to keep Python api changes, because it should reflect latest API status.

Bobgy · 2021-02-22T08:58:01Z

/lgtm
/approve

Thank you again for the continued efforts!
I think this is good to go.

/hold
To give @StefanoFioravanzo @yanniszark @capri-xiyue a last chance to review.

google-oss-robot · 2021-02-22T08:58:07Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bobgy, maganaluis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~backend/OWNERS~~ [Bobgy]
~~manifests/kustomize/OWNERS~~ [Bobgy]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2021-02-22T08:58:07Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bobgy, maganaluis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~backend/OWNERS~~ [Bobgy]
~~manifests/kustomize/OWNERS~~ [Bobgy]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

yanniszark · 2021-02-23T10:55:14Z

Thanks @Bobgy @maganaluis, I'll try to take a look today or tomorrow.

yanniszark

@maganaluis thanks for the great work on this. I mainly have a few nits in the code and some questions around SQL indexes and migrations.

backend/api/pipeline.proto

backend/src/apiserver/model/pipeline.go

yanniszark · 2021-02-24T17:06:15Z

backend/src/apiserver/client_manager.go

+ response = db.Model(&model.Pipeline{}).RemoveIndex("Name")
+ if response.Error != nil {
+ glog.Fatalf("Failed to drop unique key on pipeline name. Error: %s", response.Error)
+ }
+


Should the migration code be wrapped in a transaction (remove index + add new)?
This is in order to not leave the database in an invalid state, in case the migration fails to complete.
Also, is this code idempotent? Meaning, what would happen if the "Name" index doesn't exist? Would it fail?

Maybe gorm does it automatically if the (name, namespace) index is declared on the Go struct (in AutoMigrate)?

That's my understanding as well, and yes I also believe the operation is idempotent, I remember testing this locally. I added a comment here in how it would look like given KFP switches to the gorm main fork.
#5125

I think it's best to leave the operations separately, that's the pattern currently being used with this fork.

We tested this from Kubeflow 1.1 and we are using the code I wrote in production. Which has this logic, we did not observe any issues and we also tested it on the MySQL in cluster which Kubeflow comes with and an Azure MySQL.

yanniszark · 2021-02-24T17:08:51Z

backend/src/apiserver/resource/resource_manager.go

+ if err != nil {
+ return "", util.Wrap(err, "Failed to get namespace from versionId ID")
+ }
+ pipeline, err := r.GetPipeline(pipelineVersion.PipelineId)


Maybe reuse GetNamespaceFromPipelineID here?

yanniszark · 2021-02-24T17:21:40Z

backend/src/apiserver/server/pipeline_server.go

+ refKey := filterContext.ReferenceKey
+ if refKey == nil {
+ // In single user mode, apply filter with empty namespace for backward compatibile.
+ filterContext = &common.FilterContext{
+ ReferenceKey: &common.ReferenceKey{Type: common.Namespace, ID: ""},
+ }
+ }
+ if refKey != nil && refKey.Type != common.Namespace {
+ return nil, util.NewInvalidInputError("Invalid resource references for pipelines. ListPipelines requires filtering by namespace.")
+ }
+ if refKey != nil && refKey.Type == common.Namespace {
+ namespace := refKey.ID
+ resourceAttributes := &authorizationv1.ResourceAttributes{
+ Namespace: namespace,
+ Verb: common.RbacResourceVerbList,
+ }
+ if err = s.CanAccessPipeline(ctx, "", resourceAttributes); err != nil {
+ return nil, util.Wrap(err, "Failed to authorize with API resource references")
+ }
+ }
+


nit: I would rewrite that as:

refKey := filterContext.ReferenceKey // Validate first if refKey != nil && refKey.Type != common.Namespace { return nil, util.NewInvalidInputError("Invalid resource references for pipelines. ListPipelines requires filtering by namespace.") } if refKey == nil { // In single user mode, apply filter with empty namespace for backward compatibile. filterContext = &common.FilterContext{ ReferenceKey: &common.ReferenceKey{Type: common.Namespace, ID: ""}, } } namespace := refKey.ID resourceAttributes := &authorizationv1.ResourceAttributes{ Namespace: namespace, Verb: common.RbacResourceVerbList, } if err = s.CanAccessPipeline(ctx, "", resourceAttributes); err != nil { return nil, util.Wrap(err, "Failed to authorize with API resource references") }

because it eliminates an if clause and imo makes it less complex. Your call though :)

yanniszark · 2021-02-24T17:23:32Z

backend/src/apiserver/server/pipeline_server.go

+ */
+ refKey := filterContext.ReferenceKey
+ if refKey == nil {
+ // In single user mode, apply filter with empty namespace for backward compatibile.


This is for both multi-user and single-user at this moment right? It's for general backwards-compatibility.

Right, but you'll actually want to keep this code here going forward. Otherwise you'll run into issues with KFP standalone.

yanniszark · 2021-02-24T17:36:43Z

backend/src/apiserver/server/util.go

+func GetPipelineNamespace(queryString string) (string, error) {
+ pipelineNamespace, err := url.QueryUnescape(queryString)
+ if err != nil {
+ return "", util.NewInvalidInputErrorWithDetails(err, "Pipeline namespace in the query string has invalid format.")
+ }
+ return pipelineNamespace, nil
+}
+


nit: Since this function is only used in the upload server, maybe keep it there?

Makes sense. Done.

yanniszark · 2021-02-24T17:40:22Z

backend/src/apiserver/storage/pipeline_store.go

+ if filterContext.ReferenceKey != nil && filterContext.ReferenceKey.Type == common.Namespace {
+ glog.Info("Using Namespace to filter the query")
+ query = query.Where(
+ sq.Eq{"pipelines.Status": model.PipelineReady,
+ "pipelines.Namespace": filterContext.ReferenceKey.ID},
+ )
+ } else {
+ query = query.Where(
+ sq.Eq{"pipelines.Status": model.PipelineReady},
+ )
+ }


Is this if/else needed? From what I understand, the filterContext always contains a namespace reference key, because the ListPipelines endpoint sets it to the default value if it's not present.

We still need to check this, otherwise you'll get a nill pointer while checking for the ID. Same on the List Pipelines method. It's mostly due to keeping backwards-compatibility with KFP stand-alone.

google-oss-robot · 2021-02-26T01:09:54Z

New changes are detected. LGTM label has been removed.

corrected typo updating code based on review fixes for pipelines server reverting this back

maganaluis · 2021-02-26T13:02:57Z

@Bobgy @yanniszark Thank you for the reviews. It's good to go from my side.

Bobgy · 2021-02-26T13:27:09Z

/LGTM

Thank you a lot again @maganaluis @yanniszark!
I think a lot of people are looking forward to this.

Let's get this going! If there are any problems, we can always come back to fix.

Bobgy · 2021-02-26T13:30:34Z

/unhold

google-cla bot added the cla: yes label Nov 28, 2020

k8s-ci-robot added the size/XL label Nov 28, 2020

k8s-ci-robot requested review from Ark-kun and Bobgy November 28, 2020 00:16

k8s-ci-robot added the needs-ok-to-test label Nov 28, 2020

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels Nov 28, 2020

k8s-ci-robot requested review from chensun, elikatsis, IronPan, Jeffwan and yanniszark November 30, 2020 06:25

Bobgy reviewed Nov 30, 2020

View reviewed changes

backend/api/pipeline.proto Outdated Show resolved Hide resolved

backend/api/pipeline.proto Outdated Show resolved Hide resolved

frontend/src/pages/PipelineList.tsx Outdated Show resolved Hide resolved

k8s-ci-robot added size/XXL and removed size/XL labels Dec 4, 2020

maganaluis commented Dec 4, 2020

View reviewed changes

backend/api/pipeline.proto Outdated Show resolved Hide resolved

yanniszark suggested changes Dec 4, 2020

View reviewed changes

k8s-ci-robot added size/XL and removed size/XXL labels Dec 5, 2020

maganaluis force-pushed the master branch 2 times, most recently from b15ff17 to a3f0dda Compare February 22, 2021 00:56

k8s-ci-robot added the do-not-merge/hold label Feb 22, 2021

google-oss-robot assigned Bobgy Feb 22, 2021

google-oss-robot added the lgtm label Feb 22, 2021

google-oss-robot added the approved label Feb 22, 2021

yanniszark suggested changes Feb 24, 2021

View reviewed changes

k8s-ci-robot removed the lgtm label Feb 26, 2021

maganaluis force-pushed the master branch from 066df75 to df85f5b Compare February 26, 2021 01:32

Added multi-user pipelines backend

c5eefc1

corrected typo updating code based on review fixes for pipelines server reverting this back

maganaluis force-pushed the master branch from d73ca8a to c5eefc1 Compare February 26, 2021 02:57

removing unnecessary info logging

379cb83

k8s-ci-robot added the lgtm label Feb 26, 2021

k8s-ci-robot removed the do-not-merge/hold label Feb 26, 2021

google-oss-robot merged commit 5df2801 into kubeflow:master Feb 26, 2021

Bobgy changed the title ~~feat(backend): Added multi-user pipelines (UI + API); Fixes #4197~~ feat(backend): Added multi-user pipelines API; Fixes #4197 Jul 7, 2021

Bobgy changed the title ~~feat(backend): Added multi-user pipelines API; Fixes #4197~~ feat(backend): Added multi-user pipelines API. Fixes #4197 Jul 7, 2021

Bobgy mentioned this pull request Aug 18, 2021

support separate pipeline for each namespace #4197

Open

tasos-ale mentioned this pull request Feb 14, 2023

feat(frontend): Support namespaced pipelines from the UI. Closes #5084 #8831

Merged

aidandunlop mentioned this pull request May 22, 2024

Support namespaced custom resources on providers sky-uk/kfp-operator#40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): Added multi-user pipelines API. Fixes #4197 #4835

feat(backend): Added multi-user pipelines API. Fixes #4197 #4835

maganaluis commented Nov 28, 2020

k8s-ci-robot commented Nov 28, 2020

Bobgy commented Nov 28, 2020

Bobgy commented Nov 30, 2020

Bobgy left a comment •

edited

Loading

yanniszark commented Dec 1, 2020

yanniszark left a comment •

edited

Loading

maganaluis commented Dec 5, 2020

Bobgy commented Dec 5, 2020

yanniszark commented Dec 11, 2020 •

edited

Loading

maganaluis commented Dec 12, 2020

maganaluis commented Dec 14, 2020

Bobgy commented Dec 15, 2020

Bobgy commented Dec 15, 2020 •

edited

Loading

maganaluis commented Feb 22, 2021

Bobgy commented Feb 22, 2021

Bobgy commented Feb 22, 2021

google-oss-robot commented Feb 22, 2021

k8s-ci-robot commented Feb 22, 2021

yanniszark commented Feb 23, 2021

yanniszark left a comment

yanniszark Feb 24, 2021

maganaluis Feb 26, 2021

yanniszark Feb 24, 2021

yanniszark Feb 24, 2021 •

edited

Loading

yanniszark Feb 24, 2021

maganaluis Feb 26, 2021

yanniszark Feb 24, 2021

maganaluis Feb 26, 2021

yanniszark Feb 24, 2021

maganaluis Feb 26, 2021 •

edited

Loading

google-oss-robot commented Feb 26, 2021

maganaluis commented Feb 26, 2021

Bobgy commented Feb 26, 2021

Bobgy commented Feb 26, 2021

feat(backend): Added multi-user pipelines API. Fixes #4197 #4835

feat(backend): Added multi-user pipelines API. Fixes #4197 #4835

Conversation

maganaluis commented Nov 28, 2020

k8s-ci-robot commented Nov 28, 2020

Bobgy commented Nov 28, 2020

Bobgy commented Nov 30, 2020

Bobgy left a comment • edited Loading

Choose a reason for hiding this comment

yanniszark commented Dec 1, 2020

yanniszark left a comment • edited Loading

Choose a reason for hiding this comment

maganaluis commented Dec 5, 2020

Bobgy commented Dec 5, 2020

yanniszark commented Dec 11, 2020 • edited Loading

maganaluis commented Dec 12, 2020

maganaluis commented Dec 14, 2020

Bobgy commented Dec 15, 2020

Bobgy commented Dec 15, 2020 • edited Loading

maganaluis commented Feb 22, 2021

Bobgy commented Feb 22, 2021

Bobgy commented Feb 22, 2021

google-oss-robot commented Feb 22, 2021

k8s-ci-robot commented Feb 22, 2021

yanniszark commented Feb 23, 2021

yanniszark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yanniszark Feb 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maganaluis Feb 26, 2021 • edited Loading

Choose a reason for hiding this comment

google-oss-robot commented Feb 26, 2021

maganaluis commented Feb 26, 2021

Bobgy commented Feb 26, 2021

Bobgy commented Feb 26, 2021

Bobgy left a comment •

edited

Loading

yanniszark left a comment •

edited

Loading

yanniszark commented Dec 11, 2020 •

edited

Loading

Bobgy commented Dec 15, 2020 •

edited

Loading

yanniszark Feb 24, 2021 •

edited

Loading

maganaluis Feb 26, 2021 •

edited

Loading