Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WG ownership of kubeflow/manifests - relation to application owners #400

Closed
jlewi opened this issue Aug 24, 2020 · 25 comments
Closed

WG ownership of kubeflow/manifests - relation to application owners #400

jlewi opened this issue Aug 24, 2020 · 25 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Aug 24, 2020

Which working group will own the kubeflow/manifests repo?

How will it relate to application owners?

Right now kubeflow/manifests doesn't cleanly separate the responsibilities of platform and application owners. For example, kubeflow/manifests#1498 who decides how testing is done for the repository?

One option would be to allow/encourage application owners to host the source of truth for the kustomize manifests inside their own repositories e.g.

  • kubeflow/pipelines - would host manifests for kubeflow/pipelines
  • kubeflow/katib - manifest for katib
  • kubeflow/kfserving - manifest for kfserving
  • etc....

The platform owners could then aggregate and collect these manifests and build automation to do that.

I believe a lot of applications (e.g. pipelines, katib, kfserving) are already storing/developing their manifests inside their repositories.

cc @kubeflow/kfserving-owners
cc @yanniszark @swiftdiaries @animeshsingh
cc @Bobgy

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/question 0.80

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@ellistarn
Copy link
Contributor

KFServing has been maintaining standalone installations https://github.com/kubeflow/kfserving/tree/master/install. As I understand it (needs hard data), the majority of our production customers install KFServing in this way.

In general, I think this practice helps applications decouple from a monolithic Kubeflow release process. I do see some drawbacks regarding shared dependencies (istio) and expect that the majority of integration work will require compatibility testing issues due to these dependencies.

@yanniszark
Copy link
Contributor

@jlewi that would be nice, because I understand that app teams already need to define some form of manifests for installation, e.g., for testing.
However, having a monolithic repo to gather all manifests allowed us to enforce certain best practices and tests. How would we run the unit tests for manifests developed in other repos? This is also related to the test infra restructure issue.
Should we start packaging our unit tests as Github Actions, so that other repos can easily apply them?

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
area/engprod 0.61

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@jlewi
Copy link
Contributor Author

jlewi commented Aug 24, 2020

@yanniszark

Should we start packaging our unit tests as Github Actions, so that other repos can easily apply them?

The details of how testing should be done should be the perogative of the owners. So we need to figure out who owns and maintains kubeflow/manifests before we can resolve any question about how testing should work.

I can think of two possible paths forward

  1. There is a WG related to deployments that owns kubeflow/manifests

    • This WG would be responsible for negotiating agreements with the respective WGs (e.g. pipelines and kfserving etc...) to figure out how responsibilities are divided between the respective WGs for maintaining the manifests
  2. No more communal kubeflow/manifests repo

    • Instead each application is responsible for its own manifests
    • each platform (e.g. AWS, OpenShift, GCP, MiniKF) is responsible for figuring out how to assemble those manifests into a platform.

@swiftdiaries
Copy link
Member

Once #401 goes in, I can draft a wg-manifests(?) proposal.
Since, most of the maintainers are common to both kf/manifests and kf/kfctl my initial thoughts are to add them both as subprojects to the WG. The expectation would be to maintain both the repos.

WDYT?
/cc @crobby @adrian555 @animeshsingh @Jeffwan @PatrickXYS

@swiftdiaries
Copy link
Member

cc @yanniszark

@PatrickXYS
Copy link
Member

Once #401 goes in, I can draft a wg-manifests(?) proposal.
Since, most of the maintainers are common to both kf/manifests and kf/kfctl my initial thoughts are to add them both as subprojects to the WG. The expectation would be to maintain both the repos.

I would considering this WG as wg-deployment. And I think the idea to combine kfctl and manifests is good.

The point is how can we

negotiate agreements with the respective WGs (e.g. pipelines and kfserving etc...) to figure out how responsibilities are divided between the respective WGs for maintaining the manifests

@swiftdiaries Feel free to bring it up in the community meetings and we can get some feedbacks from folks.

@crobby
Copy link
Member

crobby commented Aug 25, 2020

Once #401 goes in, I can draft a wg-manifests(?) proposal.
Since, most of the maintainers are common to both kf/manifests and kf/kfctl my initial thoughts are to add them both as subprojects to the WG. The expectation would be to maintain both the repos.

WDYT?
/cc @crobby @adrian555 @animeshsingh @Jeffwan @PatrickXYS

I'm a fan of this. In my world, kfctl and manifests are indeed closely related and it definitely makes sense to have some consistency in their maintenance.

@Bobgy
Copy link
Contributor

Bobgy commented Aug 25, 2020

Will the WG also handle releases? All applications and platforms need to be ready before we can cut a release in manifests repo.

@PatrickXYS
Copy link
Member

@Bobgy I think so, WG need to pick up the release responsibility. Otherwise, it will make the version release and code contribution separately and make WG responsibility unclear.

@Jeffwan
Copy link
Member

Jeffwan commented Aug 26, 2020

I would like to have a wg-control-plane which might be a superset of wg-manifest. We need wider group to deal with minimum kubeflow offering. Besides kfctl and manifest, it would be better to cover some basic kubeflow components like notebook release, profile controller, conformance tests etc. But if there's separate WG takes care of those applications, I think wg-control-plane could be lightweight.

@animeshsingh
Copy link
Contributor

+1 wg-control-plane

And Istio+Dex combination as well.

@swiftdiaries
Copy link
Member

Proposal for wg-deployment: #402
I wanted to keep this WG's scope narrow and create a broader working group wg-control-plane for the core Kubeflow applications that don't necessarily fall into any specific working groups.

@animeshsingh
Copy link
Contributor

animeshsingh commented Aug 28, 2020

@swiftdiaries each workgroup involves meetings, syncup infra, slack channels et all. More we have, less attendance we have. Control plane covers the key things (kfctl/manifests/istio+dex) as key repos, with deployment and management of deployed kubeflow as key goal, and also gets a critical mass of attendees.

We can leave notebooks+UI dashboard out of the scope of this wg.

@PatrickXYS
Copy link
Member

I think the wg-controlplane's scope should come with a smaller one first.

That means, we need to first take a small scope of areas that wg-controlplane need to take, after we prove we have ability/capacity to handle the small scope of areas, we move forward to integrating more Kubeflow Components.

Per say, we can start from:

deployment
jupyter-web-app
istio+dex

After that, we can gradually move forward with large scope, such as taking care of kubeflow/testing etc.

@jlewi
Copy link
Contributor Author

jlewi commented Aug 28, 2020

I don't think jupyter should be in scope for a controlplane or deployment wg. I believe there is thinking around having a wg focused either on notebooks or more broadly the datascientist user experience see #379

@swiftdiaries
Copy link
Member

+1 on jupyter-web-app being separate.

To @animeshsingh and @PatrickXYS comments, I was under the impression that since Istio+Dex is maintained underneath the kubeflow/manifests repo it automatically falls within the scope of wg-deployment

@animeshsingh
Copy link
Contributor

other than the name, I am fine with scope, which is kfctl/manifests/istio+dex.

The reason 'deployment' doesn't cover the nuance of istio+dex is because that's a common service for load balancing/ingress/authentication/authz, hence the name 'control-plane' would be more conducive.

If it's only kfctl/manifests, then the name captures it fine.

@animeshsingh
Copy link
Contributor

other point to note is that kfctl also includes Operator now, which is not only 'deployment', but lifecycle management of deployed Kubeflow. Which means it's watching the deployed Kubeflow, and taking corrective actions when things go south.

@jlewi
Copy link
Contributor Author

jlewi commented Aug 28, 2020

@swiftdiaries I wouldn't overindex on current code locations. Where code is located is often due to organic evolution and may not be a good indication of appropriate owner/governance. We should figure out appropriate ownership and then fix code location when needed.

ISTIO and dex is similar to PodPresets(#381) in that is fairly generic K8s infrastructure and not clear what the proper WG to own it would be.

One option is to move some of these application upstream or downstream of Kubeflow so that Kubeflow can focus more on AI specific aspects.

  • By upstream I mean moving into the relevant OSS project; e.g. perhaps ISTIO or Dex is the right place to maintain tooling for integration of ISTIO and Dex. Is there really anything about it that's specific to KF
  • By downstream I mean making it the responsibility of a specific platform owner that depends on it

@Jeffwan
Copy link
Member

Jeffwan commented Aug 28, 2020

@swiftdiaries
I feel good about current scope of wg-deployment. I agree we can start with narrow scope. The questions I have is will community create wg-control-plane later, if so, what's the relationship between wg-deployment and wg-control-plane?

@Jeffwan
Copy link
Member

Jeffwan commented Aug 28, 2020

In order to deploy kubeflow, I think following are minimum assets needed to maintain in long term. We need to have WG to take all of them.

Manifests (kubeflow doesn't own codes, they are upstream or third party components):

  • istio
  • cert-manager
  • dex & auth service
  • application-controller
  • spartakus
  • Kubeflow-roles

Components (with corresponding manifest)

  • access-management
  • profile-controller
  • adminission-webhook (podpreset)
  • centraldashboard

Jupyter is out of scope. I assume there's separate WG taking care of

  • notebook-controller
  • jupyter-web-app
  • tensorflow-notebook-image

@thesuperzapper
Copy link
Member

Here is link to my proposal for kubeflow/manifests, as this Issue probably has most of the people who want to see it:
kubeflow/manifests#1554 (comment)

@stale
Copy link

stale bot commented Dec 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot closed this as completed Dec 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants