-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructuring Kubeflow docs proposal #440
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: RFMVasconcelos The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this @RFMVasconcelos, I left few comments.
- Using IBM Cloud Container Registry (ICR) | ||
- Pipelines on IBM Cloud Kubernetes Service (IKS) | ||
- End-to-end Kubeflow on IBM Cloud | ||
- **Kubeflow Operator** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubeflow operator is also part of Kubeflow components ?
Or we would say that kfctl
is tool to deploy, monitor and manage the lifecycle of Kubeflow ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubeflow operator seems to be deployment tooling, loosely connected to kfctl
, though under the same WG umbrella. It could fall into a "Lifecycle management" section or under "Getting started", as in effect it is an alternative deployment method for OpenShift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with that @RFMVasconcelos.
My personal thought is that all Lifecycle management
tools should be under the same website section.
- Experiment with the Pipelines Samples | ||
- Run a Cloud-specific Pipelines Tutorial | ||
- Troubleshooting | ||
- Reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipelines has their own Reference.
Does it make sense to move Pipelines reference under /docs/reference where other projects are located ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that Pipelines and Fairing are the only components that have references outside of reference. Might make sense to merge everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comes back to the question of splitting component applications onto separate sites. In that case, each component application's documentation will need to be self-contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the conversations in the community meetings that I have attended, it sounds like we are moving towards separate sites for each component. Would it make sense to add a separate file for modeling out what that restructure would look like if we split the component docs onto other sites? Or, should we model that in this doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for Reference, if we move to the separate domains method, I would argue that we should keep a copy of reference on both, in the same manner as we currently do with manifests
in the github org.
@joeliedtke happy with either. I guess none of this is set in stone. With this document I aim at identifying the quickest structural changes we can make to make the website more clear. We still need to rely on WG-leads
to make sure their lower-level docs structure makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like I can't edit files in this PR, so here is a quick sketch of what I have been thinking about for information architecture of Kubeflow.org after the component docs move to their own site:
- Kubeflow Home Page
- Getting started (same as current proposal, consider renaming to About Kubeflow)
- Community (same as current proposal)
- Components
- Jupyter Notebooks (Single page that describes what this component is and where to learn more)
- Central Dashboard (Single page that describes what this component is and where to learn more)
- Metadata (Single page that describes what this component is and where to learn more)
- Fairing (Single page that describes what this component is and where to learn more)
- Feature Store (Single page that describes what this component is and where to learn more)
- Frameworks for training (Single page that describes what this component is and where to learn more)
- Hyperparameter Tuning (Single page that describes what this component is and where to learn more)
- Kubeflow Pipelines (Single page that describes what this component is and where to learn more)
- Jupyter Notebooks (Single page that describes what this component is and where to learn more)
- Tools for Serving (Single page that describes what this component is and where to learn more)
- Multi-Tenancy (Single page that describes what this component is and where to learn more)
- Nuclio functions (Single page that describes what this component is and where to learn more)
- Docs
- Deployment (same as current proposal)
- Configuring Kubeflow (same as Setups in the current proposal)
- Resources (Review content, if it is general to Kubeflow then keep this section)
- Troubleshooting
- Reference (Review content, if it is general to Kubeflow then keep this section)
The thought is to focus the Kubeflow.org site on describing the Kubeflow project, components, and deployment options. All of the docs for the component applications will be hosted on other sites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @joeliedtke, thank you for the feedback!
Regarding whether docs are hosted on other sites, I think the downsides might be consistency, driving people away from Kubeflow.org and harder to make sure the websites are high quality. But despite that, I'm happy with what the community decides.
As per your suggestions:
- I like the
Kubeflow Home Page
on the sidebar. - I think
Docs
is not a good word there as all are docs, but I see the point of merging those 5 things under an umbrella. - I also like
Configuring
instead ofSetups
@RFMVasconcelos just to be clear, is this a proposal for a quick (but temporary) fix to the doc structure? Because seeing the discussion in kubeflow/website#2293, it's very clear the community wants to split the docs for each component into their own websites/subdomains. (While obviously leaving general stuff in |
@thesuperzapper please see this comment. WDYT? |
What about AgileStacks distribution of Kubeflow? It's available for AWS
and on-premise installation. This is an attractive option for Kubeflow
users looking for consistent, production-ready deployments of Kubeflow on
AWS and on-prem. We discussed the Distributions section that contains an
overview page about each distribution with a link to more detailed
documentation about installing Kubeflow using distribution specific steps.
There is a similar issue for OpenShift distribution, Arrikto distribution,
and Canonical distribution.
On Wed, Jan 13, 2021 at 3:38 PM davidyuyuan <notifications@github.com>
wrote:
… Thanks, Joe!
Best regards,
David Yuan
***@***.***
> On Jan 13, 2021, at 18:34, Joe Liedtke ***@***.***> wrote:
>
>
> That's a good point, I hadn't noticed that those were moved out of the
Getting Started section. Perhaps we should keep the Deployment and
Configuration sections with docs and links to the different deployment
options.
>
> Home
> Getting Started
> Community
> Components
> Documentation
> Deployment
> Configuration
> Kubeflow on AWS
> Kubeflow on Azure
> Kubeflow on Google Cloud
> Kubeflow on IBM Cloud
> Kubeflow on OpenShift
> Resources
> Reference
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub <
#440 (comment)>,
or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AKOBSZ5YPUSF6RETM2Y6TNDSZYUWVANCNFSM4TEDMNFQ
>.
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#440 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSIPTAZ7P2EVAXOGINQQILSZYVGBANCNFSM4TEDMNFQ>
.
|
@joeliedtke, thank you for the feedback! I currently have @mameshini, thank you as well! I also see that this proposal does not have a good space for distributions yet. I believe the best way to bring clarity then is through the separation of: This would look like:
What are your thoughts on this? It would be great to reach a consensus soon, so we can make this actionable, given that this effort is now 2.5 months old :) And no changes have taken effect. cc @jbottum @jlewi @aronchick @Bobgy @james-jwu @8bitmp3 @PatrickXYS @thesuperzapper @andreyvelich @cvenets |
@RFMVasconcelos your proposal for docs structure looks great to me, happy to help with the content. |
Let's move forward with this doc proposal, overall lgtm Thank you for your contributions @RFMVasconcelos |
One of the challenges that I see in revising the information architecture at this time is that we have a couple large changes within the project that are currently in progress, such as moving the component docs off of Kubeflow.org and moving towards providing different distributions of Kubeflow. In both cases, I don't think that we have enough information to make the best decisions so I lean towards making smaller changes now and incrementally revising the plan as the larger changes within the project move forward. For example, will Distributions be distinct from Platforms? Or, will we have individual distributions to support different cloud providers or service providers? Are the docs for the AgileStacks distribution the same on-prem and AWS, or is this a separate set of documentation? If the documentation should support individual distributions, then way may not need a platforms section (since content for a platform could live within the section for the distribution.). If there is some content that is shared between distributions (for example an AWS distribution and AgileStacks on AWS), then we may want to consider options for managing shared content within the distributions section. (That said, this increases the complexity of managing the docs.) I think that more discussions are required to understand the relationship between Distributions and Platforms going forward. So, I'm not sure that it makes sense to add Distributions and Platforms at this time. There is probably another set of questions around, is Kubeflow.org the best place for documentation about distributions? Should Kubeflow.org describe the distributions that are available and help users determine which distribution is right for them and then link the user to the distribution's docs? (Similarly to the long term plan for components.) Also, please do not abbreviate Distributions to Distros in the docs. |
@joeliedtke Yes Distributions are separate from Platforms. Distributions are based on upstream Kubeflow but complement it with additional installation tools, opinionated configurations, or integrations. For example OpenShift distribution of Kubeflow brings it together with other OpenShift tools. It only helps Kubeflow users to know about available deployment options - A) do it yourself from the latest upstream on selected platform (such as Azure) or B) select a distribution and follow distribution specific deployment instructions also there is an option to engage professional services. Kubeflow.org is an excellent place to list all available distributions, similar to kubernetes.io and many other open source projects. This was already discussed many times and I am surprised you asking this question. Distributions exist to make it easier to deploy a subset of Kubeflow components on a selected cloud platform, with available commercial support from a vendor. The detailed content specific to each distribution will consist of an overview page with links pointing to distribution-maintained. In most cases the instructions to deploy a distribution will be platform-dependent. For example, separate pages for on-prem, GCP, and AWS. But we don't have to worry about it - leave it to the distribution. We are building a community of users and companies that work together on making Kubeflow the best open source machine learning platform. |
@joeliedtke, I think we shouldn't be stuck until we have the perfect information to move forward. Many people in the community have shown support for this initiative and honestly, if we don't start now, we will continue driving users through a poor UX, which we know hinders growth in the adoption of Kubeflow, which is a driver for sales in the organizations involved and hence feeds back to how much these orgs are able to invest on Kubeflow. It is my belief that we should be impatient in providing a comprehensible Kubeflow user experience. |
@abhi-g @Bobgy @james-jwu @jlewi @richardsliu @theadactyl as the OWNERs able to approve this initiative, can you please review? Many thanks! 🚀 |
I'm not suggesting that we wait for a perfect solution. I am concerned about premature optimization. Making decisions about things that we don't fully understand is a good way to create future problems. I'm suggesting that we limit the scope and iteratively work towards improving the site. |
@mameshini To what extent are distributions and platforms distinct? For example, is all of the documentation for Kubeflow on AWS applicable to Kubeflow on AWS when deployed by AgileStacks? Should the documentation for an AgileStacks distribution specify everything that someone will need to know to install Kubeflow? Or, is there a subset of content for Kubeflow on AWS that is applicable to the AgileStacks distribution? Analysis like this can help us determine if we need both sections (and I'm not arguing that we don't need distributions... I'm wondering if platform specific content may become part or the content for a distribution.) and techniques that will be helpful in managing this content. |
@RFMVasconcelos Please change Distros to Distributions. |
@joeliedtke There is a subset of content for Kubeflow on AWS that is applicable to the AgileStacks distribution, but certainly there are some differences. Each distribution is making some opinionated choices to provide better integration or user experience. For example, documentation on AWS walks a user through steps to configure ACM, Cognito, Auth0. For AgileStacks distribution we decided to use Letsencrypt, Dev, LDAP to make it work the same way as on-prem. Therefore, information about installing Kubeflow can be very distribution & platform specific, while information about using Kubeflow should be very common between distributions. We also provide a set of tutorials/examples that are based on AgileStacks distribution so we can expect S3 buckets, secrets, configmaps to be available for tutorial to work out of the box. Platform specific content (like Kubeflow on AWS) is in addition to distribution-specific content, and in most cases it will apply completely accurately to AgileStacks distribution. However, our goal is to maintain a distribution that is deployed using AgileStacks Hub CLI and make it work across cloud and on-prem consistently. AWS team is doing a great job maintaining AWS specific documentation, however they don't have a goal for this documentation to also work for on-prem deployments. Therefore they are selecting some tools and configuration options that will not work on-prem or on GCP. Multiple distributions make different choices for installation and integration, which are fine-tuned for different Kubeflow end user personas. |
@mameshini Sorry for the delay getting back to you on this. To me this sounds like the platform documentation should exist as part of the distribution content. We could help users understand which distributions work on their platform of choice, but it sounds like the bulk of the platform content is distribution specific. I would recommend the following changes:
What links are you envisioning appearing in the top nav? |
Distributions documentation is in addition to platforms specific documentation. I agree that it's better to rename "Methods and Distributions" to simply "Distributions". Let's build it out and then we can fine tune it later. Overall the proposal lgtm. |
@joeliedtke, thank you for all the comments. I have removed "Methods &", leaving only "Distributions". As @mameshini suggests, let's take this piece by piece, and eventually we will get to the creation of things like more index pages :) If in the meantime you'd like to start one of those efforts in parallel we can definitely attempt that. I think this has to look like a list of issues, and so I will create tomorrow a project with a list of issues, so we can start making all of this better 🚀 |
@RFMVasconcelos , to clarify, are you planning to add issues for the entire doc plan, or a project where we can create issues for different aspects of this project? (For example, moving Pipelines to Components, or reordering the content of the Components section.) I agree that it would be helpful to track this work in one place. I would currently recommend against creating all of the tasks now, since discussions are continuing on some aspects of this plan. |
@RFMVasconcelos @joeliedtke as discussed in this week's community meeting, we should get moving on the "Components" second refactor, this is because there is no controversy around it, and we want WG's to start updating their docs for Kubeflow 1.3. For all those following this thread, please go to kubeflow/website#2465 for the "Components" issue which discusses the specifics of how we will do the refactor. Lets aim to get that refactor done within the next week. |
Sounds great to me as a start! We can start with I've started this project and so we can track issues/PRs so we can break this beast into smaller tasks and address them effectively. @joeliedtke this is what I meant by project, no need to make all issues for the entire doc plan now, let's start small and soon we'll have improved the overall UX :) |
@joeliedtke @mameshini @thesuperzapper I have created the first set of issues & PRs all attached to this project, so we can get this up and running! 🚀 |
As per discussion in the last community meeting, adding "Methods" back in, as for example `kfctl` is not a distribution but rather an installation method.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Phase 1 - mapping out docs today.
See /website #2293