Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out what Kubernetes installation method we 'support' #593

Closed
yuvipanda opened this issue Mar 20, 2018 · 33 comments
Closed

Figure out what Kubernetes installation method we 'support' #593

yuvipanda opened this issue Mar 20, 2018 · 33 comments

Comments

@yuvipanda
Copy link
Collaborator

yuvipanda commented Mar 20, 2018

We currently have instructions for Google, Azure and AWS to create a kubernetes cluster. We've had people try to set up clusters in other places:

  1. JetStream (from @zonca https://zonca.github.io/2017/12/scalable-jupyterhub-kubernetes-jetstream.html)
  2. Digital Ocean (Jupyterhub on Digital Ocean #401)
  3. Bare metal clusters in various places (Jupyterhub with Kubernetes on local cluster, custom hub #591, namespaces forbidden error while deploying in baremetal environment #277, JupyterHub CrashLoopBackOff #493 and various questions on the Gitter Channel)
  4. OpenStack (Add support for additional cloud providers #88, Openstack LoadBalancer #340, Notes from aleks (South Big Data Hub) meeting #237 (comment) and various discussions)
  5. OpenShift

I'd love for us to answer the following questions:

  1. When do we have instructions in the guide, and when do we just link to an external location?
  2. What do we do when users have problems with kubernetes installations and come to us? Do we help try to fix them? Do we redirect them to the k8s community? What is the line here between 'kubernetes problem' and 'JupyterHub' problem?
  3. For the providers we have instructions for, should we grade them with 'levels of support'? Which ones do we test with Kubernetes version updates and JupyterHub chart version updates?

I think we need to make sure we communicate clearly that you can use z2jh without depending on Cloud Providers. This is especially important in the current climate of intense critique of privacy of various platforms. Several people have run k8s clusters successfully on all these platforms, but everyone has to re-learn the same lessons without any place to share them.

Here are the various options I see:

  1. Introduce various installation methods in the z2jh guide that can be contritbuted to by users. We would answer question (3) from above, and mark these with appropriate levels. Particularly important is how much testing we do with new releases, since we might not have access to test on these platforms.
  2. Have a place to put links to various blog posts. This takes all decisions out of our hands, and 'reader beware, check the date before trying!' mode.
  3. Have a wiki or a community repo for doing option (1), to make it clearer that it is not part of the main z2jh guide. This can have much wider merge rights - perhaps anyone who lands a PR gets merge rights?

These are not exclusive, and there are definitely other options! Just writing these here to kick off conversation :)

@yuvipanda
Copy link
Collaborator Author

Tagging @gsemet who was interested in doing this for OpenStack I believe.

Tagging @ZachGlassman who was interested in doing this for DigitalOcean.

Tagging @aculich who was interested in seeing this happen for bare metal installations (but might not have time himself)

Tagging @choldgraf who told me that this was #1 asked for feature in his recently concluded tour of Europe :)

@choldgraf
Copy link
Member

Just adding a quick bit of info: the "#1 asked for feature" was basically "I want to run JupyterHub on k8s without giving google/amazon/MS/ a bunch of money"

@manics
Copy link
Member

manics commented Mar 20, 2018

I'm happy to help with OpenStack too. However maybe this could be combined with a bare-metal installation, especially if the instructions are designed for a "minimal functionality" Kubernetes cluster:

@gsemet
Copy link
Contributor

gsemet commented Mar 20, 2018

Hi. Thanks for this notification. We don't use openstack but a custom baremetal Kube cluster (based on CoreOS). I did not have any major change on the guide that was an excellent support for my setup and self education, just a few doc enhancement on jupyterlab.

Jupyterhub works like a charm behind a traefik ingress, I'll be happy to document on this use case + weird/experimental helming tricks (postgres, ...).

We do have weird restrictions imposed by the team in charge of the cluster, for instance all non officially supported software should be in the same namespace, no DBaaS, and so on, so I usually open new tickets here :)

But I would be glad to contribute to a community wiki! I do not recommend the blog section (a blog section in the wiki is perfectly acceptable, just not as main "community" section). Blog links do break, can be no more maintained by the author, a wiki is the perfect tool to ensure a useful information stays up to date even in case of author rotation.

@choldgraf
Copy link
Member

I am also +1 on number 3 and probably number 1, but -1 on linking to blog posts for the reasons others have stated (e.g. we don't want to create a project norm of "oh don't use that blog post, use this newer blog post")

@ZachGlassman
Copy link

Hi. I would be happy to contribute to a community wiki. Like above , I have been using https://github.com/kubernetes-incubator/external-storage with success for a while and could probably provide some wiring tips on that.

@zonca
Copy link
Contributor

zonca commented Mar 20, 2018

I think a wiki or community repo would be fine, and it is important to publish guidelines that set how much freedom a user has to edit an existing recipe. Or, do we need a group of maintainers for each recipe (swcarpentry style)?

@yuvipanda
Copy link
Collaborator Author

Also /cc @willingc and @jzf2101, who have probably put a lot of thought into structuring documentation.

@willingc
Copy link
Collaborator

willingc commented Mar 20, 2018

My initial instinct is that:

  • the major pain point for users is getting to a basic Kubernetes cluster. Each cloud provider or infrastructure uses its own naming, jargon, approach. While things are perhaps functionally similar e.g. storage, cpu, memory, networking, the path to getting to a cluster capable of supporting Kubernetes differs widely.
  • the dividing line between basic infrastructure capable and JupyterHub is at the point where Helm is installed and then charts can be loaded. I would focus our efforts on from that point until a functioning JupyterHub
  • As for each vendor, I personally do not wish to be an expert in each one ;-) Things like the Kubernetes service catalog may make it easier in the long run. In the short term, I would be inclined to take the sections of each major infrastructure setup and create a recipe similar to conda forge. Here's a basic GCP recipe, Azure recipe, etc. to get the cluster up and running. A vanilla recipe if you will as @zonca mentions.

I need to do a bit more thinking on the docs but I'm thinking that it may make sense to have a source subfolder for vendor specific/infra specific docs that we link to. I'll make a WIP PR later as an example. I think it will both modularize the cluster docs and provide extensibility for each major infra up to the point of installing a helm chart.

cc/ @minrk

@choldgraf
Copy link
Member

choldgraf commented Mar 20, 2018

totally agree regarding the "dividing line" - I think that there one big difference that separates out two "types" of vendors: one group that offers some kind of "click this button and we'll give you a k8s cluster" service, and the group where you have to set all of that up yourself.

My intuition is that the former group is much easier to "officially" support/maintain within z2jh, and the latter will require some other process to generate community knowledge on bootstrapping your own k8s cluster. Alternatively we could include it as a sub-module of the docs, but with big letters saying "we do not have the bandwidth to give you support if these docs don't work out for you, and your mileage may vary"

Either way, I like the idea of modularizing the vendor-specific docs rather than just glomming them on to the one page :-)

@willingc
Copy link
Collaborator

FYI. @choldgraf I'm going to break out the "Creating a Kubernetes cluster" on the main docs page into its own section from "Creating your JupyterHub" section. Working on a WIP PR that we can iterate on or pitch if we come up with something better.

@choldgraf
Copy link
Member

Cool! Excited to see how the PR looks :-) thanks for taking an initial stab!

@cam72cam
Copy link
Contributor

I'd like to do a AWS kops, but without using Heptio. Would that be welcome?

@yuvipanda
Copy link
Collaborator Author

@cam72cam most definitely! #243 is the issue for that.

@willingc
Copy link
Collaborator

I think that PR #594 gives us a good base to work off for adding community docs.

@minrk
Copy link
Member

minrk commented Mar 21, 2018

I think for the most part, z2jh should not have docs on setting up kubernetes. Copy/paste commands for cloud providers that give you easy, fully-functional clusters makes sense, but openstack/kops/kubeadm I think ought to be out of scope. For these, I think we should link to external docs on setting up kubernetes clusters, but I don't want to put us in a position to be getting bug reports about deploying a kubernetes cluster since there is an endless supply of options and things that can go wrong. But some description of what is required of a kubernetes setup that we require is appropriate, so that people looking to deploy their own kubernetes can know what they need to have. I think the main points for supporting less-than-complete kubernetes installations is the dynamic storage and load-balancer, which are often absent from self-deployed clusters, as mentioned above. It may be appropriate to include "if you don't have a load-balancer" and/or "if you don't have a storage provider" configuration examples.

@gsemet
Copy link
Contributor

gsemet commented Mar 21, 2018

I agree. There are many configuration of kube cluster, the setup of these config should not, IMHO, be under the umbrella of z2jh, but you can give hints on what to do given the users configuration. Maybe a full example using minikube (with an ingress) can help newbies, I have harder time using minikube now than using our cluster, now I know better what our cluster can do and cannot.
And if I move to another cluster (GCE, ...), I will be happy to find these different options described in z2jh guide.

@yuvipanda
Copy link
Collaborator Author

I'll try to summarize discussion, and point out specific questions we can answer.

Consensus of sorts seems to be,

  1. We (z2jh maintainers) do not want to be in the business of 'tell people how to set up Kubernetes'
  2. We should provide minimal instructions for popular providers that have a very easy quickstart. But we should keep this as minimal as possible, and link out wherever we can. It's still unclear where we draw this line.
  3. We should list requirements of kubernetes clusters we can run on in a clear way - Load Balancing, PVCs, etc.

I propose the following series of actions:

  1. Write up a guide on what features a Kubernetes Cluster is required to have to run z2jh. This is provider agnostic.
  2. Set up a grading system to describe the providers we will have instructions for. Possible grades:
    a. Grade A could mean 'Provider has marked it as non-beta, z2jh team tests this prior to each release'. Currently this would be only Google Cloud. We'll have minimal quickstart instructions in the repo, and link out to everything else.
    b. Grade B could mean 'Provider has marked it as beta, z2jh team tests this when they can (put date of last test in instructions)'. Currently this would mark Azure and Amazon. We'll have minimal quickstart instructions in the repo, and link out to everything else.
    c. Grade C could mean 'Fully community owned, z2jh team does not test this actively'. This would not have any quickstart instructions - only links to external resources will be present in our repo. We will accept PRs for any specific settings / versions required (structured in-line with (1))to run z2jh on this kind of cluster. OpenStack Magnum, OpenShift, Bare metal clusters, etc fall here. Most of the content in-repo will just be links.

What do people think of this?

@gsemet
Copy link
Contributor

gsemet commented Apr 3, 2018

sounds great! Makes sense. Do you think you (z2jh) could still host a wiki (mediawiki, github wiki or any other) for the community to self-organise around?

@yuvipanda
Copy link
Collaborator Author

@gsemet I'd like to, but let's open a different issue specific to discussing that? I think these are all big questions, and keeping discussion focused is important.

@consideRatio
Copy link
Member

For reference

Digital Ocean is about to start providing kubernetes!

/cc: @jzf2101

@jzf2101
Copy link

jzf2101 commented May 15, 2018

@consideRatio Also this as reference #646 . We should keep an eye on it as they roll this out

@willingc
Copy link
Collaborator

I should be getting early access to the Digital Ocean service when I return home.

@jzf2101
Copy link

jzf2101 commented May 15, 2018

Great. Let us know what you find @willingc

@consideRatio
Copy link
Member

With #758 features are introduced that benefits of having a node-pool or equivalent concept that can label all nodes as well as taint all nodes.

Is this possible within AWS / Azure etc? Node-pool labeling and tainting?

Also, a kubernetes aware cluster autoscaler is available but not official on amazon cloud? Hmmm?

@alexmorley
Copy link
Contributor

alexmorley commented Mar 18, 2019

Hey all. From what I've experience so far the Digital Ocean Kubernetes (still in beta) offering is (quite a lot) easier to get start on than anything else I've tried so far (Azure and GCP). I've started a PR with instructions to set it up at #1192 .

@ablekh
Copy link

ablekh commented Mar 18, 2019

@alexmorley Nice! Just keep in mind that AFAIK current feature set of Digital Ocean's managed K8s offering is not fully comparable with major public cloud providers due to lack of automatic upgrades, auto-scaling, multiple availability zones, integration with AD etc. It would be also interesting to compare pricing with usual suspects as well as performance (cluster creation/deletion times). Do you have any comments on this?

@consideRatio
Copy link
Member

Oh thanks for the insights about this @ablekh!

@ablekh
Copy link

ablekh commented Mar 18, 2019

@consideRatio My pleasure! :-)

@betatim
Copy link
Member

betatim commented Mar 18, 2019

I think we should keep our guides as unopinionated as possible. If there are features missing that a basic JupyterHub needs then we shouldn't have a guide but if features beyond that are missing that somehow should be left as a judgement to the user. We shouldn't become a place that passes judgement on what k8s is "the best". We already struggle just to provide guides for setup, also having to make defendable and fair judgements on additional features would sidetrack us IMHO.

Not sure what the best place for sharing such additional info is though as it is clearly interesting and relevant for people wanting to pick one :-/

@alexmorley
Copy link
Contributor

alexmorley commented Mar 19, 2019

We already struggle just to provide guides for setup, also having to make defendable and fair judgements on additional features would sidetrack us IMHO.

I agree. If possible finding an external resource for choosing a k8s provider and link out to that seems like the idea way to go.

@ablekh No I find it hard to compare the pricings directly but I guess it will be the standard DO cheaper for smaller workloads but more expensive when scaling.

@atulyadavtech
Copy link

Add information about Bare metal clusters installation. Looking forward...

@consideRatio
Copy link
Member

I think we need some notes on bare metal clusters, as they lead to questions about "why is my LoadBalancer k8s service pending?" and such which we could address there, but overall I think the practical path we have taken so far have been that if someone does the big job of adding a big chunk on how to install z2jh on a certain cloud, then we have added it. Maintenance have been reasonable in my mind.

Closing this in favor of another issue now, where I summarize the wish for more on bare metal clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests