-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Part 7 of #205] Affinities (node, pod, anti_pod - required and preferred) #239
Conversation
d98ab70
to
83cfda4
Compare
This PR makes me think we don't have the right strategy for exposing Kubernetes options. As much as possible, I would like to passthrough config to Kubernetes itself, rather than requiring a new traitlet for every conceivable Kubernetes option. That's never going to be sustainable, as we need to develop as fast as Kubernetes, and duplicate their documentation for every option. Instead, I thing we should as much as possible, rely on "passthrough" options (e.g. So I would like to set this goal for exposing new options via traitlets:
Only if none of these conditions are met should we expose a new Kubernetes option via a new traitlet. DockerSpawner takes this approach, and has very few options, but still exposes ~everything you can possibly do with Docker because it lets you passthrough arbitrary extra info to each call to the docker API via config. The advantage of the What do you think? Are there reasons that can't work? As a test case, can any/all of these options be set via |
Thank you @minrk for the review ❤️! TL;DR Summary This PR contains essential logic for constructing the affinity field of the PodSpec based on what's been set in six different arrays, and it needs to be either here or in Z2JH. I'm confident it will fit best here and help the most people like this. I think the biggest work with creating field mapping for kubespawner API to k8s API was to actually start supporting most available fields in the first place, after that the PodSpec will be quite stable I think. This is the k8s API defintion for the PodSpec-V1-Core. I figure there will be a big difference using the node_affinity_required is nested like NodeSelectorTerm objects in the final array:
K8s API types involved in these field names:
So node_affinity_required for example is nested deep within the affinity field of the PodSpec, it is a mess to add it and if you would update the field after adding it it would be even messier! I think this logic can be delegated to kubespawner or required to be reeimplimented by any users of kubespawner, I'd love to see it in kubespawner rather in each jupyterhub_config.py that builds up the final jupyterhub_config.py. This would lead to us mitigating the issue by adding such helper function to the z2jh.py file for the hub image though. I would really want to avoid asking users to user kubespawner through extra_pod_config systematically, but perhaps we could make easier in another way or so, such as using the API definitions of PodSpec to declare what fields are available automatically etc. This would be a big project though.
Ok for me: Is this option a bool or string right on top of PodSpec, don't add it. Those kind of fields are the simplest to add though and could help users of kubespawner avoid needing to know the full kubernetes API though.
There are two passthroughs currently, extra_pod_spec and extra_container_spec, I don't think there will be a need for more.
I think kubespawner could be remade to be more dynamic/flexible and general, but the change I consider would require infrastructure additions that I don't see anyone being willing to make. I'm thinking about dynamically reading a schema like kubeval does and kubevals underlying library k8s schema downloader and to-json converter. We could also write logic to navigate the provided pod configuration using |
83cfda4
to
7808419
Compare
Rebased on #238 |
I think as much as possible our goal in KubeSpawner should be:
I think it is more complicated for users to have to look at Kubernetes docs to see how the options behave and what values they can have, then check the KubeSpawner docs for how to specify them. If we had a single "Pod spec goes here, docs are in Kubernetes", then there is less information that users need, not more. In particular, I think we should try very hard to have config keys where the name is exactly identical to the Kubernetes name for the same thing (snake_case vs camelCase notwithstanding). For instance, if I know how to specify pod affinity with Kubernetes, I don't know how to specify pod affinity with KubeSpawner. I think naming things is pretty important, and when you are exposing what is an already-named public API that you are exposing and already linking to for docs, as we are here, there is a nontrivial cost to giving it a different name. I may know that a pod affinity looks like: podAffinity:
preferredDuringSchedulingIgnoredDuringExecution: ... but with this PR, after I've read the kubernetes docs for what an affinity looks like, I need to also look in the KubeSpawner docs for what we have decided to call the same option (
I specifically mean deep config values like podAffinity here. If a user had in extraPodSpec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution: ... in the helm chart and that dict was passed through exactly as-is via jupyterhub_config, I think it would do exactly what we wanted. Is that not true?
I don't see how this is the case. What's "a mess" about the above config?
I don't see what infrastructure is required to do this. Doesn't it already work right now with extra_pod_config? What's missing? The only thing I can think of is convenient deep dictionary merging so that config from multiple sources can be merged conveniently as in helm yaml, but that's quite common in Python, with many implementations that we can use. |
Yes I share your view about not duplicating documentation. In this PR I'm referencing what field the
I totally agree, I'm very frustrated about image_pull_secrets for example. that in KubeSpawner only accepts a string, while it is taking an array of dics with When we in z2jh use a partial jupyterhub_config.py to specify things in extra_pod_config followed by a user of the chart utilize the extraConfig in config.yaml to modify c.KubeSpawner.extra_pod_config as well in some way, I figure things get troublesome. Well, but as you say, deep dictionary merging solves a lot of issues, but only if you have access to Python or similar. I wont have access to this in the helm charts. I have already run into the challenges of writing the Z2jh Helm chart while working on using user-placeholder pods with lower pod priority to be able to overprovision nodes - a recommended practice by Kubernetes documentation. I want my user-placeholder pods to schedule exactly like the actual users, if I don't do this, they become pointless. To do this, we could use the kubespawner as well I guess, but then we would need to add another Hub or script to manage the spawning etc instead of controlling them through a kubernetes Deployment directly. So, how do I make the user placeholder pods schedule like the real users? If the real users have settings nested within the extraConfig, which is a big string, it's game over. If I have access to the scheduling configuration more directly. Using a template to add or not add the affinity {{- $dummy := include "jupyterhub.prepareScope" $_ }}
{{- if $_.hasTolerations }}
{{- include "jupyterhub.tolerations" $_ | nindent 6 }}
{{- end }}
nodeSelector: {{ toJson .Values.singleuser.nodeSelector }}
{{- if $_.hasAffinity }}
{{- include "jupyterhub.affinity" $_ | nindent 6 }}
{{- end }} The helper function called prepareScope The affinity helper |
Of all the things I've seen and configured in the K8s API, affinity has been the most troublesome, a lot of fields nested, and some of them are not named like their corresponding object models, for example |
This is an issue that's not specific to this PR, so let's merge this one. But in the future, I think we should try to pursue documenting/using extraPodConfig, and only add new exposing traits if/when that doesn't work, and first try to figure out a way to make it work before adding more traits. |
@minrk I've concluded I was wrong. It was not vital to have kubespawner break down the affinity fields. This PR didn't pan out well :'( Should I take action and make it one big affinity field instead? Hmm... Will keep processing these things. |
Options
|
There are a lot of affinities that can be set that currently isn't support to be set through kubespawner without using extra_pod_config, they are listed below.
Without having these explicitly written out, I realized it was super hard to write a concise Helm template code to generate proper affinities for various pods specs declared in helm templates at the same time as configuring kubespawner through traitlets through values hardcoded in a jupyterhub_config.py file along with a config.yaml file.
All in all, this is therefore required to accomplish the best autoscaling experience in z2jh.
Fixes #200