[Part 7 of #758] Added `singleuser.tolerations`, for node taints #925

consideRatio · 2018-09-11T17:57:18Z

To review

Added singleuser.tolerations, for node taints

consideRatio · 2018-09-11T19:35:40Z

jupyterhub/templates/hub/deployment.yaml

+      {{- $dummy := include "jupyterhub.prepareScope" $_ }}
+      {{- if $_.hasTolerations }}
+      {{- include "jupyterhub.tolerations" $_ | nindent 6 }}
+      {{- end }}


prepareScope prepopulates $_ with various stuff, and then the jupyterhub.tolerations template utilizes that. The $_ will be reused by a future jupyterhub.affinity template so it makes sense to prepare the scope once instead of twice.

jupyterhub/templates/scheduling/_scheduling-helpers.tpl

minrk · 2018-09-12T08:02:37Z

I think we need to be careful with these helpers. I am finding that they are making it harder to understand and work on the chart. I genuinely can't tell what tolerations are being added to the 'core' pods in this PR, which have all this extra labelling and prepareScope logic. What are the resulting tolerations on each pod?

In particular, I'm interested in what this patch would look like if we:

didn't have the prepareScope or tolerations helpers
had no pod-kind label

I think we should try to minimize new helpers in the future, only adding them where they really are solving a problem. If we had a simple helper that was a macro for "tolerate-user-node-taint", with no other logic dependent on the parent scope, I think this might be simpler to write and understand.

consideRatio · 2018-09-12T08:17:31Z

Yes I certainly agree, it is a mess to grasp this code. The big crux is to have something DRY used by both the kubespawner configured in jupyterhub_config.py as well as resources created by helm templates.

The key solution to avoiding complexity would be to have the kubespawner not be configured by jupyterhub_config.py, but by a configmap or similar directly which it observed.

This PR could be simpler by not having the preparescope etc, but the future PR with affinities and presets for configuring affinities.

In order for configurable tolerations to be useful at the same time as supporting for example the image pullers, they must be configured with the same tolerations also, and hence we have the helm templates + jupyterhub_config.py configuration from the same source challenge. Also, by having presets the challenge becomes a bit harder.

consideRatio · 2018-09-12T08:19:54Z

I think we should try to minimize new helpers in the future, only adding them where they really are solving a problem. If we had a simple helper that was a macro for "tolerate-user-node-taint", with no other logic dependent on the parent scope, I think this might be simpler to write and understand.

👍

minrk · 2018-09-12T08:51:11Z

jupyterhub/templates/scheduling/_scheduling-helpers.tpl

+  jupyterhub.tolerations
+    Refactors the setting of the user pods tolerations field.
+*/}}
+{{- define "jupyterhub.tolerations" -}}


I think this helper is at least one too far. The define/end is more than the code it's deduplicating by being reused.

If you would add

tolerations:

without any entries within it, you get invalid YAML. To know if you should add the tolerations: entry or not, you must first figure out if you have content within it.

tolerationsList is the content for the tolerations. It would have been a better name with tolerationsContent.

A lot of the struggles I've faced relates to "should i add this or not, do i have content for the field or not?", the prepareScope helper takes care of a lot of those questions.

I have tried to make do with less helpers and It is really hard... =/ I could try more and more, but I'm currently out of ideas on how to solve the challenges. I'll try address things better in a single point to avoid spreading discussion.

minrk · 2018-09-12T08:55:02Z

jupyterhub/templates/scheduling/user-scheduler/deployment.yaml

@@ -23,6 +23,11 @@ spec:
      {{- if .Values.rbac.enabled }}
      serviceAccountName: user-scheduler
      {{- end }}
+      {{- $_ := merge (dict "podKind" "core") . }}


These lines {{- $_ := merge (dict "podKind" "core") . }} are I think the hardest to follow, and make contributing to/understanding the chart very difficult. I genuinely don't know what this means or how to use it, so I can't really evaluate whether this is the right thing to do and if/when we might need to do the same or similar thing in other templates.

There's now a lot of hidden logic that is several layers deep.

If, instead, this were a single:

{{- include "jupyterhub.tolerations" "core" | nindent 6 }}

I feel like that would be simpler and easier to grok as "include the tolerations for a core pod". Is that achievable?

Perhaps we can reduce the complexity slightly, I dare not prove a negative, but I have found it very challenging. The merge thing could be simplified by using one or more set statements at least.

The reason for the merge, is that you can only pass one object to a helper, and the helper needs a lof of stuff that exists within the default scope referred to as . --- By augmenting the default scope, we would change a state globally, which also isnt great. What I have done here is to create a new variable called $_ that creates a new scope by creating a new dictionary, and merging the current scope to it.

https://gitlab.com/charts/gitlab/blob/master/doc/development/README.md#passing-variables-between-control-structures

{{- include "jupyterhub.tolerations" "core" | nindent 6 }}

This helper would not be aware of what .Release.Name was if it isnt passed the . scope or something that got the . scope merged with it. If in the scope of the tolerations helper one would reference . it would get "core" only. This is sometimes enough for certain helpers, but tend not to be for a lot of helpers.

minrk · 2018-09-12T08:56:21Z

jupyterhub/templates/scheduling/_scheduling-helpers.tpl

+{{- /* Fetch relevant information */}}
+{{- $dummy := set . "tolerationsList" (include "jupyterhub.tolerationsList" .) }}
+{{- $dummy := set . "tolerations" (include "jupyterhub.tolerations" .) }}
+{{- $dummy := set . "hasTolerations" .tolerations }}


Let's try to do this without prepareScope. I think helpers that setup the state for other helpers, rather than being self-contained is something we should try to avoid.

Will attempt to address concerns at a single discussion entry, not in a review comment.

consideRatio · 2018-09-14T10:09:46Z

The pod-kind labels are not needed for any logic. I got confused.
@minrk, delete or retain?

I think I know your answer, I'll delete them. :D

consideRatio · 2018-09-14T10:49:50Z

jupyterhub/templates/hub/configmap.yaml

  singleuser.tolerations: |
-    {{- $_.tolerationsList | nindent 4 }}
-  {{- end }}
+    {{- include "jupyterhub.userTolerations" | nindent 4 }}


We can assume userTolerations has content, we cant with the affinity later though...

consideRatio · 2018-09-14T10:50:33Z

jupyterhub/templates/hub/deployment.yaml

-      {{- $dummy := include "jupyterhub.prepareScope" $_ }}
-      {{- if $_.hasTolerations }}
-      {{- include "jupyterhub.tolerations" $_ | nindent 6 }}
-      {{- end }}


There was a hatch for tolerations on core nodes, but it was not adding anything by default anyhow.

consideRatio · 2018-09-14T10:50:52Z

jupyterhub/templates/image-puller/_daemonset-helper.yaml

-      {{- include "jupyterhub.tolerations" $_ | nindent 6 }}
-      {{- end }}
+      tolerations:
+        {{- include "jupyterhub.userTolerations" . | nindent 8 }}


The image puller also needs the same tolerations

Extra storage labels are now configurable through config.yaml and `singleuser.storageExtraLabels`.

<core|user> pods now get `tolerations` for the node taint `hub.jupyter.org/dedicated=<user|core>:NoSchedule` that could optionally be added to nodes or all nodes in a node group (aka. node pool). Note that due to a bug with using the `gcloud` CLI, we also add the toleration for the same taint where `/` is replaced with `_`. In this commit, `singleuser.extraTolerations` are now also made configurable allowing you to add your own custom tolerations to the user pods.

consideRatio force-pushed the 758-7-tolerations branch from 5ea6df9 to 7d64171 Compare September 11, 2018 18:56

consideRatio commented Sep 11, 2018

View reviewed changes

jupyterhub/templates/scheduling/_scheduling-helpers.tpl Show resolved Hide resolved

consideRatio commented Sep 11, 2018

View reviewed changes

jupyterhub/templates/scheduling/_scheduling-helpers.tpl Outdated Show resolved Hide resolved

consideRatio mentioned this pull request Sep 11, 2018

Already developed - Extracting smaller PRs from #758 #841

Closed

20 tasks

minrk reviewed Sep 12, 2018

View reviewed changes

consideRatio force-pushed the 758-7-tolerations branch from 7d64171 to cbe8418 Compare September 14, 2018 10:47

consideRatio commented Sep 14, 2018

View reviewed changes

consideRatio force-pushed the 758-7-tolerations branch 3 times, most recently from 9b9f6dd to b263568 Compare September 15, 2018 22:57

minrk approved these changes Sep 17, 2018

View reviewed changes

Storage labels: configurable extras

a95873b

Extra storage labels are now configurable through config.yaml and `singleuser.storageExtraLabels`.

consideRatio force-pushed the 758-7-tolerations branch from b263568 to 09d8263 Compare September 19, 2018 12:08

consideRatio force-pushed the 758-7-tolerations branch from 09d8263 to f243103 Compare September 19, 2018 16:21

minrk merged commit 89ee137 into jupyterhub:master Sep 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Part 7 of #758] Added `singleuser.tolerations`, for node taints #925

[Part 7 of #758] Added `singleuser.tolerations`, for node taints #925

consideRatio commented Sep 11, 2018 •

edited

Loading

consideRatio Sep 11, 2018

minrk commented Sep 12, 2018

consideRatio commented Sep 12, 2018

consideRatio commented Sep 12, 2018

minrk Sep 12, 2018

consideRatio Sep 12, 2018

minrk Sep 12, 2018

consideRatio Sep 12, 2018

minrk Sep 12, 2018

consideRatio Sep 12, 2018

consideRatio commented Sep 14, 2018 •

edited

Loading

consideRatio Sep 14, 2018

consideRatio Sep 14, 2018

consideRatio Sep 14, 2018

[Part 7 of #758] Added singleuser.tolerations, for node taints #925

[Part 7 of #758] Added singleuser.tolerations, for node taints #925

Conversation

consideRatio commented Sep 11, 2018 • edited Loading

To review

Choose a reason for hiding this comment

minrk commented Sep 12, 2018

consideRatio commented Sep 12, 2018

consideRatio commented Sep 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

consideRatio commented Sep 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[Part 7 of #758] Added `singleuser.tolerations`, for node taints #925

[Part 7 of #758] Added `singleuser.tolerations`, for node taints #925

consideRatio commented Sep 11, 2018 •

edited

Loading

consideRatio commented Sep 14, 2018 •

edited

Loading