Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

values.schema.json ships with chart and configuration reference now covers all options #2033

Merged
merged 43 commits into from
Feb 17, 2021

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Feb 13, 2021

Summary

Breaking changes

If someone has a configuration that overrides a default value to null or similarly, then they can get validation errors before the k8s api-server has been contacted by the helm CLI.

Implementation notes

Helper scripts

  • tools/compare-values-schema-content.py
    This new script isn't run as part of the CI system, but was helping me to spot if there were values in values.yaml that weren't in
    schema.yaml and vice versa.

  • tools/validate-against-schema.py
    This script was previously located in the chart folder and named validate.py, I've just moved it and made it validate with the lint-
    and-validate-values.yaml file as well. This script now runs in a dedicated job part of the test-chart workflow, before it was undocumented and not part of the CI system.

  • tools/generate-json-schema.py
    This new script reads schema.yaml, cleans it from descriptions that isn't relevant for the helm CLI's logic as far as I know, and emits jupyterhub/values.schema.json. This script runs in all chart tests and before we package and publish the Helm chart.

Accept null or not?

I have considered the type enforcement of all values that require a string or integer and considered if they should allow null or not. The principle I concluded was that if the field is a Helm chart native configuration is should typically not allow null. But, if it is a passthrough configuration that isn't required by the Helm chart templates, then we should allow it be null in order to be able to use the default values of the actual software such as JupyterHub or KubeSpawner. For such values, null implies the default value used in the software rather than the default value of the Helm chart typically.

If we would support null for all values, we would risk introducing unneeded runtime errors helm template rendering errors that are far harder to interpret. Due to this, I have opted to avoid allowing null.

Sometimes the chart has given a blank string different meaning to null, such as for fullnameOverride. At this times null is also allowed.

Notes on helm validation logic

I don't think it is possible to be very strict and warn for passed configuration not part of the schema which would be useful if a user has misspelled an option. What we can do, is to declare required fields (required), and enforce the fields passed have accepted data types (type).

In other words, this is not an end to the mistakes you can make, but it is a good improvement. While there may be some fields that are required, I have only used that keyword to require the root objects such as hub and proxy to be set.

@consideRatio consideRatio marked this pull request as draft February 13, 2021 05:30
@consideRatio consideRatio force-pushed the pr/100-pct-schema.yaml branch from af9f7d7 to a3d6140 Compare February 13, 2021 05:31
@consideRatio consideRatio changed the title Increasing coverage of schema.yaml / configuration reference values.schema.json ships with chart and configuration reference now covers all options Feb 15, 2021
@consideRatio consideRatio marked this pull request as ready for review February 15, 2021 03:36
Copy link
Collaborator

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I AM ABSOLUTELY AMAZED BY THE AMOUNT OF WORK AND CARE YOU PUT INTO THIS. WOW

import yaml

# Change current directory to this directory
os.chdir(os.path.dirname(sys.argv[0]))
Copy link
Collaborator

@yuvipanda yuvipanda Feb 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
os.chdir(os.path.dirname(sys.argv[0]))
os.chdir(os.path.abspath(os.path.dirname(__file__)))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see comment below in validate-against-schema - I don't think os.chdir in scripts is usually something we want to do.

Copy link
Member Author

@consideRatio consideRatio Feb 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like the script to function the same way independently of where they are executed from.

I guess the crux of using os.chdir is that it can influence the environment calling the script? Running it from my terminal didn't influence the terminal, but it could perhaps?

I'm not sure on the path forward here, suggestions?


UPDATE: Will resolve this using https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/2033/files#r576103689 logic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by cb0ae7d

url:
password:
subPath: ""
storageClassName: ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's set to empty rather than unset, will the resulting field be present? I think many fields in Kubernetes have adifferent effect when it's set to empty string - particularly, the defaults might not apply.

For example, an empty storageClassName has different behavior from having it be not present (see doc).

So my instinct is to allow anything that can be a non-container type (so strings, numbers, bools) be nullable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think null for unset is often the right thing to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes storageClassName was one of those fields where we differentiate between unset and empty string!

  {{- if typeIs "string" .Values.hub.db.pvc.storageClassName }}
  storageClassName: {{ .Values.hub.db.pvc.storageClassName | quote }}
  {{- end }}

I agree. Resolved by 1153aa3.

tools/compare-values-schema-content.py Show resolved Hide resolved
@minrk
Copy link
Member

minrk commented Feb 15, 2021

This is awesome! ❤️

Accept null or not?

I think we need to be careful with this a bit. Setting to null and leaving unset are not always equivalent, but often are, and that seems to be the norm for helm/kubernetes, but not for traitlets. If they are interpreted by the helm chart with things like with .Values.something allowing null is usually fine as the result is to leave it unset. With passthrough config to traitlets, though, null and unset are not generally the same, and the schema should generally distinguish. Similarly, with kubernetes, setting many fields to null is the same as leaving it unset (imagePullPolicy is one, I believe, where we should allow null to select the cluster default behavior).

The exception is the subset of config that is not direct passthrough, and instead is specifically transformed by the chart from values.yaml to traitlets config (e.g. singleuser.cmd). For those, I agree that allowing explicit null to mean using the default is the right thing to do. This is what we've implemented with the set_config_if_not_none utility in juptyerhub_config.py. That utility function is required to implement this different interpretation - helm values null means apply no config, while None in traitlets config means explicit override default with None.

jupyterhub/schema.yaml Show resolved Hide resolved
.github/workflows/test-chart.yaml Show resolved Hide resolved
tools/validate-against-schema.py Outdated Show resolved Hide resolved
import yaml

# Change current directory to this directory
os.chdir(os.path.dirname(sys.argv[0]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see comment below in validate-against-schema - I don't think os.chdir in scripts is usually something we want to do.

url:
password:
subPath: ""
storageClassName: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think null for unset is often the right thing to do.

@consideRatio
Copy link
Member Author

@minrk regarding comments on Accept null or not? I think we are in agreement and have arrived at the same conclusions, but I find the topic quite complicated in general.

Is it correct that you don't suggest changes other than those you have already explicitly pointed out?

@consideRatio consideRatio requested a review from minrk February 15, 2021 11:45
@@ -35,11 +35,11 @@ hub:
annotations: {}
ports:
nodePort:
loadBalancerIP:
loadBalancerIP: ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also be unset, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand why some vars are empty string and some are unset. My intuition is that default should be either a value we want (like uid: 1000), unset, or empty container ([], {}, etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The principle ive had while writing these evolved to be:

1: if the helm chart provide an explicit default value: set it.
2: else if the helm chart config is a passthrough config for
jupyterhub/kubespawner/authenticator etc: we set it to null to represent the intent for the helm chart to not change the default value of the underlying config, and a falsy value to represent a wish for it to be explicitly set to falsy value.
3: else if, null snd a falsy instance type both imply a unset value, set it to the falsy instance type
4: else set it to null

To me, the most controversial point is the third. The question in my mind boils down to: if both null and a falsy type ("", {}, [], 0) by the helm templates imply that a value wont be set, should we specify the values falsy type of null?

I lean towards specifying the falsy instance type rather than null if the config isn't passthrough.

I think we break this logic on point 2 for array/objects passthrough config, which i think is motivated by avoiding warnings and help helm templste logic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with loadBalancerIP, I see the following code

  {{- with .Values.proxy.service.loadBalancerIP }}
  loadBalancerIP: {{ . }}
  {{- end }}

So if I understand correctly, the loadBalancerIP key in the Service object won't be present at all in the following circumstances:

  1. .Values.proxy.service.loadBalancerIP is specified in helm chart values but not set (current master situation)
  2. .Values.proxy.service.loadBalancerIP is set to "" (as in the PR)

Is this understanding correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yepp! The with keyword using a falsy value is like a do nothing statement, it has become a practice to use it throughout the helm ecosystem over if ... Then ... - probably to avoid duplicating the reference needed in both the conditional statement and within

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. So if I see the values.yaml file, it looks like I'll actually get loadBalancerIP: "" in my Service object, which is confusing! Since k8s differentiates between unset values and "" values, so should we - otherwise it can be pretty confusing. I think the mental model of if a value is unspecified (null), it is not present in the target is better than if a value is falsey (by go template's definition of falsey) it will not be present in the target. The latter also has exceptions, like storageClassName, making it even more confusing.

So ideally, I'd like to hold the mental model of 'if it is null in my values.yaml, it will not be present in the output Service object'.

Copy link
Member Author

@consideRatio consideRatio Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuvipanda I can make it so for strings/integers without problem I think, but if we apply the same principle to {} and [] I believe there will be trouble such as helm emitting warnings, broken assumptions by helm template logic, and potentially a schema that fail to capture otherwise easy to capture issues.

I suggest we apply the principle that you suggest anyhow to our scalar strings and numbers specifically, but don't try to apply it on falsy values for object {} and array [], does that sound okay to you?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's perfect! I totally agree, @consideRatio :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for quickly iterating with me on this @yuvipanda!! I love it when I arrive at a concrete action point!

Copy link
Member Author

@consideRatio consideRatio Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a quite hard rule principle to follow without making exceptions. I was able to allow many strings to be null so our default values are now null rather than "" by default, but plenty of booleans remain required to be "boolean" rather than "boolean or null" still.

Example of complexity arisen:
If helm templates have logic making an eq comparison between a variable and a string, then it will error if it holds a null value. And, since helm template logic will evaluate Y in statements like like if X and Y, one is forced to do if X then if Y then ... or if X and (Y | default "") etc.

Anyhow, things work okay now, don't want to change back, I don't see a perfect tradeoff with regards to these considerations.


Resolved by 2c1cc57 (that also includes some misc changes deemed relevant)

@consideRatio
Copy link
Member Author

I think I've addressed all review points again and that this PR is potentially ready for a merge.

@yuvipanda
Copy link
Collaborator

@consideRatio happy to hit merge after the merge conflict is resolved.

@consideRatio consideRatio force-pushed the pr/100-pct-schema.yaml branch from 35fd5db to 6c2a38a Compare February 17, 2021 09:03
@consideRatio
Copy link
Member Author

@yuvipanda 🎉 rebased to resolve simple merge conflict

@yuvipanda yuvipanda merged commit 6778795 into jupyterhub:master Feb 17, 2021
@yuvipanda
Copy link
Collaborator

\o/ THANK YOU, @consideRatio!

@consideRatio
Copy link
Member Author

Wieeee ❤️ 🎉 thank you @yuvipanda!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

schema.yaml: not all configuration options are documented Helm 3: Values.yaml schema validations
3 participants