-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only support default partition in server cluster #721
Conversation
In almost all cases, users want to set bootstrapExpect to the number of server replicas. This change defaults it to null in values.yaml and then in the template if it's left as null, then we set the -bootstrap-expect flag to the number of server replicas. This is backwards compatible, if users have been setting this it will continue to be set. Also error out if bootstrapExpect is less than the number of replicas because this is definitely a misconfiguration as the servers won't wait until the proper number have started before electing a leader.
74a17a0
to
5a882a8
Compare
8073aee
to
5ac36a8
Compare
cd `chart_dir` | ||
assert_empty helm template \ | ||
-s templates/partition-init-role.yaml \ | ||
--set 'global.adminPartitions.enabled=true' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the description or the test is wrong here on whether or not adminPartitions.enabled
is true or false!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
--set 'global.adminPartitions.name=test' \ | ||
. | ||
|
||
[ "$status" -eq 1 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work.
I left a couple comments before you merge but otherwise it's good to go!
On a meta note : I wish there was a better way to make the $serverEnabled :=
easier to read, thx Helm!
5ac36a8
to
30126ae
Compare
171379f
to
7bb77a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs should be updated for the values to state that:
- Only
default
is allowed for server cluster. - Can't change the partition name
- Only supported on initial installation, can't enable partitions after initial installation.
@@ -1,5 +1,7 @@ | |||
{{- if (or (and (ne (.Values.client.enabled | toString) "-") .Values.client.enabled) (and (eq (.Values.client.enabled | toString) "-") .Values.global.enabled)) }} | |||
{{- if (and (and .Values.global.tls.enabled .Values.global.tls.httpsOnly) (and .Values.global.metrics.enabled .Values.global.metrics.enableAgentMetrics))}}{{ fail "global.metrics.enableAgentMetrics cannot be enabled if TLS (HTTPS only) is enabled" }}{{ end -}} | |||
{{- $serverEnabled := (or (and (ne (.Values.server.enabled | toString) "-") .Values.server.enabled) (and (eq (.Values.server.enabled | toString) "-") .Values.global.enabled)) -}} | |||
{{- if (and .Values.global.adminPartitions.enabled $serverEnabled (ne (.Values.global.adminPartitions.name | toString) "default"))}}{{ fail "global.adminPartitions.name has to be \"default\" in the server cluster" }}{{ end -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{{- if (and .Values.global.adminPartitions.enabled $serverEnabled (ne (.Values.global.adminPartitions.name | toString) "default"))}}{{ fail "global.adminPartitions.name has to be \"default\" in the server cluster" }}{{ end -}} | |
{{- if (and .Values.global.adminPartitions.enabled $serverEnabled (ne .Values.global.adminPartitions.name "default"))}}{{ fail "global.adminPartitions.name has to be \"default\" in the server cluster" }}{{ end -}} |
Does it need toString
?
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: {{ template "consul.fullname" . }}-partition-configmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop the configmap
suffix since we don't need the type as it will be implied by the kind of the resource.
charts/consul/values.yaml
Outdated
# The name of the Admin Partition. Must be "default" in the server cluster ie the Kubernetes cluster that | ||
# the Consul server pods are deployed onto. | ||
name: "default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should note that:
- Can be something other than default in "non-server" clusters
- Can't be changed after initial installation. If they wish to change it they need to uninstall Consul.
- This does not cause the parition-init job from running or the agents from attempting to join the new partition, but the helm upgrade stays in a failed state until an upgrade is run with the partition name reverted.
7bb77a2
to
112ef52
Compare
* Fail if agents in server cluster are created in a partition not "default" * Fail helm upgrades if partition name is updated. - This does not cause the parition-init job from running or the agents from attempting to join the new partition, but the helm upgrade stays in a failed state until an upgrade is run with the partition name reverted.
* Fail if agents in server cluster are created in a partition not "default" * Fail helm upgrades if partition name is updated. - This does not cause the parition-init job from running or the agents from attempting to join the new partition, but the helm upgrade stays in a failed state until an upgrade is run with the partition name reverted.
Changes proposed in this PR:
helm upgrade
to fail if the admin partition name is updated between the install and the upgrade.Questions:
Even though the helm upgrade fails when the partition name is changed, the partition-init job runs and the agents get reassigned. I could make the partition-init job a
pre-install
job and not apre-upgrade
job, BUT the drawback of doing so would be anyone trying to upgrade an existing non-partition cluster to a partitioned cluster cannot. Should we ignore this use case? Ignore the upgrade use-case will ensure that the partition-init job will not run thereby not creating a new partition.FOLLOW UP:
After conversations with the team, it was decided that in the near term, we will not support updating to a partition that is non-default. Removing the
pre-upgrade
hook frompartition-init
job. This will ensure we don't create a partition when a user errantly updates the partition-name and performs a helm upgrade.How I've tested this PR:
Manual testing.
Bats tests
How I expect reviewers to test this PR:
Code review
Would love to hear thoughts on the above questions.
Checklist: