Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce HA shoot control planes in alpha state #5741

Merged
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ spec:
- podSelector:
matchLabels:
app: etcd-statefulset
garden.sapcloud.io/role: controlplane
gardener.cloud/role: controlplane
- podSelector:
matchLabels:
app: prometheus
Expand Down
1 change: 1 addition & 0 deletions docs/concepts/scheduler.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ The following **sequence** describes the steps involved to determine a seed cand
* having `.spec.settings.shootDNS.enabled=false` (only if the shoot specifies a DNS domain or does not use the `unmanaged` DNS provider)
* whose taints (`.spec.taints`) are tolerated by the `Shoot` (`.spec.tolerations`)
* whose capacity for shoots would not be exceeded if the shoot is scheduled onto the seed, see [Ensuring seeds capacity for shoots is not exceeded](#ensuring-seeds-capacity-for-shoots-is-not-exceeded)
* which are labelled with `seed.gardener.cloud/multi-zonal` if feature gate `HAControlPlanes` is turned on and shoot requests a high available control plane.
1. Apply active [strategy](#strategies) e.g., _Minimal Distance strategy_
1. Choose least utilized seed, i.e., the one with the least number of shoot control planes, will be the winner and written to the `.spec.seedName` field of the `Shoot`.

Expand Down
2 changes: 2 additions & 0 deletions docs/deployment/feature_gates.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ The following tables are a summary of the feature gates that you can set on diff
| DisableDNSProviderManagement | `false` | `Alpha` | `1.41` | |
| ShootCARotation | `false` | `Alpha` | `1.42` | |
| ShootSARotation | `false` | `Alpha` | `1.48` | |
| HAControlPlanes | `false` | `Alpha` | `1.49` | |

## Feature gates for graduated or deprecated features

Expand Down Expand Up @@ -140,3 +141,4 @@ A *General Availability* (GA) feature is also referred to as a *stable* feature.
| ShootSARotation | `gardener-apiserver`, `gardenlet` | Enables the feature to trigger automated service account signing key rotation for shoot clusters. |
| ShootMaxTokenExpirationOverwrite | `gardener-apiserver` | Makes the Gardener API server overwriting values in the `.spec.kubernetes.kubeAPIServer.serviceAccountConfig.maxTokenExpiration` field of Shoot specifications to<br>- be at least 720h (30d) when the current value is lower<br>- be at most 2160h (90d) when the current value is higher<br>before persisting the object to etcd. |
| ShootMaxTokenExpirationValidation | `gardener-apiserver` | Enables validations on Gardener API server that enforce that the value of the `.spec.kubernetes.kubeAPIServer.serviceAccountConfig.maxTokenExpiration` field<br>- is at least 720h (30d).<br>- is at most 2160h (90d).<br>Only enable this after `ShootMaxTokenExpirationOverwrite` is enabled and all shoots got updated accordingly. |
| HAControlPlanes | `gardener-scheduler`, `gardenlet` | HAControlPlanes allows shoot control planes to be run in high availability mode. |
2 changes: 2 additions & 0 deletions example/20-componentconfig-gardener-scheduler.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ server:
debugging:
enableProfiling: false
enableContentionProfiling: false
featureGates:
HAControlPlanes: false
#schedulers:
# backupBucket:
# concurrentSyncs: 5 # defaults to 5
Expand Down
1 change: 1 addition & 0 deletions example/20-componentconfig-gardenlet.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ featureGates:
DisableDNSProviderManagement: false
ShootCARotation: true
ShootSARotation: true
HAControlPlanes: false
# seedConfig:
# metadata:
# name: my-seed
Expand Down
21 changes: 21 additions & 0 deletions pkg/apis/core/v1beta1/constants/types_constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ const (
// SecretNameCAETCD is a constant for the name of a Kubernetes secret object that contains the CA
// certificate of the etcd of a shoot cluster.
SecretNameCAETCD = "ca-etcd"
// SecretNameCAETCDPeer is a constant for the name of a Kubernetes secret object that contains the CA
// certificate of the etcd peer network of a shoot cluster.
SecretNameCAETCDPeer = "ca-etcd-peer"
// SecretNameCAFrontProxy is a constant for the name of a Kubernetes secret object that contains the CA
// certificate of the kube-aggregator a shoot cluster.
SecretNameCAFrontProxy = "ca-front-proxy"
Expand Down Expand Up @@ -272,6 +275,24 @@ const (
// Note that this annotation is alpha and can be removed anytime without further notice. Only use it if you know
// what you do.
ShootAlphaControlPlaneScaleDownDisabled = "alpha.control-plane.scaling.shoot.gardener.cloud/scale-down-disabled"
// ShootAlphaControlPlaneHighAvailability is a constant for an annotation on the Shoot resource stating that the
// high availability setup for the control plane should be enabled.
// Note that this annotation is alpha and can be removed anytime without further notice. Only use it if you know
// what you do.
ShootAlphaControlPlaneHighAvailability = "alpha.control-plane.shoot.gardener.cloud/high-availability"
// ShootAlphaControlPlaneHighAvailabilitySingleZone is a specific value that can be set for the shoot control
// plane high availability annotation, that allows gardener to spread the shoot control plane across
// multiple nodes within a single availability zone if it is possible.
// This enables shoot clusters having a control plane with a higher failure tolerance as well as zero downtime maintenance,
// especially for infrastructure providers that provide less than three zones in a region and thus a multi-zone setup
// is not possible there.
ShootAlphaControlPlaneHighAvailabilitySingleZone = "single-zone"
// ShootAlphaControlPlaneHighAvailabilityMultiZone is a specific value that can be set for the shoot control
// plane high availability annotation, that allows gardener to spread the shoot control plane across
// multiple availability zones if it is possible.
ShootAlphaControlPlaneHighAvailabilityMultiZone = "multi-zone"
// LabelSeedMultiZonal is used to identify whether the seed supports multi-zonal control planes for shoots.
LabelSeedMultiZonal = "seed.gardener.cloud/multi-zonal"
// ShootExpirationTimestamp is an annotation on a Shoot resource whose value represents the time when the Shoot lifetime
// is expired. The lifetime can be extended, but at most by the minimal value of the 'clusterLifetimeDays' property
// of referenced quotas.
Expand Down
13 changes: 13 additions & 0 deletions pkg/apis/core/validation/shoot.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@ func ValidateShootObjectMetaUpdate(newMeta, oldMeta metav1.ObjectMeta, fldPath *
allErrs := field.ErrorList{}

allErrs = append(allErrs, validateShootKubeconfigRotation(newMeta, oldMeta, fldPath)...)
allErrs = append(allErrs, validateShootHAControlPlaneUpdate(newMeta, oldMeta, fldPath)...)

return allErrs
}
Expand Down Expand Up @@ -1796,6 +1797,18 @@ func validateShootOperation(operation string, shoot *core.Shoot, fldPath *field.
return allErrs
}

func validateShootHAControlPlaneUpdate(newMeta, oldMeta metav1.ObjectMeta, fldPath *field.Path) field.ErrorList {
allErrs := field.ErrorList{}

oldVal := oldMeta.Annotations[v1beta1constants.ShootAlphaControlPlaneHighAvailability]
newVal := newMeta.Annotations[v1beta1constants.ShootAlphaControlPlaneHighAvailability]

// The etcd cluster cannot be scaled up/down nor is there an automatic re-scheduling to move from single- to multi-zone or the other way around.
allErrs = append(allErrs, apivalidation.ValidateImmutableField(oldVal, newVal, fldPath.Child("annotations").Key(v1beta1constants.ShootAlphaControlPlaneHighAvailability))...)

return allErrs
}

func isShootReadyForRotationStart(lastOperation *core.LastOperation) bool {
if lastOperation == nil {
return false
Expand Down
51 changes: 51 additions & 0 deletions pkg/apis/core/validation/shoot_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,57 @@ var _ = Describe("Shoot Validation Tests", func() {
))
})

Context("HAControlPlanes", func() {
It("should pass as HAControlPlanes option is not changed", func() {
shoot.Annotations = map[string]string{
v1beta1constants.ShootAlphaControlPlaneHighAvailability: v1beta1constants.ShootAlphaControlPlaneHighAvailabilityMultiZone,
}
newShoot := prepareShootForUpdate(shoot)

errorList := ValidateShootUpdate(newShoot, shoot)

Expect(errorList).To(HaveLen(0))
})

It("should forbid to change the HAControlPlanes option", func() {
shoot.Annotations = map[string]string{
v1beta1constants.ShootAlphaControlPlaneHighAvailability: v1beta1constants.ShootAlphaControlPlaneHighAvailabilityMultiZone,
}
newShoot := prepareShootForUpdate(shoot)
newShoot.Annotations = map[string]string{
v1beta1constants.ShootAlphaControlPlaneHighAvailability: v1beta1constants.ShootAlphaControlPlaneHighAvailabilitySingleZone,
}

errorList := ValidateShootUpdate(newShoot, shoot)

Expect(errorList).To(ConsistOf(
PointTo(MatchFields(IgnoreExtras, Fields{
"Type": Equal(field.ErrorTypeInvalid),
"Field": Equal("metadata.annotations[alpha.control-plane.shoot.gardener.cloud/high-availability]"),
})),
))
})

It("should forbid to unset the HAControlPlanes option", func() {
shoot.Annotations = map[string]string{
v1beta1constants.ShootAlphaControlPlaneHighAvailability: v1beta1constants.ShootAlphaControlPlaneHighAvailabilityMultiZone,
}
newShoot := prepareShootForUpdate(shoot)
newShoot.Annotations = map[string]string{
"foo": "bar",
}

errorList := ValidateShootUpdate(newShoot, shoot)

Expect(errorList).To(ConsistOf(
PointTo(MatchFields(IgnoreExtras, Fields{
"Type": Equal(field.ErrorTypeInvalid),
"Field": Equal("metadata.annotations[alpha.control-plane.shoot.gardener.cloud/high-availability]"),
})),
))
})
})

Context("exposure class", func() {
It("should pass as exposure class is not changed", func() {
shoot.Spec.ExposureClassName = pointer.String("exposure-class-1")
Expand Down
6 changes: 6 additions & 0 deletions pkg/features/features.go
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,11 @@ const (
// beta: v1.46.0
// GA: v1.48.0
ShootMaxTokenExpirationValidation featuregate.Feature = "ShootMaxTokenExpirationValidation"

// HAControlPlanes allows shoot control planes to be run in high availability mode.
// owner: @shreyas-s-rao @timuthy
// alpha: v1.49.0
HAControlPlanes featuregate.Feature = "HAControlPlanes"
rfranzke marked this conversation as resolved.
Show resolved Hide resolved
)

var allFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{
Expand All @@ -196,6 +201,7 @@ var allFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{
ShootSARotation: {Default: false, PreRelease: featuregate.Alpha},
ShootMaxTokenExpirationOverwrite: {Default: true, PreRelease: featuregate.GA, LockToDefault: true},
ShootMaxTokenExpirationValidation: {Default: true, PreRelease: featuregate.GA, LockToDefault: true},
HAControlPlanes: {Default: false, PreRelease: featuregate.Alpha},
}

// GetFeatures returns a feature gate map with the respective specifications. Non-existing feature gates are ignored.
Expand Down
2 changes: 1 addition & 1 deletion pkg/gardenlet/controller/shoot/shoot_control_delete.go
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ func (r *shootReconciler) runDeleteShootFlow(ctx context.Context, o *operation.O
})
scaleETCD = g.Add(flow.Task{
Name: "Scaling up etcd main and event",
Fn: flow.TaskFn(botanist.ScaleETCDToOne).RetryUntilTimeout(defaultInterval, defaultTimeout).DoIf(cleanupShootResources),
Fn: flow.TaskFn(botanist.ScaleUpETCD).RetryUntilTimeout(defaultInterval, defaultTimeout).DoIf(cleanupShootResources),
Dependencies: flow.NewTaskIDs(deployETCD),
})
waitUntilEtcdReady = g.Add(flow.Task{
Expand Down
8 changes: 4 additions & 4 deletions pkg/gardenlet/controller/shoot/shoot_control_migrate.go
Original file line number Diff line number Diff line change
Expand Up @@ -207,15 +207,15 @@ func (r *shootReconciler) runPrepareShootForMigrationFlow(ctx context.Context, o
Fn: flow.TaskFn(botanist.DeployEtcd).RetryUntilTimeout(defaultInterval, defaultTimeout).DoIf(cleanupShootResources || etcdSnapshotRequired),
Dependencies: flow.NewTaskIDs(initializeSecretsManagement),
})
scaleETCDToOne = g.Add(flow.Task{
scaleUpETCD = g.Add(flow.Task{
Name: "Scaling etcd up",
Fn: flow.TaskFn(botanist.ScaleETCDToOne).RetryUntilTimeout(defaultInterval, defaultTimeout).DoIf(wakeupRequired),
Fn: flow.TaskFn(botanist.ScaleUpETCD).RetryUntilTimeout(defaultInterval, defaultTimeout).DoIf(wakeupRequired),
Dependencies: flow.NewTaskIDs(deployETCD),
})
waitUntilEtcdReady = g.Add(flow.Task{
Name: "Waiting until main and event etcd report readiness",
Fn: flow.TaskFn(botanist.WaitUntilEtcdsReady).DoIf(cleanupShootResources || etcdSnapshotRequired),
Dependencies: flow.NewTaskIDs(deployETCD, scaleETCDToOne),
Dependencies: flow.NewTaskIDs(deployETCD, scaleUpETCD),
})
// Restore the control plane in case it was already migrated to make sure all components that depend on the cloud provider secret are restarted
// in case it has changed. Also, it's needed for other control plane components like the kube-apiserver or kube-
Expand All @@ -233,7 +233,7 @@ func (r *shootReconciler) runPrepareShootForMigrationFlow(ctx context.Context, o
wakeUpKubeAPIServer = g.Add(flow.Task{
Name: "Scaling Kubernetes API Server up and waiting until ready",
Fn: flow.TaskFn(botanist.WakeUpKubeAPIServer).DoIf(wakeupRequired),
Dependencies: flow.NewTaskIDs(deployETCD, scaleETCDToOne, waitUntilControlPlaneReady),
Dependencies: flow.NewTaskIDs(deployETCD, scaleUpETCD, waitUntilControlPlaneReady),
})
ensureResourceManagerScaledUp = g.Add(flow.Task{
Name: "Ensuring that the gardener resource manager is scaled to 1",
Expand Down
1 change: 1 addition & 0 deletions pkg/gardenlet/features/features.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,6 @@ func RegisterFeatureGates() {
features.DisableDNSProviderManagement,
features.ShootCARotation,
features.ShootSARotation,
features.HAControlPlanes,
)))
}
Loading