-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor nodes orchestration to rely on StatefulSets #1463
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This facilitates a move to StatefulSets where the mounted secrets must be the same between all the Pods in the same StatefulSet
Store transport keys and certificates in a single shared secret.
…1205) Build a set of StatefulSets matching the ES CRD NodeSpec, and reconcile them instead of reconciling pods. This is a very first iteration. Bootstrapping an ES cluster seems to work fine. Upgrading this cluster (and everything related to zen1/zen2) does not. Main bits of this change: - Require the name field in the NodeSpec, to derive the StatefulSet name - Build a StatefulSet per NodeSpec, along with a headless service - Label the StatefulSet with a hash of its template, for comparison purpose - Create a single Config secret per StatefulSet (instead of per pod) - Label the pod template with a hash of the config (before zen2 initial master nodes), to rotate pods on config changes - Reconcile StatefulSet, service and config in the default driver, commenting out most of the existing pod reconciliation logic The code is left in a rather dirty state: I had to make a few changes deep down in functions that build pod spec and config. These changes bubble up pretty high, to changes calculations and the default driver. Most of the default driver code and the mutation package is probably doomed, I chose to keep it there commented instead of deleting it for now, since it I think it may help in the short-term refactoring ("where does that logic live in the new code? here? OK I can remove."). Let me know if it still deserves more cleanup. To be refactored and improved in follow-up commits.
Add support for scaling a StatefulSet up (just update sset replicas) or down (migrate data away then remove nodes when ready). Which also adds support for renaming a StatefulSet, by creating the new sset, then slowly remove the existing one with data migration. It ignores any zen1/zen2 consideration for now (works fine with zen2 in most cases). It ignores any change budget consideration for now. Unit tests and E2E tests are missing, but a refactoring is to be expected to handle zen2/zen2 and the changeBudget. I'd prefer waiting for this refactoring to happen before dealing with large unit testing. Consider this as still work in progress. The PR also modifies the way pods and ssets are watched by the Elasticsearch controller.
* Support for APM server configuration (#1181) * Add a config section to the APM server configuration * APM: Add support for keystore * Factorize ElasticsearchAuthSettings * Update dev setup doc + fix GKE bootstrap script (#1203) * Update dev setup doc + fix GKE bootstrap script * Update wording of container registry authentication * Ensure disks removal after removing cluster in GKE (#1163) * Update gke-cluster.sh * Implement cleanup for unused disks in GCP * Update Makefile * Update CI jobs to do proper cleanup * Normalize the raw config when creating canonical configs (#1208) This aims at counteracting the difference between JSON centric serialization and the use of YAML as the serialization format in canonical config. If not normalizing numeric values like 1 will differ when comparing configs as JSON deserializes integer numbers to float64 and YAML to uint64. * Homogenize logs (#1168) * Don't run tests if only docs are changed (#1216) * Update Jenkinsfile * Simplify notOnlyDocs() * Update Jenkinsfile * Push snapshot ECK release on successful PR build (#1184) * Update makefile's to support snapshots * Add snapshot releases to Jenkins pipelines * Cleanup * Rename RELEASE to USE_ELASTIC_DOCKER_REGISTRY * Update Jenkinsfile * Add a note on EKS inbound traffic & validating webhook (#1211) EKS users must explicitly enable communication from the k8s control plane and nodes port 443 in order for the control plane to reach the validating webhook. Should help with #896. * Update PodSpec with Hostname from PVC when re-using (#1204) * Bind the Debug HTTP server to localhost by default (#1220) * Run e2e tests against custom Docker image (#1135) * Add implementation * Update makefile's * Update Makefile * Rename Jenkisnfile * Fix review comments * Update e2e-custom.yml * Update e2e-custom.yml * Return deploy-all-in-one to normal * Delete GKE cluster only if changes not in docs (#1223) * Add operator version to resources (#1224) * Warn if unsupported distribution (#1228) The operator only works with the official ES distributions to enable the security available with the basic (free), gold and platinum licenses in order to ensure that all clusters launched are secured by default. A check is done in the prepare-fs script by looking at the existence of the Elastic License. If not present, the script exit with a custom exit code. Then the ES reconcilation loop sends an event of type warning if it detects that a prepare-fs init container terminated with this exit code. * Document Elasticsearch update strategy change budget & groups (#1210) Add documentation for the `updateStrategy` section of the Elasticsearch spec. It documents how (and why) `changeBudget` and `groups` are used by ECK, and how both settings can be specified by the user.
Handle cluster rolling upgrades with StatefulSets. Relies on manually updating the rollingUpdate.Partition index of a sset when we're ready to restart the ES node. For now, it only handles 1 node restart at a time, mostly because we rely on cluster health green checks. It also introduces 2 additional concepts: StatefulSet updates expectations, based on UID + Generation: there are certain ES calls we don't want to make if our StatefulSet cache is out of date "lazy" ES client calls: some ES calls we do to handle the upgrade cannot rely on out-of-date information from observers. We need fresh ES information on the state of the cluster before eg. re-enabling shard allocations. However we don't want to perform these calls at every single reconciliation. I see this as a "stop-gap" implementation before we go on with a larger refactoring of sync/async ES calls. This is still WIP since many things are still missing: changeBudget considerations. zen1/zen2 handling. Unit tests, everywhere. I still think a large portion of this code may be refactored in the short-term to account for change budget & zen1/zen2. Keeping unit tests for a later PR. E2E tests, kept for a later PR.
With many conflicts fixed.
Merge master - with conflicts fixed
Merge master into statefulset-refactoring
* Use the setvmmaxmapcount initcontainer by default in E2E tests (#1300) Let's keep our default defaults :) The setting is disabled explicitly for E2E tests where we enable a restricted security context. * Add docs for plugins, custom configuration files and secure settings (#1298) * Allow license secret webhook to fail (#1301) Webhooks on core k8s objects are just too debilitating in case our webhook service fails. This sets the failure policy for the secret webhook to ignore to strike a balance between UX (immediate feedback) and keeping the users k8s cluster in a working state. Also we have an additional validation run on controller level so this does not allow circumventing our validation logic. * Revert "Use the setvmmaxmapcount initcontainer by default in E2E tests (#1300)" (#1302) This reverts commit fff1526. This commit is breaking our E2E tests chain, which deploy a PodSecurityPolicy by default. Any privileged init container will not work. I'll open an issue for a longer-term fix to properly handle this. * Update quickstart (#1307) * Update the name of the secret for the elastic user * Bump the Elastic Stack version from 7.1.0 to 7.2.0 * Change Kibana readiness endpoint to return a 200 OK (#1309) The previous endpoint returned an http code 302. While this is fine for Kubernetes, some derived systems like GCP LoadBalancers mimic the container readiness check for their own readiness check. Except GCP Loadbalancers only work with status 200. It's not up to us to adapt GCP LoadBalancers to K8s, but this is a fairly trivial fix. * Fix pod_forwarder to support two part DNS names, adjust e2e http_client (#1297) * Fix pod_forwarder to support two part DNS names, adjust e2e http_client url * Revert removing .svc in e2e http_client * [DOC] Resources management and volume claim template (#1252) * Add resources and persistent volume templates documentation * Ignore resources reconciled by older controllers (#1286) * Document PodDisruptionBudget section of the ES spec (#1306) * Document PodDisruptionBudget section of the ES spec I suspect this might slightly change in the feature depending on how we handle the readiness check, so I'm keeping this doc minimal for now: * what is a PDB, briefly (with a link) * default PDB we apply * how to set a different PDB * how to disable the default PDB * Move version out from Makefile (#1312) * Add release note generation tool (#1314) * no external dependencies * inspects PRs by version label * generates structured release notes in asciidoc grouped by type label * Add console output to standalone apm sample (#1321) * Update Quickstart to 0.9.0 (#1317) * Update doc (#1319) * Update persistent storage section * Update kibana localhost url to use https * Update k8s resources names in accessing-services doc * Mention SSL browser warning * Fix bulleted list * Add CI job for nightly builds (#1248) * Move version to a file * Add CI implementation * Update VERSION * Depend on another PR for moving out version from Makefile * Update Jenkinsfile * Don't build and push operator image in bootstrap-gke (#1332) We don't need to do that anymore, since we don't use an init container based on the operator image. * Remove Docker image publishing from devops-ci (#1339) * Suppress output of certain commands from Makefile (#1342) * Document how to disable TLS (#1341) * Use new credentials for Docker registry (#1346) * Workaround controller-runtime webhook upsert bug (#1337) * Fix docs build on PR job (#1351) * Fix docs build on PR job * Cleanup workspace before doing other steps * APM: remove "output" element and add elasticsearchRef (#1345) * Don't rely on buggy metaObject Kind (#1324) * Don't rely on buggy metaObject Kind A bug in our client implementation may clear the object's Kind on certain scenarios. See kubernetes-sigs/controller-runtime#406. Let's avoid that by fixing a constant Kind returned by a method call on the resource.
Merge master into statefulset-refactoring
Change the way we setup zen1 and zen2 settings to adapt to StatefulSets upscale and downscale. Also improve StatefulSet downscale code, which is moved to its own file, and add a few helper functions around pods and master nodes. In a nutshell: set minimum_master_nodes (zen1) and initial_master_nodes (zen2) in the configuration of new nodes to create, corresponding to the expected nodes that will be created. The setting is automagically updated in existing nodes configuration , where it is ignored until nodes restart. update minimum_master_nodes (zen1) and voting_config_exclusions (zen2) whenever we remove a node. For now, we don't (yet) do any change on rolling upgrades, which are limited to a single node at a time at the moment. This only impacts sset scale up/down.
* Add mandatory NodeSpec name * Rename apm samples resources to something unique Otherwise, volumes of resources from other tests are reused for this one. Should be solved once we remove PVCs in #1288.
Refactor the main driver and ES pod creation logic, towards simplification of the codebase. Main changes: StatefulSet and PodSpec creation logic lives in the NodeSpec package No more catch-all version/common.go. The only difference between versions for now is explicitly zen1 vs. zen2. No more pod package (with conflicting variable/package name) No more nested indirection from driver -> defaultDriver -> NewPodSpecParams -> podSpec(func, func, func, func) -> buildPodTemplateSpec(func, func, func, func). Default driver code attempts to be minimalistic (while still being a sequential list of function calls) No more "generic resources" since that was only for the external service anyway Remove deprecated code in mutation, pvc, pod package (a lot of the comparison/CalculateChanges/PVC reuse/PodWithConfig/PodSpecContext mechanism we had before) * Move supported version check from driver to version pkg * Simplify driver creation * Extract driver nodes reconciliation to its own file * Move expectations to the common pkg and add unit tests * Move configmap reconciliation to its own file * Move DynamicEnvVars to defaults pkg and use it in more places * Extract es env vars away from the version pkg * Move keystore init container params away from the main driver * Move sset build logic into the nodespec pkg * Move readiness probe in the nodespec pkg * Move some es pod consts into the nodespec pkg * Refactor pod template build * Remove concept of generic resources since used for a single service * Remove deprecated pod creation + PVC reuse code * Remove most of the mutation package (RIP) * Remove the PVC package (RIP) * Remove PodWithConfig and PodSpecContext (yay!) * Get rid of the pod pkg (yay!) * Fix supported versions initialization * Add missing license header * Remove deprecated pv reuse e2e test * Remove a bunch of unused functions * Make linter happy with appendAssign * Fix downward env defaults in apm unit tests * Move PodsByName to the utils pkg * Fix typo * Fix reference to the keystore initcontainer and add a unit test * Remove dead code * Address PR review
18 tasks
@elasticmachine, run elasticsearch-ci/docs |
To cleanup resources and avoid any PVC reuse for different E2E stacks, let's remove PVCs at the end of E2E tests. This should probably be done by the operator directly, see #1288.
@anyasabo I think the samples close to the code (config/samples) are updated already. Are you referring to examples in our documentation? |
david-kow
reviewed
Aug 5, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments and questions.
operators/pkg/controller/elasticsearch/certificates/transport/pod_secret.go
Outdated
Show resolved
Hide resolved
operators/pkg/controller/elasticsearch/certificates/transport/pod_secret.go
Outdated
Show resolved
Hide resolved
operators/pkg/controller/elasticsearch/certificates/transport/pod_secret.go
Show resolved
Hide resolved
operators/pkg/controller/elasticsearch/certificates/transport/pod_secret.go
Outdated
Show resolved
Hide resolved
operators/pkg/controller/elasticsearch/certificates/transport/pod_secret.go
Outdated
Show resolved
Hide resolved
david-kow
reviewed
Aug 5, 2019
david-kow
reviewed
Aug 6, 2019
pebrc
reviewed
Aug 6, 2019
pebrc
approved these changes
Aug 6, 2019
operators/pkg/controller/elasticsearch/version/zen1/compatibility_test.go
Outdated
Show resolved
Hide resolved
operators/pkg/controller/elasticsearch/version/zen1/compatibility_test.go
Outdated
Show resolved
Hide resolved
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge the
statefulset-refactoring
feature branch into master.Change the way Elasticsearch nodes are orchestrated to rely on StatefulSets instead of managing pods.
The implementation is not complete yet. In particular:
However, it seems to be good enough to spawn stacks, apply any mutation, and run E2E tests.
Relates #1299.