-
Notifications
You must be signed in to change notification settings - Fork 475
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add initial OpenShift swap enhancement
- Loading branch information
Showing
1 changed file
with
142 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
--- | ||
title: node-swap | ||
authors: | ||
- "@ehashman" | ||
reviewers: | ||
- "@rphilips" | ||
- "@sjenning" | ||
- "???" | ||
approvers: | ||
- "@mrunalp" | ||
creation-date: "2021-06-23" | ||
status: provisional | ||
--- | ||
|
||
# OpenShift Node Swap Support | ||
|
||
## Release Signoff Checklist | ||
|
||
- [ ] Enhancement is `implementable` | ||
- [ ] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] Operational readiness criteria is defined | ||
- [ ] Graduation criteria for dev preview, tech preview, GA | ||
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) | ||
|
||
## Summary | ||
|
||
The upstream Kubernetes 1.22 release introduced alpha support for configuring swap memory usage for Kubernetes workloads on a per-node basis. | ||
|
||
Now that swap use on nodes is supported in upstream, there are a number of use cases that would benefit from OpenShift nodes supporting swap, including improved node stability, better support for applications with high memory overhead but smaller working sets, the use of memory-constrained devices, and memory flexibility. | ||
|
||
## Motivation | ||
|
||
See [KEP-2400: Motivation]. | ||
|
||
[KEP-2400: Motivation]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#motivation | ||
|
||
### Goals | ||
|
||
- Swap can be provisioned and configured for nodes to use in an OpenShift cluster. | ||
|
||
### Non-Goals | ||
|
||
- Workload-specific swap accounting. | ||
- Any of the non-goals in [KEP-2400: Non-goals]. | ||
|
||
[KEP-2400: Non-goals]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#non-goals | ||
|
||
## Proposal | ||
|
||
### User Stories | ||
|
||
See [KEP-2400: User Stories](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#user-stories). | ||
|
||
### Implementation Details/Notes/Constraints [optional] | ||
|
||
See [KEP-2400: Notes/Constraints/Caveats](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#notesconstraintscaveats-optional). | ||
|
||
### Risks and Mitigations | ||
|
||
See [KEP-2400: Risks and Mitigations]. | ||
|
||
[KEP-2400: Risks and Mitigations]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#risks-and-mitigations | ||
|
||
## Design Details | ||
|
||
_very drafty_ | ||
|
||
- We will add `NodeSwap` to the [`TechPreviewNoUpgrade`] feature gate list. | ||
- We can use [ignition configs](https://coreos.github.io/ignition/configuration-v3_3/) to add swap partitions to worker nodes. (filesystems.format = swap) | ||
- Options: partition on the root node, or perhaps provision and mount an NVMe volume? (https://github.com/openshift/machine-config-operator/issues/1619) | ||
- We will need to ensure the node has `swapon` before kubelet starts | ||
- The kubelet needs an appropriate KubeletConfiguration (e.g. `NodeSwap` feature flag enabled, `failSwapOn = false`, and [`memorySwap.SwapBehavior` set](https://kubernetes.io/docs/concepts/architecture/nodes/#swap-memory)) | ||
|
||
TODO: how will we modify MCO to roll this out? Do we need to at all? | ||
|
||
### Open Questions [optional] | ||
|
||
- Will we eventually want to enable swap on all OpenShift nodes by default? | ||
- Should swap just be limited to worker nodes, or should we consider adding it to control plane nodes too? | ||
|
||
### Test Plan | ||
|
||
In addition to the upstream e2e tests, we will need to add e2e suites to OpenShift in order to exercise provisioning and use of swap. This may include unit tests where appropriate, such as the MCO. | ||
|
||
### Graduation Criteria | ||
|
||
#### Dev Preview -> Tech Preview | ||
|
||
Requires alpha support in upstream Kubernetes. (1.22+) | ||
|
||
- Support provisioning OpenShift nodes with swap enabled for all available upstream swap configurations (currently `LimitedSwap`, `UnlimitedSwap`). | ||
|
||
JIRA: https://issues.redhat.com/browse/OCPNODE-470 | ||
|
||
_Graduation criteria below are tentative._ | ||
|
||
#### Tech Preview -> GA | ||
|
||
Requires beta/GA support in upstream Kubernetes. (1.25?+) | ||
|
||
- More testing (upgrade, downgrade, scale) | ||
- Sufficient time for feedback | ||
- Available by default | ||
- Backhaul SLI telemetry | ||
- Document SLOs for the component | ||
- Conduct load testing | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
The `NodeSwap` feature flag is not supported in Kubernetes versions prior to 1.22/OpenShift 4.9. We will add the upstream `NodeSwap` feature flag to the set of [`TechPreviewNoUpgrade`] flags to prevent upgrades. | ||
|
||
Note that swap support does not require coordination between components and the configuration is limited to individual nodes. | ||
|
||
See also [KEP-2400: Upgrade/Downgrade Strategy]. | ||
|
||
[`TechPreviewNoUpgrade`]: https://github.com/openshift/enhancements/blob/ce4d303db807622687159eb9d3248285a003fabb/guidelines/techpreview.md#official-processmechanism-for-delivering-a-tp-feature | ||
[KEP-2400: Upgrade/Downgrade Strategy]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#upgrade--downgrade-strategy | ||
|
||
### Version Skew Strategy | ||
|
||
N/A, this is a compatible API change limited to the Kubelet that does not require coordination with the API Server. | ||
|
||
## Implementation History | ||
|
||
- [Upstream alpha swap support] completed in 1.22. | ||
|
||
[Upstream alpha swap support]: https://github.com/kubernetes/enhancements/issues/2400#issuecomment-884327938 | ||
|
||
## Drawbacks and Alternatives | ||
|
||
See [KEP-2400: Drawbacks] and [KEP-2400: Alternatives]. | ||
|
||
[KEP-2400: Drawbacks]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#drawbacks | ||
[KEP-2400: Alternatives]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#alternatives | ||
|
||
## Infrastructure Needed [optional] | ||
|
||
- We will need to configure periodic e2e tests on VMs with swap enabled. | ||
- We will need to enable swap on a [reliability cluster] to gauge long-term stability. | ||
|
||
[reliability cluster]: https://issues.redhat.com/browse/OCPNODE-619 |