-
Notifications
You must be signed in to change notification settings - Fork 156
ECK: Document recovery from failed volume upsize #3459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds documentation to help users recover from failed Elasticsearch volume expansion operations in ECK (Elastic Cloud on Kubernetes). The change addresses user issues where volume expansion failures can leave deployments in an unrecoverable state.
- Adds a new troubleshooting section for volume expansion failures
- Documents the recommended recovery approach using nodeSet renaming
- Provides specific error message and solution context
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
🔍 Preview links for changed docs |
|
||
## If a volume expansion failed [k8s-common-problems-volume-failed-expansion] | ||
|
||
If you attempted an expansion of an Elasticsearch volume via its [volume claim template](/deploy-manage/deploy/cloud-on-k8s/volume-claim-templates.md#k8s-volume-claim-templates-update), you may have encountered scenarios where the operation failed such as Azure not allowing volume expansion without shutting down the Virtual Machine to which it is attached. If you try to adjust the volume claim template back to the original size you will encounter a failure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you attempted an expansion of an Elasticsearch volume via its [volume claim template](/deploy-manage/deploy/cloud-on-k8s/volume-claim-templates.md#k8s-volume-claim-templates-update), you may have encountered scenarios where the operation failed such as Azure not allowing volume expansion without shutting down the Virtual Machine to which it is attached. If you try to adjust the volume claim template back to the original size you will encounter a failure: | |
If you attempted an expansion of an {{es}} volume via its [volume claim template](/deploy-manage/deploy/cloud-on-k8s/volume-claim-templates.md#k8s-volume-claim-templates-update), you may have encountered scenarios where the operation failed such as Azure not allowing volume expansion without shutting down the Virtual Machine to which it is attached. If you try to adjust the volume claim template back to the original size you will encounter a failure: |
Failed to apply spec change: handle volume expansion: decreasing storage size is not supported: an attempt was made to decrease storage size for claim elasticsearch-data | ||
``` | ||
In this scenario the best course of action is to rename the existing `nodeSet` to a new name while simultaneously updating the volume claim template to the original size. This operation will bring a new `StatefulSet` online while moving all existing indices to the new volumes and will delete the old `StatefulSet` and its volumes once the operation is complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this scenario the best course of action is to rename the existing `nodeSet` to a new name while simultaneously updating the volume claim template to the original size. This operation will bring a new `StatefulSet` online while moving all existing indices to the new volumes and will delete the old `StatefulSet` and its volumes once the operation is complete. | |
In this scenario the best course of action is to rename the existing `nodeSet` to a new name while simultaneously updating the volume claim template to the original size. This operation will bring a new `StatefulSet` online while moving all existing indices to the new volumes, and will delete the old `StatefulSet` and its volumes once the operation is complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🚢
Just two super nit-picky comments. :-)
In elastic/cloud-on-k8s#4467 it's noted that some users are dealing with volume expansion failure issues, and documenting how to recover from this situation would be helpful. This is the attempt to update that documentation.
After merge