Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement][opensearch] Avoid rolling pods for chart updates unless necessary #557

Open
briend opened this issue Jul 13, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@briend
Copy link

briend commented Jul 13, 2024

Is your feature request related to a problem? Please describe.
Minor chart version updates will always roll opensearch pods even where there is no technical reason to do so. This can be disruptive and time-consuming for large clusters.

Describe the solution you'd like
Care should be taken to only include labels and other data in the statefulset template that are actually necessary to ensure the pods are deployed correctly. Minor chart updates that might change an Ingress (for example) really do not need to be rolling an entire cluster.

For instance, instead of checksumming the entire ConfigMap (which includes labels that will change with every chart version), why not just checksum data in the ConfigMap?

configchecksum: {{ include (print .Template.BasePath "/configmap.yaml") . | sha256sum | trunc 63 }}

The security config also doesn't seem to require rolling whenever it changes, since to apply those change you still need to run the securityadmin.sh tool manually:

https://opensearch.org/docs/latest/security/configuration/security-admin

We could probably just remove it:

{{- if .Values.securityConfig.config.data }}
securityconfigchecksum: {{ include (print .Template.BasePath "/securityconfig.yaml") . | sha256sum | trunc 63 }}
{{- end }}

Similarly, the labels on the statefulset pods include the chart version, etc, so that will always roll even if nothing else changed with the configuration or image, etc:

{{- include "opensearch.labels" . | nindent 8 }}

Perhaps just using opensearch.selectorLabels (which are stable across chart versions) would be a good compromise

I can submit an MR if this all sounds reasonable.

@briend briend added enhancement New feature or request untriaged Issues that have not yet been triaged labels Jul 13, 2024
@prudhvigodithi
Copy link
Member

[Triage]
I agree with you @briend, the restart for minor chart should only happen if there is an actual change, however please correct me If i'm wrong today it does rolling restart right, It should not impact the cluster traffic. However I like your idea, is there a way you can create a PR and add the changes your are suggesting ?
Thank you
@peterzhuamazon @getsaurabh02

@prudhvigodithi prudhvigodithi removed the untriaged Issues that have not yet been triaged label Jul 18, 2024
@getsaurabh02 getsaurabh02 moved this from 🆕 New to Backlog in Engineering Effectiveness Board Jul 18, 2024
@briend
Copy link
Author

briend commented Jul 18, 2024

Sounds good @prudhvigodithi , I'll make a PR.

however please correct me If i'm wrong today it does rolling restart right, It should not impact the cluster traffic.

We have some large clusters that can take days to roll. We use a custom readiness probe to avoid rolling the next pod while the cluster is in a non-green state, which slows us down a lot. We might have something to contribute here: #474

@prudhvigodithi
Copy link
Member

Got it thanks @briend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 📦 Backlog
Development

No branches or pull requests

2 participants