Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

[stable/redis] fix sentinel restart problems #21168

Closed
wants to merge 8 commits into from
Closed

Conversation

btrappe
Copy link

@btrappe btrappe commented Mar 2, 2020

Is this a new chart

no

What this PR does / why we need it:

Which issue this PR fixes

This is a renewal of PR#19464 that was closed due to inactivity after sucessfully passing the automatic test bed. It seems to exists some interest of this PR, that's why I resend this updated PR.

About the reason of my changes:
I tried to test HA of the redis cluster and the sentinel-conform functionality of our services that uses the redis/sentinel cluster. To perform failover functionality as in real time, I just deleted slave or the master pod and found some abnormalities.

sentinel on master pod doesn't startup, if only setinel container on master pod is restarted (commit 911d2d6)
Interestingly, the redis chart creates a non-symmetrically set of pods: There is one "master" pod that own initially the redis master node and (optionally) several "slave" pods that owns initially a set of redis (and sentinel) nodes in slave mode. As long as the master node is kept untouched, scaling up/down slave pods and even recreating of suddently deleted slave pods work as intended (at least at first glance), because new slave pods can contact a redis via master service. Only if the master pod is created, it has the problem to determine, if it is the first pod of a new cluster (initial cluster startup) or if there are some remains of a running cluster, that lost its master pod. To address this problem, sameersbn added in commit 1adb31c (Chart version 8.0.16) some lines that tries to ask some sentinel for its list of sentinel IPs and add them directly to sentinel.conf. That works perfectly at pod fresh creation (non existing sentinel.conf and thus start with a copy of some template sentinel.conf) but lead to a persisting "error in sentinal configuration at line 21" (line number may vary), if the sentinal container is restarted only - you can easily simulate this by attaching to the sentinel container running on the pod of the master service and kill the sentinel process by typing "kill 1" - K8s will immediately restart the pod in-place (without wiping the folders) and thus, the code from sameersbn will append the well-known setinals a second time (there are still in sentinel.conf from the formerly running sentinel process) and lead to duplicate lines that makes the sentinel process to issue the configuration error instead of starting up. One solution would be to start always with a fresh sentinal.conf but it is worth to preserve the sentinel.conf whenever possible as it contains always the most current state of all sentinel cluster nodes. That's why I changed the code of sameersbn in that way that it is executed only if there is no former well-known sentinel node list found in sentinel.conf.

reset sentinel list after master pod has gone (commit 9be66eb)
While I dealt with the problems of 1. and the article https://redis.io/topics/sentinel, I hit on the notice "Sentinels never forget already seen Sentinels". In a statically configured environment this has no implications if nodes will be restarted/recreated having the former IP addresses. on k8s however, every pod (even after in-place updates or partial container restarts) will get always a new IP address. This leads the known sentinel list in sentinel.conf become larger and larger. Worse, at some point, the K8s cluster will recycle unused IP addresses and if some foreign redis/sentinal cluster gets such an IP address, two independent clusters will (at least partially) get unified and this leads to unexpected results. I observed such a problem on our K8s cluster running several environments/stages of our microservice architecture and thus multiple redis clusters in different K8s namespaces and could not explain at that time, why some sentinels reported redis nodes from an foreign redis cluster.
The fix is simple: just before start of the sentinel I fork a shell that is waiting 120s (fix, enough time to give sentinel to get familiar to each other), then ask the (newly started) sentinel on master pod for all known sentinels, followed by resetting all returned sentinels and the sentinel on master pod. That should work, since at this time all sentinel should follow the same master redis node from which they can acquire the current setinel list.

reset sentinel after helm upgrade (Commit baa5b79)
Finally, the IP address problem decribed in 2. also ocurs while upgrading the helm chart, because pods are potentially are rebuild. Since it it is not ensured, that the master service pod is recreated and the whole upgrade doesn't take more than 120s, I added a job that is a hook of helm chart install/upgrade event and executes the same procedure as mentioned in 2. (reset all sentinals)

Special notes for your reviewer:

Checklist

[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.]

  • DCO signed
  • Chart Version bumped
  • Title of the PR starts with chart name (e.g. [stable/mychartname])

btrappe and others added 5 commits March 1, 2020 22:10
if only sentinel container on master svc pod crashes/restarts, the existing sentinels was be added twice in sentinel.conf which leads to an "error in configuration at line <somethings around 21>"

Signed-off-by: btrappe <btrappe@web.de>
Signed-off-by: Bernd Trappe <btrappe@web.de>
if redis helm chart is upgraded, pods may be restartet after in-place container upgrade and get new IP addresses. Sentinels don't forget ever seen IP adresses. This job resets the sentinel lists after HELM chart updates.

Signed-off-by: btrappe <btrappe@web.de>
Signed-off-by: Bernd Trappe <btrappe@web.de>
Signed-off-by: Bernd Trappe <btrappe@web.de>
Signed-off-by: Bernd Trappe <btrappe@web.de>
Signed-off-by: Bernd Trappe <btrappe@web.de>
@helm-bot helm-bot added the Contribution Allowed If the contributor has signed the DCO or the CNCF CLA (prior to the move to a DCO). label Mar 2, 2020
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 2, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @btrappe. Thanks for your PR.

I'm waiting for a helm member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: btrappe
To complete the pull request process, please assign tompizmor
You can assign the PR to them by writing /assign @tompizmor in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@btrappe btrappe changed the title [stable-redis] fix sentinel restart problems [stable/redis] fix sentinel restart problems Mar 2, 2020
Copy link
Collaborator

@miguelaeh miguelaeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the PR!!!!
I leave some comments.
Apart from that, I would like to share with you that we are working on a Redis major change that will use the Redis Cluster topology so the Sentinel won't be needed anymore. For this reason, I am interested in if these changes will break the backward compatibility. In case these changes are a major change too we would like to avoid merge it.

@@ -1,6 +1,6 @@
apiVersion: v1
name: redis
version: 10.5.6
version: 10.5.7
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be at least a minor version as you are adding a new feature. If this PR is going to break the backward compatibility it should be a major change.

Copy link
Contributor

@DaveOHenry DaveOHenry Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like a new feature to me. It fixes some annoying bugs by using functionality that basically already exists in this chart.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected. Looks like resetting sentinels was never a "feature" of this chart.
The only difference I observed while using this patched version is that there are no longer any failed pods. Without the changes from this PR we had a randomly failing pod every 2-3 days. The upgrade process worked fine, too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fully my opinion - I don't want to introduce any features (the helm chart already promises to provide a HA redis cluster, otherwise it wouldn't need the sentinels), but want to use the chart without the mentioned problems, I suppose a cluster actally should keep away from me as a cluster consumer.

- -c
- |
$(timeout -s 9 {{ .Values.sentinel.initialCheckTimeout }} redis-cli --raw -h $REDIS_MASTER_HOST -a \"$REDIS_PASSWORD\" -p {{ .Values.sentinel.service.sentinelPort }} SENTINEL sentinels {{ .Values.sentinel.masterSet }} | awk -f /health/parse_sentinels_reset.awk)
redis-cli --raw -h $REDIS_MASTER_HOST -a \"$REDIS_PASSWORD\" -p 26379 SENTINEL reset {{ .Values.sentinel.masterSet }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not hardcode the port here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d7c69fb

existing_sentinels=$(timeout -s 9 {{ .Values.sentinel.initialCheckTimeout }} redis-cli --raw -h {{ template "redis.fullname" . }} -a "$REDIS_PASSWORD" -p {{ .Values.sentinel.service.sentinelPort }} SENTINEL sentinels {{ .Values.sentinel.masterSet }})
echo "$existing_sentinels" | awk -f /health/parse_sentinels.awk | tee -a /opt/bitnami/redis-sentinel/etc/sentinel.conf
fi
bash -c "sleep 120; $(timeout -s 9 {{ .Values.sentinel.initialCheckTimeout }} redis-cli --raw -h localhost -a \"$REDIS_PASSWORD\" -p {{ .Values.sentinel.service.sentinelPort }} SENTINEL sentinels {{ .Values.sentinel.masterSet }} | awk -f /health/parse_sentinels_reset.awk) echo; redis-cli --raw -h localhost -a \"$REDIS_PASSWORD\" -p 26379 SENTINEL reset {{ .Values.sentinel.masterSet }}" &
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this sleep needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally know, sleeps are ugly. The problem is, however, I have to ping (i.e. reset) all sentinel nodes. To get a full list of all (at least potential) sentinels, I'm asking a well-known sentinel - the one I just started on localhost. There are 2 cases: whole pod creation with empty/start configuration and pod restart with some configuration from before. The 120 sec sleep ensures that my just started sentinal has enough time to communicate to and negotiate with the other cluster members and is able to deliver a complete list of potentials sentinel even when the pod is newly created without any information from master pod before. On other hand, I don't expect a new master restart within 120sec. (which actually shouldn't occur at all of course),

@btrappe
Copy link
Author

btrappe commented Mar 4, 2020

For this reason, I am interested in if these changes will break the backward compatibility. In case these changes are a major change too we would like to avoid merge it.

I don't expect any problems regarding backward compatibility since no new features are introduced but only bugs of the HA node creation process in the case of node/pod failure are solved and neither additional configuration values were added nor changed.

…entinel-job: wait for master node becoming ready

Signed-off-by: Bernd Trappe <btrappe@web.de>
@helm-bot helm-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 9, 2020
@btrappe btrappe requested a review from miguelaeh March 9, 2020 09:54
@stale
Copy link

stale bot commented Apr 8, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2020
@DaveOHenry
Copy link
Contributor

ping

@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2020
@carrodher
Copy link
Collaborator

Sorry, we missed the notification about this PR, we are going to re-review it. Thanks!

Copy link
Collaborator

@miguelaeh miguelaeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
I am sorry for the late, we didn't receive the notification about your comments.
It looks good to me, except the REDIS_PASSWORD env var, redis-cli will recognize the REDISCLI_AUTH env var so it is not necessary to use the -a option that is insecure. In any case, as the full chart is using REDIS_PASSWORD it is okay to e to be consistent and create another PR to change that in the full chart.

LGTM!
lets wait for the CI/CD

@miguelaeh
Copy link
Collaborator

Hi again,
Given the stable deprecation timeline, this Bitnami maintained Helm chart is now located at bitnami/charts. Please visit the bitnami/charts GitHub repository to create Issues or PRs.
In this issue we tried to explain more carefully the reasons and motivations behind this transition, please don’t hesitate to add a comment in this issue if you have any question related to the migration itself.
I am sorry for the noise, we didn't realize about that before.

@zanhsieh
Copy link
Collaborator

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 14, 2020
@helm-bot helm-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 14, 2020
@helm-bot helm-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 14, 2020
Signed-off-by: Bernd Trappe <btrappe@web.de>
@stale
Copy link

stale bot commented May 15, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 15, 2020
@DaveOHenry
Copy link
Contributor

ping

@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 15, 2020
@stale
Copy link

stale bot commented Jun 14, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2020
@stale
Copy link

stale bot commented Jun 28, 2020

This issue is being automatically closed due to inactivity.

@stale stale bot closed this Jun 28, 2020
existing_sentinels=$(timeout -s 9 {{ .Values.sentinel.initialCheckTimeout }} redis-cli --raw -h {{ template "redis.fullname" . }} -a "$REDIS_PASSWORD" -p {{ .Values.sentinel.service.sentinelPort }} SENTINEL sentinels {{ .Values.sentinel.masterSet }})
echo "$existing_sentinels" | awk -f /health/parse_sentinels.awk | tee -a /opt/bitnami/redis-sentinel/etc/sentinel.conf
fi
bash -c "sleep 120; $(timeout -s 9 {{ .Values.sentinel.initialCheckTimeout }} redis-cli --raw --no-auth-warning -h localhost -a \"$REDIS_PASSWORD\" -p {{ .Values.sentinel.service.sentinelPort }} SENTINEL sentinels {{ .Values.sentinel.masterSet }} | awk -f /health/parse_sentinels_reset.awk) echo; redis-cli --raw --no-auth-warning -h localhost -a \"$REDIS_PASSWORD\" -p {{.Values.sentinel.porti }} SENTINEL reset {{ .Values.sentinel.masterSet }}" &

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anyone like me is looking for this fix based on Sentinel issues in this chart version - notice the typo on this line: ".Values.sentinel.porti" -> ".Values.sentinel.port"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thanks - that's in fact a typo. Unfortunately noone of the maintainer wants to merge it finnaly anymore: #21168 (comment)
I'll try to ropen this PR with corrected port variable name but I'm afraid, the whole chart is really depracated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this repo will have been closed the day after tomorrow - it makes no sense to update my PR, but thanks for mention this error !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Contribution Allowed If the contributor has signed the DCO or the CNCF CLA (prior to the move to a DCO). lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. ok-to-test size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants