-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: upgrade redis-ha chart and enable haproxy #3147
fix: upgrade redis-ha chart and enable haproxy #3147
Conversation
This is still a bit rough, but it's one potential solution for the With the current HA manifests, we run in to errors where ArgoCD is attempting to write to a Redis replica:
I took a look at the Redis HA helm chart and saw that it is currently on 3.3.1. The current version is 4.3.4, which includes a fairly significant change intended to address this issue by running haproxy in front of Redis (helm/charts#15305). In this PR, I have bumped the redis-ha version the ArgoCD HA installation uses to 4.3.4 and modified the Kustomizations to add the appropriate labels to the new haproxy components. I regenerated the installation manifests using If this is a change you're interested in accepting, I'll need some time to sort out the manifest tests. |
@shelby-moore Hm, an interesting approach that the Redis guys are going there. I think migrating the complexity of accessing the master Redis instance into a dedicated component makes much sense from the application's point of view (less code to maintain & same methods to access HA and non-HA Redis instances). HAProxy seems a good choice for the frontend to Redis, too. And that this architecture is endorsed by upstream and probably used a lot in the wild is another argument. So, for me, this change would be more than welcome if polished up. @jessesuen and @alexmt, what's your opinion on this? |
I read about the HA proxy implementation on the RedisHA chart and the main motivation is a to provide a reliable connection from outside of the cluster (k8s I guess) due to this s_down problem on the sentinel where the master node is running... There is another Redis chart that's also HA maintained by Bitnami that managed this problem on a different way I used Redid HA chart in the past on production workloads but I ended up migrating to the Bitnami chart because not all the Redis clients where compatible with the sentinel and personally I found the Bitnami approach more "Kubernetised" |
Thank you for contribution @shelby-moore ! I agree with @jannfis , it worth switching to HAProxy if it helps to resolve #3070 . |
manifests/base/kustomization.yaml
Outdated
@@ -1,15 +1,15 @@ | |||
apiVersion: kustomize.config.k8s.io/v1beta1 | |||
kind: Kustomization | |||
|
|||
bases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codegen CI job jas failed because kustomization.yaml was upgraded to kustomize 3.x . This breaks compatibility with replicated ship which stuck with kustomize 2.x. Can you please run make manifests
? It should regenerate manifests using right kustomize version.
util/cache/cache.go
Outdated
@@ -23,27 +23,14 @@ func NewCache(client CacheClient) *Cache { | |||
// AddCacheFlagsToCmd adds flags which control caching to the specified command | |||
func AddCacheFlagsToCmd(cmd *cobra.Command) func() (*Cache, error) { | |||
redisAddress := "" | |||
sentinelAddresses := make([]string, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the ability to use failover clients. It should provide a quick way switching back to sentinel if HA proxy is not stable.
Codecov Report
@@ Coverage Diff @@
## master #3147 +/- ##
======================================
Coverage 38.6% 38.6%
======================================
Files 168 168
Lines 18269 18269
Branches 237 237
======================================
Hits 7053 7053
Misses 10342 10342
Partials 874 874 Continue to review full report at Codecov.
|
a86780d
to
19d6259
Compare
manifests/ha/install.yaml
Outdated
@@ -2354,6 +2484,7 @@ metadata: | |||
app.kubernetes.io/name: argocd-redis-ha | |||
app.kubernetes.io/part-of: argocd | |||
name: argocd-redis-ha-announce-0 | |||
namespace: default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be here. The namespace should not be hard-wired into the yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is being pulled in from the upstream redis-ha chart: helm/charts@e1b39e9#diff-a988587cdea159026a58cc36592ce812
I can work on a mechanism to strip it out if it should not be included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've added a kustomize patch to strip namespace from the redis-ha manifest.
LGTM. I think it is ready to merge after addressing @jessesuen comment: #3147 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @shelby-moore ! LGTM
CI should deploy the change to https://cd.apps.argoproj.io/applications :) We will know if the upgrade is smooth in ~20 mins :)
It works, the upgrade went smooth and ha Redis just works! Will try on an intrenal dogfood instance next. |
Checklist: