Skip to content

Feat/set addrs #2190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 5, 2022
Merged

Feat/set addrs #2190

merged 4 commits into from
Oct 5, 2022

Conversation

vmihailenco
Copy link
Collaborator

Closes #2093

szuecs and others added 3 commits August 1, 2022 22:06
…euse old connections

test: ring scale-in and scale-out

rewrite as suggested by @AlexanderYastrebov
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
@vmihailenco
Copy link
Collaborator Author

@szuecs please take a look at changes I've made. I've removed the 2nd mutex you've added because I doubt it is safe and it does not look like it should affect performance.

// decrease number of shards, that you use. It will reuse shards that
// existed before and close the ones that will not be used anymore.
func (c *ringSharding) SetAddrs(addrs map[string]string) {
c.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will block all ongoing traffic through ringshards until it has connected all newRingShards to the connection pools, so we wait until we get all new shards connected until we would serve another ring.$Operation() call.
Example GetByKey is guarded by RLock:
https://github.com/go-redis/redis/blob/c561f3ca7e5cf44ce1f1d3ef30f4a10a9c674c8a/ring.go#L271-L275
which is used in cmdShard https://github.com/go-redis/redis/blob/c561f3ca7e5cf44ce1f1d3ef30f4a10a9c674c8a/ring.go#L573-L581 called by process() https://github.com/go-redis/redis/blob/c561f3ca7e5cf44ce1f1d3ef30f4a10a9c674c8a/ring.go#L592.
So a SetAddrs call will block all Redis ringshard operations until all new shards are connected. In my version it would block all the operations only when we swap the already connected shards. The only shared lock holding which would be blocked are the close operation in my proposed PR.
I care a lot about this, because we actually run multiple clusters with up to 100 redis ring shards and 2-3M operations per second. This will be visible in our latency metrics if we have to stop the processing while connecting new shards.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a SetAddrs call will block all Redis ringshard operations until all new shards are connected.

newRingShards should not open any connections and should be relatively fast.

In my version it would block all the operations only when we swap the already connected shards.

You are welcome to try to port those changes, but I am not sure it is done correctly or, in other words, this PR can be further optimized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newRingShards will create a new connection pool via pool.NewConnPool, which finally runs checkMinIdleConns(), which looks like this:

func (p *ConnPool) checkMinIdleConns() {
	if p.opt.MinIdleConns == 0 {
		return
	}
	for p.poolSize < p.opt.PoolSize && p.idleConnsLen < p.opt.MinIdleConns {
		p.poolSize++
		p.idleConnsLen++

		go func() {
			err := p.addIdleConn()
			if err != nil && err != ErrClosed {
				p.connsMu.Lock()
				p.poolSize--
				p.idleConnsLen--
				p.connsMu.Unlock()
			}
		}()
	}
}

So it starts some goroutines depending on opt.PoolSize and opt.MinIdleConns and calls addIdleConn(), which is creating a new connection via the opt.Dialer.
So it opens new connections, which you have to do anyways. The question is when do you do it, and now you just merged your own PR without discussing it. Honestly a bit odd, but it's your project not mine....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szuecs okay, go-redis opens new connections only when MinIdleConns is present and still does it asynchronously which should be relatively fast.

now you just merged your own PR without discussing it

Apologies, but I assumed it is better to have something than not having anything at all. We can unexport SetAddrs and continue looking for a solution, but there are not that many volunteers to work on this.

And as I said, I am not sure you PR was ready to be merged either...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah no problem, I just want to say how I felt.
Let's test the current merged change and if this is an issue we will reach out and propose a change.
Thanks for your collaboration and work on the project!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution and for the patience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test the current merged change and if this is an issue we will reach out and propose a change.

I agree, lets see how this performs in practice and have a second iteration.

okay, go-redis opens new connections only when MinIdleConns is present and still does it asynchronously which should be relatively fast.

For new shard it is useful to have some established connections before using it.
Asynchronous connection in fact could be a downside - shard would be available for use before there is a working connection, even if we decouple "shard locking for use" and "shard locking for address update".
I think this should be possible to demonstrate by adding sleep() into dialer.

@vmihailenco vmihailenco merged commit 965ecde into master Oct 5, 2022
@vmihailenco vmihailenco deleted the feat/set-addrs branch October 5, 2022 07:18
AlexanderYastrebov added a commit to AlexanderYastrebov/go-redis that referenced this pull request Nov 14, 2022
Introduces a new lock to make `SetAddrs` calls exclusive.
This enables removal of `ringSharding` field locking for the duration of
potentially long `newRingShards` call.
Closes newly created shards if `ringSharding` was closed during `SetAddrs`.

`TestRingSetAddrsContention` observes number of pings increased from ~200 to ~30_000.

See related discussion redis#2190 (comment)
AlexanderYastrebov added a commit to AlexanderYastrebov/go-redis that referenced this pull request Nov 14, 2022
Introduces a new lock to make `SetAddrs` calls exclusive.
This enables removal of `ringSharding` field locking for the duration of
potentially long `newRingShards` call.
Closes newly created shards if `ringSharding` was closed during `SetAddrs`.

`TestRingSetAddrsContention` observes number of pings increased from ~200 to ~30_000.

See related discussion redis#2190 (comment)

Updates redis#2077
AlexanderYastrebov added a commit to AlexanderYastrebov/go-redis that referenced this pull request Nov 21, 2022
AlexanderYastrebov added a commit to AlexanderYastrebov/go-redis that referenced this pull request Nov 21, 2022
Introduces a new lock to make `SetAddrs` calls exclusive.
This allows release of shards lock for the duration of potentially long `newRingShards` call.

`TestRingSetAddrsContention` observes number of pings increased from <1000 to ~40_000.

See redis#2190 (comment)

Updates redis#2077
AlexanderYastrebov added a commit to AlexanderYastrebov/go-redis that referenced this pull request Nov 22, 2022
Introduces a new lock to make `SetAddrs` calls exclusive.
This allows release of shards lock for the duration of potentially long `newRingShards` call.

`TestRingSetAddrsContention` observes number of pings increased from <1000 to ~40_000.

See redis#2190 (comment)

Updates redis#2077
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants