Add middleware to ensure Router only handles errors for its clients #310

KJTsanaktsidis · 2024-01-08T05:32:36Z

If you nest usages of RedisClient (perhaps becuase you're scanning from one client whilst inserting into a different cluster, perhaps), it's possible for errors which were raised from one cluster's RedisClients to instead be handled by the wrong cluster.

This is bad, because it can mean processing bogus topology "updates" when e.g. MOVED or ASKING responses get caught by the wrong client.

To fix this, we can add a middleware to our RedisClient instances (through the documented middleware interface), and tag errors at the source based on what config object they came from. This can then be used inside the router to identify errors that it should and should not handle.

Although at the moment you have to try pretty hard to get into this situation (sscan and hscan are probably the main ways to get it), solving this is important for my plan for transaction support. Recall that last year one of the reasons I wanted to wrap the returned RedisClient instances from#with in a proxy was to make sure they didn't leak errors to each other: #298 (comment)

One reason I think we might need the proxy implementation, though, is what happens if we have two different client instances, that point to totally different Redis clusters? What happens if we get a MOVED response here:
client1.with(key: '{slot1}', retry_count: 1) do |conn1|
  client2.with(key: '{slot2}', retry_count: 1) do |conn2|
    conn1.call('SET', '{slot1}key', 'value')
  end
end
The MOVED response is for client1's cluster, but client2's #with method is what catches the exception. We'd then wind up moving the slot on the wrong client instance. The benefit of the proxy object around conn is that we can catch the exceptions right there with the client that generated them, so we know for sure we're the cluster that the exception was for.

If we do this in RedisClient middleware instead, we can identify errors correctly right at the source, and that should make it safe to pass the raw RedisClient instances to callers, removing the need for a proxy over the top of it.

If you nest usages of RedisClient (perhaps becuase you're scanning from one client whilst inserting into a different cluster, perhaps), it's possible for errors which were raised from one cluster's RedisClients to instead be handled by the wrong cluster. This is bad, because it can mean processing bogus topology "updates" when e.g. MOVED or ASKING responses get caught by the wrong client. To fix this, we can add a middleware to our RedisClient instances (through the documented middleware interface), and tag errors at the source based on what config object they came from. This can then be used inside the router to identify errors that it should and should not handle.

supercaracal · 2024-01-09T05:44:01Z

Thank you for your pull request.

I feel this use case isn't a bit general. I'd say that most of the users don't need the handling with the middleware. I'm wondering either these logic should be included to our gem or not. Please give me some time for the consideration.

KJTsanaktsidis · 2024-01-09T06:36:28Z

If you’ll allow me to try and sell it a bit… this is needed for the gem to take responsibility for its own correctness, as it were. The underlying redis clients for the cluster clients are the property of the cluster client, and allowing user code “in between” the cluster client and the underlying redis client (like with #with, #scan etc) means that the cluster client can’t properly maintain its own invariants.

wrapping the raw clients is one way to fix this, but I think the solution in this PR (using middleware) is a more elegant and less invasive way.

I think this problem comes up practically in migration scenarios - like in a script to copy things from one cluster to another (like in the README), or perhaps during “dual writing” in applications where they might write to an old and new redis cluster at the same time.

KJTsanaktsidis · 2024-01-15T22:24:06Z

@supercaracal any further thoughts on this? If you like I can have another try at implementing this fix through a custom client_implementation instead, and we can compare how it looks?

supercaracal · 2024-01-16T01:33:49Z

Sorry for my delayed reviewing. As you said, I thought the middleware pattern is more elegant than wrapping the raw client. Although I thought it's not required and it seems like to be responsible for the user side, on second thought, I thought this safety net is nice-to-have. So I'll merge this pull request. Thank you for the improvement!

KJTsanaktsidis force-pushed the ktsanaktsidis/handle_own_errors branch 2 times, most recently from 95eab7f to 64fb6ab Compare January 8, 2024 05:51

KJTsanaktsidis force-pushed the ktsanaktsidis/handle_own_errors branch from 64fb6ab to 65cef5a Compare January 8, 2024 05:52

KJTsanaktsidis mentioned this pull request Jan 8, 2024

Implementation for transaction #299

Closed

supercaracal self-requested a review January 8, 2024 07:00

supercaracal merged commit 4f578b8 into redis-rb:master Jan 16, 2024

supercaracal mentioned this pull request Apr 12, 2024

fix: make the error identification middleware optional because of the use case #344

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add middleware to ensure Router only handles errors for its clients #310

Add middleware to ensure Router only handles errors for its clients #310

KJTsanaktsidis commented Jan 8, 2024 •

edited

Loading

supercaracal commented Jan 9, 2024 •

edited

Loading

KJTsanaktsidis commented Jan 9, 2024

KJTsanaktsidis commented Jan 15, 2024

supercaracal commented Jan 16, 2024

Add middleware to ensure Router only handles errors for its clients #310

Add middleware to ensure Router only handles errors for its clients #310

Conversation

KJTsanaktsidis commented Jan 8, 2024 • edited Loading

supercaracal commented Jan 9, 2024 • edited Loading

KJTsanaktsidis commented Jan 9, 2024

KJTsanaktsidis commented Jan 15, 2024

supercaracal commented Jan 16, 2024

KJTsanaktsidis commented Jan 8, 2024 •

edited

Loading

supercaracal commented Jan 9, 2024 •

edited

Loading