Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket_timeout does not work in CRuby > 3.0 #1016

Open
nickamorim opened this issue Nov 12, 2024 · 0 comments
Open

socket_timeout does not work in CRuby > 3.0 #1016

nickamorim opened this issue Nov 12, 2024 · 0 comments

Comments

@nickamorim
Copy link
Collaborator

Description

All operations except for get_multi do not respect the socket_timeout client option. I think this ultimately boils down to the usage of gets here and why in a patch using IO#timeout it works as expected (operation times out).

Reproduction

I've tested the following script on various Ruby versions and none of them time out on the Memcached call as I would expect. Note that I at least expect on Ruby versions < 3.0 it properly times out since SO_RCVTIMEO and SO_SNDTIMEO would be used since it's a blocking socket.

The gist of the script is configuring a Dalli client with a 100ms socket_timeout and using Toxiproxy to inject 5 seconds of latency to each call to Memcached. The expected behaviour is that the c.get("foo") line should time out after 100ms; but in reality it hangs 10 seconds (5s for the initial version call + 5s for the get call) on the gets("\r\n") line I referenced above.

I assume that this test was supposed to catch this bug but I don't think it does because:
a) the default timeout of 1s is less than the sleep duration (0.6s + 0.3s)
b) If you pass in socket_timeout: 1000 and remove the sleep, the test still passes

require "dalli"
require "toxiproxy"

c = Dalli::Client.new("127.0.0.1:22122", socket_timeout: 0.1)

Toxiproxy.populate([{
  name: 'memcached',
  listen: "127.0.0.1:22122",
  upstream: "127.0.0.1:11211"
}])

Toxiproxy[/memcached/].downstream(:latency, latency: 5000).apply do
  c.get("foo")
end

Note: If I replace the get call with get_multi, there is a timeout: W, [2024-11-12T11:31:08.099162 #59436] WARN -- : 127.0.0.1:22122 failed (count: 0) External timeout.

My understanding is that this is because we call readfull which ultimately calls read_nonblock and hits nonblock_timed_out?.

Proposed Solution

I have a branch that has been running in production for a couple of days without any issue that leverages IO#timeout here. I can submit a patch if you're willing to accept it @petergoldstein. Although longer term, the BufferedIO class in redis-client seems like the proper way to handle timeouts.

@nickamorim nickamorim changed the title socket_timeout does not work socket_timeout does not work in CRuby > 3.0 Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant