-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hanging when performing read_multi against large number of keys #941
Comments
Also able to repro without rails via the following: require 'timeout'
require 'dalli'
key_count = 2**10
dalli_client = Dalli::Client.new( ENV.fetch('MEMCACHE_SERVERS'), timeout: 0.5, error_when_over_max_size: true)
while key_count < 10_000_000 do
prefix = (rand * 2**64).to_i.to_s(32)
puts "Prepping #{key_count} keys"
keys = key_count.times.map { |i| "some_long_string:#{prefix}:#{i}" }
t = Time.now
puts "Starting set multi #{t}"
keys.each do |k|
dalli_client.set(k, 0)
end
diff = Time.now - t
puts "Completed in #{diff}s"
t = Time.now
puts "Starting multi get #{t}"
Timeout::timeout(diff) {
dalli_client.get_multi(keys)
}
puts "Completed in #{Time.now - t}s"
key_count *= 2
end
|
@pcorliss-provi I strongly suspect it's the same issue as #776 . I don't have a mitigation for that yet, other than limiting the number of keys fetched at a time using |
@petergoldstein Funny that I seem to have discovered on the same day as @pcorliss-provi that we are affected by this / #776 as well. Increasing I have started implementing interleaved reading/writing (which is what I suspect you mean by "batching internal to the client"), but it needs some polishing. If you want I can push a rough solution so we can have a discussion. I am especially not quite sure what you mean by "it's slightly more complicated because of the ring". For now, do you want to close this as duplicate in favor of #776? |
@marvinthepa Yes, interleaved reading/writing is basically what I was thinking. The ring (the data structure that distributes among the memcached servers) makes it more complicated since the buffer problem is per-memcached server. So ideally the interleaving would occur per server, but that requires breaking the ring abstraction. So it may be simpler to ignore this detail and batch as if it's a single server. And yes, we can close this issue as a duplicate. |
fixes petergoldstein#776. fixes petergoldstein#941. When reading a large number of keys, memcached starts sending the response when dalli is not yet finished sending the request. As we did not start reading the response until we were finished writing the request, this could lead to the following problem: * the receive buffer (rcvbuf) would fill up * due to TCP backpressure, memcached would stop sending (okay as we are not reading anyway), but also stop reading * the send buffer (sndbuf) would also fill up * as we were using a blocking write without timeout, we would block forever (at least with ruby < 3.2, which introduces IO::Timeout, see petergoldstein#967) This is addressed by using IO::select on the sockets for both read and write, and thus start reading as soon as data is available.
fixes petergoldstein#776. fixes petergoldstein#941. When reading a large number of keys, memcached starts sending the response when dalli is not yet finished sending the request. As we did not start reading the response until we were finished writing the request, this could lead to the following problem: * the receive buffer (rcvbuf) would fill up * due to TCP backpressure, memcached would stop sending (okay as we are not reading anyway), but also stop reading * the send buffer (sndbuf) would also fill up * as we were using a blocking write without timeout, we would block forever (at least with ruby < 3.2, which introduces IO::Timeout, see petergoldstein#967) This is addressed by using IO::select on the sockets for both read and write, and thus start reading as soon as data is available.
fixes petergoldstein#776. fixes petergoldstein#941. When reading a large number of keys, memcached starts sending the response when dalli is not yet finished sending the request. As we did not start reading the response until we were finished writing the request, this could lead to the following problem: * the receive buffer (rcvbuf) would fill up * due to TCP backpressure, memcached would stop sending (okay as we are not reading anyway), but also stop reading * the send buffer (sndbuf) would also fill up * as we were using a blocking write without timeout, we would block forever (at least with ruby < 3.2, which introduces IO::Timeout, see petergoldstein#967) This is addressed by using IO::select on the sockets for both read and write, and thus start reading as soon as data is available.
We're seeing an issue where the dalli client hangs trying to read ~100-200K keys at once. We were able to repro with the following code and a cache store configuration.
It looks similar to #776 but wanted to confirm and potentially add our use case given the age of that issue. We're going to mitigate on our end by dropping the batch size to 10K keys per call.
ruby 2.7.6p219
Rails 6.0.5.1
dalli (3.2.3)
The text was updated successfully, but these errors were encountered: