-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vnode proxy stuck in overload state #767
Conversation
…box size. When starting up fallbacks that receive a lot of load (e.g. in riak_kv when a node running bitcask restarts it only starts primaries before marking the service as up, fallbacks are started on demand and if there is a large keydir already on the node, it may take several minutes to start while queuing up requests). The vnode is receiving messages from other sources than the vnode proxy (possibly messages to trigger handoff from the vnode manager if the primary recovers) and does not respond to the ping message before the direct message queue length check is invoked. The msgq is over the threshold and leaves the vnode proxy in overload, but also ignores the eventual response from the ping message. If enough requests go through the vnode proxy it will do another direct check, however if this happens after the primary is back in service then the message volume is much lower (probably just handoff folds), significantly delaying (possibly by many hours) the start of a handoff. This fix immediately sends a ping message after a direct check as the vnode was overloaded enough not to respond to the ping in 2500 requests. When the vnode responds to the ping message it will adjust the mailbox check and reset the counter for a new soft check. This should bring the vnode proxy out of overload much sooner. Here's a reproducer to play with the proxy code. ``` %% Provoke the nastiness PPid = whereis(proxy_riak_kv_vnode_0). {ok, VPid} = riak_core_vnode_proxy:command_return_vnode({riak_kv_vnode,0,node()}, timeout). Report = fun() -> io:format("PPid = ~p with ~p messages. VPid = ~p with ~p messages\n", [PPid, element(2, process_info(PPid, message_queue_len)), VPid, element(2, process_info(VPid, message_queue_len))]) end. sys:suspend(PPid), sys:suspend(VPid), [spawn(fun() -> riak_kv_vnode:local_get(0, {<<"b">>,<<"k">>}) end) || X <- lists:seq(1,15000)]. Report(). [spawn(fun() -> catch gen_fsm:sync_send_event(VPid, {ohai, X}, 1) end) || X <- lists:seq(1,15000)]. Report(). sys:resume(PPid). %%% Pause here and enjoy the misery. lists:usort([riak_kv_vnode:local_get(0, {<<"b">>,<<"k">>}) || X <- lists:seq(1,2498)]). sys:resume(VPid). Report(). Report(). Report(). Report(). Report(). rr(riak_core_vnode_proxy). sys:get_status(PPid). % look for check_mailbox > 0 ```
after recovery until possibly thousands of messages are sent. As the mailbox is only an estimate, simplify by not including the vnode proxy ping/pong in it - there is no need. The ping/pong mechanism will now just track proxied requests being completed. Previously the mailbox estimate grew incorrectly when in the overload state. Now the estimate is only updated when messages are proxied. If the proxy has to perform a hard check on the msgq len, *do* include the ping message as part of it so that the estimate will be correct once the ping/pong happens. QuickCheck test added for riak_test to exercise more than the unit tests.
You can't retry until you 👍 a commit. |
👍 c322af3 |
@@ -317,6 +326,10 @@ fake_loop() -> | |||
{get_count, Pid} -> | |||
Pid ! {count, erlang:get(count)}, | |||
fake_loop(); | |||
%% Original tests do not expect replies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually added this for future souls on purpose as I couldn't work out where my replies were. I'll improve the comment.
…erload-v2 Vnode proxy stuck in overload state Reviewed-by: cmeiklejohn
…erload-v2 Vnode proxy stuck in overload state Reviewed-by: cmeiklejohn
@cmeiklejohn updated comments per review. Would you mind +1ing the new SHA? |
Stupid racey test
Put the limit back so it cannot overload. EQC test coverage is much better anyway. |
On the builders, sending more than threshold messages could cause overload. This should *never* overload.
@cmeiklejohn once more, with feeling please :) |
👍 63ff909 |
…erload-v2 Vnode proxy stuck in overload state Reviewed-by: cmeiklejohn
…erload-v2 Vnode proxy stuck in overload state Reviewed-by: cmeiklejohn
@borshop merge |
Fix for #760 (RIAK-1914).
Previously, if a heavily loaded vnode received more than
check_threshold
msgs after it hit the direct mailbox check, the ping/pong would leave it in overload state. In cases where the load was removed (e.g. primary recovered) the vnode proxy would stay in overload state until enough messages were received to trigger pinging the vnode or rechecking the mailbox.This PR does a couple of things
overloaded/1
function to make testing easier.There is corresponding riak_test on branch bugfix/jdm/vnode-proxy-stuck-in-overload-v2
Customer ticket zd://11440