-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partitions waiting to handoff indefinitely #1135
Comments
Given that these are hinted handoffs, I think it would be expected that they are handoffs from secondary partitions (i.e. fallback vnodes that were temporarily created to maintain n_val during an outage). There's been a lot of work done in the last few versions of Riak to try and improve handoff reliability, as there were a lot of problems with handoff timeouts, particularly when handoffs are occurring during busy periods or vnodes are particularly large. In your version, the first thing is probably to reduce the riak_core handoff_acksync_threshold across your cluster. This reduces the number of batches between acknowledgements. There may also be value in increasing the riak_core handoff_timeout across the cluster. There may also be value in increasing the riak_core handoff_receive_vnode_timeout. These changes can all be made via riak attach and application set_env (which will change for the next handoff). Also you can add different settings into advanced.config (which will have effect following reboot). Finally, if you have increased the riak_core handoff_concurrency from the default setting, there may be value in reducing back to the default again. Monitoring of these handoffs has been improved in recent versions, as working out what exactly is going wrong in older Riak versions is hard. When a handoff fails, it starts to re-send all the data from the beginning, so if the fallback vnodes were created as part of an extended outage (and are quite large) then continuous failures are possible. If you are confident that all the data is sufficiently covered in your cluster (due to other replicas and anti-entropy mechanisms), in the worst case scenario you can stop each node in turn and manually delete the fallback vnodes. Obviously though, it would be more sustainable to find a configuration which will work for future handoffs. |
Thanks Martin, I'll try these config changes steps and see who it goes. Will keep you updated. |
I did some changes in riak attach and application set_env
The transfer seems to be in progress, but I don't understand how to fix this riak_core_ring:check_tainted error I need your help again, thanks |
I don't know really. I believe the tainted flag was added, so that before a read-only cache of the ring is exported (using mochiglobal), it is marked as tainted so that it can be confirmed that such a cached ring is never mistakenly used as the version to make an updated ring - i.e. some code updates the ring from get_raw_ring not get_my_ring. So the tainted state, and the error messages were a check to make sure this never happens. But clearly, in some rare circumstance it can. Because of this the unset_tainted function was added so that this could be fixed from remote_console ... but that isn't available in older versions of Riak. If the error logs don't go away, there might be another method to clear this status. I don't think it will work, but perhaps |
I'm running a cluster with 24 nodes with 1024 partitions
riak_kv_version : <<"2.1.7-226">>
riak version : <<"2.0.5">>
I have 142 partitions waiting to handoff for more than 30 days. There's no ongoing transfer in the cluster.
Under this node riak@0037-internal.xx.com, I can see this error message
<0.30120.441>@riak_core_handoff_sender:start_fold:282 hinted transfer of riak_kv_vnode from 'riak@0037-internal.xx.com' 994791641816054037097625320706298110058774396928 to 'riak@0029-internal.xx.com' 994791641816054037097625320706298110058774396928 failed because of error:{badmatch,{error,closed}} [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,132}]}]
<0.9143.441>@riak_core_handoff_sender:start_fold:282 hinted transfer of riak_kv_vnode from 'riak@0037-internal.xx.com' 616571003248974668617179538802181898917346541568 to 'riak@0035-internal.xx.com' 616571003248974668617179538802181898917346541568 failed because of error:{badmatch,{error,closed}} [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,132}]}]
When I check the partitions list (riak-admin cluster partitions) I notice that all partitions which are waiting for handoff are marked as secondary. I was expecting all those partitions type to be primary
Any idea about how to fix this issue?
The text was updated successfully, but these errors were encountered: