Skip to content

vshard.storage.cfg new options - sched_ref_quota and sched_move_quota #2014

Closed
@TarantoolBot

Description

@TarantoolBot

vshard.storage.cfg new options: sched_ref_quota and sched_move_quota

Product: Tarantool, vshard (external module)
Since: 0.1.17 (vshard version; vshard is compatible with Tarantool versions >= 1.9)
Audience/target: developers; Tarantool implementation teams
Root document: https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_ref/
SME: @ Gerold103
Peer reviewer: @

Details

There are new options for vshard.storage.cfg: sched_ref_quota
and sched_move_quota. The options control how much time should
be given to storage refs and bucket moves - two incompatible but
important operations.

Storage refs are used by router's map-reduce API. Each map-reduce
call creates storage refs on all storages to prevent data
migration on them for the map execution time.

Bucket moves are used by the rebalancer. Obviously, they are
incompatible with the storage refs.

If vshard would prefer one operation to another always, it would
lead to starvation of one of them. For example, if storage refs
would be prefered, rebalancing could just never work if there are
always refs under constant map-reduce load. If bucket moves would
be prefered, storage refs (and therefore map-reduce) would stop
for the entire rebalancing time which can be quite long (hours,
days).

To control how much time to give to which operation the new
options serve.

sched_ref_quota tells how many storage refs (therefore
map-reduce requests) can be executed on the storage in a row if
there are pending bucket moves, before they are blocked to let the
moves work. Default value is 300.

sched_move_quota controls the same, but vice-versa: how many
bucket moves can be done in a row if there are pending refs.
Default value is 1.

Map-reduce requests are expected to be much shorter than bucket
moves, so storage refs by default have a higher quota.

This is how it works on an example. Assume map-reduces start.
They execute one after another, 150 requests in a row. Now the
rebalancer wakes up and wants to move some buckets. He stands into
a queue and waits for the storage refs to be gone.

But the ref quota is not reached yet, so the storage still can
execute +150 map-reduces even with the queued bucket moves until
new refs are blocked, and the moves start.

Definition of done

  • Describe new vshard configuration options—sched_ref_quota and sched_move_quota.
  • Add links to the description of map_callrw()

Metadata

Metadata

Assignees

Labels

3.0ecosystem[area] Task relates to Tarantool's ecosystem (connector, module, other non-server functionality)featureA new functionalityreference[location] Tarantool manual, Reference partvshard[area] Related to vshard module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions