vshard.storage.cfg new options - sched_ref_quota and sched_move_quota

# vshard.storage.cfg new options: sched_ref_quota and sched_move_quota

**Product:** Tarantool, vshard (external module)
**Since:** 0.1.17 (vshard version; vshard is compatible with Tarantool versions >= 1.9)
**Audience/target:** developers; Tarantool implementation teams
**Root document:** https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_ref/
**SME:** @ Gerold103 
**Peer reviewer:** @

# Details

There are new options for `vshard.storage.cfg`: `sched_ref_quota`
and `sched_move_quota`. The options control how much time should
be given to storage refs and bucket moves - two incompatible but
important operations.

Storage refs are used by router's map-reduce API. Each map-reduce
call creates storage refs on all storages to prevent data
migration on them for the map execution time.

Bucket moves are used by the rebalancer. Obviously, they are
incompatible with the storage refs.

If vshard would prefer one operation to another always, it would
lead to starvation of one of them. For example, if storage refs
would be prefered, rebalancing could just never work if there are
always refs under constant map-reduce load. If bucket moves would
be prefered, storage refs (and therefore map-reduce) would stop
for the entire rebalancing time which can be quite long (hours,
days).

To control how much time to give to which operation the new
options serve.

`sched_ref_quota` tells how many storage refs (therefore
map-reduce requests) can be executed on the storage in a row if
there are pending bucket moves, before they are blocked to let the
moves work. Default value is 300.

`sched_move_quota` controls the same, but vice-versa: how many
bucket moves can be done in a row if there are pending refs.
Default value is 1.

Map-reduce requests are expected to be much shorter than bucket
moves, so storage refs by default have a higher quota.

This is how it works on an example. Assume map-reduces start.
They execute one after another, 150 requests in a row. Now the
rebalancer wakes up and wants to move some buckets. He stands into
a queue and waits for the storage refs to be gone.

But the ref quota is not reached yet, so the storage still can
execute +150 map-reduces even with the queued bucket moves until
new refs are blocked, and the moves start.

*   Related development issues and/or commits: https://github.com/tarantool/vshard/commit/17b6d28aba302701a13e5a189123100afad8de0a

# Definition of done

- [ ] Describe new `vshard` configuration options—`sched_ref_quota` and `sched_move_quota`.
- [ ] Add links to the description of [map_callrw()](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_router/#router-api-map-callrw)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vshard.storage.cfg new options - sched_ref_quota and sched_move_quota #2014

vshard.storage.cfg new options: sched_ref_quota and sched_move_quota

Details

Definition of done

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

vshard.storage.cfg new options - sched_ref_quota and sched_move_quota #2014

Description

vshard.storage.cfg new options: sched_ref_quota and sched_move_quota

Details

Definition of done

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions