Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Snapshot bandwidth "priority inversion" #15274

Closed
bdarnell opened this issue Apr 23, 2017 · 3 comments
Closed

storage: Snapshot bandwidth "priority inversion" #15274

bdarnell opened this issue Apr 23, 2017 · 3 comments
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.

Comments

@bdarnell
Copy link
Contributor

Snapshots are currently placed into two categories for bandwidth management, which effectively act as priorities. However, since we also only allow only one snapshot at a time (per target node), we have problems with priority inversion - a high-priority operation is not allowed to interrupt an existing low-priority operation that may take a while to finish. We should introduce some way to interrupt low-priority rebalance operations when they compete with high-priority repairs (unless this entire mechanism is reworked as discussed in #14768)

@bdarnell bdarnell added this to the 1.1 milestone Apr 23, 2017
@petermattis
Copy link
Collaborator

An alternative to interrupting a low-priority snapshot would be to adjust its bandwidth dynamically. I'm wondering if this is a real problem to solve, though. Recovery operations are already prioritized over rebalance operations, so the most a recovery operation will have to wait is for one rebalance operation to finish.

@bdarnell
Copy link
Contributor Author

Yeah, I think this is probably a theoretical concern for now. It will be a bigger issue when/if we increase the max range size since "one rebalance operation" could take longer.

@bdarnell bdarnell modified the milestones: Later, 1.1 Aug 14, 2017
@bdarnell bdarnell added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-kv-replication Relating to Raft, consensus, and coordination. labels Apr 26, 2018
@tbg tbg added A-coreperf and removed A-disaster-recovery A-kv-transactions Relating to MVCC and the transactional model. A-kv-distribution Relating to rebalancing and leasing. A-kv-client Relating to the KV client and the KV interface. A-storage Relating to our storage engine (Pebble) on-disk storage. A-kv-replication Relating to Raft, consensus, and coordination. labels Jul 31, 2018
@petermattis petermattis removed this from the Later milestone Oct 5, 2018
@tbg
Copy link
Member

tbg commented Oct 11, 2018

Folding into #14768.

@tbg tbg closed this as completed Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

No branches or pull requests

3 participants