-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: Snapshot bandwidth "priority inversion" #15274
Comments
An alternative to interrupting a low-priority snapshot would be to adjust its bandwidth dynamically. I'm wondering if this is a real problem to solve, though. Recovery operations are already prioritized over rebalance operations, so the most a recovery operation will have to wait is for one rebalance operation to finish. |
Yeah, I think this is probably a theoretical concern for now. It will be a bigger issue when/if we increase the max range size since "one rebalance operation" could take longer. |
Folding into #14768. |
Snapshots are currently placed into two categories for bandwidth management, which effectively act as priorities. However, since we also only allow only one snapshot at a time (per target node), we have problems with priority inversion - a high-priority operation is not allowed to interrupt an existing low-priority operation that may take a while to finish. We should introduce some way to interrupt low-priority rebalance operations when they compete with high-priority repairs (unless this entire mechanism is reworked as discussed in #14768)
The text was updated successfully, but these errors were encountered: