Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
kvserver: lower priority level for mvcc gc work
GC could be expected to be LowPri, so that it does not impact user-facing traffic when resources (e.g. CPU, write capacity of the store) are scarce. However long delays in GC can slow down user-facing traffic due to more versions in the store, and can increase write amplification of the store since there is more live data. Ideally, we should adjust this priority based on how far behind we are with respect to GC-ing a particular range. Keeping the priority level static at NormalPri proved disruptive when a large volume of MVCC GC work is suddenly accrued (if an old protected timestamp record was just released for ex. following a long paused backup job being completed/canceled, or just an old, long running backup job finishing). After dynamic priority adjustment, it's not yet clear whether we need additional pacing mechanisms to provide better latency isolation, similar to ongoing work for backups. MVCC GC work is CPU intensive: \#82955. This patch is also speculative in nature and in response to observed incidents where NormalPri proved too disruptive. Fuller treatment would entail working off of reliable reproductions of this behaviour. Release note (performance improvement): Previously if there was sudden increase in the volume of pending MVCC GC work, there was an impact on foreground latencies. These sudden increases commonly occurred when: - gc.ttlseconds was reduced dramatically over tables/indexes that accrue a lot of MVCC garbage (think "rows being frequently deleted") - a paused backup job from a while ago (think > 1 day) was canceled/failed - a backup job that started a while ago (think > 1 day) just finished Indicators of a large increase in the volume of pending MVCC GC work is a steep climb in the "GC Queue" graph found in the DB console page, when navigating to 'Metrics', and selecting the 'Queues' dashboard. With this patch, the effect on foreground latencies as a result of this sudden build up should be reduced.
- Loading branch information