backupccl: rework split and scatter mechanism #63471

This commit filters out importSpans that have no associated data. File spans' start keys are set to be the .Next() of the EndKey of the previous file span. This was at least the case on the backup that is currently being run against restore2TB. This lead to a lot of importSpan entries that contained no files and would cover an empty key range. The precense of these import spans did not effect the performance of restores that much, but they did cause unnecessary splits and scatters which further contributed to the elevated NLHEs that were seen. This commit generally calms down the leaseholder errors, but do not eliminate them entirely. Future investigation is still needed. It also does not seem to have any performance impact on RESTORE performance (in terms of speed of the job). Release note: None

This change reworks how RESTORE splits and scatter's in 2 ways: 1. This commits updates the split and scatter mechanism that restore uses to split at the key of the next chunk/entry rather than at the current chunk/entry. This resolves a long-standing TODO that updates the split and scattering of RESTORE to perform a split at the key of the next chunk/entry. Previously, we would split at a the start key of the span, and then scatter that span. Consider a chunk with entries that we want to split and scatter. If we split at the start of each entry and scatter the entry we just split off (the right hand side of the split), we'll be continuously scattering the suffix keyspace of the chunk. It's preferrable to split out a prefix first (by splitting at the key of the next chunk) and scattering the prefix. We additionally refactor the split and scatter processor interface and stop scattering the individual entries, but rather just split them. 2. Individual entries are no longer scattered. They used to be scattered with the randomizeLeases option set to false, but there was not any indication that this was beneficial to performance, so it was removed for simplicity. Release note: None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backupccl: rework split and scatter mechanism #63471

backupccl: rework split and scatter mechanism #63471

Commits on Apr 22, 2021

Commits on Apr 23, 2021