-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: rework split and scatter mechanism #63471
Conversation
0f88201
to
0383802
Compare
Adding a few results from benchmarks for this diff. AddSSTable Request RateNote that we still see some nodes (unexpectedly) finish processing their export requests much sooner than other nodes. This manifests as restores slowing down towards the end. This will be addressed separately. AdminSplit Request Rate |
ef15083
to
9a8e318
Compare
This commit filters out importSpans that have no associated data. File spans' start keys are set to be the .Next() of the EndKey of the previous file span. This was at least the case on the backup that is currently being run against restore2TB. This lead to a lot of importSpan entries that contained no files and would cover an empty key range. The precense of these import spans did not effect the performance of restores that much, but they did cause unnecessary splits and scatters which further contributed to the elevated NLHEs that were seen. This commit generally calms down the leaseholder errors, but do not eliminate them entirely. Future investigation is still needed. It also does not seem to have any performance impact on RESTORE performance (in terms of speed of the job). Release note: None
33e3447
to
51c0b85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @pbardea)
pkg/ccl/backupccl/split_and_scatter_processor_test.go, line 174 at r2 (raw file):
2: 2, // Entry 3, Entry 8 3: 1, // Entry 4 // 4: 0 // Entry 4 gets redirected to stream 0 since stream 4 does not exist.
nit: this should be Entry 5 gets redirected to stream 0, but you have CI passing so no need to re-push 🙂
This change reworks how RESTORE splits and scatter's in 2 ways: 1. This commits updates the split and scatter mechanism that restore uses to split at the key of the next chunk/entry rather than at the current chunk/entry. This resolves a long-standing TODO that updates the split and scattering of RESTORE to perform a split at the key of the next chunk/entry. Previously, we would split at a the start key of the span, and then scatter that span. Consider a chunk with entries that we want to split and scatter. If we split at the start of each entry and scatter the entry we just split off (the right hand side of the split), we'll be continuously scattering the suffix keyspace of the chunk. It's preferrable to split out a prefix first (by splitting at the key of the next chunk) and scattering the prefix. We additionally refactor the split and scatter processor interface and stop scattering the individual entries, but rather just split them. 2. Individual entries are no longer scattered. They used to be scattered with the randomizeLeases option set to false, but there was not any indication that this was beneficial to performance, so it was removed for simplicity. Release note: None
Backportable to 21.1? |
@shermanCRL I believe this is. I also think #64067 could be a good candidate to backport. Will wrangle them up today. bors r=adityamaru |
Build failed (retrying...): |
Build succeeded: |
64656: release-21.1: improve restore performance consistency r=dt a=pbardea Backport: * 2/2 commits from "backupccl: rework split and scatter mechanism" (#63471) * 1/1 commits from "backupccl: fix a bug in routing scattered ranges" (#63875) * 1/1 commits from "backupccl: create more chunks on larger clusters" (#64067) Please see individual PRs for details. /cc @cockroachdb/release Added do-not-merge label until release-21.1 opens for 21.1.1 backports once the release is out. Co-authored-by: Paul Bardea <pbardea@gmail.com>
backupccl: don't include empty import entries during restore
This commit filters out importSpans that have no associated data.
File spans' start keys are set to be the .Next() of the EndKey of the
previous file span. This was at least the case on the backup that is
currently being run against restore2TB. This lead to a lot of importSpan
entries that contained no files and would cover an empty key range. The
precense of these import spans did not effect the performance of
restores that much, but they did cause unnecessary splits and scatters
which further contributed to the elevated NLHEs that were seen.
This commit generally calms down the leaseholder errors, but do not
eliminate them entirely. Future investigation is still needed. It also
does not seem to have any performance impact on RESTORE performance (in
terms of speed of the job).
Release note: None
backupccl: rework split and scatter mechanism
This change reworks how RESTORE splits and scatter's in 2 ways:
This commits updates the split and scatter mechanism that restore uses
to split at the key of the next chunk/entry rather than at the current
chunk/entry.
This resolves a long-standing TODO that updates the split and scattering
of RESTORE to perform a split at the key of the next chunk/entry.
Previously, we would split at a the start key of the span, and then
scatter that span.
Consider a chunk with entries that we want to split and scatter. If we
split at the start of each entry and scatter the entry we just split off
(the right hand side of the split), we'll be continuously scattering the
suffix keyspace of the chunk. It's preferrable to split out a prefix
first (by splitting at the key of the next chunk) and scattering the
prefix.
We additionally refactor the split and scatter processor interface and
stop scattering the individual entries, but rather just split them.
Individual entries are no longer scattered. They used to be scattered
with the randomizeLeases option set to false, but there was not any
indication that this was beneficial to performance, so it was removed
for simplicity.
Release note: None