-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv/kvserver: split incoming snapshot user keys into multiple sstables #67284
Comments
I believe the only change required is to |
Ack, I can make the change myself, but I probably won't get it for a little while. |
We ingest a fixed number of sstables, corresponding to the range's various contiguous keyspaces. When the default range size increased from 64 MB to 512 MB, we started ingesting larger user data sstables which negatively impacts compation. This change creates a new sstable if the current one is too large (greater than 128MiB). When the default range size increased from 64 MB to 512 MB, we started ingesting user data sstables up to 512 MB. Release note: None Fixes: cockroachdb#67284
This pr consists of two main changes. The first is moving the `MultiSSTWriter` to package `storage`. The second change comes from the following motivation. Currently, when receiving a snapshot, we ingest a fixed number of sstables, corresponding to the range's various contiguous keyspaces. When the default range size increased from 64 MB to 512 MB, we started ingesting user data sstables up to 512 MB. These large files cause more expensive compactions. As a result of limiting the size of SSTables, this pr adds range key buffering and truncation to the existing MultiSSTWriter logic. Release note: None Informs: cockroachdb#67284
127997: kvserver: split snapshot SSTables for mvcc keys into multiple SSTs r=jbowens a=itsbilal Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: #67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble. Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Based on the specified backports for linked PR #127997, I applied the following new label(s) to this issue: branch-release-24.1, branch-release-24.2. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: #67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble.
Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: cockroachdb#67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble.
This change exports the truncateAndFlush method in keyspan.Fragmenter. Necessary to unblock cockroachdb/cockroach#67284 .
This change exports the truncateAndFlush method in keyspan.Fragmenter. Necessary to unblock cockroachdb/cockroach#67284 .
This change exports the truncateAndFlush method in keyspan.Fragmenter. Necessary to unblock cockroachdb/cockroach#67284 .
This change exports the truncateAndFlush method in keyspan.Fragmenter. Necessary to unblock cockroachdb/cockroach#67284 .
Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: #67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble.
Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: cockroachdb#67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble.
This change exports the truncateAndFlush method in keyspan.Fragmenter. Necessary to unblock cockroachdb/cockroach#67284 .
This change exports the truncateAndFlush method in keyspan.Fragmenter. Necessary to unblock cockroachdb/cockroach#67284 .
Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: cockroachdb#67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble.
Based on the specified backports for linked PR #134526, I applied the following new label(s) to this issue: branch-release-23.2.15-rc. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: #67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble.
Previously, we'd only create one sstable for all mvcc keys in a range when ingesting a rebalance/recovery snapshot into Pebble. This increased write-amp in Pebble as more sstables would have to be compacted into it (or the sstable then split into smaller ones in Pebble), and had other consequences such as massive filter blocks in the large singular sstable. This change adds a new cluster setting, kv.snapshot_rebalance.max_sst_size, that sets the max size of the sstables containing user/mvcc keys in a range. If an sstable exceeds this size in multiSSTWriter, we roll over that sstable and create a new one. Epic: CRDB-8471 Fixes: cockroachdb#67284 Release note (performance improvement): Reduce the write-amplification impact of rebalances by splitting snapshot sstable files into smaller ones before ingesting them into Pebble.
Currently, when receiving a snapshot, we ingest a fixed number of sstables, corresponding to the range's various contiguous keyspaces. When the default range size increased from 64 MB to 512 MB, we started ingesting user data sstables up to 512 MB. These large files cause more expensive compactions.
Here's a relevant TODO.
Related thread: https://cockroachlabs.slack.com/archives/CAC6K3SLU/p1686237663708959?thread_ts=1686084258.076069&cid=CAC6K3SLU
Related to cockroachdb/pebble#1181.
cc @sumeerbhola
Jira issue: CRDB-8471
The text was updated successfully, but these errors were encountered: