You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks to the value blocks work, we now have base.LazyValue threaded throughout the iterator stack. I suspect there's an opportunity to flush less frequently, which can help reduce write-amp (especially in the context of CockroachDB, where delaying flushes makes raft log truncation more likely to drop keys before they're flushed).
A couple high-level thoughts:
If a value is large and compresses easily, a batch could compress it before entering the commit pipeline. When inserting into the memtable skiplist, it could set a flag on the node indicating the value is compressed and copy the smaller, compressed value into the arena. The value would need to be decompressed during flush, and during reads but only when that individual KV is returned from the pebble.Iterator.
Similar to generalized blob storage db: blob storage / WiscKey-style value separation #112, the memtable entry could encode the position of the large value in the WAL. In the rare case that the blob must be read before it's flushed, reads would suffer I/O. Additionally, at flush time the value would need to be read back these values from the WAL. This is complicated by the WAL's / record package's framing which may split a value across frames/blocks. In these cases the memtable value could encode a list of (offset, length) tuples.
Thanks to the value blocks work, we now have
base.LazyValue
threaded throughout the iterator stack. I suspect there's an opportunity to flush less frequently, which can help reduce write-amp (especially in the context of CockroachDB, where delaying flushes makes raft log truncation more likely to drop keys before they're flushed).A couple high-level thoughts:
Jira issue: PEBBLE-63
The text was updated successfully, but these errors were encountered: