Skip to content

Improve description of write amp in storage engine #19535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/current/v25.1/architecture/storage-layer.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ A certain amount of read amplification is expected in a normally functioning Coc

<a name="write-amplification"></a>

_Write amplification_ is more complicated than read amplification, but can be defined broadly as: "how many physical files am I rewriting during compactions?" For example, if the storage engine is doing a lot of [compactions](#compaction) in L5, it will be rewriting SST files in L5 over and over again. This is a tradeoff, since if the engine doesn't perform compactions often enough, the size of L0 will get too large, and an inverted LSM will result, which also has ill effects.
_Write amplification_ measures the volume of data written to disk, relative to the volume of data logically committed to the storage engine. If you commit a value to the storage engine, CockroachDB writes it once to the [write-ahead log (WAL)](#memtable-and-write-ahead-log). Then it is written again when CockroachDB flushes it to an [SSTable](#ssts). Then CockroachDB will write it several additional times as a part of [compactions](#compaction) over the lifetime of the value. Most write amplification (and write bandwidth more broadly) originates from compactions. This is a tradeoff, since if the storage engine doesn't perform compactions often enough, the size of [L0](#lsm-levels) will get too large, and an inverted LSM will result, which also has ill effects. By contrast, writes to the WAL are a small fraction of a [store]({% link {{ page.version.version }}/cockroach-start.md %}#store)'s overall write bandwidth and IOPs.

Read amplification and write amplification are key metrics for LSM performance. Neither is inherently "good" or "bad", but they must not occur in excess, and for optimum performance they must be kept in balance. That balance involves tradeoffs.

Expand Down
Loading