-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission: control snapshot ingest disk write bandwidth #120708
Labels
A-admission-control
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Comments
pre-requisite: #86857 |
5 tasks
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 7, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 7, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 8, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 8, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 8, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 10, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 11, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
aadityasondhi
added a commit
to aadityasondhi/cockroach
that referenced
this issue
Oct 11, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
craig bot
pushed a commit
that referenced
this issue
Oct 11, 2024
131243: admission, kvserver: snapshot integration for disk bandwidth r=sumeerbhola a=aadityasondhi This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes #120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect. Co-authored-by: Aaditya Sondhi <20070511+aadityasondhi@users.noreply.github.com>
annrpom
pushed a commit
to annrpom/cockroach
that referenced
this issue
Oct 14, 2024
This patch integrates raft snapshot ingestion with the disk write mechanism in admission control. The following internal machinery changes were made to make that possible: - `SnapshotQueue` was added as an implementation of the `requester` interface. Internally, it is a simple FIFO queue unlike the other work queue, since we can make the assumption that all snapshots are of the same priority and are processed as system tenant requests. - A new `kvStoreTokenChildGranter` was created to grant tokens to snapshot requests. - We now have a `StoreWorkType` that differentiates `regular`, `elastic`, and `snapshot` work for the store granters. This was necessary because snapshots do not incur the same write-amp as the other work types – they land into L6 of the LSM due to excises. We also only want these requests to be subject to pacing based on disk bandwidth. - We now prioritize store writes in the following order: `regular`, `snapshot`, `elastic`. - The `demuxHandle` of the `GrantCoordinator` now uses `StoreWorkType`. The integration point for the `SnapshotQueue` is in `Receive()` where we use a pacing mechanism to process incoming snapshots. Snapshots are subject to `snapshotBurstSize` amounts of disk writes before asking for further admission of the same size. The `multiSSTWriter` uses Pebble's SST size estimates to maintain a running count of disk writes incurred by the snapshot ingest. Once the SST is finalized we deduct/return further tokens. Closes cockroachdb#120708. Release note (ops change): Admission Control now has an integration for pacing snapshot ingest traffic based on disk bandwidth. `kvadmission.store.snapshot_ingest_bandwidth_control.enabled` is used to turn on this integration. Note that it requires provisioned bandwidth to be set for the store (or cluster through the cluster setting) for it to take effect.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-admission-control
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
see thread starting at #80607 (comment) for context.
@aadityasondhi @andrewbaptist
Jira issue: CRDB-36837
Epic CRDB-37479
The text was updated successfully, but these errors were encountered: