Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: maybe delay snapshot ingestion #77491

Closed
wants to merge 1 commit into from

Conversation

kvoli
Copy link
Collaborator

@kvoli kvoli commented Mar 8, 2022

Previously, upon receiving a snapshot the store would not check store
health before applying the snapshot. This can result in overload in the
case where the store is already unhealthy. This patch adds a
pre-ingestion check to avoid overloading the store.

The backpressure applied is configurable with
rocksdb.ingest_backpressure.l0_file_count_threshold and
rocksdb.ingest_backpressure.max_delay. To disable snapshot
backpressure, such that no delay is applied, the cluster settings can be
configured to:

SET CLUSTER SETTING rocksdb.ingest_backpressure.l0_file_count_threshold = 99999999
or
SET CLUSTER SETTING rocksdb.ingest_backpressure.max_delay = 0

touches: #74694

Release note ops change: Delay snapshot ingestion when the store is
unhealthy. The cumulative time that requests are delayed is tracked by
range.snapshots.delay.total. This backpressure is configurable with
rocksdb.ingest_backpressure.l0_file_count_threshold and
rocksdb.ingest_backpressure.max_delay.

@kvoli kvoli self-assigned this Mar 8, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@kvoli kvoli force-pushed the snap-delay branch 4 times, most recently from cfcecc7 to bd20b1a Compare March 8, 2022 18:20
@kvoli kvoli marked this pull request as ready for review March 8, 2022 19:12
@kvoli kvoli requested a review from a team March 8, 2022 19:12
@kvoli kvoli requested a review from a team as a code owner March 8, 2022 19:12
@kvoli kvoli force-pushed the snap-delay branch 2 times, most recently from 3e72471 to 2547c79 Compare March 8, 2022 19:17
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

pkg/kv/kvserver/replica_raftstorage.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/replica_raftstorage.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/replica_raftstorage.go Outdated Show resolved Hide resolved
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Previously, upon receiving a snapshot the store would not check store
health before applying the snapshot. This can result in overload in the
case where the store is already unhealthy. This patch adds a
pre-ingestion check to avoid overloading the store.

The backpressure applied is configurable with
`rocksdb.ingest_backpressure.l0_file_count_threshold` and
`rocksdb.ingest_backpressure.max_delay`. To disable snapshot
backpressure, such that no delay is applied, the cluster settings can be
configured to:

`SET CLUSTER SETTING rocksdb.ingest_backpressure.l0_file_count_threshold = 99999999`
or
`SET CLUSTER SETTING rocksdb.ingest_backpressure.max_delay = 0`

touches: cockroachdb#74694

Release justification: bug fixes

Release note ops change: Delay snapshot ingestion when the store is
unhealthy. The cumulative time that requests are delayed is tracked by
`range.snapshots.delay.total`. This backpressure is configurable with
`rocksdb.ingest_backpressure.l0_file_count_threshold` and
`rocksdb.ingest_backpressure.max_delay`.
@kvoli
Copy link
Collaborator Author

kvoli commented Mar 10, 2022

re 22.1 release - @nvanbenschoten agreed that we won't aim to merge this into this release, due to potential fallout this late.

I'm removing the release justification accordingly.

@kvoli
Copy link
Collaborator Author

kvoli commented Mar 10, 2022

re 22.1 release - @nvanbenschoten agreed that we won't aim to merge this into this release, due to potential fallout this late.

I'm removing the release justification accordingly.

To elaborate further - the concern is that this change doesn't discriminate upon the type of snapshot. There was a discussion of separate FIFO semaphores, which queue snapshots. The ingestion delay could reside in the less critical snapshot semaphore, however not the more critical.

This raises an issue regarding ticketing between them, to limit concurrency - which is open or tbd and makes this approach less desirable.

@kvoli kvoli added the do-not-merge bors won't merge a PR with this label. label Mar 23, 2022
@sumeerbhola
Copy link
Collaborator

@nvanbenschoten @lunevalex I'd like to understand how this PR relates to the discussion we are having in https://docs.google.com/document/d/1_YtzvlMg3IKcM5G_dvqsGyOP465GlFEVr16GPBBBxQk/edit?disco=AAAAXWDzV5s

  • if we want to backport, something like this make sense.
  • if we don't want to backport, I am a bit wary of adding another use case to a hard-to-tune capacity-unaware knob (as mentioned in admission,kv,bulk: unify (local) store overload protection via admission control #75066). If we did the work for byte-based tokens in admission control, which also benefits AddSSTable admission control (so there are other motivations), we could make snapshot application queue for admission. I'm assuming any change that delays snapshot application requires some experimental evaluation "due to potential fallout", so we may as well avoid repeating that work on a temporary solution.

@nvanbenschoten
Copy link
Member

@sumeerbhola I agree with your rationale. This is in a sensitive-enough code area that I don't think we would want to backport this change. That's especially true because none of the support issues that have motivated this issue ended up being due to snapshots.

If we did the work for byte-based tokens in admission control, which also benefits AddSSTable admission control (so there are other motivations), we could make snapshot application queue for admission.

Do you think this would allow us to get rid of the PreIngestDelay we use for AddSSTable ingestion both above and below Raft?

I'm assuming any change that delays snapshot application requires some experimental evaluation "due to potential fallout", so we may as well avoid repeating that work on a temporary solution.

100%. In the past, we've struggled to experimentally validate policy changes at this level. We'll need to do better if we want to add more sophistication here.

@kvoli kvoli closed this Apr 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge bors won't merge a PR with this label.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants