You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When debugging a cluster which was relatively underprovisioned, I found that the ingesters couldn't flush data fast enough to keep up with the ingestion rate. I I propose we do the following:
Add an alert to the mixin when the flush queue length (cortex_ingester_flush_queue_length) is continually increasing.
Add log messages when this is over some threshold (50?)
Increase the flush queue default concurrency.
We could also look at more sophisticated strategies such as an adaptive goroutine pool based on outstanding queue length.
The text was updated successfully, but these errors were encountered:
Hi! This issue has been automatically marked as stale because it has not had any
activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project.
A stalebot can be very useful in closing issues in a number of cases; the most common
is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale label sorted by thumbs up.
We may also:
Mark issues as revivable if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).
Add a keepalive label to silence the stalebot if the issue is very common/popular/important.
We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task,
our sincere apologies if you find yourself at the mercy of the stalebot.
stalebot
added
the
stale
A stale issue or PR that will automatically be closed.
label
Mar 2, 2022
When debugging a cluster which was relatively underprovisioned, I found that the ingesters couldn't flush data fast enough to keep up with the ingestion rate. I I propose we do the following:
cortex_ingester_flush_queue_length
) is continually increasing.We could also look at more sophisticated strategies such as an adaptive goroutine pool based on outstanding queue length.
The text was updated successfully, but these errors were encountered: