Flush queue logs/alerts #5090

owen-d · 2022-01-10T19:42:57Z

When debugging a cluster which was relatively underprovisioned, I found that the ingesters couldn't flush data fast enough to keep up with the ingestion rate. I I propose we do the following:

Add an alert to the mixin when the flush queue length (cortex_ingester_flush_queue_length) is continually increasing.
Add log messages when this is over some threshold (50?)
Increase the flush queue default concurrency.

We could also look at more sophisticated strategies such as an adaptive goroutine pool based on outstanding queue length.

The text was updated successfully, but these errors were encountered:

cyriltovena · 2022-01-11T07:10:46Z

Add this in troubleshooting section too.

stale · 2022-03-02T17:49:40Z

Hi! This issue has been automatically marked as stale because it has not had any
activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project.
A stalebot can be very useful in closing issues in a number of cases; the most common
is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

Mark issues as revivable if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).
Add a keepalive label to silence the stalebot if the issue is very common/popular/important.

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task,
our sincere apologies if you find yourself at the mercy of the stalebot.

owen-d added the low-hanging-fruit Helpful additions of limited scope label Jan 10, 2022

owen-d mentioned this issue Jan 10, 2022

better defaults for flush queue parallelism #5091

Merged

stale bot added the stale A stale issue or PR that will automatically be closed. label Mar 2, 2022

stale bot closed this as completed Apr 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flush queue logs/alerts #5090

Flush queue logs/alerts #5090

owen-d commented Jan 10, 2022

cyriltovena commented Jan 11, 2022

stale bot commented Mar 2, 2022

Flush queue logs/alerts #5090

Flush queue logs/alerts #5090

Comments

owen-d commented Jan 10, 2022

cyriltovena commented Jan 11, 2022

stale bot commented Mar 2, 2022