-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance MimirRequestLatency runbook with more advice #1967
Changes from 5 commits
83dbd76
4e8fd08
ca12851
77cff05
ef623c6
06d1f38
27472ff
90893c1
9aea9b7
a3ecb22
1252c06
67711ac
b80e560
8fb0833
00a45b4
9e1ce54
2e7b272
071a1a4
cec869f
97e63ce
5293bc6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -222,6 +222,14 @@ How to **investigate**: | |
- Check `Memcached Overview` dashboard | ||
- If memcached eviction rate is high, then you should scale up memcached replicas. Check the recommendations by `Mimir / Scaling` dashboard and make reasonable adjustments as necessary. | ||
- If memcached eviction rate is zero or very low, then it may be caused by "first time" queries | ||
- Cache query timeouts | ||
- Check store-gateway logs and look for warnings about timed out Memcached queries | ||
- If there are indeed a lot of timed out Memcached queries, consider whether the store-gateway Memcached timeout setting (`-blocks-storage.bucket-store.chunks-cache.memcached.timeout`) is sufficient | ||
- If queries are waiting in queue due to busy queriers | ||
- Consider scaling up number of queriers if they're not auto-scaled; if auto-scaled, check auto-scaling parameters | ||
- If queries are not waiting in queue due to busy queriers | ||
- Consider enabling query sharding if not already enabled, to increase query parallelism | ||
aknuds1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- If query sharding already enabled, consider increasing total number of query shards (`query_sharding_total_shards`) for tenants submitting slow queries, so their queries can be further parallelized | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I seem to recall that tuning the number of shards isn't exactly as straightforward as it seems. Is there an existing doc we could link people to that describes how to pick a number of shards? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All doc we have is at |
||
|
||
#### Alertmanager | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not saying how to check it. The
Mimir / Queries
dashboard has panels named "Queue length". Goal is to have that queue length 0 (except few sporadic spikes). If that queue length is > 0 for some time, then we need to scale up queriers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.