Skip to content

Commit

Permalink
Enhance MimirRequestLatency runbook with more advice (grafana#1967)
Browse files Browse the repository at this point in the history
* Enhance MimirRequestLatency runbook with more advice

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
  • Loading branch information
2 people authored and williamzelesny committed Jun 6, 2022
1 parent c2ca500 commit ed1bf2d
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions docs/sources/operators-guide/mimir-runbooks/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,15 @@ How to **investigate**:
- Check `Memcached Overview` dashboard
- If memcached eviction rate is high, then you should scale up memcached replicas. Check the recommendations by `Mimir / Scaling` dashboard and make reasonable adjustments as necessary.
- If memcached eviction rate is zero or very low, then it may be caused by "first time" queries
- Cache query timeouts
- Check store-gateway logs and look for warnings about timed out Memcached queries (example query: `{namespace="example-mimir-cluster", name=~"store-gateway.*"} |= "level=warn" |= "memcached" |= "timeout"`)
- If there are indeed a lot of timed out Memcached queries, consider whether the store-gateway Memcached timeout setting (`-blocks-storage.bucket-store.chunks-cache.memcached.timeout`) is sufficient
- By consulting the "Queue length" panel of the `Mimir / Queries` dashboard, determine if queries are waiting in queue due to busy queriers (an indication of this would be queue length > 0 for some time)
- If queries are waiting in queue
- Consider scaling up number of queriers if they're not auto-scaled; if auto-scaled, check auto-scaling parameters
- If queries are not waiting in queue
- Consider [enabling query sharding]({{< relref "../architecture/query-sharding/index.md#how-to-enable-query-sharding" >}}) if not already enabled, to increase query parallelism
- If query sharding already enabled, consider increasing total number of query shards (`query_sharding_total_shards`) for tenants submitting slow queries, so their queries can be further parallelized

#### Alertmanager

Expand Down

0 comments on commit ed1bf2d

Please sign in to comment.