You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently we have upgraded our Thanos and Prometheus versions to the below ones.
Thanos: from 0.23.0 to 0.29.0
Prometheus: from 2.34.0 to 2.40.0.
In production we have around 25000 targets and because of huge load we opted for Prometheus shards instead of single Prometheus server and implemented Thanos components for the aggregation. Our thanos architecture comprises three components,
Prometheus shards - For scraping and storing the metrics in local storage.
Thanos query - For viewing the TSDB data globally from all the prometheus shards.
Thanos ruler - For storing the output of processed rules and Alerting system will get data from this and alerts based on the condition.
As per the Thanos documentation (https://thanos.io/tip/components/rule.md/#rule-aka-ruler), it is not recommended to use Thanos Ruler except in some specific cases. So, we tried migrating all the recording and alerting rules to Prometheus shards and removed Thanos ruler. We have 7 replicas for Prometheus shards, due to the above change, we are getting duplicate alerts (7 alerts) for each rule. Multiple shards will scrape the targets of single namespace and the rules execution is being done locally in Prometheus shards and as a result, we are getting duplicate results.
Is it the expected behaviour? Is it the recommended approach when we have multiple replicas for Prometheus shards, we need to use Thanos Ruler? Or Thanos ruler is not needed for our scenario?
Thank you!
The text was updated successfully, but these errors were encountered:
Hi All,
Recently we have upgraded our Thanos and Prometheus versions to the below ones.
Thanos: from 0.23.0 to 0.29.0
Prometheus: from 2.34.0 to 2.40.0.
In production we have around 25000 targets and because of huge load we opted for Prometheus shards instead of single Prometheus server and implemented Thanos components for the aggregation. Our thanos architecture comprises three components,
Prometheus shards - For scraping and storing the metrics in local storage.
Thanos query - For viewing the TSDB data globally from all the prometheus shards.
Thanos ruler - For storing the output of processed rules and Alerting system will get data from this and alerts based on the condition.
As per the Thanos documentation (https://thanos.io/tip/components/rule.md/#rule-aka-ruler), it is not recommended to use Thanos Ruler except in some specific cases. So, we tried migrating all the recording and alerting rules to Prometheus shards and removed Thanos ruler. We have 7 replicas for Prometheus shards, due to the above change, we are getting duplicate alerts (7 alerts) for each rule. Multiple shards will scrape the targets of single namespace and the rules execution is being done locally in Prometheus shards and as a result, we are getting duplicate results.
Is it the expected behaviour? Is it the recommended approach when we have multiple replicas for Prometheus shards, we need to use Thanos Ruler? Or Thanos ruler is not needed for our scenario?
Thank you!
The text was updated successfully, but these errors were encountered: