-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateless Rule: Rollout - Alert For State is lost across restart - Blinking alerts #5219
Comments
If I understand correctly, in Prometheus, the state persistence across restart is by checking I need to double check if this mode supports this by using the upstream rule manager. And we need more E2E tests to verify this. |
Yeah, I can confirm this is a bug, or a feature instead I would say. Working on the fix and tests now. We need to implement a Another way to go is to just implement remote read for Thanos Querier? Then we can just connect it using remote storage client. |
Yeah, i think that's correct. Right now we just need to implement the Querier interface for Thanos Querier so that we can fetch that time series from it. |
Created prometheus/prometheus#10443 on upstream prometheus to address one of the issue. After that one is merged then I can open a new pr to fix this. |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
referencing #5230 for implementation of the fix. |
I will close this one as the feature was merged into main already. Let us know how the feature works |
Thanos, Prometheus and Golang version used:
thanos v0.25
What happened:
I implemented the stateless mode for the rule component.
When my rulers (k8s deployments) rollouts, every alerts already firing before the rollouts with a long
for
clause (lets say 30minutes) will be in Pending for that amount of time and firing again after 30minutes.And because of our alertmanager config, alerts are
resolved
after 5minutes...What you expected to happen:
I would like the alerts to still have a state for the
for
clause, maybe with ALERT_FOR_STATE metrics sent to the receivers?Is it a valid issue? Is there anything I misunderstood / Do others have same issue?
Thanks
The text was updated successfully, but these errors were encountered: