-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rule: fix panic when calling API /api/v1/rules?type=alert #6189
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, let's try this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing! 🙂
In the rules proto spec, we have four stdtime fields, the last_evaluation
of RuleGroup
, RecordingRule
, and Alert
and this active_at
in AlertInstance
. The gogo/protobuf issue does specify that the field needs to be non-nullable, but no harm in trying this out.
pkg/rules/manager.go
Outdated
@@ -97,12 +97,14 @@ func ActiveAlertsToProto(s storepb.PartialResponseStrategy, a *rules.AlertingRul | |||
active := a.ActiveAlerts() | |||
ret := make([]*rulespb.AlertInstance, len(active)) | |||
for i, ruleAlert := range active { | |||
// https://github.com/gogo/protobuf/issues/519 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep comments consistent with the other workarounds as well.
// https://github.com/gogo/protobuf/issues/519 | |
// UTC needed due to https://github.com/gogo/protobuf/issues/519. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello,
I updated the comment (+ another similar one found in the codebase)
Signed-off-by: Thibaut Ackermann <thibaut.ackermann@al-enterprise.com>
…6189) Signed-off-by: Thibaut Ackermann <thibaut.ackermann@al-enterprise.com>
Hello,
Sometimes, when I open the
/alerts
page on Ruler web UI, I get an error. When it happens, the situation is irremediable and I have to restart the Ruler component completely.. As you can see in the stacktrace linked below, the error look like and old issue fixed in #2925 . After digging, I think this is linked to theAlertInstance.ActiveAt
date, which is not 'formatted' to UTC() before protobuf encoding like the ones in the old PR.I tried to produce a testcase to reproduce this, but unfortunately I failed to find the exact combination of events...
I think this has someting to do with the
reload
process of Ruler (either SIGHUP or POST /-/reload) which might be copying the alerts and losing/changing thetime.Time
Locations (which is the real problem with protobuf encoding)..Here is the full stacktrace from server:
Changes
Verification