-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore!: adopt log/slog, drop go-kit/log #4089
Conversation
The bulk of this change set was automated by the following script which is being used to aid in converting the various exporters/projects to use slog: https://gist.github.com/tjhop/49f96fb7ebbe55b12deee0b0312d8434 This commit includes several changes: - bump exporter-tookit to v0.13.1 for log/slog support - updates golangci-lint deprecated configs - enables sloglint linter - removes old go-kit/log linter configs - introduce some `if logger == nil { $newLogger }` additions to prevent nil references - converts cluster membership config to use a stdlib compatible slog adapter, rather than creating a custom io.Writer for use as the membership `logOutput` config Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
cc: @SuperQ |
Not sure why the lint stage took so long, but it failed due to a timeout:
I don't run into any lint issue locally -- don't know how much caching we do for golangci-lint, but maybe a rerun will work 🤞 |
Similarly, I don't run into this test issue locally, either 🤔 time=2024-10-31T03:41:02.745Z level=DEBUG source=dispatch.go:165 msg="Received alert" component=dispatcher alert=test1[5ae90ff][active]
time=2024-10-31T03:41:03.746Z level=DEBUG source=dispatch.go:531 msg=flushing component=dispatcher aggrGroup="{}:{alertname=\"test1\"}" alerts=[test1[5ae90ff][active]]
time=2024-10-31T03:41:03.747Z level=DEBUG source=notify.go:877 msg="Notify success" component=dispatcher receiver=default integration=webhook[0] aggrGroup="{}:{alertname=\"test1\"}" attempts=1 duration=897.598µs alerts=[test1[5ae90ff][active]]
cli_test.go:80:
collector "webhook":
interval [1,2]
---
- &{map[] 0001-01-01T00:00:00.000Z <nil> [] 0001-01-01T00:00:01.000Z <nil> <nil> { map[alertname:test1]}}[-9.223372036854776e+09:]
[ ✓ ]
Expected total of 1 alerts, got 2
received:
@ 1.019406235
- &{map[] 0001-01-01T00:00:00.000Z <nil> [] 2024-10-31T03:41:03.728Z <nil> <nil> { map[alertname:test1]}}[0.99986554:]
@ 2.087403109
- &{map[] 2024-10-31T03:41:04.728Z <nil> [] 2024-10-31T03:41:03.728Z <nil> <nil> { map[alertname:test1]}}[0.99986554:1.99986554]
FAIL
FAIL github.com/prometheus/alertmanager/test/cli/acceptance 2.814s |
Looks like flaky tests. Re-ran and everything is green now. |
@@ -75,7 +74,8 @@ func (n *Notifier) Notify(ctx context.Context, as ...*types.Alert) (bool, error) | |||
} | |||
data := notify.GetTemplateData(ctx, n.tmpl, as, n.logger) | |||
|
|||
level.Debug(n.logger).Log("incident", key) | |||
// @tjhop: should this use `group` for the keyval like most other notify implementations? | |||
n.logger.Debug("extracted group key", "incident", key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have mixed use of incident
and group_key
in a couple of places. It would be nice to make these consistent in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I went with group_key
in most places for sake of something to key the log message on since most of the previous calls were bare; the pushover implementation already used the incident
key, so I left if for consistency in that regard.
If there's a preference what to align on, we can definitely circle back
level.Error(n.logger).Log("err", err) | ||
// @tjhop: should we `return false, err` here as we do in most | ||
// other Notify() implementations? | ||
n.logger.Error("error extracting group key", "err", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think so, this looks like an mistake!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack 👍
We've been trying to keep these logging conversions relatively straightforward -- I'd vote to fix this and the other s/Info/Error/
logging calls in a separate PR
@@ -446,7 +446,8 @@ Loop: | |||
break Loop | |||
case <-t.C: | |||
if err := runMaintenance(doMaintenance); err != nil { | |||
level.Info(s.logger).Log("msg", "Running maintenance failed", "err", err) | |||
// @tjhop: this should probably log at error level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -456,7 +457,8 @@ Loop: | |||
return | |||
} | |||
if err := runMaintenance(doMaintenance); err != nil { | |||
level.Info(s.logger).Log("msg", "Creating shutdown snapshot failed", "err", err) | |||
// @tjhop: this should probably log at error level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Addresses feedback from another PR prometheus#4089 (comment) Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* [CHANGE] Templating errors in the SNS integration now return an error. #3531 #3879 * [CHANGE] Adopt log/slog, drop go-kit/log #4089 * [FEATURE] Add a new Microsoft Teams integration based on Flows #4024 * [FEATURE] Add a new Rocket.Chat integration #3600 * [FEATURE] Add a new Jira integration #3590 #3931 * [FEATURE] Add support for `GOMEMLIMIT`, enable it via the feature flag `--enable-feature=auto-gomemlimit`. #3895 * [FEATURE] Add support for `GOMAXPROCS`, enable it via the feature flag `--enable-feature=auto-gomaxprocs`. #3837 * [FEATURE] Add support for limits of silences including the maximum number of active and pending silences, and the maximum size per silence (in bytes). You can use the flags `--silences.max-silences` and `--silences.max-silence-size-bytes` to set them accordingly #3852 #3862 #3866 #3885 #3886 #3877 * [FEATURE] Muted alerts now show whether they are suppressed or not in both the `/api/v2/alerts` endpoint and the Alertmanager UI. #3793 #3797 #3792 * [ENHANCEMENT] Add support for `content`, `username` and `avatar_url` in the Discord integration. `content` and `username` also support templating. #4007 * [ENHANCEMENT] Only invalidate the silences cache if a new silence is created or an existing silence replaced - should improve latency on both `GET api/v2/alerts` and `POST api/v2/alerts` API endpoint. #3961 * [ENHANCEMENT] Add image source label to Dockerfile. To get changelogs shown when using Renovate #4062 * [ENHANCEMENT] Build using go 1.23 #4071 * [ENHANCEMENT] Support setting a global SMTP TLS configuration. #3732 * [ENHANCEMENT] The setting `room_id` in the WebEx integration can now be templated to allow for dynamic room IDs. #3801 * [ENHANCEMENT] Enable setting `message_thread_id` for the Telegram integration. #3638 * [ENHANCEMENT] Support the `since` and `humanizeDuration` functions to templates. This means users can now format time to more human-readable text. #3863 * [ENHANCEMENT] Support the `date` and `tz` functions to templates. This means users can now format time in a specified format and also change the timezone to their specific locale. #3812 * [ENHANCEMENT] Latency metrics now support native histograms. #3737 * [ENHANCEMENT] Add timeout option for webhook notifier. #4137 * [BUGFIX] Fix the SMTP integration not correctly closing an SMTP submission, which may lead to unsuccessful dispatches being marked as successful. #4006 * [BUGFIX] The `ParseMode` option is now set explicitly in the Telegram integration. If we don't HTML tags had not been parsed by default. #4027 * [BUGFIX] Fix a memory leak that was caused by updates silences continuously. #3930 * [BUGFIX] Fix hiding secret URLs when the URL is incorrect. #3887 * [BUGFIX] Fix a race condition in the alerts - it was more of a hypothetical race condition that could have occurred in the alert reception pipeline. #3648 * [BUGFIX] Fix a race condition in the alert delivery pipeline that would cause a firing alert that was delivered earlier to be deleted from the aggregation group when instead it should have been delivered again. #3826 * [BUGFIX] Fix version in APIv1 deprecation notice. #3815 * [BUGFIX] Fix crash errors when using `url_file` in the Webhook integration. #3800 * [BUGFIX] fix `Route.ID()` returns conflicting IDs. #3803 * [BUGFIX] Fix deadlock on the alerts memory store. #3715 * [BUGFIX] Fix `amtool template render` when using the default values. #3725 * [BUGFIX] Fix `webhook_url_file` for both the Discord and Microsoft Teams integrations. #3728 #3745 * [BUGFIX] Fix wechat api link #4084 * [BUGFIX] Fix build info metric #4166 Signed-off-by: SuperQ <superq@gmail.com>
The bulk of this change set was automated by the following script which is being used to aid in converting the various exporters/projects to use slog:
https://gist.github.com/tjhop/49f96fb7ebbe55b12deee0b0312d8434
This commit includes several changes:
if logger == nil { $newLogger }
additions to prevent nil referenceslogOutput
config