-
Notifications
You must be signed in to change notification settings - Fork 770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor pg_stat_bgwriter metrics into standalone collector #556
refactor pg_stat_bgwriter metrics into standalone collector #556
Conversation
@SuperQ This should be ready to review now |
3dfb94f
to
9b9c1a7
Compare
Thanks, sorry for the delay in reviewing. I have some concerns about renaming the namespace on these metrics. But It may be a good idea long-term. I guess we need to make that decision first. |
@SuperQ Is that something to discuss here or an issue or would that be better posed to the mailing list? |
9b9c1a7
to
6909cef
Compare
This moves the metrics that are queried from pg_stat_bgwriter into a dedicated collector instead of dynamically generating queries and metrics from a map. It renames some metrics including adding the `_total` suffix on all of the counters to match prometheus standards. This implementation uses QueryRowContext to allow for later addition of context for cancellation. From the Postgres documentation, it states that there is one line per WAL sender process, but it is unclear how to differentiate between them in any meaningful way. When querying the table, there is no column to identify the row, only metrics about bgwriter. Signed-off-by: Joe Adams <github@joeadams.io>
d6c15d6
to
27d23d0
Compare
@SuperQ I refactored this now with the new collector registration and reverted the namespace changes. We can always revisit this in the future. |
collector/collector.go
Outdated
// TODO(@sysadmind): We need to discover the downstream servers and add them here. | ||
// if p.autoDiscoverDatabases { | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're intending to drop the autoDiscoverDatabases
feature. IIRC It's misused by the queries.yaml to turn this exporter into a generic SQL exporter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to hear that. I was going to ask you about the usage of this flag because it is challenging to understand from the code. I'm not surprised that it causes issues for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it's a confusing mess. Part of why I haven't done much maintenance work on this exporter myself. 😹
Signed-off-by: Joe Adams <github@joeadams.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks!
NOTE: pg_stat_bgwriter counter metrics had the `_total` suffix added #556 * [CHANGE] refactor pg_stat_bgwriter metrics into standalone collector #556 * [FEATURE] Add pg_database collector #613 * [ENHANCEMENT] Add pg_database_size_bytes metric #613 * [BUGFIX] Avoid parsing error from bogus Azure Flexible Server custom GUC #587 * [BUGFIX] Fix pg_stat_archiver error in 9.4 and earlier. #599 * [BUGFIX] Sanitize setting values because of Aurora irregularity #620 Signed-off-by: SuperQ <superq@gmail.com>
* Update build * Update to Go 1.18. * Update minimum Go version to 1.17. * Update Go modules for 1.17 format. * Bump Go modules * Enable dependabot. * Update Prometheus common files. * Fixup yamllint. Signed-off-by: SuperQ <superq@gmail.com> * Update common Prometheus files Signed-off-by: prombot <prometheus-team@googlegroups.com> * Update common Prometheus files (prometheus-community#650) Signed-off-by: prombot <prometheus-team@googlegroups.com> * Update common Prometheus files Signed-off-by: prombot <prometheus-team@googlegroups.com> * Update readme to include Postgres 14 support It looks like postgres 14.1 was added to CI here: prometheus-community@fcb2535 See also: prometheus-community#651 (comment) Signed-off-by: Austin Godber <godber@uberhip.com> * Bump github.com/prometheus/common from 0.34.0 to 0.35.0 Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.34.0 to 0.35.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.34.0...v0.35.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Release v0.11.0 NOTE: pg_stat_bgwriter counter metrics had the `_total` suffix added prometheus-community#556 * [CHANGE] refactor pg_stat_bgwriter metrics into standalone collector prometheus-community#556 * [FEATURE] Add pg_database collector prometheus-community#613 * [ENHANCEMENT] Add pg_database_size_bytes metric prometheus-community#613 * [BUGFIX] Avoid parsing error from bogus Azure Flexible Server custom GUC prometheus-community#587 * [BUGFIX] Fix pg_stat_archiver error in 9.4 and earlier. prometheus-community#599 * [BUGFIX] Sanitize setting values because of Aurora irregularity prometheus-community#620 Signed-off-by: SuperQ <superq@gmail.com> * fix for exporter issue 633 fix for exporter issue 633: prometheus-community#633 "Scan error on column index 2, name \"checkpoint_write_time\": converting driver.Value type float64 (\"6.594096e+06\") to a int: invalid syntax prometheus-community#633" Signed-off-by: bravosierrasierra <bravosierrasierra@users.noreply.github.com> * Fix checkpoint_sync_time value type Error: sql: Scan error on column index 3, name \"checkpoint_sync_time\": converting driver.Value type float64 (\"1.876469e+06\") to a int: invalid syntax See also: prometheus-community#633 prometheus-community#666 Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr> * Release 0.11.1 * [BUGFIX] Fix checkpoint_write_time value type prometheus-community#666 * [BUGFIX] Fix checkpoint_sync_time value type prometheus-community#667 Signed-off-by: SuperQ <superq@gmail.com> * PMM-10820 missing metric restore * PMM-10820 missing metric drop * PMM-10820 merge fix * PMM-10820 PR review fix Signed-off-by: SuperQ <superq@gmail.com> Signed-off-by: prombot <prometheus-team@googlegroups.com> Signed-off-by: Austin Godber <godber@uberhip.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: bravosierrasierra <bravosierrasierra@users.noreply.github.com> Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr> Co-authored-by: SuperQ <superq@gmail.com> Co-authored-by: prombot <prometheus-team@googlegroups.com> Co-authored-by: Austin Godber <godber@uberhip.com> Co-authored-by: Joe Adams <github@joeadams.io> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: bravosierrasierra <bravosierrasierra@users.noreply.github.com> Co-authored-by: Nicolas Rodriguez <nico@nicoladmin.fr>
…writer_refactor refactor pg_stat_bgwriter metrics into standalone collector
NOTE: pg_stat_bgwriter counter metrics had the `_total` suffix added prometheus-community#556 * [CHANGE] refactor pg_stat_bgwriter metrics into standalone collector prometheus-community#556 * [FEATURE] Add pg_database collector prometheus-community#613 * [ENHANCEMENT] Add pg_database_size_bytes metric prometheus-community#613 * [BUGFIX] Avoid parsing error from bogus Azure Flexible Server custom GUC prometheus-community#587 * [BUGFIX] Fix pg_stat_archiver error in 9.4 and earlier. prometheus-community#599 * [BUGFIX] Sanitize setting values because of Aurora irregularity prometheus-community#620 Signed-off-by: SuperQ <superq@gmail.com>
* Update build * Update to Go 1.18. * Update minimum Go version to 1.17. * Update Go modules for 1.17 format. * Bump Go modules * Enable dependabot. * Update Prometheus common files. * Fixup yamllint. Signed-off-by: SuperQ <superq@gmail.com> * Update common Prometheus files Signed-off-by: prombot <prometheus-team@googlegroups.com> * Update common Prometheus files (prometheus-community#650) Signed-off-by: prombot <prometheus-team@googlegroups.com> * Update common Prometheus files Signed-off-by: prombot <prometheus-team@googlegroups.com> * Update readme to include Postgres 14 support It looks like postgres 14.1 was added to CI here: prometheus-community@fcb2535 See also: prometheus-community#651 (comment) Signed-off-by: Austin Godber <godber@uberhip.com> * Bump github.com/prometheus/common from 0.34.0 to 0.35.0 Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.34.0 to 0.35.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.34.0...v0.35.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Release v0.11.0 NOTE: pg_stat_bgwriter counter metrics had the `_total` suffix added prometheus-community#556 * [CHANGE] refactor pg_stat_bgwriter metrics into standalone collector prometheus-community#556 * [FEATURE] Add pg_database collector prometheus-community#613 * [ENHANCEMENT] Add pg_database_size_bytes metric prometheus-community#613 * [BUGFIX] Avoid parsing error from bogus Azure Flexible Server custom GUC prometheus-community#587 * [BUGFIX] Fix pg_stat_archiver error in 9.4 and earlier. prometheus-community#599 * [BUGFIX] Sanitize setting values because of Aurora irregularity prometheus-community#620 Signed-off-by: SuperQ <superq@gmail.com> * fix for exporter issue 633 fix for exporter issue 633: prometheus-community#633 "Scan error on column index 2, name \"checkpoint_write_time\": converting driver.Value type float64 (\"6.594096e+06\") to a int: invalid syntax prometheus-community#633" Signed-off-by: bravosierrasierra <bravosierrasierra@users.noreply.github.com> * Fix checkpoint_sync_time value type Error: sql: Scan error on column index 3, name \"checkpoint_sync_time\": converting driver.Value type float64 (\"1.876469e+06\") to a int: invalid syntax See also: prometheus-community#633 prometheus-community#666 Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr> * Release 0.11.1 * [BUGFIX] Fix checkpoint_write_time value type prometheus-community#666 * [BUGFIX] Fix checkpoint_sync_time value type prometheus-community#667 Signed-off-by: SuperQ <superq@gmail.com> * PMM-10820 missing metric restore * PMM-10820 missing metric drop * PMM-10820 merge fix * PMM-10820 PR review fix Signed-off-by: SuperQ <superq@gmail.com> Signed-off-by: prombot <prometheus-team@googlegroups.com> Signed-off-by: Austin Godber <godber@uberhip.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: bravosierrasierra <bravosierrasierra@users.noreply.github.com> Signed-off-by: Nicolas Rodriguez <nico@nicoladmin.fr> Co-authored-by: SuperQ <superq@gmail.com> Co-authored-by: prombot <prometheus-team@googlegroups.com> Co-authored-by: Austin Godber <godber@uberhip.com> Co-authored-by: Joe Adams <github@joeadams.io> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: bravosierrasierra <bravosierrasierra@users.noreply.github.com> Co-authored-by: Nicolas Rodriguez <nico@nicoladmin.fr>
This moves the metrics that are queried from pg_stat_bgwriter into a dedicated collector instead of dynamically generating queries and metrics from a map. It adopts a new, consistent namespace
postgres
for its metrics and includes the_total
suffix on all of the counters to match prometheus standards.This implementation uses QueryRowContext to allow for later addition of context for cancellation. The Postgres docs state that this should only ever be a single row.
This ignores any sort of future Collector interface to keep the surface area small and to allow for that interface to get fleshed out before implementation.
The previous implementation had a label
server
, so I kept it but I am unsure if it's even useful.@SuperQ This should not be merged before the v0.10.0 tag. I am also very open to feedback as this is the first step of many in the refactor.