Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Add alerts for mean db blocked seconds #22822

Merged
merged 4 commits into from
Jul 16, 2021
Merged

Add alerts for mean db blocked seconds #22822

merged 4 commits into from
Jul 16, 2021

Conversation

daxmc99
Copy link
Contributor

@daxmc99 daxmc99 commented Jul 14, 2021

During the incident today one key signal that would have been helpful was the mean db block time

This value surged during the incident and it would have been helpful to call this out.

Screen Shot 2021-07-13 at 1 32 20 PM

@daxmc99 daxmc99 requested review from ryanslade, tsenart and efritz July 14, 2021 02:03
@sourcegraph-bot
Copy link
Contributor

sourcegraph-bot commented Jul 14, 2021

Notifying subscribers in CODENOTIFY files for diff 76c4027...319c52e.

Notify File(s)
@bobheadxi monitoring/definitions/shared/dbconns.go
monitoring/monitoring/monitoring.go
@christinaforney doc/admin/observability/alert_solutions.md
doc/admin/observability/dashboards.md
@slimsag monitoring/definitions/shared/dbconns.go
monitoring/monitoring/monitoring.go
@sourcegraph/distribution doc/admin/observability/alert_solutions.md
doc/admin/observability/dashboards.md
monitoring/definitions/shared/dbconns.go
monitoring/monitoring/monitoring.go

Comment on lines 66 to 67
Warning: monitoring.Alert().GreaterOrEqual(10, nil).For(5 * time.Minute),
Critical: monitoring.Alert().GreaterOrEqual(20, nil).For(10 * time.Minute),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be 10 and 20 milliseconds respectively, so in seconds it should be 0.01 and 0.02 right?

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 5ms should be the warning threshold, and 10ms critical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to be 5ms & 10ms

@daxmc99 daxmc99 changed the title Add alerts for db blocking Add alerts for mean db blocked seconds Jul 15, 2021
@daxmc99 daxmc99 enabled auto-merge (squash) July 15, 2021 18:00
@daxmc99 daxmc99 merged commit 7eb956e into main Jul 16, 2021
@daxmc99 daxmc99 deleted the dax/alert_on_db branch July 16, 2021 01:21
Warning: monitoring.Alert().GreaterOrEqual(0.05, nil).For(5 * time.Minute),
Critical: monitoring.Alert().GreaterOrEqual(0.10, nil).For(10 * time.Minute),
Owner: monitoring.ObservableOwnerCoreApplication,
PossibleSolutions: "none",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible solution can actually be:

  • Increase SRC_PGSQL_MAX_OPEN together with giving more memory to the database if needed.
  • Scale up Postgres memory / cpus.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants