Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize facilities job error metric #3398

Merged
merged 4 commits into from
Oct 7, 2019

Conversation

rtravitz
Copy link
Contributor

@rtravitz rtravitz commented Oct 7, 2019

Description of change

This PR helps address #3016

Facilities alerts currently fire indefinitely until vets-api-worker instance redeploy in Prometheus after a metric increments in StatsD. A PR exists to address this in the devops repo by only alerting when the failure counter for the most recent five minute window is greater than the failure counter for the most prior five minute window. However, that doesn't capture when the metric goes from not existing to existing the first time a failure occurs. To solve that case, we can either do some contortions in promQL to check for the absence of the metric, or we can initialize it to zero in vets-api. This PR does the latter.

Testing done

  • Checked that the Facilities_.* error metric is incremented to zero in StatsD when Sidekiq starts up

Acceptance Criteria (Definition of Done)

Unique to this PR

  • StatsD error metric is initialized to zero

Applies to all PRs

  • Appropriate logging
  • Swagger docs have been updated, if applicable
  • Provide link to originating GitHub issue, or connected to it via ZenHub
  • Does not contain any sensitive information (i.e. PII/credentials/internal URLs/etc., in logging, hardcoded, or in specs)
  • Provide which alerts would indicate a problem with this functionality (if applicable)

@rtravitz rtravitz requested review from a team as code owners October 7, 2019 18:29
@va-vfs-bot va-vfs-bot temporarily deployed to rt/prometheus-facilities-alerts/master October 7, 2019 19:46 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants