Catch error during maintenance window request #19482
Draft
+47
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
PollMaintenanceWindows
which runs every 3 minutes in each env. I'm adding some error handling so it fails loudly. The PagerDuty API documentation says a400
is returned if invalid arguments are sent, which happens if a service is deleted from pd but not from the list of service IDs in the manifest.maintenance_client
. Part 2 of this work is to make a Datadog Log monitor to catch this error and alert on it. The service IDs are sent as an array to pd. Originally I thought I could modify the code to ignore bad service IDs, but since they're sent as a batch, this isn't possible. We could send each one individually, but that would be a lot of API calls.Related issue(s)
Testing done
New code is covered by unit tests
Describe what the old behavior was prior to the change
The Sidekiq job,
PagerDuty::PollMaintenanceWindows.new.perform
would fail silently if a bad PagerDuty service ID was passed in.Describe the steps required to verify your changes are working as expected. Exclusively stating 'Specs run' is NOT acceptable as appropriate testing
pagerduty_api_token
in config/settings.yml with the real deal from Parameter StorePagerDuty::PollMaintenanceWindows.new.perform
carma
) and re-start the rails console, and run the job again, it will pass (return=> []
and no error)bad id
valid id
Screenshots
Note: Optional
What areas of the site does it impact?
Vets-api maintenance windows.
Acceptance criteria