Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(project-upstream): Emit error on multiple project fetch failures #2700

Merged
merged 4 commits into from
Nov 10, 2023

Conversation

iker-barriocanal
Copy link
Contributor

This PR makes Relay emit an error after continued project config fetch failures for an interval. Currently, Relay emits some errors on some specific failure cases, but there's a metric to tell whether it's failing all project config fetches or just a few of them. The interval can be configured with http.project_failure_interval.

Approach

Relay tracks the time of a failed fetch, and resets it on a successful one. If some time elapses and Relay hasn't reset the time, it emits an error for each failed request. This should create a big spike of errors after a continued time of failed fetches, demanding attention to the issue.

The fetch time is represented as Option<Instant>. Without the option, resetting the instant means setting the most recent time. In that case, Relay would emit an error if a fetch fails after some time of not emitting any fetches at all. This scenario is more likely to happen in low-volume environments, like some Self-Hosted instances.

Default interval

I've set an arbitrary default value of 90 seconds. This should be short enough to determine Relay is having issues without false positives, and long enough to retry the same request twice with default values (max_retry_interval = 60s).

@iker-barriocanal iker-barriocanal self-assigned this Nov 8, 2023
@iker-barriocanal iker-barriocanal requested a review from a team November 8, 2023 15:41
Copy link
Contributor

@TBS1996 TBS1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, seems very sensible

@iker-barriocanal iker-barriocanal merged commit 84a2526 into master Nov 10, 2023
@iker-barriocanal iker-barriocanal deleted the iker/ref/proj-fetch-fail-error branch November 10, 2023 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants