ref(project-upstream): Emit error on multiple project fetch failures #2700

iker-barriocanal · 2023-11-08T15:41:56Z

This PR makes Relay emit an error after continued project config fetch failures for an interval. Currently, Relay emits some errors on some specific failure cases, but there's a metric to tell whether it's failing all project config fetches or just a few of them. The interval can be configured with http.project_failure_interval.

Approach

Relay tracks the time of a failed fetch, and resets it on a successful one. If some time elapses and Relay hasn't reset the time, it emits an error for each failed request. This should create a big spike of errors after a continued time of failed fetches, demanding attention to the issue.

The fetch time is represented as Option<Instant>. Without the option, resetting the instant means setting the most recent time. In that case, Relay would emit an error if a fetch fails after some time of not emitting any fetches at all. This scenario is more likely to happen in low-volume environments, like some Self-Hosted instances.

Default interval

I've set an arbitrary default value of 90 seconds. This should be short enough to determine Relay is having issues without false positives, and long enough to retry the same request twice with default values (max_retry_interval = 60s).

CHANGELOG.md

TBS1996

looks good to me, seems very sensible

ref(project-upstream): Emit error on multiple project fetch failures

9c218a3

iker-barriocanal self-assigned this Nov 8, 2023

iker-barriocanal requested a review from a team November 8, 2023 15:41

update changelog

3ec5813

TBS1996 reviewed Nov 9, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

TBS1996 approved these changes Nov 9, 2023

View reviewed changes

iker-barriocanal added 2 commits November 9, 2023 16:04

Merge branch 'master' into iker/ref/proj-fetch-fail-error

dc12f63

update changelog

34857c7

iker-barriocanal merged commit 84a2526 into master Nov 10, 2023

iker-barriocanal deleted the iker/ref/proj-fetch-fail-error branch November 10, 2023 08:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ref(project-upstream): Emit error on multiple project fetch failures #2700

ref(project-upstream): Emit error on multiple project fetch failures #2700

iker-barriocanal commented Nov 8, 2023

TBS1996 left a comment

ref(project-upstream): Emit error on multiple project fetch failures #2700

ref(project-upstream): Emit error on multiple project fetch failures #2700

Conversation

iker-barriocanal commented Nov 8, 2023

Approach

Default interval

TBS1996 left a comment

Choose a reason for hiding this comment