Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add backoff to operator retry mechanism #650

Merged
merged 12 commits into from
Aug 8, 2024
Merged

fix: add backoff to operator retry mechanism #650

merged 12 commits into from
Aug 8, 2024

Conversation

rjferguson21
Copy link
Contributor

@rjferguson21 rjferguson21 commented Aug 7, 2024

Description

Adds delay to operator retries to fix issues with temporary failures causing status=Failed for Packages

Open to suggestions/thoughts on the delay time but this is the current breakdown:

Retry Attempt (c) backOffSeconds (3^c)
1 3
2 9
3 27
4 81

It is possible it would make more sense to retry more frequently with less delay (maybe it being exponential doesn't make sense here).

Example of what this looks like from the logs:

(⎈|k3d-uds:default) uds-core (backoff) kubectl logs -l pepr.dev/controller=watcher -n pepr-system --tail -1 | grep "seconds before" | jq '.msg'
"Waiting 3 seconds before processing Package grafana/grafana, status.phase: Retrying, observedGeneration: 1, retryAttempt: 1"
"Waiting 3 seconds before processing Package neuvector/neuvector, status.phase: Retrying, observedGeneration: 1, retryAttempt: 1"
"Waiting 9 seconds before processing Package grafana/grafana, status.phase: Retrying, observedGeneration: 1, retryAttempt: 2"
"Waiting 9 seconds before processing Package neuvector/neuvector, status.phase: Retrying, observedGeneration: 1, retryAttempt: 2"
"Waiting 27 seconds before processing Package grafana/grafana, status.phase: Retrying, observedGeneration: 1, retryAttempt: 3"
"Waiting 27 seconds before processing Package neuvector/neuvector, status.phase: Retrying, observedGeneration: 1, retryAttempt: 3"

Related Issue

Fixes #649

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Other (security config, docs update, etc)

Checklist before merging

@rjferguson21 rjferguson21 marked this pull request as ready for review August 7, 2024 13:36
@rjferguson21 rjferguson21 requested a review from a team as a code owner August 7, 2024 13:36
@rjferguson21 rjferguson21 marked this pull request as draft August 7, 2024 13:50
@rjferguson21 rjferguson21 marked this pull request as ready for review August 7, 2024 14:17
Copy link
Contributor

@mjnagel mjnagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in testing. Noting that the only real downside here is that all Package resources are in the same queue, so a backoff on a single Package delays any others from being reconciled. This would be a better experience if/when this pepr request is resolved.

@rjferguson21 rjferguson21 merged commit 52c97fd into main Aug 8, 2024
14 checks passed
@rjferguson21 rjferguson21 deleted the backoff branch August 8, 2024 18:41
mjnagel pushed a commit that referenced this pull request Aug 9, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.25.2](v0.25.1...v0.25.2)
(2024-08-09)


### Bug Fixes

* add backoff to operator retry mechanism
([#650](#650))
([52c97fd](52c97fd))
* network allows for core netpols
([#652](#652))
([e9b69e8](e9b69e8))


### Miscellaneous

* allow for extra keycloak gateway usage with client certs
([#648](#648))
([7b1c474](7b1c474))
* **deps:** update dependency defenseunicorns/uds-common to v0.11.1
([#647](#647))
([768aa1c](768aa1c))
* **deps:** update dependency defenseunicorns/uds-common to v0.11.2
([#653](#653))
([f7d1ce8](f7d1ce8))
* **deps:** update grafana helm chart to v8.4.3
([#660](#660))
([81c7af0](81c7af0))
* **deps:** update grafana to 11.1.3
([[#607](https://github.com/defenseunicorns/uds-core/issues/607)](https://github.com/defenseunicorns/uds-core/pull/607))
([7b343ac](7b343ac))
* **deps:** update neuvector to 5.3.4
([#606](#606))
([526bff4](526bff4))
* **deps:** update pepr to 0.33.0
([#588](#588))
([6eee8f0](6eee8f0))
* update identity config to 0.6.0
([#661](#661))
([469fed8](469fed8))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Package fails re-reconcile on upgrade
2 participants