Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce delay in re-enqueuing machines during creation--deletion failures. #523

Merged
merged 1 commit into from
Oct 6, 2020

Conversation

hardikdr
Copy link
Member

@hardikdr hardikdr commented Oct 1, 2020

Co-authored-by: Prashanth prashanth@sap.com

What this PR does / why we need it: Currently we immediately re-enqueue the machine-object on creation/deletion failure. This PR adds a constant delay before retrying the operation. Eventually, we must build exponential retry on top or parallel to this solution.

  • PR introduces a new phase CrashLoopBackOff. This should be used when machine creation fails, but machine-set doesn't really need to replace the machine-object[as that won't anyways help].

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Release note:

Introduced a backoff in re-enqueuing machines on creation/deletion failures. Avoids throttling APIServer & provider calls.
Adds a new phase `CrashLoopBackOff` that is set due to machine creation failures. 

@hardikdr hardikdr requested review from ggaurav10 and a team as code owners October 1, 2020 03:09
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 1, 2020
@hardikdr
Copy link
Member Author

hardikdr commented Oct 1, 2020

/hold - Need to adapt tests.

@gardener-robot gardener-robot added the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Oct 1, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Oct 1, 2020
pkg/controller/machine.go Show resolved Hide resolved
pkg/controller/machine.go Show resolved Hide resolved
pkg/apis/machine/types.go Show resolved Hide resolved
pkg/apis/machine/v1alpha1/machine_types.go Show resolved Hide resolved
pkg/controller/machine.go Show resolved Hide resolved
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 1, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 1, 2020
@hardikdr hardikdr force-pushed the throttle-api-calls branch from c8c663b to 080809a Compare October 3, 2020 12:44
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 3, 2020
@hardikdr
Copy link
Member Author

hardikdr commented Oct 3, 2020

Update: We decided to merge these changes on master-branch first, and hotfix it here only if we see a need later.
cc @prashanth26

@hardikdr hardikdr force-pushed the throttle-api-calls branch from 080809a to 99e262a Compare October 3, 2020 15:52
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 3, 2020
…n and deletion failures

Co-authored-by: Prashanth <prashanth@sap.com>
@hardikdr hardikdr force-pushed the throttle-api-calls branch from 99e262a to b725343 Compare October 6, 2020 05:11
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Oct 6, 2020
@hardikdr
Copy link
Member Author

hardikdr commented Oct 6, 2020

/unhold

@gardener-robot gardener-robot removed the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Oct 6, 2020
Copy link
Contributor

@prashanth26 prashanth26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm tested locally.

@gardener-robot gardener-robot added the reviewed/lgtm Has approval for merging label Oct 6, 2020
@prashanth26 prashanth26 merged commit cc32d3a into gardener:rel-v0.34.0 Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants