-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1931025: pkg/cvo: Use shutdownContext for final status synchronization #525
Bug 1931025: pkg/cvo: Use shutdownContext for final status synchronization #525
Conversation
@wking: This pull request references Bugzilla bug 1931025, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@wking: This pull request references Bugzilla bug 1931025, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 6 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
0e927b8
to
9bdc919
Compare
Address a bug introduced by cc1921d (pkg/start: Release leader lease on graceful shutdown, 2020-08-03, openshift#424), where canceling the Operator.Run context would leave the operator with no time to attempt the final sync [1]: E0119 22:24:15.924216 1 cvo.go:344] unable to perform final sync: context canceled With this commit, I'm piping through shutdownContext, which gets a two-minute grace period beyond runContext, to give the operator time to push out that final status (which may include important information like the fact that the incoming release image has completed verification). --- This commit picks c4ddf03 (pkg/cvo: Use shutdownContext for final status synchronization, 2021-01-19, openshift#517) back to 4.5. It's not a clean pick, because we're missing changes like: * b72e843 (Bug 1822844: Block z level upgrades if ClusterVersionOverridesSet set, 2020-04-30, openshift#364). * 1d1de3b (Use context to add timeout to cincinnati HTTP request, 2019-01-15, openshift#410). which also touched these lines. But we've gotten this far without backporting rhbz#1822844, and openshift#410 was never associated with a bug in the first place, so instead of pulling back more of 4.6 to get a clean pick, I've just manually reconciled the pick conflicts. Removing Start from pkg/start (again) fixes a buggy re-introduction in the manually-backported 20421b6 (*: Add lots of Context and options arguments, 2020-07-24, openshift#470). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1916384#c10
9bdc919
to
785b7b4
Compare
workerStopCh := make(chan struct{}) | ||
|
||
klog.Infof("Starting ClusterVersionOperator with minimum reconcile period %s", optr.minimumUpdateCheckInterval) | ||
defer klog.Info("Shutting down ClusterVersionOperator") | ||
|
||
if !cache.WaitForCacheSync(stopCh, optr.cacheSynced...) { | ||
return fmt.Errorf("caches never synchronized: %w", ctx.Err()) | ||
return fmt.Errorf("caches never synchronized: %w", runContext.Err()) | ||
} | ||
|
||
// trigger the first cluster version reconcile always | ||
optr.queue.Add(optr.queueKey()) | ||
|
||
// start the config sync loop, and have it notify the queue when new status is detected | ||
go runThrottledStatusNotifier(stopCh, optr.statusInterval, 2, optr.configSync.StatusCh(), func() { optr.queue.Add(optr.queueKey()) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method was changed in #517 as well. Did you miss it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #517, runThrottledStatusNotifier
is using runContext
, which is effectively the same as the stopCh
I use here (see stopCh := runContext.Done()
above). In master, runThrottledStatusNotifier
was ported to use Context
in #410, which was not associated with a bug, and which I'm not pulling back here. So while I could backport that portion of #410 here, and leave of the rest of #410, I haven't done that with the current PR tip.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jottofar, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@wking: All pull requests linked via external trackers have merged: Bugzilla bug 1931025 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This commit picks c4ddf03 (#517) back to 4.5. It's not a clean pick, because we're missing changes like:
which also touched these lines. But we've gotten this far without backporting rhbz#1822844, and #410 was never associated with a bug in the first place, so instead of pulling back more of 4.6 to get a clean pick, I've just manually reconciled the pick conflicts.
/assign @jottofar