Bug 1822844: Block z level upgrades when ClusterVersionOverridesSet is set #364

jottofar · 2020-04-30T16:19:33Z

CVO Upgradeable=False should block all upgrades including z level.

openshift-ci-robot · 2020-04-30T16:19:39Z

@jottofar: This pull request references Bugzilla bug 1822844, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.5.0) matches configured target release for branch (4.5.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

WIP: Bug 1822844: Block z level upgrades

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jottofar · 2020-05-01T00:18:46Z

/test e2e-aws-upgrade

jottofar · 2020-05-06T14:56:26Z

/test e2e-aws

jottofar · 2020-05-15T17:48:29Z

/retest

jottofar · 2020-06-01T17:08:31Z

/retest

jottofar · 2020-06-01T17:44:12Z

/test e2e-aws

jottofar · 2020-06-02T00:01:57Z

/test e2e-aws

wking · 2020-07-23T19:56:41Z

Looks good to me, but unit failed, including:

 --- FAIL: TestCVO_UpgradePreconditionFailing (0.00s)
    cvo_scenarios_test.go:1429: ([]testing.Action) (len=3 cap=3) {
         (testing.GetActionImpl) {
...

Possibly needs some CI updates for the Completed pivot like this one?

jottofar · 2020-07-23T20:30:15Z

/retest

wking · 2020-07-23T21:59:40Z

/lgtm

openshift-ci-robot · 2020-07-23T21:59:57Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jottofar, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jottofar,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jottofar · 2020-07-24T00:20:49Z

/retest

openshift-bot · 2020-07-24T00:46:19Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-07-24T02:17:22Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-07-24T02:43:20Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-ci-robot · 2020-07-24T04:10:57Z

@jottofar: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-gcp-upgrade	`a862696`	link	`/test e2e-gcp-upgrade`
ci/prow/e2e-gcp	`a862696`	link	`/test e2e-gcp`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2020-07-24T04:53:22Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-07-24T06:11:22Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-ci-robot · 2020-07-24T08:22:50Z

@jottofar: All pull requests linked via external trackers have merged: openshift/cluster-version-operator#364. Bugzilla bug 1822844 has been moved to the MODIFIED state.

In response to this:

Bug 1822844: Block z level upgrades when ClusterVersionOverridesSet is set

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Address a bug introduced by cc1921d (pkg/start: Release leader lease on graceful shutdown, 2020-08-03, openshift#424), where canceling the Operator.Run context would leave the operator with no time to attempt the final sync [1]: E0119 22:24:15.924216 1 cvo.go:344] unable to perform final sync: context canceled With this commit, I'm piping through shutdownContext, which gets a two-minute grace period beyond runContext, to give the operator time to push out that final status (which may include important information like the fact that the incoming release image has completed verification). --- This commit picks c4ddf03 (pkg/cvo: Use shutdownContext for final status synchronization, 2021-01-19, openshift#517) back to 4.5. It's not a clean pick, because we're missing changes like: * b72e843 (Bug 1822844: Block z level upgrades if ClusterVersionOverridesSet set, 2020-04-30, openshift#364). * 1d1de3b (Use context to add timeout to cincinnati HTTP request, 2019-01-15, openshift#410). which also touched these lines. But we've gotten this far without backporting rhbz#1822844, and openshift#410 was never associated with a bug in the first place, so instead of pulling back more of 4.6 to get a clean pick, I've just manually reconciled the pick conflicts. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1916384#c10

Address a bug introduced by cc1921d (pkg/start: Release leader lease on graceful shutdown, 2020-08-03, openshift#424), where canceling the Operator.Run context would leave the operator with no time to attempt the final sync [1]: E0119 22:24:15.924216 1 cvo.go:344] unable to perform final sync: context canceled With this commit, I'm piping through shutdownContext, which gets a two-minute grace period beyond runContext, to give the operator time to push out that final status (which may include important information like the fact that the incoming release image has completed verification). --- This commit picks c4ddf03 (pkg/cvo: Use shutdownContext for final status synchronization, 2021-01-19, openshift#517) back to 4.5. It's not a clean pick, because we're missing changes like: * b72e843 (Bug 1822844: Block z level upgrades if ClusterVersionOverridesSet set, 2020-04-30, openshift#364). * 1d1de3b (Use context to add timeout to cincinnati HTTP request, 2019-01-15, openshift#410). which also touched these lines. But we've gotten this far without backporting rhbz#1822844, and openshift#410 was never associated with a bug in the first place, so instead of pulling back more of 4.6 to get a clean pick, I've just manually reconciled the pick conflicts. Removing Start from pkg/start (again) fixes a buggy re-introduction in the manually-backported 20421b6 (*: Add lots of Context and options arguments, 2020-07-24, openshift#470). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1916384#c10

The argument landed in b72e843 (Bug 1822844: Block z level upgrades if ClusterVersionOverridesSet set, 2020-04-30, openshift#364) for use by Upgradeable.Run. But even then, that method opens by retrieving a (possibly cached) ClusterVersion resource from the configured lister, so there's no need to pass the explicit argument. We should save explicit inputs for things that need to be passed in from memory at call-time, and not use them for information that can be retrieved from precondition-creation-time callbacks. And even for things that need to come from memory at call time, we should be using ReleaseContext so we can add and remove properties without having to touch function signatures for precondition implementations that don't care about the properties we're touching. While I'm touching the Run call site, I replaced a context.TODO with a context.Background. As pointed out in the docs [1], Background is prefered for tests. [1]: https://pkg.go.dev/context#Background

…mp scoping Godocs for Upgradeable [1]: Upgradeable indicates whether the component (operator and all configured operands) is safe to upgrade based on the current cluster state. When Upgradeable is False, the cluster-version operator will prevent the cluster from performing impacted updates unless forced. When set on ClusterVersion, the message will explain which updates (minor or patch) are impacted. When set on ClusterOperator, False will block minor OpenShift updates. The message field should contain a human readable description of what the administrator should do to allow the cluster or component to successfully update. The cluster-version operator will allow updates when this condition is not False, including when it is missing, True, or Unknown. So we specifically doc it as only about 4.y -> 4.(y+1) minor updates when seen on ClusterOperator. But we leave it unclear on ClusterVersion because when you set some ClusterVersion overrides, it can break patch updates, so QE asked us to also block patch updates in that case [2,3]. With this patch, I'm using availableUpdates and conditionalUpdates to look up a version associated with the proposed target release pullspec. That's a bit less reliable than the current cluster-version operator behavior, which is extracting the proposed target version from the proposed release image itself (e.g. see [4]). But it's probably sufficient for now, with the odds that the OpenShift Update Service serves bad data low. And we can refine further in the future if we want. [1]: https://github.com/openshift/api/blob/cce310ad2932f6de24491052d506926e484c082c/config/v1/types_cluster_operator.go#L179-L190 : [2]: openshift/cluster-version-operator#364 [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1822844 [4]: openshift/cluster-version-operator#431

openshift-ci-robot requested review from smarterclayton and wking April 30, 2020 16:19

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2020

jottofar force-pushed the bug-1822844 branch 2 times, most recently from be0728a to 27cea86 Compare April 30, 2020 20:20

jottofar force-pushed the bug-1822844 branch from 1b15119 to 32221b2 Compare May 5, 2020 18:19

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 15, 2020

jottofar force-pushed the bug-1822844 branch from 7b609b6 to 55c1d14 Compare May 15, 2020 15:58

openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 15, 2020

jottofar force-pushed the bug-1822844 branch from 55c1d14 to 7517544 Compare May 15, 2020 16:08

jottofar force-pushed the bug-1822844 branch from 7517544 to b1a0f92 Compare May 20, 2020 20:47

jottofar force-pushed the bug-1822844 branch from e9cd7f1 to 97530da Compare June 1, 2020 16:01

jottofar force-pushed the bug-1822844 branch 4 times, most recently from a27e4e2 to f63f3a2 Compare June 1, 2020 21:25

jottofar force-pushed the bug-1822844 branch 2 times, most recently from 5fddab0 to 0079459 Compare June 2, 2020 13:44

jottofar changed the title ~~WIP: Bug 1822844: Block z level upgrades~~ Bug 1822844: Block z level upgrades Jun 2, 2020

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2020

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 23, 2020

openshift-ci-robot assigned wking Jul 23, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 23, 2020

openshift-merge-robot merged commit b658b42 into openshift:master Jul 24, 2020

wking mentioned this pull request Feb 20, 2021

Bug 1931025: pkg/cvo: Use shutdownContext for final status synchronization #525

Merged

wking mentioned this pull request Dec 10, 2021

pkg/payload/precondition: File shuffling, drop ClusterVersion argument, etc. #708

Merged

wking mentioned this pull request Mar 28, 2023

HOSTEDCP-632: hypershift-operator/controllers/hostedcluster: isUpgradeable minor-bump scoping openshift/hypershift#2336

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1822844: Block z level upgrades when ClusterVersionOverridesSet is set #364

Bug 1822844: Block z level upgrades when ClusterVersionOverridesSet is set #364

jottofar commented Apr 30, 2020

openshift-ci-robot commented Apr 30, 2020

jottofar commented May 1, 2020

jottofar commented May 6, 2020

jottofar commented May 15, 2020

jottofar commented Jun 1, 2020

jottofar commented Jun 1, 2020

jottofar commented Jun 2, 2020

wking commented Jul 23, 2020

jottofar commented Jul 23, 2020

wking commented Jul 23, 2020

openshift-ci-robot commented Jul 23, 2020

jottofar commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-ci-robot commented Jul 24, 2020 •

edited

Loading

openshift-bot commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-ci-robot commented Jul 24, 2020

Bug 1822844: Block z level upgrades when ClusterVersionOverridesSet is set #364

Bug 1822844: Block z level upgrades when ClusterVersionOverridesSet is set #364

Conversation

jottofar commented Apr 30, 2020

openshift-ci-robot commented Apr 30, 2020

jottofar commented May 1, 2020

jottofar commented May 6, 2020

jottofar commented May 15, 2020

jottofar commented Jun 1, 2020

jottofar commented Jun 1, 2020

jottofar commented Jun 2, 2020

wking commented Jul 23, 2020

jottofar commented Jul 23, 2020

wking commented Jul 23, 2020

openshift-ci-robot commented Jul 23, 2020

jottofar commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-ci-robot commented Jul 24, 2020 • edited Loading

openshift-bot commented Jul 24, 2020

openshift-bot commented Jul 24, 2020

openshift-ci-robot commented Jul 24, 2020

openshift-ci-robot commented Jul 24, 2020 •

edited

Loading