pkg/cvo/sync_worker: Make expected/actual version mismatch fatal #431

wking · 2020-08-09T20:43:45Z

If, for example, the configured ClusterVersion spec.upstream advertised a given image as 4.0.1, but the version baked into the release's own metadata was 1.0.0-abc, report VerifyPayloadVersionFailed and continue to apply the previous release image, instead of optimistically moving on to apply the new release image. This protects users from compromised or man-in-the-middled upstreams who attempt downgrade and similar attacks by misrepresenting a recommended update.

Addressing a FIXME from 58356de (#419).

If, for example, the configured ClusterVersion spec.upstream advertised a given image as 4.0.1, but the version baked into the release's own metadata was 1.0.0-abc, report VerifyPayloadVersionFailed and continue to apply the previous release image, instead of optimistically moving on to apply the new release image. This protects users from compromised or man-in-the-middled upstreams who attempt downgrade and similar attacks by misrepresenting a recommended update. Addressing a FIXME from 58356de (*: Port from Update to Release for ClusterVersion status, 2020-07-24, openshift#419).

openshift-ci-robot · 2020-08-10T01:17:29Z

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e	`9be6175`	link	`/test e2e`
ci/prow/e2e-upgrade	`9be6175`	link	`/test e2e-upgrade`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

We had used InsecureEdgeTerminationPolicyAllow since the route landed in 1fdf865 (Create a route for Cincinnati service, 2020-05-01, commit message, but from discussion in the GitHub pull request [1], it was: * InsecureEdgeTerminationPolicyAllow is the default termination policy. * Cincinnati's docs have no preference [2]. However, we really, really want HTTPS security for cluster-version operators making upstream requests for update recommendations. There are long-term plans for tightening down guards against malicious, compromised, or man-in-the-middled update recommendation services, but today we have yet to land even guards as basic as "upstream is lying about the version string associated with a given release image" [3]. By removing HTTP termination [4], we force consumers to configure their clients, including the cluster-version operator, with https:// URIs (or do something else explicit like setting up their own HTTP termination) before they can access the policy-engine output, which reduces the risk that they will recieve and trust compromised update graphs. This may be a breaking change, but: * We're still in beta, and not yet in general-availability with backwards-compatability requirements. * Folks who have configured their cluster-version operators and other clients with http:// upstreams should *want* to be broken. We are protecting them from all sorts of compromised-upstream failure modes. * The cluster-version operator, and other well-behaved clients, will report understandable error messages for "I tried to connect over HTTP and there was nobody there", which will lead users into auditing and fixing their upstream URIs, so recovering from the breakage should not be to onerous. [1]: openshift#30 (comment) [2]: https://github.com/openshift/cincinnati/blame/0bb5f6f3228858f9e5d1807bd6f45f46e537cdea/docs/user/running-cincinnati.md#L87-L88 [3]: openshift/cluster-version-operator#431 [4]: https://github.com/openshift/api/blob/346618ed7d5e6396191efe6f10b2c36f1e95d8b7/route/v1/types.go#L258-L259

We had used InsecureEdgeTerminationPolicyAllow since the route landed in 1fdf865 (Create a route for Cincinnati service, 2020-05-01, openshift#30). The motivation for that value didn't make it into the Git commit message, but from discussion in the GitHub pull request [1], it was: * InsecureEdgeTerminationPolicyAllow is the default termination policy. * Cincinnati's docs have no preference [2]. However, we really, really want HTTPS security for cluster-version operators making upstream requests for update recommendations. There are long-term plans for tightening down guards against malicious, compromised, or man-in-the-middled update recommendation services, but today we have yet to land even guards as basic as "upstream is lying about the version string associated with a given release image" [3]. By removing HTTP termination [4], we force consumers to configure their clients, including the cluster-version operator, with https:// URIs (or do something else explicit like setting up their own HTTP termination) before they can access the policy-engine output, which reduces the risk that they will recieve and trust compromised update graphs. This may be a breaking change, but: * We're still in beta, and not yet in general-availability with backwards-compatability requirements. * Folks who have configured their cluster-version operators and other clients with http:// upstreams should *want* to be broken. We are protecting them from all sorts of compromised-upstream failure modes. * The cluster-version operator, and other well-behaved clients, will report understandable error messages for "I tried to connect over HTTP and there was nobody there", which will lead users into auditing and fixing their upstream URIs, so recovering from the breakage should not be to onerous. [1]: openshift#30 (comment) [2]: https://github.com/openshift/cincinnati/blame/0bb5f6f3228858f9e5d1807bd6f45f46e537cdea/docs/user/running-cincinnati.md#L87-L88 [3]: openshift/cluster-version-operator#431 [4]: https://github.com/openshift/api/blob/346618ed7d5e6396191efe6f10b2c36f1e95d8b7/route/v1/types.go#L258-L259

openshift-merge-robot · 2020-12-09T19:33:30Z

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-agnostic-upgrade	`9be6175`	link	`/test e2e-agnostic-upgrade`
ci/prow/e2e-agnostic	`9be6175`	link	`/test e2e-agnostic`
ci/prow/e2e-agnostic-operator	`9be6175`	link	`/test e2e-agnostic-operator`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

jottofar · 2021-02-03T14:25:23Z

/retest

jottofar · 2021-02-03T14:30:06Z

/lgtm

openshift-ci-robot · 2021-02-03T14:30:37Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jottofar, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jottofar,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2021-02-09T02:24:43Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-02-09T03:03:43Z

/retest

Please review the full test history for this PR and help us cut down flakes.

…mp scoping Godocs for Upgradeable [1]: Upgradeable indicates whether the component (operator and all configured operands) is safe to upgrade based on the current cluster state. When Upgradeable is False, the cluster-version operator will prevent the cluster from performing impacted updates unless forced. When set on ClusterVersion, the message will explain which updates (minor or patch) are impacted. When set on ClusterOperator, False will block minor OpenShift updates. The message field should contain a human readable description of what the administrator should do to allow the cluster or component to successfully update. The cluster-version operator will allow updates when this condition is not False, including when it is missing, True, or Unknown. So we specifically doc it as only about 4.y -> 4.(y+1) minor updates when seen on ClusterOperator. But we leave it unclear on ClusterVersion because when you set some ClusterVersion overrides, it can break patch updates, so QE asked us to also block patch updates in that case [2,3]. With this patch, I'm using availableUpdates and conditionalUpdates to look up a version associated with the proposed target release pullspec. That's a bit less reliable than the current cluster-version operator behavior, which is extracting the proposed target version from the proposed release image itself (e.g. see [4]). But it's probably sufficient for now, with the odds that the OpenShift Update Service serves bad data low. And we can refine further in the future if we want. [1]: https://github.com/openshift/api/blob/cce310ad2932f6de24491052d506926e484c082c/config/v1/types_cluster_operator.go#L179-L190 : [2]: openshift/cluster-version-operator#364 [3]: https://bugzilla.redhat.com/show_bug.cgi?id=1822844 [4]: openshift/cluster-version-operator#431

openshift-ci-robot requested review from LalatenduMohanty and sdodson August 9, 2020 20:44

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 9, 2020

wking mentioned this pull request Aug 9, 2020

*: Port from Update to Release for ClusterVersion status #419

Merged

wking force-pushed the fatal-cincinnati-version-mismatch branch from c53b9b2 to 9be6175 Compare August 9, 2020 20:46

wking mentioned this pull request Sep 15, 2020

pkg/controller/cincinnati: Use InsecureEdgeTerminationPolicyNone openshift/cincinnati-operator#64

Merged

wking mentioned this pull request Sep 30, 2020

Bug 1794466: modules/installation-mirror-repository: Pivot to by-digest pullspecs openshift/openshift-docs#19266

Closed

openshift-ci-robot assigned jottofar Feb 3, 2021

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 3, 2021

openshift-merge-robot merged commit ac702a5 into openshift:master Feb 9, 2021

wking deleted the fatal-cincinnati-version-mismatch branch April 6, 2021 22:23

wking mentioned this pull request Mar 28, 2023

HOSTEDCP-632: hypershift-operator/controllers/hostedcluster: isUpgradeable minor-bump scoping openshift/hypershift#2336

Closed

wking mentioned this pull request Sep 18, 2024

OSDOCS#12209: Sigstore signature verification for core OCP images openshift/openshift-docs#81985

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg/cvo/sync_worker: Make expected/actual version mismatch fatal #431

pkg/cvo/sync_worker: Make expected/actual version mismatch fatal #431

wking commented Aug 9, 2020

openshift-ci-robot commented Aug 10, 2020

openshift-merge-robot commented Dec 9, 2020

jottofar commented Feb 3, 2021

jottofar commented Feb 3, 2021

openshift-ci-robot commented Feb 3, 2021

openshift-bot commented Feb 9, 2021

openshift-bot commented Feb 9, 2021

pkg/cvo/sync_worker: Make expected/actual version mismatch fatal #431

pkg/cvo/sync_worker: Make expected/actual version mismatch fatal #431

Conversation

wking commented Aug 9, 2020

openshift-ci-robot commented Aug 10, 2020

openshift-merge-robot commented Dec 9, 2020

jottofar commented Feb 3, 2021

jottofar commented Feb 3, 2021

openshift-ci-robot commented Feb 3, 2021

openshift-bot commented Feb 9, 2021

openshift-bot commented Feb 9, 2021