Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOSTEDCP-632: hypershift-operator/controllers/hostedcluster: isUpgradeable minor-bump scoping #2336

Closed
wants to merge 1 commit into from

Conversation

wking
Copy link
Member

@wking wking commented Mar 28, 2023

Godocs for Upgradeable:

Upgradeable indicates whether the component (operator and all configured operands) is safe to upgrade based on the current cluster state. When Upgradeable is False, the cluster-version operator will prevent the cluster from performing impacted updates unless forced. When set on ClusterVersion, the message will explain which updates (minor or patch) are impacted. When set on ClusterOperator, False will block minor OpenShift updates. The message field should contain a human readable description of what the administrator should do to allow the cluster or component to successfully update. The cluster-version operator will allow updates when this condition is not False, including when it is missing, True, or Unknown.

So we specifically doc it as only about 4.y -> 4.(y+1) minor updates when seen on ClusterOperator. But we leave it unclear on ClusterVersion because when you set some ClusterVersion overrides, it can break patch updates, so QE asked us to also block patch updates in that case.

With this patch, I'm using availableUpdates and conditionalUpdates to look up a version associated with the proposed target release pullspec. That's a bit less reliable than the current cluster-version operator behavior, which is extracting the proposed target version from the proposed release image itself (e.g. see here). But it's probably sufficient for now, with the odds that the OpenShift Update Service serves bad data low. And we can refine further in the future if we want.

@openshift-ci openshift-ci bot requested review from csrwng and sjenning March 28, 2023 17:21
@openshift-ci openshift-ci bot added the area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release label Mar 28, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 28, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wking
Once this PR has been reviewed and has the lgtm label, please assign enxebre for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wking wking force-pushed the isUpgradeable-for-minor-bumps branch 2 times, most recently from 285c632 to fd519c4 Compare March 28, 2023 19:55
…mp scoping

Godocs for Upgradeable [1]:

  Upgradeable indicates whether the component (operator and all
  configured operands) is safe to upgrade based on the current cluster
  state. When Upgradeable is False, the cluster-version operator will
  prevent the cluster from performing impacted updates unless forced.
  When set on ClusterVersion, the message will explain which updates
  (minor or patch) are impacted. When set on ClusterOperator, False
  will block minor OpenShift updates. The message field should contain
  a human readable description of what the administrator should do to
  allow the cluster or component to successfully update. The
  cluster-version operator will allow updates when this condition is
  not False, including when it is missing, True, or Unknown.

So we specifically doc it as only about 4.y -> 4.(y+1) minor updates
when seen on ClusterOperator.  But we leave it unclear on
ClusterVersion because when you set some ClusterVersion overrides, it
can break patch updates, so QE asked us to also block patch updates in
that case [2,3].

With this patch, I'm using availableUpdates and conditionalUpdates to
look up a version associated with the proposed target release
pullspec.  That's a bit less reliable than the current cluster-version
operator behavior, which is extracting the proposed target version
from the proposed release image itself (e.g. see [4]).  But it's
probably sufficient for now, with the odds that the OpenShift Update
Service serves bad data low.  And we can refine further in the future
if we want.

[1]: https://github.com/openshift/api/blob/cce310ad2932f6de24491052d506926e484c082c/config/v1/types_cluster_operator.go#L179-L190 :
[2]: openshift/cluster-version-operator#364
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1822844
[4]: openshift/cluster-version-operator#431
@wking wking force-pushed the isUpgradeable-for-minor-bumps branch from fd519c4 to 0c04d51 Compare March 28, 2023 20:06
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 28, 2023

@wking: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/unit 0c04d51 link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@csrwng csrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wking !
One comment. It also looks like you need to fix the unit test.

@@ -4214,10 +4214,40 @@ func (r *HostedClusterReconciler) reconcileAWSSubnets(ctx context.Context, creat
return nil
}

func releaseImageToVersion(hcluster *hyperv1.HostedCluster, image string) (semver.Version, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to just look up the version from the release image. For example, see:

releaseInfo, err := r.ReleaseProvider.Lookup(ctx, hcluster.Spec.Release.Image, pullSecretBytes)
if err != nil {
return ctrl.Result{}, fmt.Errorf("failed to lookup release image: %w", err)
}
releaseImageVersion, err = semver.Parse(releaseInfo.Version())
if err != nil {
return ctrl.Result{}, fmt.Errorf("failed to parse release image version: %w", err)
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

releaseInfo, err := r.ReleaseProvider.Lookup(ctx, hcluster.Spec.Release.Image, pullSecretBytes) 

Tunneling pullSecretBytes around down from HostedClusterReconciler.reconcile to isProgressing to isUpgradeable feels awkward. Thoughts about adjusting the ProviderWithRegistryOverrides interface to take WithPullSecret to push that information down into the structures once we have it? Or alternatively, adjusting the release providers that need the pull secret (just PodProvider and RegistryClientProvider?) to take a callback when initializing them here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the thing is that the provider is a controller-level variable while the pull secret we use is hostedcluster-specific. We do want to use the pull secret that is provided with the HostedCluster if at some point you are trying to use a release that only your pull secret has access to.

@sjenning sjenning changed the title hypershift-operator/controllers/hostedcluster: isUpgradeable minor-bump scoping HOSTEDCP-632: hypershift-operator/controllers/hostedcluster: isUpgradeable minor-bump scoping Mar 30, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 30, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 30, 2023

@wking: This pull request references HOSTEDCP-632 which is a valid jira issue.

In response to this:

Godocs for Upgradeable:

Upgradeable indicates whether the component (operator and all configured operands) is safe to upgrade based on the current cluster state. When Upgradeable is False, the cluster-version operator will prevent the cluster from performing impacted updates unless forced. When set on ClusterVersion, the message will explain which updates (minor or patch) are impacted. When set on ClusterOperator, False will block minor OpenShift updates. The message field should contain a human readable description of what the administrator should do to allow the cluster or component to successfully update. The cluster-version operator will allow updates when this condition is not False, including when it is missing, True, or Unknown.

So we specifically doc it as only about 4.y -> 4.(y+1) minor updates when seen on ClusterOperator. But we leave it unclear on ClusterVersion because when you set some ClusterVersion overrides, it can break patch updates, so QE asked us to also block patch updates in that case.

With this patch, I'm using availableUpdates and conditionalUpdates to look up a version associated with the proposed target release pullspec. That's a bit less reliable than the current cluster-version operator behavior, which is extracting the proposed target version from the proposed release image itself (e.g. see here). But it's probably sufficient for now, with the odds that the OpenShift Update Service serves bad data low. And we can refine further in the future if we want.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sjenning
Copy link
Contributor

@wking thanks! This has been on our list for a while. Unit test need fixing up.

}

if requestedVersion.Major == currentTargetVersion.Major && requestedVersion.Minor == currentTargetVersion.Minor {
// ClusterVersion's Upgradeable condition is mostly about minor bumps from x.y to x.(y+1) and larger. It is not intended to block patch updates from x.y.z to x.y.z' except under very limited circumstances which we can ignore for now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking Not sure why did you mention mostly in ClusterVersion's Upgradeable condition is mostly about minor bumps from x.y to x.(y+1) and larger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think CVO override is applicable to managed OCP clusters. So we just mention that the upgradeable condition is only about minors.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2023
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sjenning
Copy link
Contributor

superseded by #2381
/close

@openshift-ci openshift-ci bot closed this Apr 12, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 12, 2023

@sjenning: Closed this PR.

In response to this:

superseded by #2381
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants