-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1794466: modules/installation-mirror-repository: Pivot to by-digest pullspecs #19266
Conversation
@wking: This pull request references Bugzilla bug 1794466, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking, I have a few text requests/suggestions.
@jianlinliu, will you PTAL? |
Suggestions all committed; thanks :). Will squash back down to one commit shortly. |
@kalexand-rh Per Luke, 4.1.30+ also changes the arch, so this should go back into 4.1 as well. |
Add a MIRROR variable in case folks want to point at a different mirror for whatever reason. If neither OPENSHIFT_VERSION nor a Git tag are available, use the latest/ entry on the mirror to fetch the latest release (note that this is not pinned to a particular 4.y, it's just the latest stable release). Drop get_openshift_version and just set up OPENSHIFT_RELEASE_VERSION instead; no need to delay that. Echo the chosen release version and how it was determined to be more transparent and reduce surprise. Drop empty-OPENSHIFT_RELEASE_VERSION checks from the rest of the script, now that OPENSHIFT_RELEASE_VERSION is always set. Pull the release pullspec from the mirror's release.txt for the chosen version. This decouples us from the Quay tagging scheme [1,2], decouples us from the particular repository (doesn't even have to get pushed to Quay), and protects us from malicious registries that could point the mutable tag at a malicious image. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1794466 [2]: openshift/openshift-docs#19266
48c39c7
to
3649282
Compare
Squashed back down to a single commit. |
@@ -22,38 +22,41 @@ Complete the following steps on the bastion host: | |||
. Review the | |||
link:https://access.redhat.com/downloads/content/290/[{product-title} downloads page] | |||
to determine the version of {product-title} that you want to install. | |||
The image release advisory includes the pullspec repository and digest. For example, the link:https://access.redhat.com/errata/product/290/ver=4.3/rhel---7/x86_64/RHBA-2020:0062[put the title of the page that loads] release uses `quay.io/openshift-release-dev/ocp-release` for the pullspec repository and `sha256:3a516480dfd68e0f87f702b4d7bdd6f6a0acfdac5cd2e9767b838ceede34d70d` for the x86-64 4.3.0 release image digest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both image tag and image digest is showing on the same advisory page, why here we have to replace image tag with image digest. We just need tell user to get correct image tag form advisory page, no need to modify the following detailed steps. Do I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need tell user to get correct image tag form advisory page, no need to modify the following detailed steps.
Then you have to worry about divergence between the advisory and actual tag names. Obviously we'd like to keep those synchronized, but didn't always do so over the multi-arch transition. Also, users got used to the tag == version pattern we had been using for amd64 and were surprised when arch suffixes broke that pattern. By-digest flows avoid implying a tag-name pattern. Personally, I'd rather remove tag names from Errata text entirely, and this change sets the stage for that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tag name is easier reading, and I saw this in another PR - #19380.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tag name is easier reading...
Do you need to read this? You just copy/paste out of the errata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am a customer, I want to get know which version my cluster is running on, which version my cluster will go forward to. But digest can not clearly tell me that. Of course, this is only my personal thoughts and preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this change make me remember my original testing mentioned in https://issues.redhat.com/browse/CORS-1105?focusedCommentId=13851937&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13851937. Do you mean this change will not cause that issue any more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...But digest can not clearly tell me that...
You just copied the digest off a specific release's errata page. So you should remember for the duration of the copy/paste.
And this change make me remember my original testing...
Copying out here, your issue seems to have been:
# oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release@sha256:ce9cc2ba29ab270091266f1f20240cfe6e6dcb11ccc61d5c717ca6b812925613 --force
Updating to release image registry.svc.ci.openshift.org/ocp/release@sha256:ce9cc2ba29ab270091266f1f20240cfe6e6dcb11ccc61d5c717ca6b812925613
# oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.2.0-0.nightly-2019-08-26-202352 True False 8m55s Cluster version is registry.svc.ci.openshift.org/ocp/release@sha256:ce9cc2ba29ab270091266f1f20240cfe6e6dcb11ccc61d5c717ca6b812925613
First, in case anyone is tempted to copy/paste, that should use --allow-explicit-upgrade
and not --force
. About the digest in the status, seems like a CVO bug to me. As you can see from the version column, the CVO is able to retrieve the version string compiled into the release image. And that compiled-in version is covered by the image signature, so it's more reliable than the tag, which is not. Would addressing the status-string issue in the CVO cover your concerns with this doc change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would addressing the status-string issue in the CVO cover your concerns with this doc change?
Yeah, if the status string can clearly show version info too, that would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 4.6.0-0.nightly-2020-09-28-125756 -> 4.6.0-0.nightly-2020-09-28-171716, the e2e logs have:
Sep 28 18:04:05.430: INFO: Starting upgrade to version= image=registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-09-28-171716
STEP: continuously hitting pods through the service's LoadBalancer
Sep 28 18:04:35.732: INFO: cluster upgrade is Progressing: Working towards 4.6.0-0.nightly-2020-09-28-171716: 0% complete
And the final ClusterVersion includes:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1310630479774355456/artifacts/e2e-aws-upgrade/clusterversion.json | jq '.items[] | {spec: {desiredUpdate: .spec.desiredUpdate}, status}'
{
"spec": {
"desiredUpdate": {
"force": true,
"image": "registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-09-28-171716",
"version": ""
}
},
"status": {
"availableUpdates": null,
"conditions": [
{
"lastTransitionTime": "2020-09-28T18:00:46Z",
"message": "Done applying 4.6.0-0.nightly-2020-09-28-171716",
"status": "True",
"type": "Available"
},
{
"lastTransitionTime": "2020-09-28T18:00:46Z",
"status": "False",
"type": "Failing"
},
{
"lastTransitionTime": "2020-09-28T18:13:49Z",
"message": "Cluster version is 4.6.0-0.nightly-2020-09-28-171716",
"status": "False",
"type": "Progressing"
},
{
"lastTransitionTime": "2020-09-28T17:31:47Z",
"message": "Unable to retrieve available updates: currently reconciling cluster version 4.6.0-0.nightly-2020-09-28-171716 not found in the \"stable-4.6\" channel",
"reason": "VersionNotFound",
"status": "False",
"type": "RetrievedUpdates"
}
],
"desired": {
"image": "registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-09-28-171716",
"version": "4.6.0-0.nightly-2020-09-28-171716"
},
"history": [
{
"completionTime": "2020-09-28T18:13:49Z",
"image": "registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-09-28-171716",
"startedTime": "2020-09-28T18:04:33Z",
"state": "Completed",
"verified": false,
"version": "4.6.0-0.nightly-2020-09-28-171716"
},
{
"completionTime": "2020-09-28T18:00:46Z",
"image": "registry.svc.ci.openshift.org/ocp/release@sha256:5721030240b241996401f633506b91c7ad1f61acb0c45b669489df3b3917cc8c",
"startedTime": "2020-09-28T17:31:47Z",
"state": "Completed",
"verified": false,
"version": "4.6.0-0.nightly-2020-09-28-125756"
}
],
"observedGeneration": 2,
"versionHash": "WtyYPY_PXQ0="
}
}
So I think the --to-image
version name handling was fixed by openshift/cluster-version-operator#419.
Add a MIRROR variable in case folks want to point at a different mirror for whatever reason. If neither OPENSHIFT_VERSION nor a Git tag are available, use the latest/ entry on the mirror to fetch the latest release (note that this is not pinned to a particular 4.y, it's just the latest stable release). Drop get_openshift_version and just set up OPENSHIFT_RELEASE_VERSION instead; no need to delay that. Echo the chosen release version and how it was determined to be more transparent and reduce surprise. Drop empty-OPENSHIFT_RELEASE_VERSION checks from the rest of the script, now that OPENSHIFT_RELEASE_VERSION is always set. Pull the release pullspec from the mirror's release.txt for the chosen version. This decouples us from the Quay tagging scheme [1,2], decouples us from the particular repository (doesn't even have to get pushed to Quay), and protects us from malicious registries that could point the mutable tag at a malicious image. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1794466 [2]: openshift/openshift-docs#19266
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
/remove-lifecycle stale I still think we want this, but I need to look at version reporting in |
3649282
to
241b8a5
Compare
Rebased onto master and updated to include changes to the very-similar |
@jianlinliu, is this ready to merge now? |
QE already switched tag name to digest in some part of internal testing, but several CEE guys have asked some questions, about digest not easy reading, not having arch info and version info from its naming, where to get digest (if not read from advisories), which is similar concern raised in my previous comments. |
These can be extracted with
For updates, clusters extract this from Update Service Cincinnati graphs, e.g. $ curl -sH Accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=fast-4.5&arch=amd64' | jq -r '.nodes[] | select(.version == "4.4.23").payload'
quay.io/openshift-release-dev/ocp-release@sha256:0455e0201f475a836f2474d4af7864a55208a33eb6932027f63109bbbd821b65 But advisory content from the mirrors is signed with GPG, which is more secure than trusting that the Update Service is uncompromised (even if you trust the TLS between you and the Update Service). Still, it's an option, and if you follow up and check that the by-digest pullspec has a valid GPG signature and the name/arch you expected, then you are protected from any Update Service shenanigans. In this space is the in-flight openshift/cluster-version-operator#431. |
Thanks for your detailed explanation. As QE, I know how to get the image version/name/arch info from the pullspec, such as: you mentioned |
The OCP_RELEASE stuff landed in ced98c6 (osdocs-626 preparing for disconnected installation, 2019-08-26, openshift#16678). Since then, 4.2.14 and later (including some 4.1 releases) have been suffixing the Quay tags with architecture information [1]. By pivoting to digests pulled from the release errata, we decouple these docs from tag names. Also pull the repo portion of the pullspec out of the errata, so there's less hard-coding in the docs. Checking the to-disk flow using the 4.5.11 oc to mirror the 4.5.11 release image: $ oc version --client Client Version: 4.5.11 $ oc adm release mirror --to-dir=/tmp/mirror quay.io/openshift-release-dev/ocp-release@sha256:4d048ae1274d11c49f9b7e70713a072315431598b2ddbb512aee4027c422fe3e info: Mirroring 110 images to file://openshift/release ... ... Update image: openshift/release:4.5.11 To upload local images to a registry, run: oc image mirror --from-dir=/tmp/mirror 'file://openshift/release:4.5.11*' REGISTRY/REPOSITORY Configmap signature file /tmp/mirror/config/signature-sha256-4d048ae1274d11c4.yaml created I've gone with :4.* in the 'image mirror' examples, because I don't want folks to have to bother creating an environment variable, and we have no roadmap for a backwards-incompatible OpenShift v5 today. The reference to 'oc adm release info ...' should mitigate concerns about folks who want to map from by-digest pullspecs back to the release metadata [2]. I'm not linking to an existing section on that command, because the only section we have around that today is _unused_topics/exploring-cvo.adoc. There are some other mentions, e.g. with --changelog in modules/cli-administrator-other.adoc and with --commits in modules/querying-operator-status-after-installation.adoc, but an inline comment seemed simpler here than trying to link to any of those locations with enough guidance for the reader to re-orient once they followed the link. I also updated the very similar modules/update-mirror-repository.adoc with content that minimally diverged from the installation module. It might be worth unifying these modules with conditionals for imageContentSources vs. ImageContentSourcePolicy, but I'm punting on that for now. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1794466 [2]: openshift#19266 (comment)
241b8a5
to
62bad59
Compare
I've pushed 241b8a515c -> 62bad59 to mention
Both of these flows are setting up |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@wking: This pull request references Bugzilla bug 1794466. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The
OCP_RELEASE
stuff landed in ced98c6 (#16678). Since then, 4.2.14 and later (including some 4.1 releases) have been suffixing the Quay tags with architecture information. By pivoting to digests pulled from the release errata, we decouple these docs from tag names. Also pull the repo portion of the pullspec out of the errata, so there's less hard-coding in the docs.