Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable gzip compression when unpacking bundles to ConfigMap #2240

Merged
merged 5 commits into from
Jul 16, 2021

Conversation

zcahana
Copy link
Contributor

@zcahana zcahana commented Jul 7, 2021

Signed-off-by: Zvi Cahana zvic@il.ibm.com

Description of the change:

This is a follow-up PR to operator-framework/operator-registry#685 which introduced Gzip compression support when unpacking bundle images to ConfigMap via opm alpha bundle extract command.
The current PR closes the loop by setting the corresponding -z/--gzip flag when spawning the bundle extract Job that runs the above opm command.

Motivation for the change:

Enable OLM users to install operator with bundles reasonably larger than the current 1 MiB limit.

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 7, 2021
@openshift-ci openshift-ci bot requested review from ecordell and njhale July 7, 2021 15:30
@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 7, 2021
@openshift-ci
Copy link

openshift-ci bot commented Jul 7, 2021

Hi @zcahana. Thanks for your PR.

I'm waiting for a operator-framework member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zcahana
Copy link
Contributor Author

zcahana commented Jul 7, 2021

Setting this PR as WIP as there are a couple of open items I'd like to get the maintainers input on:

  • Should this be configurable at the catalog-operator level (i.e., add a catalog-operator flag to enable/disable gzip compression when spawning extract jobs)?
  • I'd like to add an e2e test that uses a large (> 1MiB) bundle and verifies it installs correctly. Which of the existing e2e tests would be a good candidate to extend with such a test case?

Thanks!

/cc @kevinrizza @joelanford @exdx @timflannagan @dinhxuanvu

@joelanford
Copy link
Member

Not all index images will have an opm version that supports the -z flag. I think that means:
a. All existing index images need to continue working with no changes required anywhere. So I don't think we can blindly add -z to the catalog operator's bundle unpacker job.
b. All existing versions of OLM should be able to handle index images that do support compression. Basically, we can't enable compression by default in the unpack command (we didn't so we're good here).

It seems like the ideal would be a technique that lets a new version of OLM attempt to use compression and then fall back to non-compression if the index image does not support compression.

@kevinrizza
Copy link
Member

Not all index images will have an opm version that supports the -z flag. I think that means:
a. All existing index images need to continue working with no changes required anywhere. So I don't think we can blindly add -z to the catalog operator's bundle unpacker job.
b. All existing versions of OLM should be able to handle index images that do support compression. Basically, we can't enable compression by default in the unpack command (we didn't so we're good here).

It seems like the ideal would be a technique that lets a new version of OLM attempt to use compression and then fall back to non-compression if the index image does not support compression.

The opm binary on catalog doesn't matter though. This binary is the one that OLM happens to pull down on cluster as part of the registry image it references when it starts up as a command line parameter.

The only backwards compatibility concern we need to deal with is, if a configmap already exists for a bundle before this change was applied on cluster, then the unpacker job needs to understand how to unpack both zipped an unzipped content

@zcahana
Copy link
Contributor Author

zcahana commented Jul 7, 2021

The only backwards compatibility concern we need to deal with is, if a configmap already exists for a bundle before this change was applied on cluster, then the unpacker job needs to understand how to unpack both zipped an unzipped content

This case is already handled as the configmap reading logic tries to decompress only if an olm.contentEncoding: gzip+basse64 annotation is present on the configmap, and otherwise falls to the previous (uncompressed) logic:
https://github.com/operator-framework/operator-registry/blob/eacb9a40431c56ae887e7696e4931e935e67e045/pkg/configmap/configmap.go#L59

Nevertheless, this might be worthy of a test. On what scenario does OLM try to install from a configmap without going through the extract job first. Does there happen to be an existing test that runs a similar scenario?

@zcahana
Copy link
Contributor Author

zcahana commented Jul 7, 2021

It seems like the ideal would be a technique that lets a new version of OLM attempt to use compression and then fall back to non-compression if the index image does not support compression.

The opm binary on catalog doesn't matter though. This binary is the one that OLM happens to pull down on cluster as part of the registry image it references when it starts up as a command line parameter.

Actually, since the registry image is configurable via the --configmapServerImage flag, a user might indeed set an image with an "old" opm (e.g. using a non-latest release tag). With the current approach, I'm not sure how easy it would be for a user to realize the actual failure reason. If you think this is a concern, we can use the fallback-to-non-compression suggested by @joelanford. or at least try to better communicate the failure reason.

@zcahana
Copy link
Contributor Author

zcahana commented Jul 8, 2021

The e2e test currently fails since the configmap-operator-registry:latest image includes an outdated opm binary that doesn't understand the new --gzip flag.

For my understanding, the configmap-operator-registry:latest image depends on the upstream-registry-builder:latest image which had last been successfully built more than 3 months ago, and ever since fails to build:

image

Any insights into why this is failing?

@zcahana
Copy link
Contributor Author

zcahana commented Jul 8, 2021

I think I figured out why upstream-registry-builder:latest fails to build; attempted to fix in operator-framework/operator-registry#697.

@zcahana
Copy link
Contributor Author

zcahana commented Jul 12, 2021

Note: this PR is currently blocked until the new opm lands in configmap-operator-registry:latest ; As of now, it seems most of the auto-built images are broken (here, here, here, and here - all have recent build failures).

The attempted fix in operator-framework/operator-registry#697 may have addressed one issue with the images, but it appears that there are other issues still breaking the builds (see this comment). It'd be best if someone with a broader understanding of these images (as well as access to the build logs) can take a look at this.

Thanks! :-)

@zcahana
Copy link
Contributor Author

zcahana commented Jul 13, 2021

As a workaround to the upstream-registry-builder:latest failing to build, I've changed the default opm image used by catalog-operator to upstream-opm-builder:latest (configurable via --opmImage). For my understanding this shouldn't be breaking anything, but users setting a custom image via --configmapServerImage will need to set --opmImage now too (release note?).

Also, I've just realized I'll need to bump the module dependency on operator-registry since the configmap decompression logic is vendored from there. Is there a new operator-registry release planned to be cut soon?

@exdx
Copy link
Member

exdx commented Jul 14, 2021

As a workaround to the upstream-registry-builder:latest failing to build, I've changed the default opm image used by catalog-operator to upstream-opm-builder:latest (configurable via --opmImage). For my understanding this shouldn't be breaking anything, but users setting a custom image via --configmapServerImage will need to set --opmImage now too (release note?).

Also, I've just realized I'll need to bump the module dependency on operator-registry since the configmap decompression logic is vendored from there. Is there a new operator-registry release planned to be cut soon?

We actually just had a new operator-registry release today, https://github.com/operator-framework/operator-registry/releases

@zcahana
Copy link
Contributor Author

zcahana commented Jul 14, 2021

We actually just had a new operator-registry release today, https://github.com/operator-framework/operator-registry/releases

Awesome, I'll update the go.mod with it.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 15, 2021
Signed-off-by: Zvi Cahana <zvic@il.ibm.com>
Signed-off-by: Zvi Cahana <zvic@il.ibm.com>
Signed-off-by: Zvi Cahana <zvic@il.ibm.com>
Signed-off-by: Zvi Cahana <zvic@il.ibm.com>
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 15, 2021
@zcahana
Copy link
Contributor Author

zcahana commented Jul 15, 2021

upstream-tests check fails with the following error:

GOOS=linux GOARCH=386 go build -mod=vendor -ldflags "-X github.com/operator-framework/operator-lifecycle-manager/pkg/version.GitCommit=fbbd87a49427add1edff0cde4b3b6c0211dfad20 -X github.com/operator-framework/operator-lifecycle-manager/pkg/version.OLMVersion=`cat OLM_VERSION`" -tags "json1" -o bin/catalog github.com/operator-framework/operator-lifecycle-manager/cmd/catalog
vendor/github.com/operator-framework/operator-registry/internal/property/property.go:8:2: cannot find package "." in:
	/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/io/fs
make: *** [Makefile:100: github.com/operator-framework/operator-lifecycle-manager/cmd/catalog] Error 1

I think this happens since the io/fs package has been introduced only with Go 1.16, and the build job probably runs with an older Go version; I'll update the job so that it'll explicitly use 1.16 as the others - if you prefer me to move that to a separate PR just let me know.

Signed-off-by: Zvi Cahana <zvic@il.ibm.com>
@zcahana
Copy link
Contributor Author

zcahana commented Jul 15, 2021

/retest

#2091

@openshift-ci
Copy link

openshift-ci bot commented Jul 15, 2021

@zcahana: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kevinrizza
Copy link
Member

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 15, 2021
@kevinrizza
Copy link
Member

/approve

@openshift-ci
Copy link

openshift-ci bot commented Jul 15, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kevinrizza, zcahana

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 15, 2021
@joelanford
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 16, 2021
@joelanford joelanford changed the title WIP: Enable gzip compression when unpacking bundles to ConfigMap Enable gzip compression when unpacking bundles to ConfigMap Jul 16, 2021
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 16, 2021
@openshift-merge-robot openshift-merge-robot merged commit 8f09c66 into operator-framework:master Jul 16, 2021
@zcahana zcahana deleted the bundle_gzip branch July 19, 2021 08:30
@jackson-chris
Copy link

Is it possible for this support be be back ported to earlier version of OCP 4.x? I have heard this functionality is only slated to go into OCP 4.9 currently, is that correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants