Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run bundle - ConfigMap is invalid: []: Too long: must have at most 1048576 bytes #6323

Closed
disaster37 opened this issue Feb 21, 2023 · 20 comments · Fixed by #6408
Closed

run bundle - ConfigMap is invalid: []: Too long: must have at most 1048576 bytes #6323

disaster37 opened this issue Feb 21, 2023 · 20 comments · Fixed by #6408
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. needs discussion priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@disaster37
Copy link

Bug Report

What did you do?

I run

operator-sdk run bundle quay.io/webcenter/elasticsearch-operator-bundle:v0.0.1 -n default

What did you expect to see?

It deploy operator at the end

What did you see instead? Under which circumstances?

Failed to run bundle: create catalog: error creating registry pod: error building registry pod definition: configMap error: error updating ConfigMap: error creating ConfigMap: ConfigMap "elasticsearch-operator-catalog-configmap-partition-2" is invalid: []: Too long: must have at most 1048576 bytes 

Environment

Operator type:

language go

Kubernetes cluster type:

Rancher 2.6.8

$ operator-sdk version

operator-sdk version: "v1.26.1", commit: "4582a8414e65f40ebea65c74729f121e1f3e3b9a", kubernetes version: "1.25.0", go version: "go1.19.5", GOOS: "linux", GOARCH: "amd64"

$ go version (if language is Go)

go version go1.19.6 linux/amd64

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.0", GitCommit:"2b525e8d2647a41e686bc7da5b7430667a13953e", GitTreeState:"clean", BuildDate:"2021-08-20T21:43:10Z", GoVersion:"go1.15.14", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.14", GitCommit:"3321ffc07d2f046afdf613796f9032f4460de093", GitTreeState:"clean", BuildDate:"2022-11-09T13:32:47Z", GoVersion:"go1.17.13", Compiler:"gc", Platform:"linux/amd64"}

Possible Solution

Additional context

@varshaprasad96
Copy link
Member

varshaprasad96 commented Feb 27, 2023

Usually the data stored in configMap cannot cannot exceed 1 MiB (ref: https://kubernetes.io/docs/concepts/configuration/configmap/#motivation). However, there have been issues created in SDK regarding the same previously. @everettraven @rashmigottipati are you aware of this and/or a possible workaround?

@everettraven
Copy link
Contributor

Yeah, this should have been resolved by the partitioning logic that I implemented in #6182 however it looks like there may be a bug based on this error message.

It looks like it is correctly trying to create a partition but at some point the partition seems to have gotten to large to fit into the etcd limits.

@everettraven
Copy link
Contributor

I'm wondering if the problem is either here:

// Create a new ConfigMap
cm = f.makeBaseConfigMap()
// Since adding this data would have made the previous
// ConfigMap too large, add it to this new one.
// No chunk of YAML from the bundle should cause
// the ConfigMap size to exceed 1 MiB and if
// somehow it does then there is a problem with the
// YAML itself. We can't reasonably break it up smaller
// since it is a single object.
cm.Data[defaultConfigMapKey] = yamlDef

or here:
} else {
// If there is no data in the ConfigMap
// then this is the first pass. Since it is
// the first pass go ahead and add the data.
cm.Data[defaultConfigMapKey] = yamlDef
}

That being said, I'm thinking the issue is more so that there is a single item that may be too large to fit into a ConfigMap on it's own as the partitioning logic is meant to split up the FBC YAML (which consists of multiple YAML definitions separated by ---) into single YAML definitions. My suspicion is that there is likely a single bundle definition in the FBC that is way to large to fit into a single ConfigMap. The partitioning logic in it's current state doesn't know how to handle a situation where it would need to split up a single definition into multiple (and I'm not sure how possible that is with ConfigMaps since this would essentially be a single file definition split amongst 2 ConfigMaps).

I would definitely be curious to see what exactly is trying to be placed into this ConfigMap making it too large to see if there is anything we can fix, but if it is indeed the scenario I mentioned above I think there would be a need for a much deeper discussion as to how we could go about fixing this problem.

@disaster37
Copy link
Author

You can test it from this project: https://github.com/webcenter-fr/elasticsearch-operator

@everettraven
Copy link
Contributor

Just wanted to follow up here and mention that I did take a look at this and the problem seems to be that a single olm.bundle specification is ~2.5Mb which is way over the traditional etcd limits. This definitely falls under the scenario I outlined in my previous comment where we can't really split up a singular definition.

That being said, I'm looking into what options may be available for us to take to remedy this.

@rnasani
Copy link

rnasani commented Mar 27, 2023

I hit the same issue today, is there any workaround or quick fix ?

@everettraven
Copy link
Contributor

Unfortunately I haven't been able to come up with a workaround or a quick fix for this. I apologize for the inconvenience!

@rnasani
Copy link

rnasani commented Mar 30, 2023

I am hitting this issue only when more rbacs&crds are in the bundle(around 15 to 25 crds). Otherwise, it works without any problem.

@acornett21
Copy link
Contributor

@everettraven I ran into this issue with a partner as well, I have a bundle that we could use to test this logic.

@everettraven
Copy link
Contributor

Circling back around to this, the idea of using some form of compression has been brought up and is something we could do to help. Another idea is to add a new flag that allows users of operator-sdk run bundle to ignore the olm.bundle.object properties in the bundle resource (this field is so the packages installed with OLM are discoverable via the OLM packageserver) which, in some cases, could reduce the bundle size significantly.

Unfortunately, I don't think we will be able to entirely get rid of this problem but I think a combination of compression and the option to ignore/remove the olm.bundle.object properties from the bundle could make it much harder to hit the etcd limits.

I'm also not sure when we will be able to get the fix out for this - I'll take the milestone off so that this can be discussed in the community meeting next Monday (04/10/2023)

@everettraven everettraven removed this from the Backlog milestone Apr 4, 2023
@abhijeet2096-confluent
Copy link

abhijeet2096-confluent commented Apr 14, 2023

I hit the same issue today, is there any workaround or quick fix ?

FATA[0046] Failed to run bundle: create catalog: error creating registry pod: error building registry pod definition: configMap error: error updating ConfigMap: error creating ConfigMap: ConfigMap "confluent-for-kubernetes-catalog-configmap-partition-2" is invalid: []: Too long: must have at most 1048576 bytes

@nunnatsa
Copy link
Contributor

The same problem was already addressed in OLM, by gzip the content.

@kensipe kensipe added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Apr 17, 2023
@kensipe kensipe added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 17, 2023
@kensipe kensipe added this to the v1.30.0 milestone Apr 17, 2023
@nunnatsa
Copy link
Contributor

/assign @nunnatsa

@abhijeet2096-confluent
Copy link

Hey @nunnatsa ,

The same problem was already addressed in OLM, by gzip the content.

Can you explain what you meant by this. Is their an alternate way to install and test operator in Openshift cluster?


Currently for testing the bundle currently I follow this

I execute this command

operator-sdk run bundle $BUNDLE_IMAGE_NAME -n $BUNDLE_TESTING_NS --pull-secret-name $PULL_SECRET_NAME  --timeout 15m

but it is erroring out with above error

FATA[0046] Failed to run bundle: create catalog: error creating registry pod: error building registry pod definition: configMap error: error updating ConfigMap: error creating ConfigMap: ConfigMap "confluent-for-kubernetes-catalog-configmap-partition-2" is invalid: []: Too long: must have at most 1048576 bytes

@nunnatsa
Copy link
Contributor

Hi. As far as I understand, using operator-sdk to install and run operators in openshift is mostly for development and testing. The main road, as far as I know, is to build an image index from the bundle image(s) and the use CatalogSource and subscription to install.

@acornett21
Copy link
Contributor

@abhijeet2096-confluent If you scroll down in the document in your second link, it explains how to build a catalog/index image and a CatalogSource. This is another way to test your operator during development.

@svghadi
Copy link

svghadi commented Sep 6, 2023

I am facing the same issue. As I understand, it has been fixed with #6408. However I don't see any documentation or mention of --gzip-configmap flag anywhere.

When I try to use this flag, I get Error: unknown flag: --gzip-configmap message.

$ ./operator-sdk_darwin_arm64 version
operator-sdk version: "v1.31.0", commit: "e67da35ef4fff3e471a208904b2a142b27ae32b1", kubernetes version: "1.26.0", go version: "go1.19.11", GOOS: "darwin", GOARCH: "arm64"

$ ./operator-sdk_darwin_arm64 run bundle --gzip-configmap=true quay.io/webcenter/elasticsearch-operator-bundle:v0.0.1 -n default
Error: unknown flag: --gzip-configmap
Usage:
  operator-sdk run bundle <bundle-image> [flags]

@everettraven - can we reopen this issue or is there something I am missing?

@everettraven
Copy link
Contributor

@svghadi IIRC we removed the flag and made it the default behavior

@acornett21
Copy link
Contributor

acornett21 commented Sep 6, 2023

@everettraven is correct, there is no flag and the default behavior is to use gzip. I tried to install this operator and everything was successful.

operator-sdk run bundle quay.io/webcenter/elasticsearch-operator-bundle:v0.0.1 -n default
oc get csv
NAME                            DISPLAY                  VERSION   REPLACES   PHASE
elasticsearch-operator.v0.0.1   elasticsearch-operator   0.0.1                Succeeded

@svghadi
Copy link

svghadi commented Sep 6, 2023

Thanks for the clarification @everettraven

After removing the flag, the operator got installed without any errors with latest version of operator-sdk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. needs discussion priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.