Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bundle-upgrade operation stuck and never timeouts #5980

Closed
jotak opened this issue Aug 9, 2022 · 3 comments
Closed

bundle-upgrade operation stuck and never timeouts #5980

jotak opened this issue Aug 9, 2022 · 3 comments
Assignees
Labels
language/go Issue is related to a Go operator project

Comments

@jotak
Copy link

jotak commented Aug 9, 2022

Bug Report

Hi,

When doing a run bundle-upgrade operation to test an operator upgrade, the operation is stuck after logging Found existing catalog source with name etc.

I reproduce consistently, every time I try.

Also, it seems to never timeout (I've been waiting for ~10min so far), apparently not honouring the timeout option that is said to default to 2 minutes.

What did you do?

operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.1.3
# wait a little bit
operator-sdk run bundle-upgrade quay.io/netobserv/network-observability-operator-bundle:v0.1.4

What did you expect to see?

Success

What did you see instead? Under which circumstances?

Stdout:

INFO[0002] Found existing subscription with name netobserv-operator-v0-1-3-sub and namespace network-observability 
INFO[0002] Found existing catalog source with name netobserv-operator-catalog and namespace network-observability 

(and stuck at this point)

Pod logs:

$ oc logs quay-io-netobserv-network-observability-operator-bundle-v0-1-3
time="2022-08-09T10:54:14Z" level=info msg="serving registry" configs=/netobserv-operator-catalog-configs port=50051

catalog-operator

There are a couple of errors of type the object has been modified; please apply your changes to the latest version and try again, not sure if they are relevant for this ticket:

$ oc logs catalog-operator-847b966b64-f7rqf  -n openshift-operator-lifecycle-manager
[...]
time="2022-08-09T10:54:28Z" level=info msg=syncing id=pFDid ip=install-tz8jd namespace=network-observability phase=Installing
time="2022-08-09T10:54:29Z" level=info msg="added to bundle, Kind=ConfigMap" configmap=network-observability/9b34086d5b83308e052bd989ee88c04b7987b0245ac7c3f363215c6215961ba key=netobserv-manager-config_v1_configmap.yaml
time="2022-08-09T10:54:29Z" level=info msg="added to bundle, Kind=ClusterRole" configmap=network-observability/9b34086d5b83308e052bd989ee88c04b7987b0245ac7c3f363215c6215961ba key=netobserv-metrics-reader_rbac.authorization.k8s.io_v1_clusterrole.yaml
time="2022-08-09T10:54:29Z" level=info msg="added to bundle, Kind=Service" configmap=network-observability/9b34086d5b83308e052bd989ee88c04b7987b0245ac7c3f363215c6215961ba key=netobserv-metrics-service_v1_service.yaml
time="2022-08-09T10:54:29Z" level=info msg="added to bundle, Kind=ClusterServiceVersion" configmap=network-observability/9b34086d5b83308e052bd989ee88c04b7987b0245ac7c3f363215c6215961ba key=netobserv-operator.clusterserviceversion.yaml
time="2022-08-09T10:54:29Z" level=info msg="added to bundle, Kind=CustomResourceDefinition" configmap=network-observability/9b34086d5b83308e052bd989ee88c04b7987b0245ac7c3f363215c6215961ba key=flows.netobserv.io_flowcollectors.yaml
time="2022-08-09T10:54:30Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T10:54:30Z" level=info msg=syncing id=IeaE1 ip=install-tz8jd namespace=network-observability phase=Complete
time="2022-08-09T10:54:30Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T10:54:30Z" level=warning msg="an error was encountered during reconciliation" error="Operation cannot be fulfilled on subscriptions.operators.coreos.com \"netobserv-operator-v0-1-3-sub\": the object has been modified; please apply your changes to the latest version and try again" event=update reconciling="*v1alpha1.Subscription" selflink=
E0809 10:54:30.954906       1 queueinformer_operator.go:290] sync {"update" "network-observability/netobserv-operator-v0-1-3-sub"} failed: Operation cannot be fulfilled on subscriptions.operators.coreos.com "netobserv-operator-v0-1-3-sub": the object has been modified; please apply your changes to the latest version and try again
time="2022-08-09T10:54:30Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T10:54:30Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T10:54:31Z" level=info msg=syncing id=/vpfz ip=install-tz8jd namespace=network-observability phase=Complete
time="2022-08-09T10:59:07Z" level=error msg="UpdateStatus - error while setting CatalogSource status" error="Operation cannot be fulfilled on catalogsources.operators.coreos.com \"community-operators\": the object has been modified; please apply your changes to the latest version and try again" id=JD1cF source=community-operators
E0809 10:59:07.229352       1 queueinformer_operator.go:290] sync {"update" "openshift-marketplace/community-operators"} failed: Operation cannot be fulfilled on catalogsources.operators.coreos.com "community-operators": the object has been modified; please apply your changes to the latest version and try again
time="2022-08-09T11:02:06Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T11:02:15Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T11:02:15Z" level=info msg=syncing id=15vQh ip=install-tz8jd namespace=network-observability phase=Complete
time="2022-08-09T11:02:15Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T11:17:44Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T11:17:52Z" level=info msg=syncing id=OoS1E ip=install-tz8jd namespace=network-observability phase=Complete
time="2022-08-09T11:17:52Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-08-09T11:17:52Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
[...]

olm-operator

Here again, similar errors:

$ oc logs olm-operator-6854fbd4c6-d6l4r  -n openshift-operator-lifecycle-manager
[...]
time="2022-08-09T10:54:37Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=e5qLU namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:37Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=8Wlx7 namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:37Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=M3nR1 namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:37Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=jD+fF namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:38Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=9D1S5 namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:38Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=p8DoE namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:39Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=/IlK9 namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:39Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=tYayq namespace=network-observability phase=Installing strategy=deployment
time="2022-08-09T10:54:40Z" level=info msg="install strategy successful" csv=netobserv-operator.v0.1.3 id=Gd2sN namespace=network-observability phase=Installing strategy=deployment
I0809 10:54:40.362599       1 event.go:285] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"network-observability", Name:"netobserv-operator.v0.1.3", UID:"a22b0bd6-6fa8-480a-924f-0110469f6304", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"60812", FieldPath:""}): type: 'Normal' reason: 'InstallSucceeded' install strategy completed with no errors
E0809 11:03:18.334912       1 queueinformer_operator.go:290] sync {"update" "openshift-operator-lifecycle-manager/packageserver"} failed: could not update operatorgroups olm.providedAPIs annotation: Operation cannot be fulfilled on operatorgroups.operators.coreos.com "olm-operators": the object has been modified; please apply your changes to the latest version and try again
E0809 11:12:12.583008       1 queueinformer_operator.go:290] sync {"update" "openshift-operator-lifecycle-manager/packageserver"} failed: could not update operatorgroups olm.providedAPIs annotation: Operation cannot be fulfilled on operatorgroups.operators.coreos.com "olm-operators": the object has been modified; please apply your changes to the latest version and try again
E0809 11:16:39.652694       1 queueinformer_operator.go:290] sync {"update" "openshift-operator-lifecycle-manager/packageserver"} failed: could not update operatorgroups olm.providedAPIs annotation: Operation cannot be fulfilled on operatorgroups.operators.coreos.com "olm-operators": the object has been modified; please apply your changes to the latest version and try again
[...]

Environment

Operator type:

/language go

Kubernetes cluster type:

OpenShift (ocp 4.11 rc6)

$ operator-sdk version

operator-sdk version: "v1.22.2", commit: "da3346113a8a75e11225f586482934000504a60f", kubernetes version: "1.24.1", go version: "go1.18.4", GOOS: "linux", GOARCH: "amd64"

$ go version (if language is Go)

go version go1.17.11 linux/amd64

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v0.23.0", GitCommit:"6db43e22ecffd58dbd1cfb3b245967bde98f9a66", GitTreeState:"clean", BuildDate:"2022-03-14T13:42:44Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0+9546431", GitCommit:"0a57f1f59bda75ea2cf13d9f3b4ac5d202134f2d", GitTreeState:"clean", BuildDate:"2022-07-08T19:55:26Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
@openshift-ci openshift-ci bot added the language/go Issue is related to a Go operator project label Aug 9, 2022
@jotak
Copy link
Author

jotak commented Aug 9, 2022

PS: I saw this issue that could be similar: #5773 , but since symptoms seem different, I opened a new one

@jotak
Copy link
Author

jotak commented Aug 9, 2022

Found a workaround, inspired by #5946
Using --index-image pointing to my operator-sdk cli version worked:

operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.1.3 --index-image quay.io/operator-framework/opm:v1.22

operator-sdk run bundle-upgrade quay.io/netobserv/network-observability-operator-bundle:v0.1.4

@rashmigottipati
Copy link
Member

@jotak here's the temporary workaround for this issue before we pin opm's version: operator-framework/operator-registry#984 (comment)

Feel free to reopen this issue if you're still seeing issues with run bundle-upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/go Issue is related to a Go operator project
Projects
None yet
Development

No branches or pull requests

2 participants